Detailed Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending for examination. Claims 1, 11, and 20 are independent.

Response to Arguments
Applicant's arguments filed 10/07/2021 have been fully considered but they are not persuasive. 
Applicant Argues regarding Claim 1: “The term "probabilities" is only mentioned once throughout the whole application, which is in Paragraph 25, and it simply refers to static probabilities as part of an action policy that includes a mapping between player states, game events, actions, and rewards. The static probabilities, which simply define the likelihood that a certain reward will occur if a certain action is taken, are actually hardcoded in Jaatinen, and thus cannot be determined based on the entity history for each entity. Therefore, Jaatinen never describes how the probabilities are determined, and thus cannot disclose "for each of the plurality of entities, determining, via the machine-learned intervention selection model based at least in part on the entity history for each entity, a respective probability.”
Examiner respectfully disagrees, Jaatinen describes a machine learning model (i.e. RNN) which is generating a probability as the output (i.e. action policy). In para 0025, Jaatinen states that the action policy can “include rules, heuristics, and probabilities for matching a player state (including game events) with one or more uses RNN-2 125 in a reinforcement learning scenario in order for RNN - 2 125 to create the policy and then continuously update the policy based on user actions over time.” Examiner reads the RNN has outputting a non-static probability that is continuously updating according to user actions over time (i.e. history of an entity).
Applicant Argues “Kumar simply describes a recommendation to maximize the business objective but does not describe using an intervention to actually improve the business objective. Given that Kumar does not have a plurality of available interventions that will improve an objective value, Kumar cannot determine a probability for each of the plurality of available interventions. Therefore, Kumar cannot disclose "a respective probability that each of a plurality of available interventions will improve an objective value that is determined based at least in part on a measure of continued use of the computer application by the entity."”
Examiner respectfully disagrees, Kumar describes an intervention to improve the business objective in para 0094-0096. Kumar states in Para 0094 “The Supervised learning module 234 selects a particular supervised learning method to build the model based on the one or more business objectives. The business objectives for which the model(s) can be optimized may include …overall engagement, total time spent on an application or user interaction time, user acquisition or retention, etc. Under broadest reasonable interpretation Examiner reads the actions performed by the Supervised learning Module (i.e. machine learning model) such as optimization of model parameters (See Para 0095) according to a business objective as using an intervention 
	Applicant Argues “With regards to claim 3, the Office Action points to paragraphs 222 and 224 of Grosso for disclosing "prior to determining the respective probabilities, randomly providing one or more of the plurality of available interventions to the plurality of entities during an exploratory time period." Paragraph 222 of Grosso describes the classic multi-armed bandit problem of allocating resources among different slot machines when the probability of the slot machine payout is unknown. Paragraph 224 of Grosso describes assigning users randomly in multiple test segments. Grosso never describes a plurality of available interventions, so cannot randomly provide one or more of the plurality of available interventions to a plurality of entities during an exploratory time period. Randomly providing available interventions to different entities during an exploratory time period is very different than simply assigning users randomly to different test segments. Therefore, none of the cited references disclose "prior to determining the respective probabilities, randomly providing one or more of the plurality of available interventions to the plurality of entities during an exploratory time period," as claimed in claim 3.”
	Examiner respectfully disagrees, under broadest reasonable interpretation, assigning users randomly can be read as randomly providing an intervention (i.e. the action of randomly assigning) to a plurality of entities (i.e. users) during an exploratory time period (i.e. learning period). According to the specification (Para 0006) an intervention can “include an action and/or operational change by the developer and/or application taken with respect to one or more users (e.g., with the goal of preventing user churn)”. Examiner reads the random assignment of users in order to maximize the reliability of the models results as an action and application taken with the goal of preventing user churn (i.e. intervention).
Applicant Argues “With regards to claim 6, the Office Action again points to paragraphs 222 and 224 of Grosso for disclosing "in addition to the measure of continued use of the computer application by the entity, the objective value is further determined based at least in part on an allocation of resources by the entity within the computer application." As previously mentioned, Grosso does not disclose a plurality of available interventions, so cannot determine a probability for each of the plurality of available interventions that will improve the objective value, where the objective value is determined based on an allocation of resources by the entity within the computer application. Therefore, none of the cited references disclose "in addition to the measure of continued use of the computer application by the entity, the objective value is further determined based at least in part on an allocation of resources by the entity within the computer application," as claimed in claim 6.” 
Examiner respectfully disagrees, claim 6 does not mention a plurality of available interventions and is not dependent on claim 3 (See rejection of claims 1 for this limitation). Regarding Claim 6 Examiner interprets Grasso (Para 0224) as disclosing the limitations: in addition to the measure of continued use of the computer application (i.e. game in use) by the entity (i.e. customer/user), the objective value (i.e. yield) is further determined based at least in part on an allocation of resources by the entity within the computer application (i.e. virtual currency used in a game).
Applicant Argues “With regards to claim 8, the Office Action points to paragraph 32 of Jaatinen for disclosing "the operations comprise identifying the plurality of available interventions from a plurality of defined interventions, the plurality of available interventions being a subset of the plurality of defined interventions that satisfy one or more developer-supplied intervention criteria at a time of selection." Jaatinen does not disclose identifying the plurality of available interventions from 
Response Dated: October 7, 2021Office Action Dated: July 7, 2021plurality of defined interventions. Additionally, Jaatinen does not disclose that the plurality of available interventions are a subset of the plurality of defined interventions. In fact, Jaatinen never mentions having available interventions that is a subset of defined interventions. Therefore, none of the cited references disclose "the operations comprise identifying the plurality of available interventions from a plurality of defined interventions, the plurality of available interventions being a subset of the plurality of defined interventions that satisfy one or more developer-supplied intervention criteria at a time of selection," as described in claim 8.”
Examiner respectfully disagrees, Jaatinen discloses in para 0032, reinforcement learning being applied to the RNN (i.e. machine learning model). At each decision-making point there is an engine action (i.e. defined intervention) being evaluated to achieve some reward, in this case a maximum amount of monetary rewards (i.e. satisfying developer-supplied intervention criteria). The policy is optimized to determine the best set of engine actions (i.e. available interventions as a subset of defined interventions/each action) for a specific individual user at a specific time.
Applicant Argues “With regards to claim 19, the Office Action points to paragraph 98 of Kumar for disclosing "performing, by the one or more computing 
Examiner respectfully disagrees, tuning an optimization model by changing its parameter allows the models to provide recommendations (i.e. on the computer application) with favorable predicted responses aimed at targeted users. The model parameters modify what may be transmitted to a user or how an interaction between the user and application will result (See para 0118), which under broadest reasonable interpretation, modifies the operating parameters of the computer application.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –




Claim 11-12, 15 and 17 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Jaatinen et al. (US 20190180319A1, hereinafter Jaatinen).

Regarding claim 11, 
Jaatinen discloses a computer-implemented method, comprising: 
obtaining, by one or more computing devices, entity history data associated with an entity associated with a computer application (Jaatinen fig. 1 elements 102 and 130 & [0017] and [0027] recites “The user device 102 is a computing device capable of providing a gameplay experience (e.g., a game, including video games) to the user 101. [0027] At operation 203 of the method 200, the game event data, reward data and context data are recorded by the client module 120 in the database 156. In accordance with an embodiment, there is provided a logging system (not separately shown in the Figures) to record the game event data reward data, and context data. At operation 204 of the method 200, the LTV optimization server module 138 communicates with the database 156 to extract the game event data, reward data, and context data for the game player 101.” User device and/or server (i.e. computing device), data (i.e. entity history), user (i.e. entity), and game (i.e. computer application)); 
inputting, by the one or more computing devices, the entity history data into a machine-learned intervention selection model that is configured to process the entity history data to select one or more interventions from a plurality of available interventions (Jaatinen fig. 2 element 206 & [0028] and [0029] recite, in part, “Referring back to FIG. 2, and in accordance with an embodiment, at operation 206 the server module 138 provides a player state (e.g., player state data from the output of RNN-1 123) and reward data into a machine learning system 122 that includes the second neural network RNN-2 125. The second neural network RNN-2 125 uses the state data from RNN-1 123 and the reward data to determine an engine action policy. [0029] In accordance with an embodiment, during operation 208 of the method 200, the client module 120 implements the chosen engine action (e.g., places a specific advertisement within a game at a specific time and place) chosen by the server module 138 using the policy. In the embodiment, the client module 120 implements the decision made by the server module 138.” (Emphasis added.) Machine learning system RNN model 2 (i.e. machine learned intervention selection model) and action (i.e. intervention)); 
receiving, by the one or more computing devices, a selection of the one or more interventions by the machine-learned intervention selection model based at least in part on the entity history data (Jaatinen [0030] recites “In accordance with another embodiment, during operation 208 of the method 200, the client module 120 uses the engine action policy and state data (e.g., the first state) to choose one or more engine actions, and to subsequently implement (e.g., place) the chosen one or more engine actions in the game environment.” Chosen actions (i.e. selection of interventions)); and 
in response to the selection, performing, by the one or more computing devices, the one or more interventions for the entity (Jaatinen [0030] recites “In the embodiment, the client module 120 both chooses and implements the engine action.” Implement action (i.e. perform intervention)).  

Regarding claim 12, 
Jaatinen discloses the computer-implemented method of claim 11, wherein at least some of the plurality of available interventions are defined by a developer of the computer application (Jaatinen [0022] recites, in part, “The underlying details for a game event is created in the game code by a game developer and can be customized to include specific events. In accordance with an embodiment, in order to maximize the effectiveness of .  

Regarding claim 15, 
Jaatinen discloses the computer-implemented method of claim 11, wherein the machine-learned intervention selection model comprises an intervention agent that learns via reinforcement learning (Jaatinen [0025] recites, in part, “The policy includes information that describes a relationship (e.g., a mapping) between player states, game events, engine actions and rewards… The machine learning system 122 uses RNN-2 125 in a reinforcement learning scenario in order for RNN-2 125 to create the policy and then continuously update the policy based on user actions over time. RNN-2 125 learns a functional relationship between possible engine actions and cumulative future rewards.” Player (i.e. agent)).

Regarding claim 17,
Jaatinen discloses the computer-implemented method of claim 11, wherein the computer application comprises a mobile application, a gaming application, or a website (Jaatinen [0048] recites “The applications 420 include built-in applications 440 and/or third-party applications 442. Examples of representative built-in applications 440 may include, but are not limited to, a contacts application, a browser application, a book reader application, a location application, a media application, a messaging application, and/or a game application.”).  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 4-5, 7-10, 16 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Jaatinen et al. (US 20190180319 A1, hereinafter Jaatinen) in view of Kumar et al. (US 20170061286 A1, hereinafter Kumar).


Regarding claim 1,
Jaatinen discloses a computing system (Jaatinen fig. 1 element 100 and fig. 5 element 500), comprising: 
one or more processors (Jaatinen fig. 5 element 510 & [0052] recites “Although FIG. 5 shows multiple processors, the machine 500 may include a single processor with a single core, a single processor with multiple cores (e.g., a multi-core processor), multiple processors with a single core, multiple processors with multiples cores, or any combination thereof.”); and 
one or more non-transitory computer-readable media that collectively store (Jaatinen fig. 5 element 538 & [0053]-[0054] recites, in part, “The storage unit 536 and memory 532, 534 store the instructions 516 embodying any one or more of the methodologies or functions described herein... Accordingly, the memory 532, 534, the storage unit 536, and the memory of processors 510 are examples of machine-readable media. [0054] The term “machine-readable medium” excludes signals per se.”): 
a machine-learned intervention selection model configured to select interventions on an entity-by-entity basis based at least in part on respective entity histories associated with entities (Jaatinen [0027]-[0028] recites “At operation 204 of the method 200, the LTV optimization server module 138 communicates with the database 156 to extract the game event data, reward data, and context data for the game player 101. As part of operation 204, the LTV optimization server module 138 provides the data to RNN-1 123 which creates representations from the data. The representations can include a time dependent representation of a player state, which includes a representation for context data, and a representation for engine actions. [0028] The engine action policy is used by the LTV optimization server module 138 as a guide to make the optimum decision at each moment (e.g., in real-time), taking into account past events (e.g., previous player states and rewards) and future impacts (e.g., predicted changes in the player state and predicted future rewards). The decision includes the choice of engine actions to employ given a current user state and engine action policy. Using the method 200 to follow a single player, the LTV optimization server module 138 uses RNN-2 125 within the machine learning system 122 to learn over time an optimum engine action policy on a per player basis, not on player segments or other groupings of game players.” (Emphasis added.) RNN within LTV optimization module (i.e. a machine-learned intervention selection model), actions (i.e. interventions), player data including past events (i.e. history), users (i.e. entities)); and 
instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising (Jaatinen [0053] recites, in part, “The storage unit 536 and memory :
obtaining an entity history of each of a plurality of entities that use a computer application (Jaatinen [0027] recites “At operation 203 of the method 200, the game event data, reward data and context data are recorded by the client module 120 in the database 156.” Recording data associated to game on client device (i.e. obtaining entity history)); 
for each of the plurality of entities, determining, via the machine-learned intervention selection model based at least in part on the entity history for each entity, a probability (Jaatinen [0025] and [0028] recites “In accordance with an embodiment, RNN-2 125 creates an engine action policy (e.g., or just ‘policy’) for the LTV optimization server module 138 (or alternately for the LTV optimization client module 120). The policy includes information that describes a relationship (e.g., a mapping) between player states, game events, engine actions and rewards. The policy can include rules, heuristics, and probabilities for matching a player state (including game events) with one or more engine actions in order to maximize a future reward. [0028] Using the method 200 to follow a single player, the LTV optimization server module 138 uses RNN-2 125 within the machine learning system 122 to learn over time an optimum engine action policy on a per player basis, not on (Emphasis added.)); and 
providing one or more interventions of the plurality of available interventions to one or more entities of the plurality of entities based at least in part on the respective probabilities determined via the machine-learned intervention selection model (Jaatinen [0028]-[0029] recites “At operation 208, the LTV optimization server module 138 uses the engine action policy and current state data (e.g., first state data) to choose one or more engine actions to be implemented in the game environment. The engine action policy is used by the LTV optimization server module 138 as a guide to make the optimum decision at each moment (e.g., in real-time), taking into account past events (e.g., previous player states and rewards) and future impacts (e.g., predicted changes in the player state and predicted future rewards). [0029] In accordance with an embodiment, during operation 208 of the method 200, the client module 120 implements the chosen engine action (e.g., places a specific advertisement within a game at a specific time and place) chosen by the server module 138 using the policy. In the embodiment, the client module 120 implements the decision made by the server module 138.” (Emphasis added.) Client implements decision made by server (i.e. providing intervention)).  
	Although Jaatinen discloses for each of the plurality of entities, determining, via the machine-learned intervention selection model based at least in part on the entity history for each entity, a probability, Jaatinen does not explicitly disclose a respective probability that each of a plurality of available interventions will improve an objective value that is determined based at least in part on a measure of continued use of the computer application by the entity.
	Kumar teaches a respective probability that each of a plurality of available interventions will improve an objective value that is determined based at least in part on a measure of continued use of the computer application by the entity (Kumar [0094]-[0096] recites, in part, “The business objectives for which the model(s) can be optimized may include a dollar value (revenue, profit, etc.),… total time spent on an application…, etc. [0095] Taking overall profit as an example of a business objective to be optimized in a model, the supervised learning module 234a may tune parameters of the model so that products with higher margins or profits may be recommended over those with a higher likelihood of purchase, but a lower margin or profit... In some implementations, the proxy value can based on a user response... For example, the proxy value can be based on an amount of time the user plays a video on a video service, …, or other such user responses that can be optimized for achieving the business objective. [0096] In some implementations, the model tuned by supervised learning module 234a may recommend A (even though A may have an ever so slightly lower probability of purchase) because the proxy value (e.g., margin X probability of purchase) of A is higher (or is an optimized value) compared to the proxy value of B.” (Emphasis added.) Business objective (i.e. objective value), total time spend on an application (i.e. measure of continued use of computer application)).
Kumar and Jaatinen are both directed to machine learning. In view of the teachings of Kumar, it would have been obvious to one of ordinary skill in the art to apply the teachings of Kumar to Jaatinen before the effective filing date of the claimed invention in order to optimize business objectives like user retention by incorporating features of usage behavior and demographics with supervised learning thereby improving Jaatinen (cf. Kumar [0009]-[0010] recites, in part, “For instance, the features further include the similar attribute as including one from a group of usage behavior and demographics. For instance, the features further include 

Regarding claim 2, 
The Jaatinen/Kumar Combination teaches the computing system of claim 1, wherein the computer application comprises at least one of: a mobile application, a web browser application, or a game application (Jaatinen [0048] recites “The applications 420 include built-in applications 440 and/or third-party applications 442. Examples of representative built-in applications 440 may include, but are not limited to, a contacts application, a browser application, a book reader application, a location application, a media application, a messaging application, and/or a game application.”).  
Please see motivation for claim 1 above.

Regarding claim 4,
The Jaatinen/Kumar Combination teaches the computing system of claim 1, wherein the machine-learned intervention selection model is trained using supervised learning techniques (Kumar fig. 1 elements 1106 and 1108 & [0131] recites, in part, “At block 1106, the supervised learning module 234 selects a supervised learning method. At block 1108, the supervised learning module 234 builds a model based on the supervised learning method and a first dataset restricted to the subset of features and the subset of rows in the master dataset.”). 
Please see motivation for claim 1 above.

Regarding claim 5,
The Jaatinen/Kumar Combination teaches the computing system of claim 1, wherein the machine-learned intervention selection model comprises an intervention agent in a reinforcement learning scheme (Jaatinen [0025] recites, in part, “The policy includes information that describes a relationship (e.g., a mapping) between player states, game events, engine actions and rewards… The machine learning system 122 uses RNN-2 125 in a reinforcement learning scenario in order for RNN-2 125 to create the policy and then continuously update the policy based on user actions over time. RNN-2 125 learns a functional relationship between possible engine actions and cumulative future rewards.” Player (i.e. agent)).
Please see motivation for claim 1 above.
  
Regarding claim 7, 
The Jaatinen/Kumar Combination teaches the computing system of claim 1, wherein at least some of the plurality available interventions are specified by a developer of the computer application (Jaatinen [0022] recites, in part, “The monetization events include completing an action prompted by an advertisement. The underlying details for a game event is created in the game code by a game developer and can be customized to include specific events. In accordance with an embodiment, in order to maximize the effectiveness of the systems and methods described herein, it is expected that a developer include a plurality of specific game events in a game.” Game developer (i.e. developer), game events to prompt action created in game code (i.e. specified by developer)).  
Please see motivation for claim 1 above.

Regarding claim 8, 
 the computing system of claim 1, wherein the operations comprise identifying the plurality of available interventions from a plurality of defined interventions, the plurality of available interventions being a subset of the plurality of defined interventions that satisfy one or more developer-supplied intervention criteria at a time of selection (Jaatinen [0032] recites, in part, “In accordance with an embodiment the LTV optimization server module 138 uses a form of reinforcement learning (e.g. using RNN-2 125) at each decision-making point, to learn a policy connecting a player state, each engine action, future predicted rewards and predicted future player states. The LTV optimization server module 138 creates (e.g., via the recursion of reinforcement learning of RNN-2 125) an evolving engine action policy for a player so that over time a game (e.g., the developer of the game) receives the maximum amount of monetary rewards from the player. Furthermore, because of the use of the player state (e.g., with player state history) and RNN-1 123, which uses memory of past player states (e.g. using LSTM), the policy is optimized on an individual player level and in an ongoing and dynamic way (e.g., the policy determines the best set of engine actions with a specific individual user, at a specific time in a specific game context).” Engine action policy for a player (i.e. plurality of defined interventions) and engine actions (i.e. subset of defined interventions, the available interventions), developer of the game providing monetizing events (i.e. developer supplied intervention)).  
Please see motivation for claim 1 above.

Regarding claim 9, 
The Jaatinen/Kumar Combination teaches the computing system of claim 1, wherein the machine-learned intervention selection model is located within a server computing device that serves the computer application (Jaatinen fig. 1 element 130 & [0020] recites “The server 130 also includes a memory 132 for storing a LTV optimization server module (“server module”) 138 that provides various LTV optimization functionality as described herein… Server (i.e. server computing device)).  
Please see motivation for claim 1 above.

Regarding claim 10, 
The Jaatinen/Kumar Combination teaches the computing system of claim 1, wherein the machine-learned intervention selection model is located within the computer application on a user computing device (Jaatinen fig. 1 element 102 & [0019] and [0020] recite, in part, “The user device 102 also includes a memory 104 configured to store a game engine 106 (e.g., executed by the CPU 108 or GPU 110) that communicates with the display device 124 and also with other hardware such as the input device(s) 114 to present a game to the user 101. The game engine 106 includes a LTV optimization client module (“client module”) 120 that provides various LTV optimization functionality as described herein. [0020] During operation, the LTV optimization client module 120 and the LTV optimization server module 138 perform the various LTV optimization functionalities described herein.” User device (i.e. user computing device)).  
Please see motivation for claim 1 above.

Regarding claim 16, 
Jaatinen discloses the computer-implemented method of claim 11 and the machine-learned intervention selection model (Jaatinen fig. 1 elements 138, 122, 123, 125).  
Although Jaatinen discloses the computer-implemented method of claim 11 and the machine-learned intervention selection model, Jaatinen does not disclose wherein the model has been trained on a set of training data via supervised learning.
wherein the model has been trained on a set of training data via supervised learning (Kumar fig. 1 elements 1106 and 1108 & [0131] recites, in part, “At block 1106, the supervised learning module 234 selects a supervised learning method. At block 1108, the supervised learning module 234 builds a model based on the supervised learning method and a first dataset restricted to the subset of features and the subset of rows in the master dataset.”).
Kumar and Jaatinen are both directed to machine learning. In view of the teachings of Kumar, it would have been obvious to one of ordinary skill in the art to apply the teachings of Kumar to Jaatinen before the effective filing date of the claimed invention in order to optimize business objectives like user retention by incorporating features of usage behavior and demographics with supervised learning thereby improving Jaatinen (cf. Kumar [0009]-[0010] recites, in part, “For instance, the features further include the similar attribute as including one from a group of usage behavior and demographics. For instance, the features further include the business objective as including one from a group of profit, revenue, user retention, number of user interactions, user interaction time, and user interaction type… [0010] The present disclosure is particularly advantageous because it formulates the generation of recommendation as supervised learning. In particular, such formulation allows business goals (e.g., profit) and business rules (e.g., arbitrary business requirement to honor contractual or vested interest) to be directly optimizable by being integrated in a supervised learning model. Another advantage of the approach is its natural ability to incorporate data or features from multiple data sources—items, users, user devices, and such.”).

Regarding claim 19, 
Jaatinen discloses the computer-implemented method of claim 11, performing, by the one or more computing devices, the one or more interventions and the computer application (Jaatinen [0012] and [0030] recites “A method for optimizing LTV related to Implement action (i.e. perform intervention)).  
Although Jaatinen discloses the computer-implemented method of claim 11, performing, by the one or more computing devices, the one or more interventions and the computer application, Jaatinen does not disclose modifying, by the one or more computing devices, one or more operating parameters of the computer application.
Kumar teaches modifying, by the one or more computing devices, one or more operating parameters of the computer application (Kumar [0098] recites, in part, “In some implementations, the supervised learning module 234a tunes a model of the chosen type by optimizing its parameters to maximize a desired aspect of performance. For example, if the supervised learning model is predicting a numerical measure of user-item interaction such as the duration of video watching by user, or the user rating of items, the L2 score (i.e. the Euclidian distance between the observed and predicted values of the interaction measure), L1 score (i.e. Manhattan distance), or other scores that quantify the discrepancy between numerical predictions and observed values can be used as a performance measure.” Tuning the model of the chosen type by optimizing its parameters (i.e. modifying operating parameters)).
Kumar and Jaatinen are both directed to machine learning. In view of the teachings of Kumar, it would have been obvious to one of ordinary skill in the art to apply the teachings of Kumar to Jaatinen before the effective filing date of the claimed invention in order to optimize business objectives like user retention by incorporating features of usage behavior and demographics with supervised learning thereby improving Jaatinen (cf. Kumar [0009]-[0010] recites, in part, “For instance, the features further include the similar attribute as including one from a group of usage behavior and demographics. For instance, the features further include the business objective as including one from a group of profit, revenue, user retention, number of user interactions, user interaction time, and user interaction type… [0010] The present 

Regarding claim 20, 
Jaatinen discloses one or more non-transitory computer-readable media that store instructions that, when executed by one or more computing devices, cause the one or more computing devices to perform operations, the operations comprising (Jaatinen fig. 5 elements 538, 516 and [0053] recites “The storage unit 536 and memory 532, 534 store the instructions 516 embodying any one or more of the methodologies or functions described herein. The instructions 516 may also reside, completely or partially, within the memory 532, 534, within the storage unit 536, within at least one of the processors 510 (e.g., within the processor's cache memory), or any suitable combination thereof, during execution thereof by the machine 500.”): 
obtaining entity history data associated with an entity associated with a computer application (Jaatinen [0027] recites “At operation 203 of the method 200, the game event data, reward data and context data are recorded by the client module 120 in the database 156.”); 
inputting the entity history data into a machine-learned intervention selection model that is configured to process the entity history data to select one or more interventions from a plurality of available interventions (Jaatinen fig. 2 element 206 & [0028]-[0029] recite, in part, “Referring back to FIG. 2, … at operation 206 the server module 138 provides a player state ... and reward data into a machine learning system 122 that includes the second neural network RNN-2 125. The second neural network RNN-2 125 uses the state data from RNN-1 123 and the reward data to determine an engine action policy. [0029] In accordance with an embodiment, during operation 208 of the method 200, the client module 120 implements the chosen engine action (e.g., places a specific advertisement within a game at a specific time and place) chosen by the server module 138 using the policy. In the embodiment, the client module 120 implements the decision made by the server module 138.” (Emphasis added.) Machine learning system RNN model 2 (i.e. machine learned intervention selection model), action (i.e. intervention), and policy (i.e. available interventions)), 
wherein at least some of the plurality of available interventions are defined by a developer of the computer application (Jaatinen [0022] recites, in part, “The underlying details for a game event is created in the game code by a game developer and can be customized to include specific events. In accordance with an embodiment, in order to maximize the effectiveness of the systems and methods described herein, it is expected that a developer include a plurality of specific game events in a game.”); 
receiving a selection of the one or more interventions by the machine-learned intervention selection model based at least in part on the entity history data (Jaatinen [0030] and [0033] recites “In accordance with another embodiment, during operation 208 of the method 200, the client module 120 uses the engine action policy and state data (e.g., the first state) to choose one or more engine actions, and to subsequently implement (e.g., place) the chosen one or more engine actions in the game environment. [0033] The decisions include selecting one or more advertising placements and/or IAP placements from the promotion system 140 (e.g., selecting specific ad/AIP data 310) to include in the one or more engine actions 312 that are sent to the client module 120 to be exposed to the user in the game environment 302.” Chosen actions (i.e. selection of interventions), actions sent to client module (i.e. receiving interventions)); and 
in response to the selection, performing the one or more interventions for the entity within the computing application (Jaatinen [0030] recites “In the embodiment, the Implement action (i.e. perform intervention)).
Although Jaatinen discloses wherein the machine-learned intervention selection model is configured to make the selection of the one or more interventions, Jaatinen does not disclose wherein the model is configured to make the selection to optimize an objective function, wherein the objective function measures entity engagement with the computer application.
Kumar teaches wherein the model is configured to make the selection to optimize an objective function, wherein the objective function measures entity engagement with the computer application (Kumar [0094]-[0096] recites, in part, “The business objectives for which the model(s) can be optimized may include a dollar value (revenue, profit, etc.),… overall engagement, total time spent on an application…, etc.” (Emphasis added.) Business model for model to optimize (i.e. objective function) and overall engagement (i.e. measures entity engagement)).
Kumar and Jaatinen are both directed to machine learning. In view of the teachings of Kumar, it would have been obvious to one of ordinary skill in the art to apply the teachings of Kumar to Jaatinen before the effective filing date of the claimed invention in order to optimize business objectives like user retention by incorporating features of usage behavior and demographics with supervised learning thereby improving Jaatinen (cf. Kumar [0009]-[0010] recites, in part, “For instance, the features further include the similar attribute as including one from a group of usage behavior and demographics. For instance, the features further include the business objective as including one from a group of profit, revenue, user retention, number of user interactions, user interaction time, and user interaction type… [0010] The present disclosure is particularly advantageous because it formulates the generation of recommendation as supervised learning. In particular, such formulation allows business goals (e.g., profit) and business rules (e.g., arbitrary business requirement to honor contractual or vested interest) to .

Claims 3 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over Jaatinen in view of Kumar and in further view of Grosso (US 20180361253 A1).

Regarding claim 3, 
The Jaatinen/Kumar Combination teaches the computing system of claim 1.  
However, The Jaatinen/Kumar Combination does not teach wherein the operations further comprise, prior to determining the respective probabilities, randomly providing one or more of the plurality of available interventions to the plurality of entities during an exploratory time period.
Grosso teaches wherein the operations further comprise, prior to determining the respective probabilities, randomly providing one or more of the plurality of available interventions to the plurality of entities during an exploratory time period (Grosso [0222] and [0224] recite, in part, “The multi-armed bandit problem is derived from the context in which a gambler confronts several slot machines, and must decide how to allocate resources among them when the probability of a payout from each of the machines is unknown. The classic problem in such a context is that a player must tradeoff resources against two goals—exploration (learning about the payoff behavior of each machine) and exploitation (using the knowledge already gained to maximize results based on that limited knowledge). Multi-armed bandit techniques may be useful to a seller of goods or service when the characteristics and preferences and behavior of customers are hidden, or when some aspects of these are known, but their implications for the effect of changes in the seller's offerings are unknown. [0224] As previously discussed, such data may include information about the device used, how For some optimization processes it will make sense to create a control segment and multiple test segments, and to assign users randomly.” (Emphasis added.) Assign randomly users during exploration (i.e. randomly providing to entities during an exploratory time period)).
Grosso and The Jaatinen/Kumar Combination are both directed to recommendations in a game or service. In view of the teachings of Grosso, it would have been obvious to one of ordinary skill in the art to apply the teachings of Grosso to The Jaatinen/Kumar Combination before the effective filing date of the claimed invention in order to apply more insightful analytics to the gaming application thereby improving The Jaatinen/Kumar Combination (cf. Grosso [0011]-[0012] recites, in part, “Games and other digital entertainment applications are fundamentally different, and predictive analysis is still crude particularly with respect to lifecycle analysis and churn prediction. Core ideas can be translated to the gaming industry, such as utilizing an increased role for A/B testing to compensate for a current lack of theoretical models or utilizing machine learning (which has progressed a lot in recent years). Utilizing existing “general purpose” infrastructure such as cloud-based technologies can reduce technological complexity. [0012] Digital entertainment applications are generally not exploring this, instead focusing on dealing with technological shifts (e.g., the move to mobile devices), platform shifts (frequent game console and platform releases/updates), behavioral shifts (players tending away from or toward certain genres or gameplay elements), the emergence of virtual reality, smart televisions, and the like, and basic product or sales management concerns. Analytics are often an afterthought at large game companies, or are out of reach of smaller studios or independent developers.”).

Regarding claim 6, 
The Jaatinen/Kumar Combination teaches the computing system of claim 1.
 wherein, in addition to the measure of continued use of the computer application by the entity, the objective value is further determined based at least in part on an allocation of resources by the entity within the computer application.
Grosso teaches wherein, in addition to the measure of continued use of the computer application by the entity, the objective value is further determined based at least in part on an allocation of resources by the entity within the computer application (Grosso [0222] and [0224] recite, in part, “The multi-armed bandit problem is derived from the context in which a gambler confronts several slot machines, and must decide how to allocate resources among them when the probability of a payout from each of the machines is unknown. The classic problem in such a context is that a player must tradeoff resources against two goals—exploration (learning about the payoff behavior of each machine) and exploitation (using the knowledge already gained to maximize results based on that limited knowledge). Multi-armed bandit techniques may be useful to a seller of goods or service when the characteristics and preferences and behavior of customers are hidden, or when some aspects of these are known, but their implications for the effect of changes in the seller's offerings are unknown. [0224] In step 1502 data is collected regarding the users of the game and their experiences in playing the game. As previously discussed, such data may include information about the device used, how often the user plays the game, etc.”).  
Grosso and The Jaatinen/Kumar Combination are both directed to recommendations in a game or service. In view of the teachings of Grosso, it would have been obvious to one of ordinary skill in the art to apply the teachings of Grosso to The Jaatinen/Kumar Combination before the effective filing date of the claimed invention in order to apply more insightful analytics to the gaming application thereby improving The Jaatinen/Kumar Combination (cf. Grosso [0011]-[0012] recites, in part, “Games and other digital entertainment applications are fundamentally different, and predictive analysis is still crude particularly with respect to lifecycle .


Claims 13-14 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Jaatinen in view of Grosso.

Regarding claim 13, 
Jaatinen discloses the computer-implemented method of claim 11, wherein the machine-learned intervention selection model is configured to make the selection of the one or more interventions to optimize an objective function (Jaatinen [0015] recites “The methods and systems described herein maximize LTV by optimizing the use of monetization placements (e.g., in time and space) in a game; and also maximizing the total number of monetization placements seen by a player by maximizing the time a player is engaged with a game (e.g., maximizing total play time).”).
However, Jaatinen does not disclose wherein the objective function measures entity churn out of the computer application.
wherein the objective function measures entity churn out of the computer application (Grosso [0124] recites “Contextualized metrics may then be provided to reporting server 524 for use in reporting operations, such as to provide a user with detailed contextualized information on segments (for example, “this player has played x minutes per day” or “players who spend y per hour of gameplay are less likely to churn”, or other such contextualized metric-based information).”).
Grosso and Jaatinen are both directed to improving performance of a game or service. In view of the teachings of Grosso, it would have been obvious to one of ordinary skill in the art to apply the teachings of Grosso to Jaatinen before the effective filing date of the claimed invention in order to apply more insightful analytics to the gaming application thereby improving Jaatinen (cf. Grosso [0011]-[0012] recites, in part, “Games and other digital entertainment applications are fundamentally different, and predictive analysis is still crude particularly with respect to lifecycle analysis and churn prediction. Core ideas can be translated to the gaming industry, such as utilizing an increased role for A/B testing to compensate for a current lack of theoretical models or utilizing machine learning (which has progressed a lot in recent years). Utilizing existing “general purpose” infrastructure such as cloud-based technologies can reduce technological complexity. [0012] Digital entertainment applications are generally not exploring this, instead focusing on dealing with technological shifts (e.g., the move to mobile devices), platform shifts (frequent game console and platform releases/updates), behavioral shifts (players tending away from or toward certain genres or gameplay elements), the emergence of virtual reality, smart televisions, and the like, and basic product or sales management concerns. Analytics are often an afterthought at large game companies, or are out of reach of smaller studios or independent developers.”).

Regarding claim 14, 
the computer-implemented method of claim 13, wherein the machine-learned intervention selection model is configured to determine a plurality of respective probabilities with which the plurality of available interventions will improve an objective value provided by the objective function, wherein the selection of the one or more interventions is based at least in part on the plurality of respective probabilities (Jaatinen [0025] recites “In accordance with an embodiment, RNN-2 125 creates an engine action policy (e.g., or just ‘policy’) for the LTV optimization server module 138 (or alternately for the LTV optimization client module 120). The policy includes information that describes a relationship (e.g., a mapping) between player states, game events, engine actions and rewards. The policy can include rules, heuristics, and probabilities for matching a player state (including game events) with one or more engine actions in order to maximize a future reward. The engine action policy is an output of RNN-2 125, and is used as a guide by the LTV optimization server module 138 (or alternately the LTV optimization client module 120) to decide on one or more engine actions to incorporate into a game given a particular user state and a context (e.g., a specific game environment with specific game events).” Policy including probabilities (i.e. respective probabilities), maximize future reward (i.e. improve objective value), and policy used as guide by LTV optimization module (i.e. based on respective probabilities)).  
Please see motivation for claim 13 above.

Regarding claim 18, 
Jaatinen discloses the computer-implemented method of claim 11.
However, Jaatinen does not disclose further comprising: performing, by the one or more computing devices, an exploration phase in which, for one or more other entities, one of the plurality of available interventions is selected randomly.  
Grosso teaches further comprising: performing, by the one or more computing devices, an exploration phase in which, for one or more other entities, one of the plurality of available interventions is selected randomly ((Grosso [0222] and [0224] recite, in part, “The multi-armed bandit problem is derived from the context in which a gambler confronts several slot machines, and must decide how to allocate resources among them when the probability of a payout from each of the machines is unknown. The classic problem in such a context is that a player must tradeoff resources against two goals—exploration (learning about the payoff behavior of each machine) and exploitation (using the knowledge already gained to maximize results based on that limited knowledge). Multi-armed bandit techniques may be useful to a seller of goods or service when the characteristics and preferences and behavior of customers are hidden, or when some aspects of these are known, but their implications for the effect of changes in the seller's offerings are unknown. [0224] As previously discussed, such data may include information about the device used, how often the user plays the game, etc. In step 1504, a segmenting schema is generated. Segmentation can be based on a variety of schemas. For some optimization processes it will make sense to create a control segment and multiple test segments, and to assign users randomly.” (Emphasis added.) Assign randomly (i.e. selected randomly))).  
Grosso and Jaatinen are both directed to improving performance of a game or service. In view of the teachings of Grosso, it would have been obvious to one of ordinary skill in the art to apply the teachings of Grosso to Jaatinen before the effective filing date of the claimed invention in order to apply more insightful analytics to the gaming application thereby improving Jaatinen (cf. Grosso [0011]-[0012] recites, in part, “Games and other digital entertainment applications are fundamentally different, and predictive analysis is still crude particularly with respect to lifecycle analysis and churn prediction. Core ideas can be translated to the gaming industry, such as utilizing an increased role for A/B testing to compensate for a current lack of theoretical models or utilizing machine learning (which has progressed a lot in recent years). Utilizing existing “general purpose” infrastructure such as cloud-based technologies can reduce technological complexity. [0012] Digital entertainment applications are generally not exploring .

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Angelopoulos et al. (US 20190189025) similarly describes providing interventions to users based on machine learning models and historical data (See Abstract)..
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TEWODROS E MENGISTU whose telephone number is (571)270-7714. The examiner can normally be reached Mon-Fri 7:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ABDULLAH KAWSAR can be reached on (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TEWODROS E MENGISTU/Examiner, Art Unit 2127                                                                                                                                                                                                        
/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127