DETAILED ACTION
This final rejection is responsive to amendments and remarks filed 05 October 2021. 
Claims 2-6, 8-9, and 17 are amended. Claim 12 is cancelled. Claim 21 is added. No claims have been withdrawn. Therefore, claims 1-11 and 13-21 are presently pending.

Response to Arguments
Applicant's arguments with respect to the claim rejections under 35 U.S.C. § 101 have been fully considered and are persuasive. Accordingly, the rejection of claims 1-2 under 35 U.S.C. § 101 have been withdrawn.
Applicant's arguments with respect to the claim rejections under 35 U.S.C. § 103 have been fully considered but they are not persuasive. 
Regarding the rejection of claim 1, the Applicant argues that the Office Action fails to establish that Dotan-Cohen teaches the step-plus-function limitation of “performing a step for determining recommendations…” (Remarks, p. 12).
While the Examiner agrees with the Applicant that invocation of 35 U.S.C. § 112(f) would give the claim limitation its broadest reasonable interpretation in light of the specification (and all included embodiments), the Examiner also agrees with the previous Examiner that paragraph [0149] and element 1300 of FIG. 13 provides an exemplary embodiment of the claimed limitation “performing a step for determining recommendations….” Therefore, the Examiner believes that the disclosure from the combination of Dotan-Cohen and De Nijs read on the step-plus-function limitation.
Regarding the newly amended limitations of claim 6, the Applicant argues that Dotan-Cohen “does not suggest determining population dynamics corresponding to user[s] entering and 
The Examiner respectfully disagrees. The updated rejection below cites Dotan-Cohen to disclose that a data collection component is responsible for acquiring, accessing, or receiving data related to population dynamics of a location. This collection of data is then used by other components in the system, which utilize resources such as network bandwidth, processing power, and energy (Dotan-Cohen, PP[0027, 0032, 0046, 0146]).
Regarding the rejection of claim 17, the Applicant argues that Dotan-Cohen fails to teach the newly amended limitations and that secondary reference Hariri fails to remedy the deficiencies of the combination of Dotan-Cohen and De Nijs (Remarks, pp. 15-16). 
The rejection below has been updated to reflect the amendments to the claims.
 
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “performing a step for determining recommendations” in claim 1.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

The Examiner would like to kindly direct the Applicant to refer back to the previous Office Action for any mathematical expressions that are not formatted correctly in the rejection below.

Claims 1-2, 6-9, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Dotan-Cohen et al. (US 20170032248 A1, hereinafter Dotan-Cohen) in view of De Nijs et al. (“Bounding the Probability of Resource Constraint Violations in Multi-Agent MDPs,” hereinafter De Nijs). 
Regarding claim 1, Dotan-Cohen discloses in a digital medium environment in which availability of limited resources is tracked, a computer-implemented method for generating real time point-of-interest recommendations for users, comprising (Dotan-Cohen [0024] recites “In particular, such applications, services, or routines may operate on one or more user devices (such as user device 102a), servers (such as server 106), may be distributed across one or more user devices and servers, or be implemented in the cloud.”): 
analyzing resource data associated with a plurality of points of interest (Dotan-Cohen [0027]-[0028] recite, in part, “Data collection component 215 is generally responsible for acquiring, accessing, or receiving (and in some cases also identifying) user data, venue data, and interpretive data from one or more data sources, such as data sources 104a through 104n of FIG. 1. [0028] Venue data corresponds to data collected in association with one or more venues. A “venue” may refer to a physical location that people can conduct certain activities at in person.”),
generating a sequential history of user actions for each user of a plurality of users based on observing state transitions of each user of the plurality of users (Dotan-Cohen [0030] recites “A routine characteristic can be a characteristic that is determined by the system as being part of a routine model that is detected and tracked by the system (e.g., venue visits, visitation patterns, activity patterns, and/or behavior patterns, or routines). A sporadic characteristic can be a characteristic that is determined by the system, but not as being part of a known routine practiced by a location or user that is detected and tracked by the system (e.g., an event that is not part of a known practiced routine vs. an event that may or may not be part of a known practiced routine).”); and 
performing a step for determining recommendations for the plurality of users by taking into account user types, the sequential history of user actions for each user of the plurality of users, and the current state of each user (Dotan-Cohen [0151] recites “In some embodiments, recommended actions may correspond to one or more conditions, which may be assessed based on sensor(s) on a user device associated with the user, via user history, patterns or routines (e.g. the user drives to work every day between 8:00 and 8:30 AM), other user information (such as online activity of a user, user communication information including missed communications, urgency or staleness of the content (e.g. the content should be presented to the user in the morning, but is no longer relevant after 10 AM), the particular user routine that is diverged from, and/or contextual information corresponding to the out of routine event.”). 
However, Dotan-Cohen does not disclose to determine a plurality of resource constraints; wherein the plurality of resource constraints provide limitations on a capacity of each resource; generating a plurality of expected resource consumptions that provide expected uses of each resource subject to the plurality of resource constraints; determining a current state for each user of the plurality of users; and performing a step for determining recommendations for the plurality of users by taking into account user types, the plurality of resource constraints, the plurality of expected resource consumptions, the sequential history of user actions for each user of the plurality of users, and the current state of each user. 
De Nijs teaches to determine a plurality of resource constraints (De Nijs Pg. 3562, Introduction Section recites, in part, “When policies of agents are allowed to satisfy resource constraints in expectation (such as with accidental overruns of budget, or infrastructural constraints), this weakness of deterministic resource allocations can be overcome resulting in significantly higher expected value.” Satisfy resource constraints in expectation to overcome weakness of deterministic resource allocations (i.e. determine resource constraints)); 
wherein the plurality of resource constraints provide limitations on a capacity of each resource (De Nijs Pg. 3563, MMDPs with Global Resource Constraints Section recites, in part, “Global resource constraints force the agents to coordinate their decisions, which means that the policies used by the agents should maximize the total expected value while staying below global resource limits.” Global resource limits (i.e. limitations on a capacity of resources)); 
generating a plurality of expected resource consumptions that provide expected uses of each resource subject to the plurality of resource constraints (De Nijs recites, in part, “The agents have access to k resource types. For each agent i the consumption of resource type j is defined using a function Cij: Six Ai -> [0, C max_ij], where C max_ij denotes the maximum instantaneous consumption of resource type j for agent i. For instantaneous constraints the resource limit at time t is defined by L_jt, where the usage at time t does not affect the limit at t'> t. Budget constraints are defined by a single limit L_j.” Consumption of resource type j for each agent i defined by using function and access to k resource types (i.e. generate expected resource consumptions that provide expected uses of each resource and associated with resource constraints));
determining a current state for each user of the plurality of users (De Nijs Pg. 3563, MMDPs with Global Resource Constraints Section, Para. 2 recites, in part, “The tuple (Si, Ai, Ti, Fi, h, s_i,1) defines the MDP Mi for agent i. Each agent has its own sets of states Si and actions Ai. The transition function Ti: Six Aix Si -> [0, 1] gives the probability of advancing to state s1                        
                            ∈
                        
                     S1 from state s                         
                            ∈
                        
                     Si by choosing action a                         
                            ∈
                        
                     Ai, thus Ti (s, a, s')= P (s'l s, a). The choice of action a in state s is rewarded through reward function Ai: Six Ai -> R. The horizon h specifies the total number of decisions and the initial state of agent i is defined by s_i,1.” State s, (i.e. current state)); and 
taking into account the plurality of resource constraints, the plurality of expected resource consumptions (De Nijs Pg. 3563, MMDPs with Global Resource Constraints section, Para. 3 recites, in part, “Global resource constraints force the agents to coordinate their decisions, which means that the policies used by the agents should maximize the total expected value while staying below global resource limits. The agents have access to k resource types. For each agent i the consumption of resource type j is defined using a function Cij: Six Ai -> [0, C max_ij], where Cmax_ij denotes the maximum instantaneous consumption of resource type j for agent i.” Function that includes consumption of resource type j for each agent i and access to k resource types (i.e. taking into account expected resource consumptions and resource constraints)). 
De Nijs and Dotan-Cohen are both directed to machine learning. In view of the teachings of De Nijs, it would have been obvious to one of ordinary skill in the art to apply the teachings of De Nijs to Dotan-Cohen before the effective filing date of the claimed invention in order to coordinate actions on shared, collectively owned resources by supporting multi-unit resource consumption using an algorithm for computing policies satisfying a given violation tolerance (cf. De Nijs Pg. 3562, Introduction Section recites the following: “When decision-making agents share collectively owned resources, their actions need to be coordinated subject to the availability of these resources. In many problem domains it is impractical, inefficient or costly to coordinate resource consumption during execution. For example, load balancing of energy consumption in smart energy grids has instantaneous effects on the stability of the grid, which requires real-time decisions.” “Our method naturally supports multi-unit resource consumption, and requires no communication between agents during execution. It can be seen as an anytime algorithm for computing policies satisfying a given violation tolerance.”).

Regarding claim 2, the Dotan-Cohen/De Nijs Combination teaches the computer-implemented method of claim 1, wherein the current state of each user corresponds to a state of the user that associates a location of the user with respect to one of the plurality of points of interest and a time at which the user was at the location and wherein a state transition of the user tracks a change of the user from a previous state to a subsequent state (Dotan-Cohen [0028]-[0029] and [0032] recites, in part, “As used herein, a venue can correspond to a venue profile, such as one of venue profiles 224, which may be associated with a corresponding venue identifier (ID) and optionally various semantic characteristics (routine characteristics and/or sporadic characteristics) of the venue including a name of the venue, a category of the venue, a location or region of the venue, and the like. [0029] Examples of semantic characteristics that may be utilized to infer event and/or activity instances include routine characteristics of a user and/or location (e.g., a geographic areas or venue). [0032] Examples of sporadic characteristics include a specific concert occurring on a particular day or at a particular time at a location...” Venue data with semantic characteristics of activity, user and location (i.e. state of the user). Additionally, De Nijs Pg. 3563, MMDPs with Global Resource Constraints Section, Para. 2 recites, in part, “The tuple (Si, Ai, Ti, Ri, h, si,1) defines the MDP Mi for agent i. Each agent has its own sets of states Si and actions Ai. The transition function Ti: Six Aix Si -> [0, 1] gives the probability of advancing to state state s1                                
                                    ∈
                                
                             S1 from state s                                 
                                    ∈
                                
                             Si by choosing action a                                 
                                    ∈
                                
                             Ai, thus Ti (s, a, s')= P(s'|s, a). The choice of action a in state s is rewarded through hreward function Ri: Six Ai -> R. The horizon h specifies the total number of decisions and the initial state of agent i is defined by si,1.” Transition function (i.e. state transition), state s1 from state s, (i.e. subsequent state from previous state)). 
Please see motivation for combination from claim 1 above. 

Regarding claim 6, Dotan-Cohen discloses a non-transitory computer readable storage medium including a set of instructions that, when executed by at least one processor, cause a computing device to (Dotan-Cohen fig. 1, element 102 (user device) & [0017] and [0180] recites, in part, “For instance, some functions may be carried out by a processor executing instructions stored in memory. [0180] Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data... Computer storage media does not comprise signals per se.”): 
analyze resource data associated with a plurality of points of interest (DotanCohen [0027]-[0028] recite, in part, “Data collection component 215 is generally responsible for acquiring, accessing, or receiving (and in some cases also identifying) user data, venue data, and interpretive data from one or more data sources, such as data sources 104a through 104n of FIG. 1. [0028] Venue data corresponds to data collected in association with one or more venues. A “venue” may refer to a physical location that people can conduct certain activities at in person.”), 
	determine population dynamics that correspond to users entering and exiting a structure associated with the plurality of points of interest (Dotan-Cohen, P[0027], “Data collection component 215 is generally responsible for acquiring, accessing, or receiving (and in some cases also identifying) user data, venue data, and interpretive data from one or more data sources.” Dotan-Cohen, P[0032], “Examples of routine characteristics of a location (e.g., a venue) include … aggregate visitor demographics, aggregate visitor characteristics, and many more. Examples of sporadic characteristics include a specific concert occurring on a particular day or at a particular time at a location, an unexpected spike in visitors to a location and/or visitors or traffic (e.g., people or vehicles) near a location, current weather conditions at a location, unusual events or activity occurring at a location, and many more.”); 
generate a plurality of expected resource consumptions that provide expected uses of each resource … based on the population dynamics … (Dotan-Cohen, P[0046], “The data acquired by data collection component 215 and processed by event tracker 216 on aggregate forms a detailed record of patterns of instances of events involving users and venues. These patterns can provide understanding and knowledge to system 200 and can be identified and detected by the various components of system 200 including event tracker 216 and presentation component 298.” Dotan-Cohen, P[0146], “Where presentation component 298 refrains from presenting the content to a user, processing, power, and other resources related to the presentation of the content are conserved. For example, generating content may utilize network bandwidth, processing power, and energy.”);
generate a sequential history of user actions for each user of a plurality of users based on observing state transitions of each of the plurality of users (Dotan-Cohen [0030] recites “A routine characteristic can be a characteristic that is determined by the system as being part of a routine model that is detected and tracked by the system (e.g., venue visits, visitation patterns, activity patterns, and/or behavior patterns, or routines). A sporadic characteristic can be a characteristic that is determined by the system, but not as being part ofa known routine practiced by a location or user that is detected and tracked by the system (e.g., an event that is not part of a known practiced routine vs. an event that may or may not be part of a known practiced routine).”); and 
determine recommendations for the plurality of users that takes into account user types and the sequential history of user actions for each user of the plurality of users, and the current state of each user (Dotan-Cohen [0151] recites “In some embodiments, recommended actions may correspond to one or more conditions, which may be assessed based on sensor(s) on a user device associated with the user, via user history, patterns or routines (e.g. the user drives to work every day between 8:00 and 8:30 AM), other user information (such as online activity of a user, user communication information including missed communications, urgency or staleness of the content (e.g. the content should be presented to the user in the morning, but is no longer relevant after 10 AM), the particular user routine that is diverged from, and/or contextual information corresponding to the out of routine event.” User history, routines (i.e. sequential history of user actions), recommended actions (i.e. recommendations)). 
However Dotan-Cohen does not disclose to determine a plurality of resource constraints; wherein the plurality of resource constraints provide limitations on a capacity of each resource; generating a plurality of expected resource consumptions that provide expected uses of each resource subject to the plurality of resource constraints; determining a current state for each user of the plurality of users; and by solving a constrained linear program that is based on the plurality of resource constraints, the plurality of expected resource consumptions. 
De Nijs teaches to determine a plurality of resource constraints (De Nijs Pg. 3562, Introduction Section recites, in part, “When policies of agents are allowed to satisfy resource constraints in expectation (such as with accidental overruns of budget, or infrastructural constraints), this weakness of deterministic resource allocations can be overcome resulting in significantly higher expected value.” Satisfy resource constraints in expectation to overcome weakness of deterministic resource allocations (i.e. determine resource constraints)); 
wherein the plurality of resource constraints provide limitations on a capacity of each resource (De Nijs Pg. 3563, MMDPs with Global Resource Constraints Section recites, in part, “Global resource constraints force the agents to coordinate their decisions, which means that the policies used by the agents should maximize the total expected value while staying below global resource limits.” Global resource limits (i.e. limitations on a capacity of resources)); 
generating a plurality of expected resource consumptions that provide expected uses of each resource subject tothe plurality of resource constraints (De Nijs recites, in part, “The agents have access to k resource types. For each agent /the consumption of resource type jis defined using a function Cij: Six Ai > [0, C max,ij], where C max,ij denotes the maximum instantaneous consumption of resource type /for agent /. For instantaneous constraints the resource limit at time t is defined by Lt, wnere the usage at time fdoes not affect the limit at t'> t, Budget constraints are defined by a single limit Lj.” Consumption of resource type j for each agent i defined by using function and access to k resource types (i.e. generate expected resource consumptions that provide expected uses of each resource and associated with resource constraints)); 
determining a current state for each user of the plurality of users (De Nijs Pg. 3563, MMDPs with Global Resource Constraints Section, Para. 2 recites, in part, “The tuple (Si, Ai, Ti, Ri, h, s'') defines the MDP Mi for agent i. Each agent has its own sets of states Si and actions Ai. The transition function Ti: Six Aix Si > [0, 1] gives the probability of advancing to state s7¢ Si from state s € Si by choosing action a «é Ai, thus Ti (s, a, s')= P (s'l s, a). The choice of action ain state sis rewarded through reward function Ai: Six Ai > R. The horizon A specifies the total number of decisions and the initial state of agent /is defined by s’'.” State s, (i.e. current state)); and 
by solving a constrained linear program that is based on the plurality of resource constraints, the plurality of expected resource consumptions (De Nijs Pg. 3563, MMDPs with Global Resource Constraints section, Para. 3 recites, in part, “Global resource constraints force the agents to coordinate their decisions, which means that the policies used by the agents should maximize the total expected value while staying below global resource limits. The agents have access to k resource types. For each agent /the consumption of resource type /is defined using a function Cij: Six Ai > [0, C maxi], where C max,ij denotes the maximum instantaneous consumption of resource type /for agent /.” Function that includes consumption of resource type J for each agent i and access to k resource types (i.e. taking into account expected resource consumptions and resource constraints). Additionally, De Nijs Pg. 3563, Constrained Markov Decision Processes section recites “The Constrained MDP (CMDP, Altman 1999) framework defines a linear program to solve MDPs and can be used to impose additional constraints on the resulting stochastic policies.”). 
De Nijs and Dotan-Cohen are both directed to machine learning. In view of the teachings of De Nijs, it would have been obvious to one of ordinary skill in the art to apply the teachings of De Nijs to Dotan-Cohen before the effective filing date of the claimed invention in order to coordinate actions on shared, collectively owned resources by supporting multi-unit resource consumption using an algorithm for computing policies satisfying a given violation tolerance (cf. De Nijs Pg. 3562, Introduction Section recites the following: “When decision-making agents share collectively owned resources, their actions need to be coordinated subject to the availability of these resources. In many problem domains it is impractical, inefficient or costly to coordinate resource consumption during execution. For example, load balancing of energy consumption in smart energy grids has instantaneous effects on the stability of the grid, which requires real-time decisions.” “Our method naturally supports multi-unit resource consumption, and requires no communication between agents during execution. It can be seen as an anytime algorithm for computing policies satisfying a given violation tolerance.” ). 

Regarding claim 7, The Dotan-Cohen/De Nijs Combination teaches the non-transitory computer readable storage medium of claim 6, wherein the sequential history of user actions of each user associates previous locations of the user with respect to one of the plurality of points of interest and a time at which the user was at each of the previous locations (Dotan-Cohen [0028]-[0029] and [0032] recites, in part, “As used herein, a venue can correspond to a venue profile, such as one of venue profiles 224, which may be associated with a corresponding venue identifier (ID) and optionally various semantic characteristics (routine characteristics and/or sporadic characteristics) of the venue including a name of the venue, a category of the venue, a location or region of the venue, and the like. [0029] Examples of semantic characteristics that may be utilized to infer event and/or activity instances include routine characteristics of a user and/or location (e.g.,a geographic areas or venue). [0032] Examples of sporadic characteristics include a specific concert occurring on a particular day or at a particular time at a location...” Venue data with semantic characteristics of activity, user and location (i.e. state of the user).). 
Please see motivation for claim 6 above. 

Regarding claim 8, The Dotan-Cohen/De Nijs Combination teaches the non-transitory computer readable storage medium of claim 6, further comprising instructions that, when executed by the at least one processor, cause the computing device to: 
issue a recommendation to each user of the plurality of users based on the recommendations (Dotan-Cohen [0046] recites “The data acquired by data collection component 215 and processed by event tracker 216 on aggregate forms a detailed record of patterns of instances of events involving users and venues.... For example, presentation component 298 may employ at least some of these patterns of instances of events (e.g., using event records) in recommending service content items to users (e.g., associated with user profiles 222).”); 
determine a reaction of each user of the plurality of users to the recommendations (Dotan-Cohen [0059] recites, in part, “...an activity and/or event may have one or more tracked features defined by its corresponding model. Values of one or more of the tracked features may optionally be stored in association with a user... Tracked features can correspond to any ofa variety of user data, examples of which have been described above, and include interaction data, or sensor data or readings, which may be sensed by one or more sensors (Such as information associated with a user device regarding location, position, motion/orientation, user - access/touch, connecting/disconnecting a charger, app interaction, user activity on the user device, or other information that may be sensed by one or more sensors, such as sensors found ona mobile device) GPS coordinate samples, and many more.” /nteraction (i.e. reaction)); and update a user type of each user of the plurality of users based onthe reaction (Dotan-Cohen [0045] recites, in part, “Event tracker 216 can store any of the various data employed in tracking routines and/or events of users, venues, and/or activities as user tracking data, venue tracking data, and activity tracking data respectively. Overtime, event tracker 216 may update the tracking data as data is periodically analyzed and newevents, routines, and activities are discovered, modified, or disassociated with users, venues, and/or geographic tiles.”). 
Please see motivation for claim 6 above. 

Regarding claim 9, The Dotan-Cohen/De Nijs Combination teaches the non-transitory computer readable storage medium of claim 6, further comprising instructions that, when executed by the at least one processor, cause the computing device to: 
issue a recommendation to each user of the plurality of users based on the recommendations (Dotan-Cohen [0046] recites “The data acquired by data collection component 215 and processed by event tracker 216 on aggregate forms a detailed record of patterns of instances of events involving users and venues.... For example, presentation component 298 may employ at least some of these patterns of instances of events (e.g., using event records) in recommending service content items to users (e.g., associated with user profiles 222).”); 
determine a reaction of each user of the plurality of users to the recommendations (Dotan-Cohen [0059] recites, in part, “...an activity and/or event may have one or more tracked features defined by its corresponding model. Values of one or more of the tracked features may optionally be stored in association with a user... Tracked features can correspond to any of a variety of user data, examples of which have been described above, and include interaction data, or sensor data or readings, which may be sensed by one or more sensors (Such as information associated with a user device regarding location, position, motion/orientation, user - access/touch, connecting/disconnecting a charger, app interaction, user activity on the user device, or other information that may be sensed by one or more sensors, such as sensors found on a mobile device) GPS coordinate samples, and many more.” Interaction (i.e. reaction)); and 
update the resource data based on the reaction of each user of the plurality of users (De Nijs Pg. 3565, Dynamic Constraint Relaxation section, Para. 4 recites “Finally, the relaxed resource limits L J,i 1 are computed. When starting iteration 1 + 1, the algorithm computes policies based on the newly obtained resource limits, after which the entire procedure starts again.”). 
Please see motivation for claim 6 above. 

Regarding claim 21, Dotan-Cohen in view of De Nijs teaches the non-transitory computer readable storage medium of claim 6, wherein the instructions, when executed by the at least one processor, cause the computing device to determine the population dynamics that correspond to the users entering and exiting the structure associated with the plurality of points of interest by determining the population dynamics that correspond to the users accessing and exiting an online system hosted by one or more servers (Dotan-Cohen, P[0034], “The data acquired by data collection component 215, including user data, venue data, and interpretative data, can be collected by data collection component 215 from a variety of sources in which the data may be available in a variety of formats. By way of example and no limitation, the user or venue data may include data that is sensed or determined from one or more sensors, such as … user-activity information (for example: app usage, online activity; searches; voice data such as automatic speech recognition; activity logs; communications data including calls, texts, instant messages, and emails; website posts;” etc.).

Claims 3, 10-11, 13 and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Dotan-Cohen in view of De Nijs and in further view Hariri of et al. (“Context Adaptation in Interactive Recommender Systems,” hereinafter Hariri). 
Regarding claim 3, the Dotan-Cohen/De Nijs Combination teaches the computer-implemented method of claim 1. 
However, The Dotan-Cohen/De Nijs Combination does not teach wherein performing the step for determining recommendations for the plurality of users by taking into the account user types comprises determining a user type for each user of the plurality of users using Thompson sampling. 
Hariri teaches wherein performing the step for determining recommendations for the plurality of users by taking into the account user types comprises determining a user type for each user of the plurality of users using Thompson sampling (Hariri Pg. 43, 4.2 Context-Aware Recommendation Using Thompson Sampling section recites, in part, “We adapt Thompson sampling heuristic as the bandit strategy for generating a recommendation list at each step of interaction with a user. In this setting, 8 which characterizes the utility distribution for each item, represents a user’s preference model. It is a k-dimensional random vector drawn from an unknown multivariate distribution. The user model is updated after each interaction” User's preference model (i.e. user type)). 
Hariri and The Dotan-Cohen/ De Nijs Combination are both directed to machine learning. In view of the teachings of De Nijs, it would have been obvious to one of ordinary skill in the art to apply the teachings of De Nijs to Dotan-Cohen before the effective filing date of the claimed invention in order to have an interactive recommender system that detects, observes and adapts to dynamic context changes based on user’s ongoing behavior by using Thompson sampling (cf. Hariri Pg. 41, Abstract section recites the following: “In many recommendation and personalization applications, particularly in domains where user context changes dynamically, it is difficult to represent and model contextual factors directly, but it is often possible to observe their impact on user preferences during the course of users’ interactions with the system. In this paper, we introduce an interactive recommender system that can detect and adapt to changes in context based on the user’s ongoing behavior. The system, then, dynamically tailors its recommendations to match the user’s most recent preferences. We formulate this problem as a multi-armed bandit problem and use Thompson sampling heuristic to learn a model for the user. Following the Thompson sampling approach, the user model is updated after each interactionas the system observes the corresponding rewards for the recommendations provided during that interaction.”). 

Regarding claim 10, The Dotan-Cohen/De Nijs Combination teaches the non-transitory computer readable storage medium of claim 6, wherein the instructions, when executed by the at least one processor, cause the computing device to determine recommendations for the plurality of users by solving the constrained linear program that takes into account the user types by: 
solving the constrained linear program using column generation to obtain a mix of recommendation policies for the plurality of users based on the user type of each user of the plurality of users (Dotan-Cohen [0140] recites, in part, “A recommendation may be selected from that the few places identified in the values of the tracked feature.” Additionally, De Nijs Pg. 3564, Column Generation for Linear Programming section, Para. 2-3 recites “The LP contains a column for each policy                         
                            
                                
                                    π
                                
                                
                                    i
                                
                            
                            ∈
                            
                                
                                    Z
                                
                                
                                    i
                                
                            
                        
                     for each agent                         
                            i
                        
                    . In order to prevent full policy enumeration, a Column Generation algorithm can be used which incrementally adds columns corresponding to policies. It keeps track of a lower bound                         
                            
                                
                                    φ
                                
                                
                                    1
                                
                            
                        
                     and an upper bound                         
                            
                                
                                    φ
                                
                                
                                    u
                                
                            
                        
                     on the optimal objective value of (4) and adds columns until convergence.”); and 
determining a recommendation for each user of the plurality of users based on the mix of recommendation policies (Dotan-Cohen [0151] recites, in part, “...recommended actions may correspond to one or more conditions, which may be assessed based on sensor(s) on a user device associated with the user, via user history, patterns or routines (e.g. the user drives to work every day between 8:00 and 8:30 AM), other user information (such as online activity of a user, user communication information including missed communications, urgency or staleness of the content (e.g. the content should be presented to the user in the morning, but is no longer relevant after 10 AM), the particular user routine that is diverged from, and/or contextual information corresponding to the out of routine event. ”). 
However, The Dotan-Cohen/De Nijs Combination does not teach determining a user type for each user of the plurality of users. 
Hariri teaches determining a user type for each user of the plurality of users (Hariri Pg. 43, 4.2 Context-Aware Recommendation Using Thompson Sampling section recites, in part, “We adapt Thompson sampling heuristic as the bandit strategy for generating a recommendation list at each step of interaction with a user. In this setting, 8 which characterizes the utility distribution for each item, represents a user’s preference model. It is a k-dimensional random vector drawn from an unknown multivariate distribution. The user model is updated after each interaction” Users preference model (i.e. user type)). 
Hariri and The Dotan-Cohen/ De Nijs Combination are both directed to machine learning. In view of the teachings of De Nijs, it would have been obvious to one of ordinary skill in the art to apply the teachings of De Nijs to Dotan-Cohen before the effective filing date of the claimed invention in order to have an interactive recommender system that detects, observes and adapts to dynamic context changes based on users ongoing behavior by using Thompson sampling (cf. Hariri Pg. 41, Abstract section recites the following: “IN many recommendation and personalization applications, particularly in domains where user context changes dynamically, it is difficult to represent and model contextual factors directly, but it is often possible to observe their impact on user preferences during the course of users’ interactions with the system. In this paper, we introduce an interactive recommender system that can detect and adapt to changes in context based on the users ongoing behavior. The system, then, dynamically tailors its recommendations to match the user’s most recent preferences. We formulate this problem as a multi-armed bandit problem and use Thompson sampling heuristic to learn a model for the user. Following the Thompson sampling approach, the user model is updated after each interactionas the system observes the corresponding rewards for the recommendations provided during that interaction.”). 

Regarding claim 11, The Dotan-Cohen/De Nijs/Hariri Combination teaches the non-transitory computer readable storage medium of claim 10, wherein solving the constrained linear program using column generation comprises: 
solving the constrained linear program to obtain a set of costs (De Nijs Pg. 3564, Column Generation for Linear Programming Section recites, in part, “Column Generation terminates when the dual prices                         
                            
                                
                                    λ
                                
                                
                                    i
                                    ,
                                    t
                                
                            
                        
                     stabilize, leading to an equal lower and upper bound (Vanderbeck 2005; Liang and Wilhelm 2010).” Dual prices                         
                            
                                
                                    λ
                                
                                
                                    i
                                    ,
                                    t
                                
                            
                        
                     stabilize (i.e. solving to obtain set of costs)), 
wherein each costin the set of costs indicates an increase in value fora recommended point of interest if the recommended point of interest acquired a larger capacity (De Nijs Pg. 3564, Column Generation for Linear Programming Section recites “The maximization problem in (5) decouples into n separate subproblems for which a maximizing policy should be found. Such a maximizing policy can be found by solving the MDP using a time-dependent reward function Git (Ss, a) = Ri(s, a) -                         
                            
                                
                                    ∑
                                    
                                        i
                                    
                                
                                
                                    
                                        
                                            λ
                                        
                                        
                                            i
                                            ,
                                            t
                                        
                                    
                                
                            
                            
                                
                                    C
                                
                                
                                    i
                                    ,
                                    j
                                
                            
                            (
                            s
                            ,
                            a
                            )
                        
                    , which can be solved using standard MDP algorithms (e.g., value iteration).”); and 
inputting the set of costs into a planner algorithm to generate a new policy for each user of the plurality of users (De Nijs Pg. 3564, Column Generation for Linear Programming Section recites “The Column Generation algorithm initializes an empty master LP. In each iteration it solves the LP to obtain the multipliers                         
                            
                                
                                    λ
                                
                                
                                    i
                                    ,
                                    t
                                
                            
                        
                     and a lower bound                         
                            
                                
                                    ϕ
                                
                                
                                    l
                                
                            
                        
                    ). The multipliers are used to solve n separate MDPs Musing the reward function Git, also resulting in anew upper bound                         
                            
                                
                                    ϕ
                                
                                
                                    u
                                
                            
                        
                    .”), 
wherein the new policy is an additional input into the constrained linear program (De Nijs Pg. 3565, Accelerating the Column Generation Method Section recites, in part, “The Column Generation algorithm executes several iterations when computing policies, andin each iteration itadds n columns to the master LP, which correspond to the policies of the agents.” Add n columns which correspond to policies of the agents to master LP (i.e. newpolicy as additional input to linear program)). 
Please see motivation to combine for claim 10 above. 

Regarding claim 13, The Dotan-Cohen/De Nijs/Hariri Combination teaches the non-transitory computer readable storage medium of claim 10, wherein determining the user type for each user of the plurality of users comprises determining a user type using Thompson sampling (Hariri Pg. 43, 4.2 Context-Aware Recommendation Using Thompson Sampling section recites, in part, “We adapt Thompson sampling heuristic as the bandit strategy for generating a recommendation list at each step of interaction with a user. In this setting,                                 
                                    θ
                                
                             which characterizes the utility distribution for each item, represents a user's preference model. It is a k-dimensional random vector drawn from an unknown multivariate distribution. The user model is updated after each interaction” User's preference model (i.e. user type)). 
Please see motivation to combine for claim 10 above. 

Regarding claim 17, Dotan-Cohen discloses a system for generating real-time point-of-interest recommendations to users, comprising (Dotan-Cohen fig. 1 element 100): 
at least one server (Dotan-Cohen fig. 1 element 106); and 
at least one non-transitory computer readable storage medium storing instructions thereon that, when executed by the at least one server, cause the system to (Dotan-Cohen [0020] and [0180] recites “Server 106 can comprise server-side software designed to work in conjunction with client-side software on user devices 102a through 102n so as to implement any combination of the features and functionalities discussed in the present disclosure. [0180] Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data... Computer storage media does not comprise signals per se.”): 
analyze resource data associated with a plurality of points of interest (Dotan-Cohen [0027]-[0028] recite, in part, “Data collection component 215 is generally responsible for acquiring, accessing, or receiving (and in some cases also identifying) user data, venue data, and interpretive data from one or more data sources, such as data sources 104a through 104n of FIG. 1. [0028] Venue data corresponds to data collected in association with one or more venues. A “venue” may refer to a physical location that people can conduct certain activities at in person.”), 
determine population dynamics that correspond to users entering and exiting a structure associated with the plurality of points of interest (Dotan-Cohen, P[0027], “Data collection component 215 is generally responsible for acquiring, accessing, or receiving (and in some cases also identifying) user data, venue data, and interpretive data from one or more data sources.” Dotan-Cohen, P[0032], “Examples of routine characteristics of a location (e.g., a venue) include … aggregate visitor demographics, aggregate visitor characteristics, and many more. Examples of sporadic characteristics include a specific concert occurring on a particular day or at a particular time at a location, an unexpected spike in visitors to a location and/or visitors or traffic (e.g., people or vehicles) near a location, current weather conditions at a location, unusual events or activity occurring at a location, and many more.”); 
generate a plurality of expected resource consumptions that provide expected uses of each resource … based on the population dynamics … (Dotan-Cohen, P[0046], “The data acquired by data collection component 215 and processed by event tracker 216 on aggregate forms a detailed record of patterns of instances of events involving users and venues. These patterns can provide understanding and knowledge to system 200 and can be identified and detected by the various components of system 200 including event tracker 216 and presentation component 298.” Dotan-Cohen, P[0146], “Where presentation component 298 refrains from presenting the content to a user, processing, power, and other resources related to the presentation of the content are conserved. For example, generating content may utilize network bandwidth, processing power, and energy.”);
generate a sequential history of user actions for each user of a plurality of users based on observing state transitions of each of the plurality of users (Dotan-Cohen [0030] recites “A routine characteristic can be a characteristic that is determined by the system as being part of a routine model that is detected and tracked by the system (e.g., venue visits, visitation patterns, activity patterns, and/or behavior patterns, or routines). A sporadic characteristic can be a characteristic that is determined by the system, but not as being part of a known routine practiced by a location or user that is detected and tracked by the system (e.g., an event that is not part of a known practiced routine vs. an event that may or may not be part of a known practiced routine).”); and 
generate a recommendation policy for each user of the plurality of users, wherein the recommendation policy comprises real-time point-of-interest recommendations based on a user type (Dotan-Cohen [0151] recites “In some embodiments, recommended actions may correspond to one or more conditions, which may be assessed based on sensor(s) on a user device associated with the user, via user history, patterns or routines (e.g. the user drives to work every day between 8:00 and 8:30 AM), other user information (such as online activity of a user, user communication information including missed communications, urgency or staleness of the content (e.g. the content should be presented to the user in the morning, but is no longer relevant after 10 AM), the particular user routine that is diverged from, and/or contextual information corresponding to the out of routine event.” User history, routines (i.e. user type), recommended actions (i.e. recommendation)). 
However, Dotan-Cohen does not disclose wherein the plurality of resource constraints provide limitations on a capacity of each resource associated with the plurality of points of interest; generate a plurality of expected resource consumptions that provide expected uses of each resource associated with the plurality of points of interest … subject to the plurality of resource constraints; determine a current state for each user of the plurality of users; and generate policy by: determining a user type for each user of the plurality of users; and solving a linear program that takes into account the user type and is based on the plurality of resource constraints, the plurality of expected resource consumptions, the sequential history of user actions for each user of the plurality of users, and the current state of each user using column generation to obtain a mix of recommendation policies for the plurality of users. 
De Nijs teaches to determine a plurality of resource constraints (De Nijs Pg. 3562, Introduction Section recites, in part, “When policies of agents are allowed to satisfy resource constraints in expectation (such as with accidental overruns of budget, or infrastructural constraints), this weakness of deterministic resource allocations can be overcome resulting in significantly higher expected value.” Satisfy resource constraints in expectation to overcome weakness of deterministic resource allocations (i.e. determine resource constraints)); 
wherein the plurality of resource constraints provide limitations on a capacity of each resource associated with the plurality of points of interest (De Nijs Pg. 3563, MMDPs with Global Resource Constraints Section recites, in part, “Global resource constraints force the agents to coordinate their decisions, which means that the policies used by the agents should maximize the total expected value while staying below global resource limits.” Global resource limits (i.e. limitations on a capacity of resources)); 
generate a plurality of expected resource consumptions that provide expected uses of each resource associated with the plurality of points of interest subject to the plurality of resource constraints (De Nijs recites, in part, “The agents have access to k resource types. For each agent i the consumption of resource type /is defined using a function Cij: Six Ai > [0, C maxi j], where C max,ij;denotes the maximum instantaneous consumption of resource type /for agent /. For instantaneous constraints the resource limit at time tis defined by Ljt, where the usage at time t does not affect the limit at t’> t. Budget constraints are defined by a single limit L,.” Consumption of resource type j for each agent i defined by using function and access to k resource types (i.e. generate expected resource consumptions that provide expected uses of each resource and associated with resource constraints)); 
determine a current state for each user of the plurality of users (De Nijs Pg. 3563, MMDPs with Global Resource Constraints Section, Para. 2 recites, in part, “The tuple (Si, Ai, Ti, Ri, h, s'') defines the MDP Mi for agent i. Each agent has its own sets of states Si and actions Ai. The transition function Ti: Six Aix Si > [0, 1] gives the probability of advancing to state si¢ Si from state s € Si by choosing action a é Ai, thus Ti (s, a, s')= P (s'l s, a). The choice of action ain state sis rewarded through reward function Ai: Six Ai > R. The horizon A specifies the total number of decisions and the initial state of agent /is defined by s’'.” State s, (i.e. current state)); and 
generate policy by: 
solving a linear program that takes into account the user type and is based on the plurality of resource constraints, the plurality of expected resource consumptions, the sequential history of user actions for each user of the plurality of users, and the current state of each user using column generation to obtain a mix of recommendation policies for the plurality of users (De Nijs Pg. 3564, Column Generation for Linear Programming Section recites “The maximization problem in (5) decouples into n separate subproblems for which a maximizing policy should be found. Such a maximizing policy can be found by solving the MDP using a time-dependent reward function Git (S, a) = Ri(s, a) - >j Ajt Cij (S$, a), which can be solved using standard MDP algorithms (e.g., value iteration).”);). 
De Nijs and Dotan-Cohen are both directed to machine learning. In view of the teachings of De Nijs, it would have been obvious to one of ordinary skill in the art to apply the teachings of De Nijs to Dotan-Cohen before the effective filing date of the claimed invention in order to coordinate actions on shared, collectively owned resources by supporting multi-unit resource consumption using an algorithm for computing policies satisfying a given violation tolerance (cf. De Nijs Pg. 3562, Introduction Section recites the following: “When decision-making agents share collectively owned resources, their actions need to be coordinated subject to the availability of these resources. In many problem domains it is impractical, inefficient or costly to coordinate resource consumption during execution. For example, load balancing of energy consumption in smart energy grids has instantaneous effects on the stability of the grid, which requires real-time decisions.” “Our method naturally supports multi-unit resource consumption, and requires no communication between agents during execution. It can be seen as an anytime algorithm for computing policies satisfying a given violation tolerance.” ). 
However, The Dotan-Cohen Combination still does not teach determine a current state for each user of the plurality of users. 
Hariri teaches determining a user type for each user of the plurality of users (Hariri Pg. 43, 4.2 Context-Aware Recommendation Using Thompson Sampling section recites, in part, “We adapt Thompson sampling heuristic as the bandit strategy for generating a recommendation list at each step of interaction with a user. In this setting, 8 which characterizes the utility distribution for each item, represents a user’s preference model. It is a k-dimensional random vector drawn from an unknown multivariate distribution. The user model is updated after each interaction” User's preference model (i.e. user type)). 
Hariri and The Dotan-Cohen/ De Nijs Combination are both directed to machine learning. In view of the teachings of De Nijs, it would have been obvious to one of ordinary skill in the art to apply the teachings of De Nijs to Dotan-Cohen before the effective filing date of the claimed invention in order to have an interactive recommender system that detects, observes and adapts to dynamic context changes based on users ongoing behavior by using Thompson sampling (cf. Hariri Pg. 41, Abstract section recites the following: “IN many recommendation and personalization applications, particularly in domains where user context changes dynamically, it is difficult to represent and model contextual factors directly, but it is often possible to observe their impact on user preferences during the course of users’ interactions with the system. In this paper, we introduce an interactive recommender system that can detect and adapt to changes in context based on the user's ongoing behavior. The system, then, dynamically tailors its recommendations to match the user’s most recent preferences. We formulate this problem as a multi-armed bandit problem and use Thompson sampling heuristic to learn a model for the user. Following the Thompson sampling approach, the user model is updated after each interactionas the system observes the corresponding rewards for the recommendations provided during that interaction.” ). 

Regarding claim 18, The Dotan-Cohen/De Nijs/Hariri Combination teaches the system of claim 17, wherein solving the linear program using column generation comprises: 
	solving the linear program to obtain a set of costs (De Nijs Pg. 3564, Column Generation for Linear Programming Section recites, in part, “Column Generation terminates when the dual prices Ai, stabilize, leading to an equal lower and upper bound (Vanderbeck 2005; Liang and Wilhelm 2010).” Dual prices Ait stabilize (i.e. solving to obtain set of costs)), 
wherein each costin the set of costs indicates an increase in value fora recommended point of interest if the recommended point of interest acquired a larger capacity (De Nijs Pg. 3564, Column Generation for Linear Programming Section recites “The maximization problem in (5) decouples into n separate subproblems for which a maximizing policy should be found. Such a maximizing policy can be found by solving the MDP using a time-dependent reward function Git (Ss, a) = Ri(s, a) - >) Ajt Cij (S, a), which can be solved using standard MDP algorithms (e.g., value iteration).”); and 
inputting the set of costsintoa planner algorithm to generate a new policy for each of the plurality of users (De Nijs Pg. 3564, Column Generation for Linear Programming Section recites “The Column Generation algorithm initializes an empty master LP. In each iteration it solves the LP to obtain the multipliers Ajt and a lower bound @:. The multipliers are used to solve n separate MDPs Miusing the reward function Git, also resulting in a newupper bound ®u..”), 
wherein the new policy is an additional input into the linear program (De Nijs Pg. 3565, Accelerating the Column Generation Method Section recites, in part, “The Column Generation algorithm executes several iterations when computing policies, and in each iteration it adds n columns to the master LP, which correspond to the policies of the agents.” Add n columns which correspond to policies of the agents to master LP (i.e. newpolicy as additional input to linear program)). Please see motivation for claim 17 above. 

Regarding claim 19, The Dotan-Cohen/De Nijs/Hariri Combination teaches the system of claim 18, wherein solving the linear program using column generation further comprises solving the linear program until the set of costs converges (De Nijs Pg. 3564, Column Generation for Linear Programming Section recites, in part, “The algorithm is anytime, guaranteed to converge and subproblems can be solved in a distributed fashion. Column Generation terminates when the dual prices Ai stabilize, leading to an equal lower and upper bound (Vanderbeck 2005; Liang and Wilhelm 2010).” Dual prices Ait stabilize and column generation terminates where algorithm is guaranteed to converge (i.e. solving linear program and set of costs converge)). 
Please see motivation for claim 17 above. 

Regarding claim 20, The Dotan-Cohen/De Nijs/Hariri Combination teaches the system of claim 17, further comprising instructions that, when executed by the at least one server, cause the system to: 
issue a recommendation to each user of the plurality of users based on the determined recommendations (Dotan-Cohen [0046] recites “The data acquired by data collection component 215 and processed by event tracker 216 on aggregate forms a detailed record of patterns of instances of events involving users and venues.... For example, presentation component 298 may employ at least some of these patterns of instances of events (e.g., using event records) in recommending service content items to users (e.g., associated with user profiles 222).”); 
determine a reaction of each user of the plurality of users to the recommendations (Dotan-Cohen [0059] recites, in part, “...an activity and/or event may have one or more tracked features defined by its corresponding model. Values of one or more of the tracked features may optionally be stored in association with a user... Tracked features can correspond to any ofa variety of user data, examples of which have been described above, and include interaction data, or sensor data or readings, which may be sensed by one or more sensors (such as information associated with a user device regarding location, position, motion/orientation, useraccess/touch, connecting/disconnecting a charger, app interaction, user activity on the user device, or other information that may be sensed by one or more sensors, such as sensors found on a mobile device) GPS coordinate samples, and many more.” Interaction (i.e. reaction)); and 
update the user type of each user of the plurality of users based on the reactions (Dotan-Cohen [0045] recites, in part, “Event tracker 216 can store any of the various data employed in tracking routines and/or events of users, venues, and/or activities as user tracking data, venue tracking data, and activity tracking data respectively. Overtime, event tracker 216 may update the tracking data as data is periodically analyzed and new events, routines, and activities are discovered, modified, or disassociated with users, venues, and/or geographic tiles.”). 
Please see motivation for claim 17 above. 

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Dotan-Cohen in view of De Nijs and in further view Hariri and in further view of Péron et al. (“Fast- Tracking Stationary MOMDPs for Adaptive Management Problems”, hereinafter Peron).
Regarding claim 4, The Dotan-Cohen/De Nijs Combination teaches the computer-implemented method of claim 1, wherein performing the step for determining recommendations for the plurality of users by taking into account the user types comprises determining a user type of the user types by (Dotan-Cohen [0040]-[0041] recite, in part, “Extracted semantic characteristics of users may be stored in association with one or more user profiles, such as user profiles 222. [0041] Explicit semantic characteristics correspond to explicit information, which may be explicit from a user, or explicit from a data source (e.g., a web page, document, file, yellow pages, maps, indexes, etc.) from which the information is extracted. As an example, explicit information can be extracted from data recording input by a user of likes and dislikes into a user profile associated with one of user profiles 222.”). 
However, The Dotan-Coher/De Nijs Combination does not teach building a mixed observability Markov decision process; and determining a solution to the mixed observability Markov decision process using a solver. 
Péron teaches building a mixed-observability Markov decision process (Péron Pg. 4532, Mixed Observability Markov Decision Process Section recites, in part, “A partially observable Markov decision process (POMDP) is a mathematical framework to model the impact of sequential decisions on a probabilistic system under imperfect observation of the states (Sigaud and Buffet 2010). MOMDPs are a special case of POMDPs, where the state can be decomposed into a fully observable component anda partially observable component (Ong et al. 2010). Alternatively, they can be seen as MDPs extended with a non-observable component (Fig. 1). MOMDPs can model various decision problems where an agent knows its position but evolves in a partially observable environment, or when the transition matrices or rewards are uncertain.”); and 
determining a solution to the mixed-observability Markov decision process using a solver (Péron Pg. 4532-4533, Mixed Observability Markov Decision Process Section recites, in part, “Initialized with a lower bound of the optimal value function, most MOMDP solvers calculate the policy by updating the sets I’. recursively through Bellman’s equation, causing Ix to increase until it is close enough to the optimal value function. To apply the policy, since each avector is associated with an action, the best action to implement at any time step is found by selecting the n-vector that maximizes b - a in Eq. 3.” Calculating the policy by updating sets until it is close enough to optimal value function done by MOMDP solver (i.e. using a solver to determine a solution to the MOMDP)). 
Péron and The Dotan-Cohen/De Nijs Combination are both directed to problems related to machine learning. In view of the teachings of Péron, it would have been obvious to one of ordinary skill in the art to apply the teachings of Péron to The Dotan-Cohen/De Nijs Combination before the effective filing date of the claimed invention in order to achieve the best trade-off between informative and rewarding actions using an optimal MOMDP (cf. Péron Pg. 4531 Introduction Section recites the following: “The uncertainty about the system dynamics is often modeled by a finite set of scenarios (Walters and Hilborn 1976; Moore and Conroy 2006). Chadés et al. (2012) showed that this problem can be formulated as a mixed observability Markov decision process (MOMDP), a special case of POMDP (partially observable MDP). An optimal MOMDP policy accomplishes the best trade-off between informative and rewarding actions, with regard to a precise management objective (Chadés et al. 2012).” ). 

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Dotan-Cohenin view of De Nijs and in further view Hariri and in further view of Kurniawati et al. (“SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces”, hereinafter Kurniawati). 
Regarding claim 14, The Dotan-Cohen/De Nijs/Hariri Combination teaches the non-transitory computer readable storage medium of claim 10 and determining the user type for each user of the plurality of users (Hariri Pg. 43, 4.2 Context-Aware Recommendation Using Thompson Sampling section recites, in part, “In this setting, 8 which characterizes the utility distribution for each item, represents a user’s preference model. It is a k-dimensional random vector drawn from an unknown multivariate distribution. The user model is updated after each interaction” User's preference model (i.e. usertype)). 
However, The Dotan-Cohen/De Nijs/Hariri Combination does not teach wherein determining comprises: generating a belief tree comprising a plurality of belief points, whereina belief point comprises a probability distribution over user types; and determining the user type for each user of the plurality of users based on the probability distribution associated with a current belief point of each user. 
Kurniawati teaches wherein determining comprises: generating a belief tree comprising a plurality of belief points (Kurniawati Pg. 67, III. SARSOP Section and fig. 2 recites, in part, “Like all point-based algorithm, SARSOP samples a set of points from the belief space. The sampled points form a tree (Fig. 2). Each node of represents a sampled point.” Sampling points from a belief space that forma tree (i.e. generating a belief tree comprising belief points)), 
wherein a belief point comprises a probability distribution over user types (Kurniawati Pg. 66, Il. Background Section recites, in part, “The solution to a POMDP is an optimal policy that maximizes the expected total reward. Normally, a policy is a mapping from the agent's state to a prescribed action. However, in a POMDP, the agent’s state is partially observable and not known exactly. So we rely on the concept of beliefs. As described earlier, a belief is a probability distribution over S. APOMDP policy 1:B —A maps a belief b € B t oa prescribed action a € A.” Belief is a probability distribution over S where S is the set of states (i.e. belief point comprises a probability distribution over user types)); and 
determining the user type for each user of the plurality of users based on the probability distribution associated with a current belief point of each user (Kurniawati Pg. 66, Il. Background Section recites “Each a-vector is associated with an action. The policy can be executed by selecting the action corresponding to the best a-vector at the current belief. So a policy can be represented as a set of a-vectors.” Selecting the action corresponding to the best a-vector at the current belief (i.e. based on the probability distribution associated with a current belief point)). 
Kurniawati and The Dotan-Cohen/De Nijs/Hariri Combination are both directed to machine learning. In view of the teachings of Kurniawati, it would have been obvious to one of ordinary skill in the art to apply the teachings of Kurniawati to The Dotan-Cohen/De Nijs/Hariri Combination before the effective filing date of the claimed invention in order to improve computational efficiency by using a point-based algorithm that exploits the notion optimally reachable belief spaces (cf. Kurniawati Abstract and Introduction Sections recite the following: “To this end, we have developed a new point-based POMDP algorithm that exploits the notion of optimally reachable belief spaces to improve computational efficiency.” “The main idea of our algorithm is to compute successive approximations of R*(bo) and converge to it iteratively. Since R*(bo) is unknown in advance, the algorithm relies on heuristic exploration to sample R*(bo) and improves sampling over time through a simple on-line learning technique. It then uses a bounding technique to avoid sampling in regions that are unlikely to be optimal and focus sampling on the region near R*(bo), B the subset of most relevant to the POMDP solution. This leads to substantial gain in computational efficiency.” ).

Allowable Subject Matter
Claims 5 and 15-16 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

















Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CATHERINE F LEE whose telephone number is (571)270-7487. The examiner can normally be reached Monday thru Friday, 10:00AM-6:00PM PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/C.F.L./Examiner, Art Unit 2124                                                                                                                                                                                                        
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124