DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Pervious Office Action vacated/withdrawn
The Office Action mailed on 09/17/2021 was inadvertently labeled as a Non-Final Office Action.
This Final Rejection replaces the Office Action mailed on 09/17/2021.

Status of Claims
The present application is being examined under the claims filed on 06/25/2021.
Claims 1, 2, 21, and 27 are amended.
Claims 8, 10, 13-20 are cancelled.
Claims 1-7, 9, 11, 12, and 21-28 are rejected.
Claims 1-7, 9, 11, 12, and 21-28 are pending.

Drawings
	The Drawings filed on 07/06/2016 are acceptable for examination purposes.

Specification
	The Specification filed on 07/06/2016 is acceptable for examination purposes.

Response to Arguments
In reference to rejections under 35 USC § 112(a) and 35 USC § 112(b)

Examiner notes that new rejections under 35 USC § 112(b) are present below.

In reference to Prior Art
Applicant asserts that that all the claims presently pending in the application, are patentably distinct over the prior art of record and are in condition for allowance.
Examiner respectfully disagrees. Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
Applicant's arguments filed 06/25/2021 have been fully considered but they are not persuasive.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 2 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor 
In reference to claim 2. The claim recites the limitation of “wherein the user data is monitored and recorded, and includes time spent typing, time spent moussing, and time spent reading, changes in acceleration, motion, speed, anomalous movement, abuse of a device by the user, repeated attempts to change configuration, warnings by supervisors, and changes in skin luminescence”. Examiner notes that the highlighted terms are unclear. In reference to “changes in acceleration”, “motion”, “anomalous movement”, is this referring to changes in acceleration of the user? Such walking to running? Or a vehicle? In reference to “abuse of a device by the user”, there is not description in the Instant Specification for the term abuse, what is considered abuse of a device? In reference to “repeated attempts to change configuration” is this referring to changes in user action (like step/configuration of actions taken)? Or the configurations of settings of a device used by the user?

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

s 1, 3-7, 9, 11, 12, and 21-28 are rejected under 35 U.S.C. 103 as being unpatentable over Samantha J. Horseman (hereinafter Horseman) US 20130009993 A1 in view of Shen et al. (hereinafter Shen) “Risk-sensitive Reinforcement Learning” in view of Chintakindi et al. (hereinafter Chintakindi) US 20190027038 A1 (relying on provisional application date) in view of Samantha J. Horseman (hereinafter Horseman640) US 8872640 B2.
In reference to claim 1. Horseman teaches a risk management system, comprising:
“a processor” (Horseman in at least Fig. 13 and ¶ [0142]-[0144]);
“a memory, the memory storing instructions to cause the processor to” (Horseman in at least Fig. 13 and ¶ [0142]-[0144]):
“map user data, site data, and equipment data as well as past data from a database to an event on a site” (Horseman Fig. 15 and ¶ [0045] disclose analyzing the user data “the health data can be used to assess various biometric and biomechanic characteristics (e.g., characteristics, conditions and risks) of the employee, such as the employee's body weight, body temperature, body fat percentage, heart rate, blood pressure, blood glucose level, blood oxygenation level, body position/posture, eye fatigue, neural activity, emotions, thoughts, facial movements/expressions, motor skills, physical exertion, and the like”. ¶ [0085] discloses analyzing the site data “monitoring the health of the employee while they work in or travel between various work environments. For example, system 600 may enable the collection of health data while the employee is working in the field (e.g., on worksite such as an oil and gas production platform, a manufacturing plant, a refinery, a construction site, and/or the like), when they are situated in a workstation (e.g., an employee's office employee's office, cubicle, assigned station on an assembly/manufacturing line, or the like), and/or when they are traveling (e.g. traveling between worksites, driving a delivery truck, and/or the like)”. ¶ [0107] discloses analyzing the equipment data “a positioning device is health alerts 208 may be based on the collected health data, the health profile for the employee, actions determined to have been taken by the employee, predicted actions expected to be taken by the employee, and corresponding consequences related thereto. For example, where it has been determined that the user has lifted a heavy object based on the health data collected (e.g., based on the force data acquired via force sensors integrated into the employee's work gloves and/or work boots) and the health consequence of a lower back injury is associated with lifting heavy objects […]”.Fig. 15 discloses at least 2 circuits capable of performing the functions claimed, these are the mobile devices 122 and employee computers 630. Figs. 8 and 9 illustrate the components of the mobile devices and employee computers. Fig. 10, ¶ [0113], ¶ [0138] discloses the collected health data (user data, site data, and equipment data) gets stored together and as mentioned previously the health data is used to map behavior types to events. In at least ¶ [0152], ¶ [0188] disclose using past data from the database);
“determine a relationship between the mapped data and the event based on behaviors exhibited by the user and an impact on a performance factor and a risk factor” (Horseman ¶ [0043] “the health alert includes information to encourage the employee to take actions that improve the employee's health and/or to discourage actions that may have a negative impact on the employee's health […] help to prevent the employee from engaging in actions that may have a negative impact on their health” and ¶ [0056] “health information that encourages an employee to engage in actions that have a positive impact on their health”, 
“wherein an action to collectively change an input into a future of the user data, the site data, and the equipment data in concert is recommended […] to achieve the overall site productivity to meet a production outcome by changing an activity pertaining to the user data, the site data, and the equipment data […]” (Horseman in at least ¶ [0004], ¶ [0064], ¶ [0077], and ¶ [0079] discloses feedback to the user to dynamically adjust their actions, dynamically adjusting their actions changes all the data collectively since all the data is related to each other. These sections disclose meeting a production outcome because once the user adjust their actions based on the feedback the user achieves the overall site productivity instructed, thereby, meeting the production outcome. Examiner notes that the user data, site data, and equipment data are all related to each other; the user data comprises user behavior, the site data comprises interactions between users and equipment, and equipment data comprises the data gathered from the equipment sensors as mentioned above. Examiner notes that by changing the activity pertaining to the user data (i.e. user performing an action), the site data (i.e. user interaction with equipment) and the equipment data (i.e. data from equipment sensors) changes too),

Horseman does not explicitly disclose:
“use reinforcement learning via a machine learning algorithm to learn the performance factor to the risk factor ratio to change an overall site productivity, the reinforcement learning determining the change to the performance factor to the risk ratio through equipment operator modelling by modelling: the equipment operation by the user; user profiling behaviors; and actions of users based on the relationships determined”,
“wherein an action […] is recommended based on a result of the reinforcement learning to achieve the overall site productivity […] such that the relationship between the mapped data and the event change the performance factor to risk factor ratio”,
However, Shen discloses:
“use reinforcement learning via a machine learning algorithm to learn the performance factor to the risk factor ratio to change an overall site productivity, the reinforcement learning determining the change to the performance factor to the risk ratio through equipment operator modelling by modelling: the equipment operation by the user; user profiling behaviors; and actions of users based on the relationships determined” (Shen § 1, § 3, and § 4.2 disclose learning an action (performance factor) to risk (risk factor) relationship to maximizing the expected reward (optimize an overall productivity), this is based on the policies selected “Risk arises from the uncertainties associated with future events, and is inevitable since the consequences of actions are uncertain at the time when a decision is made. Hence, risk has to be taken into account by the decision-maker, consciously or unconsciously”, “In the context of sequential or multistage decision-making problems, reinforcement learning (RL, Sutton and Barto, 1998) follows this line of thought. It describes how an agent ought to take actions that maximize expected cumulative rewards in an environment typically described by a Markov decision process (MDP, Puterman, 1994)”, and “The optimal policy within a time horizon T is obtained by maximizing the expectation of the discounted cumulative rewards”. These sections describe multiple types of policies including “An economically rational decision-making rule, which is risk-neutral, is to select the alternative with the highest expected reward”, “Besides risk-neutral policies, risk-averse policies, which accept a choice with a more certain but possibly lower expected reward”, and “risk-seeking policies, which prefer a choice with less certain but possibly high reward, are considered economically irrational”. Examiner notes that for examination purposes, these 2 terms “user profiling behaviors”, and “actions of users” will be interpreted under the broadest reasonable interpretation to be the any equipment operation by the user. Given this interpretation, the cited sections above clearly disclose optimizing using reinforcement learning through the actions taken (equipment operation by the user) which result in the optimal policy. Examiner notes that reinforcement learning algorithms are defined to optimize (i.e. change) the reward gained, the reward is based on the ratio between action taken (i.e. performance factor) and risk taken (i.e. risk factor)),
“wherein an action […] is recommended based on a result of the reinforcement learning to achieve the overall site productivity […] such that the relationship between the mapped data and the event change the performance factor to risk factor ratio” (Shen § 3, § 4.2, and § 5 discloses using a Markov process to determine the optimal policy and recommending an action based on the policy “The optimal policy within a time horizon T is obtained by maximizing the expectation of the discounted cumulative rewards …”. Examiner notes that reinforcement learning algorithms are defined to optimize (i.e. change) the reward gained, the reward is based on the ratio between action taken (i.e. performance factor) and risk 
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Horseman and Shen. Horseman teaches methods for providing feedback of health information to an employee when the employee is engaged in their work duties. Shen teaches a risk-sensitive Q-learning algorithm, which is necessary for modeling human behavior when transition probabilities are unknown, and prove its convergence. One of ordinary skill would have motivation to modify the system of Horseman by adding the human behavior modeling of Shen to provide the optimal actions that maximize expected overall productivity. MPEP 2143 sets forth the Supreme Court rationales for obviousness.

Horseman and Shen do not explicitly disclose:
wherein the equipment data comprises:
“a type of equipment”; and
“a risk associated with a hazard associated with the type of equipment”; and
However, Chintakindi discloses:
wherein the equipment data comprises:
“a type of equipment” (Chintakindi ¶ [0018] and ¶ [0048] “type, amount, and cost of vehicle”);
“a risk associated with a hazard associated with the type of equipment” (Chintakindi “determine which risk factors on road segments can impact a vehicle and occupants included therein. The system may determine (e.g., quantify and/or create) a probability of an adverse event occurring. The probability may be range bound. The server may determine the potential cost ( e.g., in dollars) of an adverse event, such as an accident”);


Horseman, Shen, and Chintakindi do not explicitly disclose:
wherein the equipment data comprises:
“wherein the user data comprises a cognitive state of the user including a distraction level and a fatigue level”.
However, Horseman640 discloses:
wherein the equipment data comprises:
“wherein the user data comprises a cognitive state of the user including a distraction level and a fatigue level” (Horseman640 in at least Col. 15 lines 5-25 and Col. 24 line 50 to col. 25 line 16 “a plurality of neural sensor (e.g., sixteen neural sensors/channels) may be disposed about the employee's scalp to detect neuro-signals (e.g., including alpha, beta, gamma, and delta waves) that can be used to determine the employee's brain state, including their emotional state (e.g., distracted, angry happy, sad, excited, etc.), thoughts ( e.g., cognitive thoughts, subconscious thoughts, intent, etc.), facial movements (e.g., facial expressions), fatigued/tired (e.g., suffering from sleep deprivation), and/or the like” and “process driver status data 112 to determine whether the driver is distracted, fatigued, has fallen asleep, is suffering a stroke/heart-attack, and/or the like”. Examiner notes that other sections not cited are also relevant to this limitation).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Horseman, Shen, Chintakindi, and Horseman640. Horseman teaches methods for providing feedback of health information to an employee when the employee is engaged in their work duties (examiner notes that Horseman incorporates by reference in its entirety in Horseman640). Shen teaches a risk-sensitive Q-learning algorithm, which is necessary for modeling human behavior when transition probabilities are unknown, and prove its convergence. Chintakindi teaches a vehicle control computer able to self-improve and provide better vehicle responses to future adverse driving events. Horseman640 teaches methods for providing feedback of health information to a driver when driving a vehicle (examiner notes that Horseman640 incorporates by reference in its entirety in Horseman). One of ordinary skill would have motivation to combine Horseman, Shen, Chintakindi, and Horseman640 because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 3. Horseman, Shen, Chintakindi, and Horseman640 teach the system of claim 1 (as mentioned above), wherein:
Shen further discloses:
“the reinforcement learning optimizes the performance factor to the risk factor ratio […]” (Shen § 1, § 3, and § 4.2 disclose a reinforcement learning model which learns an action (performance factor) to risk (risk factor) relationship to maximizing the expected reward (optimize an overall productivity)).

Horseman further discloses:
“[…] optimizes the performance factor to the risk factor ratio through the operator modelling, user profiling behaviors, and actions of users based on the relationship determined” (Horseman ¶ [0043] “the health alert includes information to encourage the employee to take actions that improve the employee's health and/or to discourage actions that may have a negative impact on the employee's health […] help to prevent the employee from engaging in actions that may have a negative impact on their health” and ¶ [0056] “health information that encourages an employee to engage in actions that have a positive impact on their health”, these cited sections disclose the relationship between the mapped data and event based on the user behaviors and an impact on a performance and risk. Examiner notes that performance and risk are correlated to each other, for example ignoring the alerts this increases the changes of injury (increases risk) and decreases the chances of performance (performance decreases with injury), the overall productivity of the employee is increased by listening to alerts (reducing risk and increasing performance). Fig. 15 discloses at least 2 circuit capable of performing the functions claimed, these are the mobile devices 122 and employee computers 630. Figs. 8 and 9 illustrate the components of 

In reference to claim 4. Horseman, Shen, Chintakindi, and Horseman640 teach the system of claim 1 (as mentioned above), wherein the memory further stores instructions to cause the processor to:
Shen further discloses:
“determine a set of stationary policies regarding an action and an expected productivity increase of the action […], in order to optimize a value function having the performance factor to the risk factor ratio as an output of the value function” (Shen § 1, § 3, and § 4.2 disclose creating a policy (stationary policies) for the agent (user) to follow, particularly see Algorithm 1 in pg. 10. As mentioned above the policy selected optimizes the action (performance factor) to risk (risk factor) relationship to maximizing the expected reward (optimize an overall productivity). Theorem 3.1 discloses optimizing the Q-values (value function)),
“the value function finds a policy that maximizes a return by maintaining a set of estimates of expected returns for either a current policy or an optimal policy, and optimizing the value function relies on a Markov Decision Process (MDPs), where optimality is defined by stronger than the stationary policy” (Shen in at least Shen § 1, § 3, and § 4.2 disclose the Markov Decision Process which finds a policy that maximizes a return by maintaining a set of 

Horseman further discloses:
“[…] given a real-time relationship of the user data, the site data, and the equipment data […]” (Horseman ¶ [0044] “health information provides real-time feedback to the employee regarding their health”, ¶ [0045] “the health data may be indicative of the employee's health and actions while the employee is engaged in their day-to-day work activities and may enable monitoring of dynamic/real-time changes in the employee's health and actions throughout the workday”, and as mentioned above Fig. 15, ¶ [0045], ¶ [0085], and ¶ [0107] disclose the relationship of the user data, the site data, and the equipment data),

In reference to claim 5. Horseman, Shen, Chintakindi, and Horseman640 teach the system of claim 1 (as mentioned above), wherein the memory further stores instructions to cause the processor to:
Shen further discloses:
“utilize a Markov Decision Process as part of the machine learning algorithm in the reinforcement learning” (Shen § 1, § 3, and § 4.2 disclose the Markov Decision Process in the reinforcement learning. See Figure 3(b) in pg. 13).

In reference to claim 6. Horseman, Shen, Chintakindi, and Horseman640 teach the system of claim 1 (as mentioned above), wherein:
Shen further discloses:
“the performance factor to the risk factor ratio is learned based on the risk factor being greater than a predetermined risk tolerance level” (Shen § 1, § 3, and § 4.2 disclose multiple types of policies. Table 1 in page 14 discloses the predetermined risk tolerance levels for each of the policies. In Table 1 you will find the breakdown of each policy (risk-averse, risk-neutral, and risk-seeking)).

In reference to claim 7. Horseman, Shen, Chintakindi, and Horseman640 teach the system of claim 1 (as mentioned above), wherein:
Shen further discloses:
“the performance factor to the risk factor ratio is dynamically changed […]” (Shen § 1, § 3, and § 4.2 disclose “A risk-sensitive objective was derived and optimized by value iteration or dynamic programming”).

Horseman further discloses:
“[…] a real-time feed” (Horseman ¶ [0044] “health information provides real-time feedback to the employee regarding their health”, ¶ [0045] “the health data may be indicative of the employee's health and actions while the employee is engaged in their day-to-day work activities and may enable monitoring of dynamic/real-time changes in the employee's health and actions throughout the workday”. Fig. 15 discloses at least 2 circuits capable of performing the functions claimed, these are the mobile devices 122 and employee computers 630. Figs. 8 and 9 illustrate the components of the mobile devices and employee computers),

In reference to claim 9. Horseman, Shen, Chintakindi, and Horseman640 teach the system of claim 1 (as mentioned above), wherein the site data comprises:
Horseman further discloses:
“an interaction between a user and the equipment including a positional overlap with other equipment” (Horseman Fig. 15, ¶ [0086], ¶ [0107], and ¶ [0185] disclose the interaction between a user and the equipment “may be determined that the employee is at risk of a back injury, neck injury, rotator cuff injury, and/or physical fatigue may be based on the employee's high level of physical exertion (e.g., lifting above a predetermined threshold of 25 kg (55 lbs.)) using poor posture/body position (e.g., bending at the back as opposed to the knees)”);
“a risk of a positon of the user or the user and the equipment on the site” (Horseman Fig. 15, ¶ [0086], ¶ [0107], and ¶ [0185] disclose the risk of a positon of the user or the user and the equipment on the site “may be determined that the employee is at risk of a back injury, neck injury, rotator cuff injury, and/or physical fatigue may be based on the employee's high level of physical exertion (e.g., lifting above a predetermined threshold of 25 kg (55 lbs.)) using poor posture/body position (e.g., bending at the back as opposed to the knees)”);
“a risk created by a site environment change” (¶ [0085] discloses monitoring the risk in different environments “monitoring the health of the employee while they work in or travel between various work environments. For example, system 600 may enable the collection of health data while the employee is working in the field (e.g., on worksite such as an oil and gas production platform, a manufacturing plant, a refinery, a construction site, and/or the like), when they are situated in a workstation (e.g., an employee's office employee's office, cubicle, assigned station on an assembly/manufacturing line, or the like), and/or when they are traveling (e.g. traveling between worksites, driving a delivery truck, and/or the like)”).

In reference to claim 11. Horseman, Shen, Chintakindi, and Horseman640 teach the system of claim 1 (as mentioned above), wherein:
Horseman further discloses:
“the user data, site data, and equipment data comprise a real-time feed” Horseman ¶ [0044] “health information provides real-time feedback to the employee regarding their health”, ¶ [0045] “the health data may be indicative of the employee's health and actions while the employee is engaged in their day-to-day work activities and may enable monitoring of dynamic/real-time changes in the employee's health and actions throughout the workday”, and as mentioned above Fig. 15, ¶ [0045], ¶ [0085], and ¶ [0107] disclose the user data, the site data, and the equipment data,
“wherein the prior behavior types are stored in a database” (Horseman ¶ [0139] discloses the database, and ¶ [0152] discloses the historical data “the logged health data 700 may be used to generate health profiles and/or reports that are based on current/recent health data 700 (e.g., health data 700 collected within a minute, hour, day, week, month, or the like) and/or historical health data 700 (e.g., health data 700 collected more than a minute, hour, day, week, moth, year, or the like prior) […] health information 609 may include, for each employee, employee personal profile data (e.g., name, age, etc.), historical/current employee health profile data (e.g., health data, characteristics, conditions, plans) and/or employee activity data (e.g., a log of exercises, food consumed, etc.), and so forth”).

In reference to claim 12. Horseman, Shen, Chintakindi, and Horseman640 teach the system of claim 1 (as mentioned above), wherein the prior behavior types comprise:
Horseman does not explicitly disclose:
“a prior accident severity”;
“a prior accident outcome”;
“a type of accident”; 
“a prior performance rate”.
However, Shen discloses:
“a prior accident severity” (Chintakindi ¶ [0049] “the potential evasive maneuvers may be made by comparing and/or matching the above-mentioned data (e.g., direct involvement, peripheral involvement, environmental, and/or vehicle performance and/or operational) to prior incident data stored in historical data source server 230 and/or multi-dimensional risk score generation server 250 to determine a set of maneuvers of known outcomes” and ¶ [0048] “type, amount, and severity of injuries”. These sections disclose maintaining a historical data source server which contains prior incident data, incident data includes the type, amount, and severity of injuries);
“a prior accident outcome” (Chintakindi ¶ [0049] “the potential evasive maneuvers may be made by comparing and/or matching the above-mentioned data (e.g., direct involvement, peripheral involvement, environmental, and/or vehicle performance and/or operational) to prior incident data stored in historical data source server 230 and/or multi-dimensional risk score generation server 250 to determine a set of maneuvers of known outcomes” and ¶ [0048] “type, amount, and severity of injuries”. These sections disclose maintaining a historical data source server which contains prior incident data, incident data includes the outcome of the incident);
“a type of accident” (Chintakindi ¶ [0049] “the potential evasive maneuvers may be made by comparing and/or matching the above-mentioned data (e.g., direct involvement, peripheral involvement, environmental, and/or vehicle performance and/or operational) to prior incident data stored in historical data source server 230 and/or multi-dimensional risk score generation server 250 to determine a set of maneuvers of known outcomes” and ¶ [0048] “type, amount, and severity of injuries”. These sections disclose maintaining a historical data source server which contains prior incident data, incident data includes the type of incident); 
“a prior performance rate” (Chintakindi ¶ [0029] “vehicle control computer 214 may be configured to receive, analyze, and act upon vehicle performance and operational data and environmental surroundings and conditions data provided by vehicle sensors 211”, ¶ [0033] “historical vehicle operation and performance data”, ¶ [0049] “the potential evasive maneuvers may be made by comparing and/or matching the above-mentioned data (e.g., direct involvement, peripheral involvement, environmental, and/or vehicle performance and/or operational) to prior incident data stored in historical data source server 230 and/or multi-dimensional risk score generation server 250”. These sections disclose the prior performance measurements of the equipment (vehicle), as mentioned above performance and risk are correlated to each other, for example ignoring the alerts this increases the changes of injury (increases risk) and decreases the chances of performance (performance decreases with injury), the overall productivity of the employee is increased by listening to alerts (reducing risk and increasing performance)).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Horseman, Shen, and Chintakindi. Horseman teaches methods for providing feedback of health information to an employee when the employee is engaged in their work duties. Shen teaches a risk-sensitive Q-learning algorithm, which is necessary for modeling human behavior when transition probabilities are unknown, and prove its convergence. Chintakindi teaches a vehicle control computer able to self-improve and provide better vehicle responses to future adverse 

In reference to claim 21. Horseman teaches a risk management method, comprising:
“mapping user data, site data, and equipment data as well as past data from the database to an event on a site” (Horseman Fig. 15 and ¶ [0045] disclose analyzing the user data “the health data can be used to assess various biometric and biomechanic characteristics (e.g., characteristics, conditions and risks) of the employee, such as the employee's body weight, body temperature, body fat percentage, heart rate, blood pressure, blood glucose level, blood oxygenation level, body position/posture, eye fatigue, neural activity, emotions, thoughts, facial movements/expressions, motor skills, physical exertion, and the like”. ¶ [0085] discloses analyzing the site data “monitoring the health of the employee while they work in or travel between various work environments. For example, system 600 may enable the collection of health data while the employee is working in the field (e.g., on worksite such as an oil and gas production platform, a manufacturing plant, a refinery, a construction site, and/or the like), when they are situated in a workstation (e.g., an employee's office employee's office, cubicle, assigned station on an assembly/manufacturing line, or the like), and/or when they are traveling (e.g. traveling between worksites, driving a delivery truck, and/or the like)”. ¶ [0107] discloses analyzing the equipment data “a positioning device is provided in the employee's chair, boots, work gloves, helmet, elbow pads, knee pads, and/or belt, body position data 700f may include signals and/or coordinates indicative of the location of each of the positioning devices such that a location of the employee's hands, health alerts 208 may be based on the collected health data, the health profile for the employee, actions determined to have been taken by the employee, predicted actions expected to be taken by the employee, and corresponding consequences related thereto. For example, where it has been determined that the user has lifted a heavy object based on the health data collected (e.g., based on the force data acquired via force sensors integrated into the employee's work gloves and/or work boots) and the health consequence of a lower back injury is associated with lifting heavy objects […]”. Fig. 15 discloses at least 2 circuits capable of performing the functions claimed, these are the mobile devices 122 and employee computers 630. Figs. 8 and 9 illustrate the components of the mobile devices and employee computers. Fig. 10, ¶ [0113], ¶ [0138] discloses the collected health data (user data, site data, and equipment data) gets stored together and as mentioned previously the health data is used to map behavior types to events. In at least ¶ [0152], ¶ [0188] disclose using past data from the database);
“determining a relationship between the mapped data and the event based on behaviors exhibited by the user and an impact on a performance factor and a risk factor” (Horseman ¶ [0043] “the health alert includes information to encourage the employee to take actions that improve the employee's health and/or to discourage actions that may have a negative impact on the employee's health […] help to prevent the employee from engaging in actions that may have a negative impact on their health” and ¶ [0056] “health information that encourages an employee to engage in actions that have a positive impact on their health”, these cited sections disclose the relationship between the mapped data and event based on the user behaviors and an impact on a performance and risk. Examiner notes that performance and risk are correlated to each other, for example ignoring the alerts this 
“recommending an action to collectively change an input into a future of the user data, the site data, and the equipment data in concert […] to achieve the overall site productivity to meet a production outcome by changing an activity pertaining to the user data, the site data, and the equipment data […]” (Horseman in at least ¶ [0004], ¶ [0064], ¶ [0077], and ¶ [0079] discloses feedback to the user to dynamically adjust their actions, dynamically adjusting their actions changes all the data collectively since all the data is related to each other. These sections disclose meeting a production outcome because once the user adjust their actions based on the feedback the user achieves the overall site productivity instructed, thereby, meeting the production outcome. Examiner notes that the user data, site data, and equipment data are all related to each other; the user data comprises user behavior, the site data comprises interactions between users and equipment, and equipment data comprises the data gathered from the equipment sensors as mentioned above. Examiner notes that by changing the activity pertaining to the user data (i.e. user performing an action), the site data (i.e. user interaction with equipment) and the equipment data (i.e. data from equipment sensors) changes too),

Horseman does not explicitly disclose:
“using reinforcement learning via a machine learning algorithm to learn the performance factor to the risk factor ratio to change an overall site productivity, the reinforcement learning determining the change to the performance factor to the risk ratio through equipment operator modelling by modelling: the equipment operation by the user; user profiling behaviors; and actions of users based on the relationships determined”,
“wherein an action […] is recommended based on a result of the reinforcement learning to achieve the overall site productivity […] such that the relationship between the mapped data and the event change the performance factor to risk factor ratio”.
However, Shen discloses:
“using reinforcement learning via a machine learning algorithm to learn the performance factor to the risk factor ratio to change an overall site productivity, the reinforcement learning determining the change to the performance factor to the risk ratio through equipment operator modelling by modelling: the equipment operation by the user; user profiling behaviors; and actions of users based on the relationships determined” (Shen § 1, § 3, and § 4.2 disclose learning an action (performance factor) to risk (risk factor) relationship to maximizing the expected reward (optimize an overall productivity), this is based on the policies selected “Risk arises from the uncertainties associated with future events, and is inevitable since the consequences of actions are uncertain at the time when a decision is made. Hence, risk has to be taken into account by the decision-maker, consciously or unconsciously”, “In the context of sequential or multistage decision-making problems, reinforcement learning (RL, Sutton and Barto, 1998) follows this line of thought. It describes how an agent ought to take actions that maximize expected cumulative rewards in an environment typically described by a Markov decision process (MDP, Puterman, 1994)”, and “The optimal policy within a time horizon T is obtained by maximizing the expectation of the discounted cumulative rewards”. These sections describe multiple types of policies including “An economically rational decision-making rule, which is risk-neutral, is to select the alternative with the highest expected reward”, “Besides risk-neutral policies, risk-averse policies, which accept a choice with a more certain but possibly lower expected reward”, and “risk-seeking policies, which prefer a choice with less certain but possibly high reward, are considered economically irrational”. Examiner notes that for examination purposes, these 2 terms “user profiling behaviors”, and “actions of users” will be interpreted under the broadest reasonable interpretation to be the any equipment operation by the user. Given this interpretation, the cited sections above clearly disclose optimizing using reinforcement learning through the actions taken (equipment operation by the user) which result in the optimal policy. Examiner notes that reinforcement learning algorithms are defined to optimize (i.e. change) the reward gained, the reward is based on the ratio between action taken (i.e. performance factor) and risk taken (i.e. risk factor)),
“wherein an action […] is recommended based on a result of the reinforcement learning to achieve the overall site productivity […] such that the relationship between the mapped data and the event change the performance factor to risk factor ratio” (Shen § 3, § 4.2, and § 5 discloses using a Markov process to determine the optimal policy and recommending an action based on the policy “The optimal policy within a time horizon T is obtained by maximizing the expectation of the discounted cumulative rewards …”. Examiner notes that reinforcement learning algorithms are defined to optimize (i.e. change) the reward gained, the reward is based on the ratio between action taken (i.e. performance factor) and risk taken (i.e. risk factor). The relationship between the mapped data and the event change the reward (i.e. the performance factor to risk factor ratio)).


Horseman and Shen do not explicitly disclose:
wherein the equipment data comprises:
“a type of equipment”; and
“a risk associated with a hazard associated with the type of equipment”; and
However, Chintakindi discloses:
wherein the equipment data comprises:
“a type of equipment” (Chintakindi ¶ [0018] and ¶ [0048] “type, amount, and cost of vehicle”);
“a risk associated with a hazard associated with the type of equipment” (Chintakindi “determine which risk factors on road segments can impact a vehicle and occupants included therein. The system may determine (e.g., quantify and/or create) a probability of an adverse event occurring. The probability may be range bound. The server may determine the potential cost ( e.g., in dollars) of an adverse event, such as an accident”);
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Horseman, Shen, and Chintakindi. Horseman teaches methods for 

Horseman, Shen, and Chintakindi do not explicitly disclose:
wherein the equipment data comprises:
“wherein the user data comprises a cognitive state of the user including a distraction level and a fatigue level”.
However, Horseman640 discloses:
wherein the equipment data comprises:
“wherein the user data comprises a cognitive state of the user including a distraction level and a fatigue level” (Horseman640 in at least Col. 15 lines 5-25 and Col. 24 line 50 to col. 25 line 16 “a plurality of neural sensor (e.g., sixteen neural sensors/channels) may be disposed about the employee's scalp to detect neuro-signals (e.g., including alpha, beta, gamma, and delta waves) that can be used to determine the employee's brain state, including their emotional state (e.g., distracted, angry happy, sad, excited, etc.), thoughts ( e.g., cognitive thoughts, subconscious thoughts, intent, etc.), facial movements (e.g., facial expressions), motor functions and/or the like. Such data can be used to determine wither the driver is fatigued/tired (e.g., suffering from sleep deprivation), and/or the like” and “process driver distracted, fatigued, has fallen asleep, is suffering a stroke/heart-attack, and/or the like”. Examiner notes that other sections not cited are also relevant to this limitation).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Horseman, Shen, Chintakindi, and Horseman640. Horseman teaches methods for providing feedback of health information to an employee when the employee is engaged in their work duties (examiner notes that Horseman incorporates by reference in its entirety in Horseman640). Shen teaches a risk-sensitive Q-learning algorithm, which is necessary for modeling human behavior when transition probabilities are unknown, and prove its convergence. Chintakindi teaches a vehicle control computer able to self-improve and provide better vehicle responses to future adverse driving events. Horseman640 teaches methods for providing feedback of health information to a driver when driving a vehicle (examiner notes that Horseman640 incorporates by reference in its entirety in Horseman). One of ordinary skill would have motivation to combine Horseman, Shen, Chintakindi, and Horseman640 because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 22. Horseman, Shen, Chintakindi, and Horseman640 teach the method of claim 21 (as mentioned above), further comprising:
Shen further discloses:
“creating a schedule of actions for the user to follow based on the schedule adhering to the changed performance factor to risk factor ratio learned by the learning” (Shen § 1, § 3, and § 4.2 disclose creating a policy (schedule of actions) for the agent (user) to follow, particularly see Algorithm 1 in pg. 10. As mentioned above the policy selected optimizes the action (performance factor) to risk (risk factor) relationship to maximizing the expected reward (optimize an overall productivity)).

In reference to claim 23. Horseman, Shen, Chintakindi, and Horseman640 teach the method of claim 21 (as mentioned above), wherein:
Shen further discloses:
“the learning uses the reinforcement learning to optimize the performance factor to the risk factor ratio […]” (Shen § 1, § 3, and § 4.2 disclose a reinforcement learning model which learns an action (performance factor) to risk (risk factor) relationship to maximizing the expected reward (optimize an overall productivity)).

Horseman further discloses:
“[…] change the performance factor to the risk factor ratio through equipment operator modelling, user profiling behaviors, and actions of users based on the relationship determined by the determining” (Horseman ¶ [0043] “the health alert includes information to encourage the employee to take actions that improve the employee's health and/or to discourage actions that may have a negative impact on the employee's health […] help to prevent the employee from engaging in actions that may have a negative impact on their health” and ¶ [0056] “health information that encourages an employee to engage in actions that have a positive impact on their health”, these cited sections disclose the relationship 

In reference to claim 24. Horseman, Shen, Chintakindi, and Horseman640 teach the method of claim 21 (as mentioned above), wherein:
Shen further discloses:
“the learning determines a set of stationary policies regarding an action and an expected productivity increase of the action […], in order to change a value function having the performance factor to the risk factor ratio as an output of the value function” (Shen § 1, § 3, and § 4.2 disclose creating a policy (stationary policies) for the agent (user) to follow, particularly see Algorithm 1 in pg. 10. As mentioned above the policy selected optimizes the action (performance factor) to risk (risk factor) relationship to maximizing the expected 

Horseman further discloses:
“[…] given a real-time relationship of the user data, the site data, and the equipment data […]” (Horseman ¶ [0044] “health information provides real-time feedback to the employee regarding their health”, ¶ [0045] “the health data may be indicative of the employee's health and actions while the employee is engaged in their day-to-day work activities and may enable monitoring of dynamic/real-time changes in the employee's health and actions throughout the workday”, and as mentioned above Fig. 15, ¶ [0045], ¶ [0085], and ¶ [0107] disclose the relationship of the user data, the site data, and the equipment data),

In reference to claim 25. Horseman, Shen, Chintakindi, and Horseman640 teach the method of claim 21 (as mentioned above), wherein:
Shen further discloses:
“the learning utilizes a Markov Decision Process as a part of the machine learning algorithm in the reinforcement learning” (Shen § 1, § 3, and § 4.2 disclose the Markov Decision Process in the reinforcement learning. See Figure 3(b) in pg. 13).

In reference to claim 26. Horseman, Shen, Chintakindi, and Horseman640 teach the system of claim 21 (as mentioned above), wherein:
Shen further discloses:
“the performance factor to the risk factor ratio is learned based on the risk factor being greater than a predetermined risk tolerance level” (Shen § 1, § 3, and § 4.2 disclose multiple 

In reference to claim 27. Horseman teaches a non-transitory computer-readable recording medium recording a risk management program, the program causing a computer to perform:
“mapping user data, site data, and equipment data as well as past data from the database to an event on a site” (Horseman Fig. 15 and ¶ [0045] disclose analyzing the user data “the health data can be used to assess various biometric and biomechanic characteristics (e.g., characteristics, conditions and risks) of the employee, such as the employee's body weight, body temperature, body fat percentage, heart rate, blood pressure, blood glucose level, blood oxygenation level, body position/posture, eye fatigue, neural activity, emotions, thoughts, facial movements/expressions, motor skills, physical exertion, and the like”. ¶ [0085] discloses analyzing the site data “monitoring the health of the employee while they work in or travel between various work environments. For example, system 600 may enable the collection of health data while the employee is working in the field (e.g., on worksite such as an oil and gas production platform, a manufacturing plant, a refinery, a construction site, and/or the like), when they are situated in a workstation (e.g., an employee's office employee's office, cubicle, assigned station on an assembly/manufacturing line, or the like), and/or when they are traveling (e.g. traveling between worksites, driving a delivery truck, and/or the like)”. ¶ [0107] discloses analyzing the equipment data “a positioning device is provided in the employee's chair, boots, work gloves, helmet, elbow pads, knee pads, and/or belt, body position data 700f may include signals and/or coordinates indicative of the location of each of the positioning devices such that a location of the employee's hands, health alerts 208 may be based on the collected health data, the health profile for the employee, actions determined to have been taken by the employee, predicted actions expected to be taken by the employee, and corresponding consequences related thereto. For example, where it has been determined that the user has lifted a heavy object based on the health data collected (e.g., based on the force data acquired via force sensors integrated into the employee's work gloves and/or work boots) and the health consequence of a lower back injury is associated with lifting heavy objects […]”. Fig. 15 discloses at least 2 circuits capable of performing the functions claimed, these are the mobile devices 122 and employee computers 630. Figs. 8 and 9 illustrate the components of the mobile devices and employee computers. Fig. 10, ¶ [0113], ¶ [0138] discloses the collected health data (user data, site data, and equipment data) gets stored together and as mentioned previously the health data is used to map behavior types to events. In at least ¶ [0152], ¶ [0188] disclose using past data from the database);
“determining a relationship between the mapped data and the event based on behaviors exhibited by the user and an impact on a performance factor and a risk factor” (Horseman ¶ [0043] “the health alert includes information to encourage the employee to take actions that improve the employee's health and/or to discourage actions that may have a negative impact on the employee's health […] help to prevent the employee from engaging in actions that may have a negative impact on their health” and ¶ [0056] “health information that encourages an employee to engage in actions that have a positive impact on their health”, these cited sections disclose the relationship between the mapped data and event based on the user behaviors and an impact on a performance and risk. Examiner notes that performance and risk are correlated to each other, for example ignoring the alerts this 
“recommending an action to collectively change an input into a future of the user data, the site data, and the equipment data in concert […] to achieve the overall site productivity to meet a production outcome by changing an activity pertaining to the user data, the site data, and the equipment data […]” (Horseman in at least ¶ [0004], ¶ [0064], ¶ [0077], and ¶ [0079] discloses feedback to the user to dynamically adjust their actions, dynamically adjusting their actions changes all the data collectively since all the data is related to each other. These sections disclose meeting a production outcome because once the user adjust their actions based on the feedback the user achieves the overall site productivity instructed, thereby, meeting the production outcome. Examiner notes that the user data, site data, and equipment data are all related to each other; the user data comprises user behavior, the site data comprises interactions between users and equipment, and equipment data comprises the data gathered from the equipment sensors as mentioned above. Examiner notes that by changing the activity pertaining to the user data (i.e. user performing an action), the site data (i.e. user interaction with equipment) and the equipment data (i.e. data from equipment sensors) changes too),

Horseman does not explicitly disclose:
“using reinforcement learning via a machine learning algorithm to learn the performance factor to the risk factor ratio to change an overall site productivity, the reinforcement learning determining the change to the performance factor to the risk ratio through equipment operator modelling by modelling: the equipment operation by the user; user profiling behaviors; and actions of users based on the relationships determined”,
“wherein an action […] is recommended based on a result of the reinforcement learning to achieve the overall site productivity […] such that the relationship between the mapped data and the event change the performance factor to risk factor ratio”.
However, Shen discloses:
“using reinforcement learning via a machine learning algorithm to learn the performance factor to the risk factor ratio to change an overall site productivity, the reinforcement learning determining the change to the performance factor to the risk ratio through equipment operator modelling by modelling: the equipment operation by the user; user profiling behaviors; and actions of users based on the relationships determined” (Shen § 1, § 3, and § 4.2 disclose learning an action (performance factor) to risk (risk factor) relationship to maximizing the expected reward (optimize an overall productivity), this is based on the policies selected “Risk arises from the uncertainties associated with future events, and is inevitable since the consequences of actions are uncertain at the time when a decision is made. Hence, risk has to be taken into account by the decision-maker, consciously or unconsciously”, “In the context of sequential or multistage decision-making problems, reinforcement learning (RL, Sutton and Barto, 1998) follows this line of thought. It describes how an agent ought to take actions that maximize expected cumulative rewards in an environment typically described by a Markov decision process (MDP, Puterman, 1994)”, and “The optimal policy within a time horizon T is obtained by maximizing the expectation of the discounted cumulative rewards”. These sections describe multiple types of policies including “An economically rational decision-making rule, which is risk-neutral, is to select the alternative with the highest expected reward”, “Besides risk-neutral policies, risk-averse policies, which accept a choice with a more certain but possibly lower expected reward”, and “risk-seeking policies, which prefer a choice with less certain but possibly high reward, are considered economically irrational”. Examiner notes that for examination purposes, these 2 terms “user profiling behaviors”, and “actions of users” will be interpreted under the broadest reasonable interpretation to be the any equipment operation by the user. Given this interpretation, the cited sections above clearly disclose optimizing using reinforcement learning through the actions taken (equipment operation by the user) which result in the optimal policy. Examiner notes that reinforcement learning algorithms are defined to optimize (i.e. change) the reward gained, the reward is based on the ratio between action taken (i.e. performance factor) and risk taken (i.e. risk factor)),
“wherein an action […] is recommended based on a result of the reinforcement learning to achieve the overall site productivity […] such that the relationship between the mapped data and the event change the performance factor to risk factor ratio” (Shen § 3, § 4.2, and § 5 discloses using a Markov process to determine the optimal policy and recommending an action based on the policy “The optimal policy within a time horizon T is obtained by maximizing the expectation of the discounted cumulative rewards …”. Examiner notes that reinforcement learning algorithms are defined to optimize (i.e. change) the reward gained, the reward is based on the ratio between action taken (i.e. performance factor) and risk taken (i.e. risk factor). The relationship between the mapped data and the event change the reward (i.e. the performance factor to risk factor ratio)).


Horseman and Shen do not explicitly disclose:
wherein the equipment data comprises:
“a type of equipment”; and
“a risk associated with a hazard associated with the type of equipment”; and
However, Chintakindi discloses:
wherein the equipment data comprises:
“a type of equipment” (Chintakindi ¶ [0018] and ¶ [0048] “type, amount, and cost of vehicle”);
“a risk associated with a hazard associated with the type of equipment” (Chintakindi “determine which risk factors on road segments can impact a vehicle and occupants included therein. The system may determine (e.g., quantify and/or create) a probability of an adverse event occurring. The probability may be range bound. The server may determine the potential cost ( e.g., in dollars) of an adverse event, such as an accident”);
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Horseman, Shen, and Chintakindi. Horseman teaches methods for 

Horseman, Shen, and Chintakindi do not explicitly disclose:
wherein the equipment data comprises:
“wherein the user data comprises a cognitive state of the user including a distraction level and a fatigue level”.
However, Horseman640 discloses:
wherein the equipment data comprises:
“wherein the user data comprises a cognitive state of the user including a distraction level and a fatigue level” (Horseman640 in at least Col. 15 lines 5-25 and Col. 24 line 50 to col. 25 line 16 “a plurality of neural sensor (e.g., sixteen neural sensors/channels) may be disposed about the employee's scalp to detect neuro-signals (e.g., including alpha, beta, gamma, and delta waves) that can be used to determine the employee's brain state, including their emotional state (e.g., distracted, angry happy, sad, excited, etc.), thoughts ( e.g., cognitive thoughts, subconscious thoughts, intent, etc.), facial movements (e.g., facial expressions), motor functions and/or the like. Such data can be used to determine wither the driver is fatigued/tired (e.g., suffering from sleep deprivation), and/or the like” and “process driver distracted, fatigued, has fallen asleep, is suffering a stroke/heart-attack, and/or the like”. Examiner notes that other sections not cited are also relevant to this limitation).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Horseman, Shen, Chintakindi, and Horseman640. Horseman teaches methods for providing feedback of health information to an employee when the employee is engaged in their work duties (examiner notes that Horseman incorporates by reference in its entirety in Horseman640). Shen teaches a risk-sensitive Q-learning algorithm, which is necessary for modeling human behavior when transition probabilities are unknown, and prove its convergence. Chintakindi teaches a vehicle control computer able to self-improve and provide better vehicle responses to future adverse driving events. Horseman640 teaches methods for providing feedback of health information to a driver when driving a vehicle (examiner notes that Horseman640 incorporates by reference in its entirety in Horseman). One of ordinary skill would have motivation to combine Horseman, Shen, Chintakindi, and Horseman640 because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 28. Horseman, Shen, Chintakindi, and Horseman640 teach the non-transitory computer-readable recording medium of claim 27 (as mentioned above), wherein:
Shen further discloses:
“the performance factor to the risk factor ratio is learned based on the risk factor being greater than a predetermined risk tolerance level” (Shen § 1, § 3, and § 4.2 disclose multiple types of policies. Table 1 in page 14 discloses the predetermined risk tolerance levels for each of the policies. In Table 1 you will find the breakdown of each policy (risk-averse, risk-neutral, and risk-seeking)).

Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Samantha J. Horseman (hereinafter Horseman) US 20130009993 A1 in view of Shen et al. (hereinafter Shen) “Risk-sensitive Reinforcement Learning” in view of Chintakindi et al. (hereinafter Chintakindi) US 20190027038 A1 (relying on provisional application date) in view of Samantha J. Horseman (hereinafter Horseman640) US 8872640 B2 in view of Stubna et al. (hereinafter Stubna) US 20130054215 A1.
In reference to claim 2. Horseman, Shen, Chintakindi, and Horseman640 teach the system of claim 1 (as mentioned above), wherein the memory further stores instructions to cause the processor to:
Shen further discloses:
“create a schedule of actions for the user to follow based on the schedule adhering to the changed performance factor to risk factor ratio learned […]” (Shen § 1, § 3, and § 4.2 disclose creating a policy (schedule of actions) for the agent (user) to follow, particularly see Algorithm 1 in pg. 10. As mentioned above the policy selected optimizes the action (performance factor) to risk (risk factor) relationship to maximizing the expected reward (optimize an overall productivity)).

Horseman further discloses:
“[…] performance factor to risk factor ratio learned by the reinforcement learning circuit” (Horseman Examiner notes that the broadest reasonable interpretation for “a reinforcement learning circuit” is any circuit that performs the functions claimed. Fig. 15 discloses at least 2 circuit capable of performing the functions claimed, these are the mobile devices 122 and employee computers 630. Figs. 8 and 9 illustrate the components of the mobile devices and employee computers),
“wherein the user data is monitored and recorded, and includes time spent typing, time spent moussing, and time spent reading, changes in acceleration, motion, speed, anomalous movement, abuse of a device by the user, repeated attempts to change configuration, warnings by supervisors, and changes in skin luminescence” (Horseman in at least ¶ [0110], ¶ [0148], ¶ [0149], and ¶ [0152] “monitoring the health sensors to collect health data includes executing a single measurement by some or all of the sensors 102. For example, some or all of the sensors 102 may be employed to record a single measurement in sequence (e.g., one after the other) or in parallel (e.g., at the same time) and transmit corresponding health data 700 to mobile device 622 and/or employee computer”. Fig. 7 and Fig. 10 show some of the sensors disclosed by Horseman. See also at least ¶ [0106]-[0108] which discloses the video and audio recording. Examiner notes that under the broadest reasonable interpretation, Horseman disclose all of the elements of this limitation because by monitoring and recording a user with at least video and audio (and the many other sensors disclosed) the user data would include all actions performed by the user, which includes and is not limited to time spent typing, time spent moussing, and time spent reading, changes in acceleration, motion, speed, anomalous movement, abuse of a device by the user, repeated attempts to change configuration, warnings by supervisors, and changes in skin luminescence. Examiner notes that the broadest reasonable interpretation 

Horseman, Shen, Chintakindi, and Horseman640 do not explicitly disclose:
wherein the equipment data comprises:
“wherein the user data further comprises a user cohort including a disability of the user”, and
However, Stubna discloses:
wherein the equipment data comprises:
“wherein the user data further comprises a user cohort including a disability of the user” (Stubna in at least ¶ [0038] discloses measure neurobehavioral performance including context-relative performance tasks, such as a workplace-specific task. Performance measures for such neurobehavioral tasks may come from direct human observation, measurement instruments, or from embedded systems (e.g., a lane tracking system on a commercial motor vehicle). medical monitoring, screening, diagnosis and treatment settings neurobehavioral assessment may be made based on physician or medical-care-provider 
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Horseman, Shen, Chintakindi, Horseman640, and Stubna. Horseman teaches methods for providing feedback of health information to an employee when the employee is engaged in their work duties (examiner notes that Horseman incorporates by reference in its entirety in Horseman640). Shen teaches a risk-sensitive Q-learning algorithm, which is necessary for modeling human behavior when transition probabilities are unknown, and prove its convergence. Chintakindi teaches a vehicle control computer able to self-improve and provide better vehicle responses to future adverse driving events. Horseman640 teaches methods for providing feedback of health information to a driver when driving a vehicle (examiner notes that Horseman640 incorporates by reference in its entirety in Horseman). Stubna teaches methods used to measure neurobehavioral performance including context-relative performance tasks, such as a workplace-specific task. One of ordinary skill would have motivation to combine Horseman, Shen, Chintakindi, Horseman640, and Stubna because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Viker A. Lamardo whose telephone number is (571)270-5871.  The examiner can normally be reached on Mon. - Fri. 9 AM - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann J. Lo can be reached on (571)272-9767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact 






/VIKER A LAMARDO/Primary Examiner, Art Unit 2126