DETAILED ACTION
Status of Claims
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is a FINAL office action in response to the Applicant’s response filed 23 August 2022.
Claims 1, 2, 4, 9, 14, 15, 18, and 20 have been amended.
The 103 rejection for claims 1-20 has been overcome by amendments.
The 112b rejection for claims 1-20 has been overcome by amendments.
Claims 1-20 are currently pending and have been examined.

Response to Arguments
Applicant's arguments filed 23 August 2022 with regards to the 101 rejection have been fully considered but they are not persuasive.

With respect to the 101 rejection, the Applicant argues on pages 9 and 10 of their response, “Here, the Examiner characterizes that the method of claim 1 is a performance of commercial activities and the managing of relationships of behaviors of people. Office Action, p.6. Applicant respectfully submits that the amended claim 1 focuses on the constructing of a simulation environment and the training of the simulated vehicle, which are irrelevant to commercial activities and managing relationships and behaviors of people. Further, claim 1 also describes using the simulated environment and vehicle to learn an action policy, and managing vehicles based on the learned action policy. The managing of a vehicle is not managing human behaviors or relationships. Therefore, the claims do not recite certain methods of organizing human activity.”  The Examiner respectfully disagrees with the Applicant’s interpretation of the requirements under 35 USC 101, the bounds of the claimed invention, and the grounds of the previous and current rejection.  First, the Examiner notes that the Applicant’s argument does not refer to any specific elements of the claims, and instead merely generally refers to the newly amended portion of the claims, and provided a conclusory assertion that the claims do not recite an certain methods of organizing human activity. Notably, the Applicant’s argument does not address the previous claim elements that remain in the claim, including, “determining… a behavior of the ride-share-enabled vehicle based on a current location of the ride-share-enabled vehicle and the ride-sharing policy algorithm; and causing… the ride-share-enabled vehicle to be operated according to the determined behavior of the ride-share-enabled vehicle,” both of which are elements that have been identified in the previous 101 rejection, and stated as encompasses, “collecting service region information, identifying business rules/policies and actions to be conducted in accordance with the rules/policies, and conducting sales/service activities; which is the performance of commerical activities (managing sales activities, business relations, marketing) and the managing of relationships and behaviors of people.” (see paragraph 11 of the Non-Final rejection).  As such, the Applicant’s argument that the claims don’t include elements that recite an abstract idea are found not persuasive, as these elements do recite an abstract idea.  Second, with regards to the Applicant’s argument that the claims focus on, “the constructing of a simulation environment and the training of the simulated vehicle, which are irrelevant to commercial activities and managing relationships and behaviors of people,” the Examiner is not persuaded.  Notably, the constructing a “simulated environment” that includes a “simulated vehicle” and determining the vehicles services, encompasses the planning and forecasting of services offered by a vehicle in a rideshare environment, and thus, would encompass managing business relations, sales activities, and behavior and relationships between people.  Therefore the Examiner maintains that this rejection is proper.
The Applicant continues on page 10 of their response, “The claims do not recite a mental process. The MPEP explains that the grouping of mental processes includes concepts performed in the human mind (including an observation, evaluation, judgment, opinion). Additionally, MPEP § 2106.04(a)(2)(II) clarifies that claims do not recite a mental process when they do not contain limitations that can practically be performed in the human mind, for instance when the human mind is not equipped to perform the claim limitations. See SRI Int’l, Inc. v. Cisco Systems, Inc., 930 F.3d 1295, 1304 (Fed. Cir. 2019) (declining to identify the claimed collection and analysis of network data as abstract because "the human mind is not equipped to detect suspicious activity by using network monitors and analyzing network packets as recited by the claims"). The method of amended claim 1 cannot be performed or implemented in the human mind. For instance, a human mind cannot construct ‘a simulation environment based on the plurality of historical trips by training a simulated vehicle using a Markov decision process’ as recited in claim 1. In addition, amended claim 1 describes a machine learning process, i.e., reinforcement learning, by providing the necessary configurations for the algorithm to work, such as the state, action, reward, and parameter updates. This process only arises in the realm of computer technology and machine learning. Therefore, the claims do not recite a mental process.”  The Examiner respectfully disagrees with the Applicant’s interpretation of the requirements under 35 USC 101, the bounds of the claimed invention, and the grounds of the previous and current rejection.  First, the Examiner notes that the Applicant’s argument that the claimed elements cannot be performed in the human mind is recited in a conclusory manner.  In particular, the Applicant’s argument that, “a human mind cannot construct ‘a simulation environment based on the plurality of historical trips by training a simulated vehicle using a Markov decision process’ as recited in claim 1,” is recited without any evidence or explanation as to why a human can not construct a simulation environment, which the Examiner notes is broad enough for a merely a human being thinking about an environment and actions in the environment.  Additionally, it is noted that the Applicant has failed to explain why a human mind, with or without a physical aid (e.g. a pencil and paper), would not be able to “train” a simulated vehicle using a Markov decision process.  Notably, “training” a vehicle in a simulated environment, is merely planning out a vehicle’s actions in an imagined environment, while using the mathematical technique of a Markov decision process to plan out options and actions of the vehicle.  The Applicant has failed to identify and show that the simulation environment, cannot be merely performed practically by a human mind, and as such, the Applicant’s argument is not persuasive.   Second, with regards to the Applicant’s argument that the claim, “describes a machine learning process, i.e., reinforcement learning, by providing the necessary configurations for the algorithm to work, such as the state, action, reward, and parameter updates,” the Examiner is unpersuaded.  Notably, nothing in the claimed invention is directed to “machine-learning,” and thus, any arguments regarding “machine-learning” are beyond the scope of the claimed invention.  Additionally, it is noted that “reinforcement learning,” is not something that, “only arises in the realm of computer technology and machine learning,” but instead is a fundamental building block of all problem solving and planning of a service.  Particularly, the Applicant states that providing the necessary configurations for an algorithm to work, including the state, action, reward, and parameter updates, only occurs in the realm of computer technology and machine learning, however this is not the case.  For example, providing the initial state of a model (e.g. a taxi service), some action that would take place in said model (e.g. a taxi picks up a passenger), a reward in said model (e.g. the driver of the taxi is rewarded with pay), and how parameters in the model are changed/updated (e.g. the taxi travels along a route, and the passenger changes locations), does not involve machine-learning, nor does it occur only in a computer system.  It is also noted that the mere use of machine-learning, if claimed, would not remedy this issue, as it merely narrows the field of use of the abstract idea by reciting a generalized technique used carry out the actions, would be merely applying the abstract idea on a general purpose computer, and would not improve the functioning of a computer, another technology, or technical field.  Therefore the Examiner maintains that this rejection is proper.
The Applicant continues on page 11 of their response, “For instance, amendment claim 1 includes ‘simulation environment,’ ‘simulated vehicle,’ ‘Markov decision process,’ ‘repeating the Markov decision process’ for ‘multiple episodes,’ etc. Application respectfully submits that the elements identified above integrate any alleged abstract idea into a practical application. In particular, these additional elements describe a computer-implemented method to construct a simulation environment based on a plurality of historical trips by training the simulated vehicle using the Markov decision process, and using the simulated vehicle to learn an action policy to provide guidance to the vehicle fleet in the real world. This method leads to technical improvement to the existing technology in the ride-sharing field. As pointed out in the specification, ‘Existing technologies have not developed such systems and methods that can provide a robust mechanism for training policies for vehicle services,’ and ‘the provided simulation environment paves the way for generating automatic vehicle guidance that makes passenger-picking or waiting for decisions as well as carpool routing decisions for real vehicle drivers, which are unattainable by existing technologies.’ Specification, para. [85].”  The Examiner respectfully disagrees with the Applicant’s interpretation of the requirements under 35 USC 101, the bounds of the claimed invention, and the grounds of the previous and current rejection.  First, the Examiner notes that the Applicant’s argument refers to newly amended portions of the claims, and thus were not previously addressed under the previous 101 analysis and rejection, however remain analyzed and rejected below.  Second, with regards to the elements that the Applicant has recited (e.g. ‘simulation environment,’ ‘simulated vehicle,’ ‘Markov decision process,’ ‘repeating the Markov decision process’ for ‘multiple episodes,’ etc.), the Examiner notes that these are not solely additional elements, as such, classifying them all as additional elements that integrate the abstract idea into a practical application is found not persuasive.  Third, with respect to the Applicant’s argument that the construction of a simulation environment based on a plurality of historical trips by training the simulated vehicle using the Markov decision process, and using the simulated vehicle to learn an action policy to provide guidance to the vehicle fleet in the real world, provides a technical improvement to the existing technology in the ride-sharing field, is found not persuasive.  Notably, “ride-sharing” is merely the commerical industry of providing transport to users, which is an abstract idea, and not a specific “technology.”  It is noted that the Applicant has identified paragraph 85 of their specification as identifying and describing the improvement, however this is found not persuasive.  In particular, paragraph 85 states, “As such, the disclosed environment can be used to train models and/or algorithms for vehicle navigation. Existing technologies have not developed such systems and methods that can provide a robust mechanism for training policies for vehicle services. The environment is a key for providing optimized policies that can guide vehicle driver effortlessly while maximizing their gain and minimizing passenger time cost. That is, the above-described recursive performance of the steps (1)-(4) based on historical data of trips taken by historical passenger groups can train a policy that maximizes a cumulative reward for the time period; and the trained policy determines an action for a real vehicle in a real environment when the real vehicle has no passenger, the action for the real vehicle in the real environment being selected from: (action 1) waiting at a current location of the real vehicle, and (action 2) determining the value M to transport M real passenger groups each comprising one or more passengers. For the real vehicle in the real environment, the (action 2) may further comprise: determining the M real passenger groups from available real passenger groups requesting vehicle service; if M is more than 1, determining an order for: picking up each of the M real passenger groups and dropping off each of the M passenger groups; and transporting the determined M real passenger groups according to the determined order. Therefore, the provided simulation environment paves the way for generating automatic vehicle guidance that makes passenger-picking or waiting decisions as well as carpool routing decisions for real vehicle drivers, which are unattainable by existing technologies.”  (Emphasis added).  As shown and emphasized here, the Applicant’s specification describes the alleged improvement in generating optimized policies that can guide vehicle driver effortlessly while maximizing their gain and minimizing passenger time cost, maximizing a cumulative reward for the time period, making passenger-picking or waiting decisions, and carpool routing decisions, which are not technologies, but instead are improvements in the abstract idea of planning business relations and human behavior and interactions.  It is noted that MPEP 2106.05(a)(II) states, “However, it is important to keep in mind that an improvement in the abstract idea itself (e.g. a recited fundamental economic concept) is not an improvement in technology. For example, in Trading Technologies Int’l v. IBG, 921 F.3d 1084, 1093-94, 2019 USPQ2d 138290 (Fed. Cir. 2019), the court determined that the claimed user interface simply provided a trader with more information to facilitate market trades, which improved the business process of market trading but did not improve computers or technology.”  (Emphasis added).  As shown and emphasized here, the improvement in the abstract idea is not an improvement in technology.  With respect to the Applicant’s claimed invention and arguments, they have failed to identify specific technology that is improved upon, and have failed to show how the claims reflect said improvement.  Thus, the Applicant has failed to show that their specification discloses an improvement in the functioning of a computer, another technology or technical field, and failed to show that the claims reflect said improvement.  Therefore the Examiner maintains that this rejection is proper.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite obtaining a plurality of historical trips collected from a period of time; constructing a simulation environment based on the plurality of historical trips by training a simulated vehicle using a Markov decision process that comprises determining an initial state of the simulated vehicle, the initial state comprising an initial location and an initial time; determining an action for the simulated vehicles based on the plurality of historical trips according to a ride-sharing policy algorithm, wherein the action includes waiting at a current location, picking up a passenger/passenger group, or picking up multiple passengers/passenger groups and transporting them in carpool; determining a reward corresponding to the action based on a travel distance of the simulated vehicle by executing the actions; updating an state of the simulated vehicle based on the action, the initial location, and the initial time; repeating the Markov decision process based on the updated state; and adjusting parameters of the ride-sharing policy algorithm to maximize an accumulative reward for the simulated vehicle;  determining a target location of the ride-share-enabled vehicle; determining a behavior of the ride-share-enabled vehicle based on a current location of the ride-share-enabled vehicle and the determined ride-sharing policy algorithm; and causing the ride-share-enabled vehicle to be operated according to the determined behavior of the ride-share-enabled vehicle.
The limitations of obtaining a plurality of historical trips collected from a period of time; constructing a simulation environment based on the plurality of historical trips by training a simulated vehicle using a Markov decision process that comprises determining an initial state of the simulated vehicle, the initial state comprising an initial location and an initial time; determining an action for the simulated vehicles based on the plurality of historical trips according to a ride-sharing policy algorithm, wherein the action includes waiting at a current location, picking up a passenger/passenger group, or picking up multiple passengers/passenger groups and transporting them in carpool; determining a reward corresponding to the action based on a travel distance of the simulated vehicle by executing the actions; updating an state of the simulated vehicle based on the action, the initial location, and the initial time; repeating the Markov decision process based on the updated state; and adjusting parameters of the ride-sharing policy algorithm to maximize an accumulative reward for the simulated vehicle;  determining a target location of the ride-share-enabled vehicle; determining a behavior of the ride-share-enabled vehicle based on a current location of the ride-share-enabled vehicle and the determined ride-sharing policy algorithm; and causing the ride-share-enabled vehicle to be operated according to the determined behavior of the ride-share-enabled vehicle, as drafted, under the broadest the broadest reasonable interpretation, encompasses elements that can be performed in the human mind, the management of commerical activities (including managing sales activities, business relations, marketing), the managing of behaviors and interactions between people, and the performance of mathematical concepts, with the use of generic computer elements as tools.  That is, other than reciting the use of generic computer elements (computing system, processors, a server, and memory), the claims recite an abstract idea.  For example, obtaining historical trips and constructing a simulation environment based on the historical trips, encompass a user observing and evaluating historical job information to determine a general job performance, are elements that can be performed in the human mind.  Additionally, training a vehicle by determining an initial state of the vehicle, determining an action of the vehicles based on the historic information, determining an award corresponding to the action, updating the simulation with the determined parameters, and adjusting parameters in a job policy to maximize an accumulative award; encompasses a human using collected job information and initial state information to evaluate and predict job performance of a vehicle, wherein the evaluation includes judging how the a service provider would perform and adjusting the variables used based on the evaluation, which can be performed mentally.  In addition, determining a target location of the ride-share-enabled vehicle, encompasses a user observing a target location of a vehicle, which can be performed in the human mind.  In addition, determining a ride-sharing policy algorithm to determine a behavior of the ride-share-enabled vehicle and determining a behavior of the ride-share-enabled vehicle based on a current location of the vehicle and the determined ride-sharing policy algorithm, encompasses a user deciding which policy to use for a rideshare program, and deciding what behavior will be conducted under that policy, which are steps that can be performed in the human mind (judgement, evaluation, opinion).  As such, the claims recite limitations that fall into the “Mental Processes” grouping of abstract ideas.  In addition, obtaining a plurality of historical trip information, and constructing a simulation environment based on the plurality of historical trips by training a simulated vehicle using a Markov decision, encompasses collecting historic job information and modeling job performance, which encompasses managing a commerical interaction (sales activities, business relations, marketing), and managing human behavior and interactions.  In addition, determining an initial state of the simulated vehicle, determining an action for the simulated vehicles based on the plurality of historical trips according to a ride-sharing policy algorithm, determining a reward corresponding to the action based on a travel distance of the simulated vehicle by executing the actions, updating an state of the simulated vehicle based on the determined variables, repeating the Markov decision process based on the updated state, and adjusting parameters of the ride-sharing policy algorithm to maximize an accumulative reward for the simulated vehicle; encompasses the modelling and predicting of job performance and payment by and for a service provider; and thus recites managing a commerical interaction (sales activities, business relations, marketing), and managing human behavior and interactions.  In addition, determining a target location of the vehicle, determining a ride-sharing policy algorithm to determine a behavior of the vehicle including whether to accept a multiple shared ride or maintain a single shared ride and a route of the multiple shared ride, determining a behavior of the vehicle based on a current location of the ride-share-enabled vehicle and the determined ride-sharing policy algorithm, and causing the vehicle to be operated according to the determined behavior of the vehicle; encompasses collecting service region information, identifying business rules/policies and actions to be conducted in accordance with the rules/policies, and conducting sales/service activities; which is the performance of commerical activities (managing sales activities, business relations, marketing) and the managing of relationships and behaviors of people.  Thus, the claims recite elements that fall into the “Certain Methods of Organizing Human Activity” grouping of abstract ideas.  In addition, using a Markov decision process to train a model, encompasses the performance of mathematical concepts and algorithms.  Thus, the claims recite elements that fall into the “Mathematical Concepts” grouping of abstract ideas.  Therefore the claims recite an abstract idea.
This judicial exception is not integrated into a practical application.  The claims do not recite additional elements that improve the functioning of a computer, another technology, or technical field.  The claims do not recite the use of, or apply the abstract idea with, a particular machine, the claims do not recite the transformation of an article from one state or thing into another.  Finally, the claims do not recite additional elements that apply or use the abstract idea in some other meaningful way beyond generally linking the use of the abstract idea to a particular technological environment.  Instead, the claims recite the use of generic computer elements (computing system, processors, a server, and memory) as tools to carry out the abstract idea.  The claims are directed to an abstract idea.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer elements and machines to perform the steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.  The claims are directed to non-patent eligible subject matter.
The dependent claims 2-13, 15-17, 19, and 20, taken individually and in an ordered combination, do not recite additional elements that integrate the abstract idea into a practical application, or add significantly more to the abstract idea itself.  For example, the claims further recite that the policy algorithm is or is not based on a deep reinforced learning method of a deep Q-Networks, which merely narrows the field of use by identifying the type of algorithm used in the policy algorithm, which does not integrate the abstract idea into a practical application, or add significantly more to the abstract idea itself (claims 2, 6, 11, 15, and 20).  In addition, the claims recite determining the date and time, and using this to determine the policy, which encompasses actions that can be performed in the human mind, as well as performing commerical activities, and thus further recites elements that fall into the “Mental Processes” and “Certain Methods of Organizing Human Activity” groupings of abstract ideas (claims 3 and 16).  In addition, the claims further recite determining different rideshare policies for different locations, which encompasses actions that can be performed in the human mind, as well as performing commerical activities, and thus further recites elements that fall into the “Mental Processes” and “Certain Methods of Organizing Human Activity” groupings of abstract ideas (claim 4).  In addition, the claims further recite that the different locations are differently populated and different policies accept different number of rides, which is merely a narrowing of the field of use by defining the locations and the policy results, which does not integrate the abstract idea into a practical application, or add significantly more to the abstract idea itself (claim 5).  In addition, the claims further recite determining a ride request density or population density, and basing the policy algorithm on the density, date/time, and location; which encompasses actions that can be performed in the human mind, as well as performing commerical activities, and thus further recites elements that fall into the “Mental Processes” and “Certain Methods of Organizing Human Activity” groupings of abstract ideas; as well as narrowing the field of use by defining the type of information used in deciding a policy, which does not integrate the abstract idea into a practical application, or add significantly more to the abstract idea itself  (claims 7, 8, 9, 17).  In addition, the claims further recite that one policy accepts more rides than another, which further recites elements that fall into the “Certain Methods of Organizing Human Activity” grouping of abstract ideas (claim 10).  In addition, the claims further define the target location, which merely narrows the field of use by defining a variable, which does not integrate the abstract idea into a practical application, or add significantly more to the abstract idea itself (claims 12 and 13).  In addition, the claims recite that the vehicle is an autonomous vehicle, which merely further narrows the field of use by defining type of vehicle used, and the use of generic machines to perform their normal tasks, which does not integrate the abstract idea into a practical application, or add significantly more to the abstract idea itself (claim 20).

Novelty/Non-Obviousness
Claims 1-20 are allowed over the prior art of record, however remain rejected under 35 USC 101.  

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Dey et al. (2010/0106603 A1) – Which describes the use of a Markov decision process to model possible vehicle routes, wherein actions in the model have costs/penalties and rewards.  In addition, the goal of the decision process is to minimize the cost of travel along a path, wherein actions have costs and are weighted.  In addition, the Markov decision process can be trained, and used to predict future route path taking, and can be tailored to taxis and fleets of vehicles.
Katsuki et al. (US 2018/0018568 A1) – Which describe the creation of an estimation model, which can be a Markov model, in order to analyze historical information and a person’s characteristics in generate scored and rewards for actions conducted by the person.
Kamar et al. (US 2010/0332242 A1) – Which describes creating an agent based carpooling system, wherein the system is optimized collaboration goals while keeping individual goals in mind.  In addition, user actions are modeled, with rewards identified for actions, and wherein service plans are optimized for the efficiency of transportation.  In addition, penalties are also identified and utilized in order to further optimize the model.  
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL P HARRINGTON whose telephone number is (571)270-1365. The examiner can normally be reached Monday-Friday 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jeffrey can be reached on (571) 272-4602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

Michael Harrington
Primary Patent Examiner
10 November 2022
Art Unit 3628
/MICHAEL P HARRINGTON/Primary Examiner, Art Unit 3628