Detailed Action
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Notice to Applicant
The following is a Final Office action to Application Serial Number 16/674,472, filed on November 5, 2019.  In response to Examiner’s Non-Final Office Action of November 22, 2021, Applicant, on February 14, 2022, amended claims 1, 6, 8, 13, and 15.  Claims 1- 20 are pending in this application and have been rejected below.
Response to Amendment
Applicant’s amendments are acknowledged.
Regarding 35 U.S.C. § 101 rejection, the amended claims have been considered
and are insufficient to overcome the rejection. Please refer to the 35 U.S.C. § 101 rejection for further explanation and rationale.
The 35 U.S.C. § 103 rejections are hereby amended pursuant to applicants amendments. Updated 35 U.S.C. § 103 rejections have been applied to amended claims. Please refer to the § 103 rejection for further explanation and rationale. 
Response to Arguments
Applicant’s arguments filed February 14, 2022 have been fully considered but they are not persuasive and/or are moot in view of the revised rejections.  Applicant’s arguments will be addressed herein below in the order in which they appear in the response filed February 14, 2022.
On page 10-11, regarding the 35 U.S.C. § 101 rejection, Applicant states claim 1 recites, in detail, how simulated transaction data is generated. Specifically, statistic data representing a group of real customers having similar transaction characteristics is provided as a goal, an action including a plurality of simulated transactions is compared to the goal, a feedback about the comparison is provided, and a policy is updated based on the feedback. These steps are repeated until the action is close to the goal (i.e., simulated transaction data is similar enough to the statistical data). Further, claims are similar to Example 39 of the guidance given, the policy engine, which determines the next action, is iteratively updated until an action is similar enough to the goal  In response, Examiner respectfully disagrees. In contrast, Example 39 of the PTO guidance creates training data and iteratively trains and retrains the data of a neural network model. In contrast, Applicants amended claims update policies but does not show the algorithm or model or iterative training of the model in the amended claim language. 
On page 15-16, regarding the 35 U.S.C. § 103 rejection, Applicant argues that the cited references, alone or in combination, fail to teach or suggest, "generating, by the processor, an artificial customer profile by combining randomly selected information from a set of real customer profile data; generating, by the processor, simulated transaction data in imitation of real transaction data a group of real customers having similar transaction characteristics, wherein generating the simulated transaction data further comprises: providing, by the processor, statistic data representing the group of real customers having similar transaction characteristics as a goal; performing, by the processor, a plurality of iterations to simulate the real transaction data, .. . generating, by the processor, simulated customer data by combining the artificial customer profile with the last action to form simulated customer data," and, "performing, by the processor, a plurality of iterations to simulate the real transaction data, wherein the plurality of iterations is performed until a degree of similarity of simulated transaction data relative to the statistical data is higher than a first predefined threshold." In response, new ground(s) of rejection is made necessitated by amendment see MPEP 706.07a where Harris is now applied for Claims 1, 8 and 15.  Regarding the 35 U.S.C. § 103 rejection, Applicant’s arguments with respect to claims has been considered but are moot in view of the new grounds of rejection.  
Double Patenting
Claims 1-20 of this application are patentably indistinct from claims 1-20 of Application No. 16/674,457. Pursuant to 37 CFR 1.78(f), when two or more applications filed by the same applicant or assignee contain patentably indistinct claims, elimination of such claims from all but one application may be required in the absence of good and sufficient reason for their retention during pendency in more than one application. Applicant is required to either cancel the patentably indistinct claims from all but one application or maintain a clear line of demarcation between the applications. See MPEP § 822.
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of copending Application No. 16/674,457 (reference application) as follows. Note that the claims in this application recite limitations that are either recited in ‘457 application or within the scope of the claims. 
Application 16/674,472
Application 16/674,457
Claims 1, 8 and 15
Claims 1, 8, and 15
Claims 2, 9 and 16
Claims 2, 9, and 16
Claims 3 and 10
Claims 3 and 10
Claims 4, 11 and 17
Claims 4, 11 and 17
Claims 5, 12 and 19
Claims 5, 12 and 19
Claims 6 and 13
Claims 6 and 13
Claim 7, 14 and 20
Claims 7, 14, 20
Claim 18
Claim 18


Although the claims at issue are not identical, they are not patentably distinct from each other because the present claims are drawn to the same invention of simulating customer data using a reinforcement learning model.
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1- 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Claims 1-7 are directed to a method for simulating customer data using a reinforcement learning model, Claim 8-14 are directed to an article of manufacture for simulating customer data using a reinforcement learning model and Claims 15-20 are directed to a system for simulating customer data using a reinforcement learning model.
Claim 1 recites a method for simulating customer data using a reinforcement learning model, Claim 8 recites an article of manufacture for simulating customer data using a reinforcement learning model and Claim 15 recites a system for simulating customer data using a reinforcement learning model, which include generating an artificial customer profile by combining randomly selected information from a set of real customer profile data; generating simulated transaction data in imitation of real transaction data a group of real customers having similar transaction characteristics, wherein generating the simulated transaction data further comprises: providing statistic data representing the group of real customers having similar transaction characteristics as a goal; performing, a plurality of iterations to simulate the real transaction data, wherein the plurality of iterations is performed until a degree of similarity of simulated transaction data relative to the statistical data is higher than a first predefined threshold; performing a plurality of iterations to simulate the standard customer transaction data, wherein the plurality of iterations is performed until a degree of similarity of simulated customer transaction data relative to the standard customer transaction data is higher than a first predefined threshold; in each iteration conducting, in each iteration, an action including a plurality of simulated transactions; comparing the action with the goal; providing a feedback associated with the action based on a degree of similarity relative to the goal; and updating a policy based on the feedback for determining next action; and generating simulated customer data by combining the artificial customer profile with the last action to form simulated customer data.  As drafted, this is, under its broadest reasonable interpretation, within the Abstract idea grouping of “Methods of Organizing Human Activity”- marketing or sales activities. The recitation of “processor”, “intelligent agent”, “environment” , “policy engine” , “system”, “memory”, “computer program product”. “computer readable storage medium”, and “program instructions” does not take claims out of the methods of organizing human activities.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. The claims primarily recite the additional element of using computer components to perform each step. The of “processor”, “intelligent agent”, “environment” , “policy engine” , “system”, “memory”, “computer program product”. “computer readable storage medium”, and “program instructions”  is recited at a high-level of generality, such that it amounts no more than mere instructions to apply the exception using a computer component. See MPEP 2106.05(f).  
Accordingly, the additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims also fail to recite any improvements to another technology or technical field, improvements to the functioning of the computer itself, use of a particular machine, effecting a transformation or reduction of a particular article to a different state or thing, and/or an additional element applies or uses the judicial  exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort designed to monopolize the exception.  See 84 Fed. Reg. 55.  In particular, there is a lack of improvement to a computer or technical field in market analysis. 
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of ““processor”, “intelligent agent”, “environment” , “policy engine” , “system”, “memory”, “computer program product”, “computer readable storage medium”, and “program instructions”      is insufficient to amount to significantly more. (See MPEP 2106.05(f) – Mere Instructions to Apply an Exception – “Thus, for example, claims that amount to nothing more than an instruction to apply the abstract idea using a computer do not render an abstract idea eligible.” Alice Corp., 134 S. Ct. at 235). Mere instructions to apply an exception using a computer component cannot provide an inventive concept. 
The claim fails to recite any improvements to another technology or technical field, improvements to the functioning of the computer itself, use of a particular machine, effecting a transformation or reduction of a particular article to a different state or thing, adding unconventional steps that confine the claim to a particular useful application, and/or meaningful limitations beyond generally linking the use of an abstract idea to a particular environment.  See 84 Fed. Reg. 55. Viewed individually or as a whole, these additional claim element(s) do not provide meaningful limitation(s) to transform the abstract idea into a patent eligible application of the abstract idea such that the claim(s) amounts to significantly more than the abstract idea itself.   With regards to accessing data and step 2B, it is M2106.05(d)- Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information). 
Examiner concludes that the additional elements in combination fail to amount to significantly more than the abstract idea based on findings that each element merely performs the same function(s) in combination as each element performs separately. The claim is not patent eligible. Thus, taken alone, the additional elements do not amount to significantly more than the above-identified judicial exception (the abstract idea). Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually.
Dependent Claims 2-7, 9-14 and 16-20 recite the additional elements of the real customer profile data includes one or more of an address of a customer, a name of a customer, contact information, credit information, and income information; each simulated transaction includes transaction type, transaction amount, transaction time, transaction location, transaction medium, a second party associated with the simulated transaction; the environment includes a set of all previous actions conducted; removing a plurality of previous actions having the degree of similarity lower than a second predefined threshold; adding the action; acquiring the statistic data from raw customer transaction data through an unsupervised clustering approach; the feedback is a reward or a penalty; and further narrowing the abstract idea. These recited limitations in the dependent claims are mere instructions for applying the abstract idea on a computerized system which are operating such that they do not amount to significantly more than the above-identified judicial exceptions in Claims 1, 8 and 15.  Regarding Claims 4-6, 11-13, and 17-18  recite the additional elements of “intelligent agent”, “environment”, “processor”, “program instructions”,  and it is M2106.05(d)- Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 7-12 and 14-17, 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Harris et al. , US Publication No. 20210264448 A1, [hereinafter Harris], and in further view of Campos et al. , US Publication No. 20180293498 A1, [hereinafter Campos].
Regarding Claim 1, 
Harris teaches
generating, by the processor, an artificial customer profile by combining randomly selected information from a set of real customer profile data; (Harris- Par. 4-“One embodiment of the disclosure is related to a method comprising: a) receiving, by a computer, network data comprising a plurality of transactions conducted by a plurality of actual users and a plurality of actual resource providers; b) generating, by the computer, a plurality of simulated users, each simulated user based upon a set of the plurality of actual users; c) generating, by the computer, a plurality of simulated resource providers, each simulated resource provider based upon at least one actual resource provider; d) executing, by the computer, a simulation using the plurality of simulated users and simulated resource providers; and e) determining, in response to step d), a plurality of simulated transactions conducted by the simulated users and simulated resource providers.”; Par. 5)
generating, by the processor, simulated transaction data in imitation of real transaction data a group of real customers having similar transaction characteristics, wherein generating the simulated transaction data further comprises: (Harris Par. 4; Par. 14; Par. 60-“The additional processing performed by the evaluation computer 120 can include recommending a resource to an actual user associated with an actual consumer agent which performed a simulated transaction for a digital representation of the resource. The evaluation computer 120 can also recommend a resource to one or more actual users based on simulated transactions performed by consumer agents of a similar community group. For example, a simulant consumer agent may be associated with a “high tech” community group and may perform 10 simulated transactions during the simulation. The evaluation computer 120 can determine actual consumers that are also associated with the “high tech” community group and may generate one or more recommendations for actual resources for the actual consumers based on the simulated transactions performed by the simulant consumer agent. Additional processing may also include analyzing the simulated transactions for trends in purchasing habits, adjusting parameters of the simulation based on the resulting simulated transactions, etc.”;)
providing, by the processor, statistic data representing the group of real customers having similar transaction characteristics as a goal; (Harris Par. 14-“ In some embodiments, a model may be a statistical model, which can be used to predict unknown information from known information. For example, a learning module may be a set of instructions for generating a regression line from training data (supervised learning) or a set of instructions for grouping data into clusters of different classifications of data based on similarity, connectivity, and/or distance between data points (unsupervised learning). The regression line or data clusters can then be used as a model for predicting unknown information from known information. Once a model has been built from the learning module, the model may be used to generate a predicted output from a new request. The new request may be a request for a prediction associated with presented data. For example, new request may be a request for classifying an image or for a recommendation for a user.”)
performing, by the processor, a plurality of iterations to simulate the real transaction data, wherein the plurality of iterations is performed until a degree of similarity of simulated transaction data relative to the statistical data is higher than a first predefined threshold (Harris Par. 155-“ For example, in some embodiments, the simulation computer can compare the plurality of actual consumer agents and the plurality of simulant consumer agents. The simulation computer can then remove, based on the comparing, actual consumer agents from the plurality of actual consumer agents which do not exceed a matching threshold when compared to the plurality of simulant consumer agents.”; Par 169-170-“ Several simulation are run out n epochs with consumer agents at a 1 to 1 match. The data can be split between the training set and the validation set. True transactional data can be pulled from an analogous actual time frame within the data. The adversarial AI can build a model using the training set and can then evaluate the model using the validation set. The model attempts to predict where a given consumer will be at a set time. In some embodiments, noise can be introduced to the network data to help obfuscate selected consumers whose actions are predicted too actually based on a predefined threshold. The noise can be in the form of data removal, swapping information between consumers, fuzzifing key elements of a transaction (e.g., amount) or shift a timeline.”; Abstract; Par. 14); 
in each iteration: conducting, by the intelligent agent, an action including a plurality of simulated transactions; (Harris Par. 4-“ One embodiment of the disclosure is related to a method comprising: a) receiving, by a computer, network data comprising a plurality of transactions conducted by a plurality of actual users and a plurality of actual resource providers; b) generating, by the computer, a plurality of simulated users, each simulated user based upon a set of the plurality of actual users; c) generating, by the computer, a plurality of simulated resource providers, each simulated resource provider based upon at least one actual resource provider; d) executing, by the computer, a simulation using the plurality of simulated users and simulated resource providers; and e) determining, in response to step d), a plurality of simulated transactions conducted by the simulated users and simulated resource providers.”; Par. 81-“ The simulation module 208B may comprise code or software, executable by the processor 204, for performing a simulation. The simulation can include an imitation of a situation and/or process. The simulation can include any suitable simulation (e.g., a continuous simulation, a discrete event simulation, a stochastic simulation, a deterministic simulation, etc.). The simulation module 208B, in conjunction with the processor 204, can be capable of implementing a pollinator-plant simulation which may simulate interactions between pollinators (e.g., simulated as consumer agents) and plants (e.g., simulated as resource provider agents). For example, the simulation module 208B, in conjunction with the processor 204, may iterate through a list of consumer agents and may determine whether or not each consumer agent can perform a simulated transaction for a recommendation during the current epoch based on data associated with the consumer agent (e.g., constraint data) and on data associated with the resource provider agent (e.g., if a consumer capacity value has not been exceed during the current epoch). The simulation module 208B, in conjunction with the processor 204, can determine whether or not a simulated transaction may be performed based on, for example, if a recommendation is non-zero, if a consumer agent is available, if a resource provider is available, etc. The simulation module 208B, in conjunction with the processor 204, can perform the simulation as described in further detail herein.”)
comparing, by the environment, the action with the goal (Harris Par. 155-“ For example, in some embodiments, the simulation computer can compare the plurality of actual consumer agents and the plurality of simulant consumer agents. The simulation computer can then remove, based on the comparing, actual consumer agents from the plurality of actual consumer agents which do not exceed a matching threshold when compared to the plurality of simulant consumer agents.”); 
providing, by the environment, a feedback associated with the action based on a degree of similarity relative to the goal; (Harris-Par. 135-136-“ At step 328, the simulation computer can reweight the recommendations and update the consumer agent's satisfaction level. In some embodiments, during the first epoch the recommendations and satisfaction level may not yet differ from the initial values. However, it is understood that later in the epoch, and in later epochs, the recommendations and satisfaction levels may be updated based on performed simulated transactions and, in some embodiments, when a simulated transaction was not able to be performed. For example, if a consumer agent performs a transaction for a laptop, then the satisfaction level may be increased by an amount which may be proportional to the consumer agent's community group, the consumer agent's propensity”)
updating, by the policy engine, a policy based on the feedback for determining a next action; (Harris- Par.135-136-“ At step 328, the simulation computer can reweight the recommendations and update the consumer agent's satisfaction level. In some embodiments, during the first epoch the recommendations and satisfaction level may not yet differ from the initial values. However, it is understood that later in the epoch, and in later epochs, the recommendations and satisfaction levels may be updated based on performed simulated transactions and, in some embodiments, when a simulated transaction was not able to be performed. For example, if a consumer agent performs a transaction for a laptop, then the satisfaction level may be increased by an amount which may be proportional to the consumer agent's community group, the consumer agent's propensity; At step 330, after updating the recommendations and the satisfaction level(s), the simulation computer can determine whether or not a first recommendation of the recommendations for the current epoch and the current consumer agent is non-zero (i.e., determine whether or not there is a recommendation). For example, the first consumer agent can be a simulant consumer agent associated with the “high tech” community group can have a recommendation of “laptop.” If the recommendation is non-zero (i.e., there is a recommendation of, for example, “laptop”), then the simulation computer can proceed to step 332. If the recommendation is zero, or null (i.e., there is no recommendation), then the simulation computer can proceed to step 326 to iterate to the next recommendation as well as update the recommendations and satisfaction level(s) as appropriate. )
and generating, by the processor, simulated customer data by combining, the artificial customer profile with the simulated customer transaction data to form simulated customer data; (Harris- Par. 60-“ The additional processing performed by the evaluation computer 120 can include recommending a resource to an actual user associated with an actual consumer agent which performed a simulated transaction for a digital representation of the resource. The evaluation computer 120 can also recommend a resource to one or more actual users based on simulated transactions performed by consumer agents of a similar community group. For example, a simulant consumer agent may be associated with a “high tech” community group and may perform 10 simulated transactions during the simulation. The evaluation computer 120 can determine actual consumers that are also associated with the “high tech” community group and may generate one or more recommendations for actual resources for the actual consumers based on the simulated transactions performed by the simulant consumer agent. Additional processing may also include analyzing the simulated transactions for trends in purchasing habits, adjusting parameters of the simulation based on the resulting simulated transactions, etc.”;)

Harris teaches artificial intelligence techniques and the feature is expounded upon by Campos:
A computer implemented method in a data processing system comprising a processor and a memory comprising instructions, which are executed by the processor to cause the processor to implement the method for simulating customer data using a reinforcement learning model including an intelligent agent, a policy engine, and an environment, the method comprising: (Campos- Par.10-“In an embodiment, an AI engine has multiple independent modules to work on one or more computing platforms. The multiple independent modules are configured to have their instructions executed by one or more processors in the one or more computing platforms and any software instructions they may use can be stored in one or more memories of the computing platforms.”; Fig. 5, Par. 27; Par.99-“ For example, in each iteration, the machine learning software makes a decision about the next set of parameters for friction compensation and the next set of parameters for motion. These decisions are made by the modules of the AI engine. It is anticipated that the many iterations involved will require that the optimization process be capable of running autonomously. To achieve this, a software layer is utilized to enable the AI engine software to configure the control with the next iteration's parameterization for friction compensation and its parameterization of the axis motion. The goal for deep reinforcement learning in this example user's case is to explore the potential of the AI engine to improve upon manual or current automatic calibration. Specifically, to eliminate the human expert and make the AI the expert in selecting parameter values, equal or improve upon the degree of precision, reduce the number of iterations of tests needed, and hence the overall time needed to complete the circularity test. The AI engine is coded to understand machine dynamics and develop initial model of machine's dynamics. Development of a simulation model is included based on initial measurements. The AI engine's ability to set friction and backlash compensation parameters occurs within the simulation model. After the initial model training occurs, then the training of the simulation model of friction and backlash compensation is extended with the advice from any experts in that field. The training of the simulation model moves from the simulation model world, after the deep reinforcement learning is complete, to a real world environment. The training of the concept takes the learning from the real machine and uses it to improve and tune the simulation model.”;  Par 155)


Harris and Campos are directed to artificial intelligence modelling and analysis. It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have improve upon artificial intelligence (AI) analysis of Harris, as taught by Campos, by utilizing reinforcement learning techniques with a reasonable expectation of success of arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make the modification to the teachings of Harris with the motivation of using reinforcement learning technique to hierarchically decompose a complex tasks into multiple smaller, individual sub-tasks making up the complex task to improve the speed and/or accuracy of the resulting AI model. (Campos Par. 11 and Par. 90).
Regarding Claim 2, Claim 9 and Claim 16, Harris in view of Campos teach the method as recited in claim 1…, the computer program product of claim 8…, and the system of claim 15…
wherein the real customer profile data includes one or more of an address of a customer, a name of a customer, contact information, credit information, and income information (Harris Par. 49-“ n some embodiments, the simulation computer 112 can query the network data database 110 for network data associated with one or more criteria. For example, one criterion that the simulation computer 112 can include in the query is a time and/or time range. For example, the simulation computer 112 can query for network data that is associated with the past day, past hour, particular date range (e.g., 5/10/2019 to 5/15/2019), etc. As another example, the simulation computer 112 can include a criterion that the retrieved network data include data related to interactions that occurred within a particular geographic area (e.g., North America, California, etc.). Additional example criteria can relate to user demographics, resource provider demographics, spending amount, etc.”).
Regarding Claim 3 and Claim 10, Harris in view of Campos teach the method as recited in claim 1…, and the computer program product of claim 8…
wherein each simulated transaction includes transaction type, transaction amount, transaction time, transaction location, transaction medium, a second party associated with the simulated transaction. (Harris Par. 52-“ At step 11, the configuration database 114 can provide the simulation computer 112 with the queried configuration(s). For example, the simulation computer 112 can query the configuration database 114 for a Bay Area simulation configuration. The Bay Area simulation configuration can include data relating to values which represent the Bay Area (e.g., ZIP codes, addresses, common types of resource providers in the area, income data, spending data, etc.).).
Regarding Claim 4, Claim 11 and Claim 17, Harris in view of Campos teach the method as recited in claim 1…, the computer program product of claim 8…, and the system of claim 15…
wherein the environment includes a set of all previous actions conducted by the intelligent agent. (Harris Par. 117-118 “At step 312, the simulation computer can build the simulant consumer agent's transaction history data from community group(s). The community group(s) to which a simulant consumer agent belongs can be predetermined and stored in the simulant consumer agent database or can be determined by an unsupervised learner capable of clustering the simulant consumer agents and, in some embodiments, the actual consumer agents into one or more community groups based on the similarities of the simulant consumer agents and, in some embodiments, the actual consumer agents.; The simulation computer can generate random simulated transactions for a transaction history associated with each simulant consumer agent of the plurality of simulant consumer agents. In some embodiments, the random simulated transactions can be based on a set of the plurality of actual consumers (e.g., a community of actual consumers, a groups of actual consumers which the simulant consumer agent represents, etc.).”).
Regarding Claim 5, Claim 12 and Claim 19, Harris in view of Campos teach the method as recited in claim 4…, the computer program product of claim 11…, and the system of claim 17…
further comprising: removing, by the processor, a plurality of previous actions having the degree of similarity lower than a second predefined threshold. (Harris Par. 155 “For example, in some embodiments, the simulation computer can compare the plurality of actual consumer agents and the plurality of simulant consumer agents. The simulation computer can then remove, based on the comparing, actual consumer agents from the plurality of actual consumer agents which do not exceed a matching threshold when compared to the plurality of simulant consumer agents.”).
Regarding Claim 7, Claim 14 and Claim 20, Harris in view of Campos teach the method as recited in claim 1…, the computer program product of claim 8…, and the system of claim 15…
Harris fails to teach the following feature taught by Campos:
wherein the feedback is a reward or a penalty. (Campos Par. 13-“The AI engine decomposing the complex task allows for both i) the first module to use reward functions focused for solving each individual sub-task, and then one or more reward functions focused for the end solution of the complex task, as well as ii) the first module to conduct the training of the AI objects corresponding to the individual sub-tasks in the complex task, in parallel at the same time. The combined parallel training and the use of reward functions focused for solving each individual sub-task speed up an overall training duration for the complex task on the one or more computing platforms, and subsequent deployment of a resulting AI model that is trained, compared to an end-to-end training with a single algorithm for all of the AI objects incorporated into the AI model.”; Par. 76-“ In an embodiment, at each time step (e.g., iteration of learning), the AI concept receives a state in a state space, selects a sub-task from an action space, follows a policy, which controls the AI concept's behavior, i.e., a mapping from a state to sub-tasks, then receives a scalar reward, and then transitions to the next state, according to the environment dynamics, or model, for the reward function. (See FIG. 6 for example.) The AI concept also receives feedback from its selected sub-tasks and performance and then evaluates the feedback to alter its training. Each concept can have different state+action spaces.”).

Regarding Claim 8, 
Harris teaches
generate an artificial customer profile by combining randomly selected information from a set of real customer profile data; (Harris- Par. 4-“One embodiment of the disclosure is related to a method comprising: a) receiving, by a computer, network data comprising a plurality of transactions conducted by a plurality of actual users and a plurality of actual resource providers; b) generating, by the computer, a plurality of simulated users, each simulated user based upon a set of the plurality of actual users; c) generating, by the computer, a plurality of simulated resource providers, each simulated resource provider based upon at least one actual resource provider; d) executing, by the computer, a simulation using the plurality of simulated users and simulated resource providers; and e) determining, in response to step d), a plurality of simulated transactions conducted by the simulated users and simulated resource providers.”; Par. 5)
generate simulated transaction data in imitation of real transaction data a group of real customers having similar transaction characteristics, wherein generating the simulated transaction data further comprises: (Harris Par. 4; Par. 14; Par. 60-“The additional processing performed by the evaluation computer 120 can include recommending a resource to an actual user associated with an actual consumer agent which performed a simulated transaction for a digital representation of the resource. The evaluation computer 120 can also recommend a resource to one or more actual users based on simulated transactions performed by consumer agents of a similar community group. For example, a simulant consumer agent may be associated with a “high tech” community group and may perform 10 simulated transactions during the simulation. The evaluation computer 120 can determine actual consumers that are also associated with the “high tech” community group and may generate one or more recommendations for actual resources for the actual consumers based on the simulated transactions performed by the simulant consumer agent. Additional processing may also include analyzing the simulated transactions for trends in purchasing habits, adjusting parameters of the simulation based on the resulting simulated transactions, etc.”;)
providing statistic data representing the group of real customers having similar transaction characteristics as a goal; (Harris Par. 14-“ In some embodiments, a model may be a statistical model, which can be used to predict unknown information from known information. For example, a learning module may be a set of instructions for generating a regression line from training data (supervised learning) or a set of instructions for grouping data into clusters of different classifications of data based on similarity, connectivity, and/or distance between data points (unsupervised learning). The regression line or data clusters can then be used as a model for predicting unknown information from known information. Once a model has been built from the learning module, the model may be used to generate a predicted output from a new request. The new request may be a request for a prediction associated with presented data. For example, new request may be a request for classifying an image or for a recommendation for a user.”)
performing a plurality of iterations to simulate the real transaction data, wherein the plurality of iterations is performed until a degree of similarity of simulated transaction data relative to the statistical data is higher than a first predefined threshold (Harris Par. 155-“ For example, in some embodiments, the simulation computer can compare the plurality of actual consumer agents and the plurality of simulant consumer agents. The simulation computer can then remove, based on the comparing, actual consumer agents from the plurality of actual consumer agents which do not exceed a matching threshold when compared to the plurality of simulant consumer agents.”; Par 169-170-“ Several simulation are run out n epochs with consumer agents at a 1 to 1 match. The data can be split between the training set and the validation set. True transactional data can be pulled from an analogous actual time frame within the data. The adversarial AI can build a model using the training set and can then evaluate the model using the validation set. The model attempts to predict where a given consumer will be at a set time. In some embodiments, noise can be introduced to the network data to help obfuscate selected consumers whose actions are predicted too actually based on a predefined threshold. The noise can be in the form of data removal, swapping information between consumers, fuzzifing key elements of a transaction (e.g., amount) or shift a timeline.”; Abstract; Par. 14); 
in each iteration: conducting, by the intelligent agent, an action including a plurality of simulated transactions; (Harris Par. 4-“ One embodiment of the disclosure is related to a method comprising: a) receiving, by a computer, network data comprising a plurality of transactions conducted by a plurality of actual users and a plurality of actual resource providers; b) generating, by the computer, a plurality of simulated users, each simulated user based upon a set of the plurality of actual users; c) generating, by the computer, a plurality of simulated resource providers, each simulated resource provider based upon at least one actual resource provider; d) executing, by the computer, a simulation using the plurality of simulated users and simulated resource providers; and e) determining, in response to step d), a plurality of simulated transactions conducted by the simulated users and simulated resource providers.”; Par. 81-“ The simulation module 208B may comprise code or software, executable by the processor 204, for performing a simulation. The simulation can include an imitation of a situation and/or process. The simulation can include any suitable simulation (e.g., a continuous simulation, a discrete event simulation, a stochastic simulation, a deterministic simulation, etc.). The simulation module 208B, in conjunction with the processor 204, can be capable of implementing a pollinator-plant simulation which may simulate interactions between pollinators (e.g., simulated as consumer agents) and plants (e.g., simulated as resource provider agents). For example, the simulation module 208B, in conjunction with the processor 204, may iterate through a list of consumer agents and may determine whether or not each consumer agent can perform a simulated transaction for a recommendation during the current epoch based on data associated with the consumer agent (e.g., constraint data) and on data associated with the resource provider agent (e.g., if a consumer capacity value has not been exceed during the current epoch). The simulation module 208B, in conjunction with the processor 204, can determine whether or not a simulated transaction may be performed based on, for example, if a recommendation is non-zero, if a consumer agent is available, if a resource provider is available, etc. The simulation module 208B, in conjunction with the processor 204, can perform the simulation as described in further detail herein.”)
comparing, by the environment, the action with the goal (Harris Par. 155-“ For example, in some embodiments, the simulation computer can compare the plurality of actual consumer agents and the plurality of simulant consumer agents. The simulation computer can then remove, based on the comparing, actual consumer agents from the plurality of actual consumer agents which do not exceed a matching threshold when compared to the plurality of simulant consumer agents.”); 
providing, by the environment, a feedback associated with the action based on a degree of similarity relative to the goal; (Harris-Par. 135-136-“ At step 328, the simulation computer can reweight the recommendations and update the consumer agent's satisfaction level. In some embodiments, during the first epoch the recommendations and satisfaction level may not yet differ from the initial values. However, it is understood that later in the epoch, and in later epochs, the recommendations and satisfaction levels may be updated based on performed simulated transactions and, in some embodiments, when a simulated transaction was not able to be performed. For example, if a consumer agent performs a transaction for a laptop, then the satisfaction level may be increased by an amount which may be proportional to the consumer agent's community group, the consumer agent's propensity”)
updating, by the policy engine, a policy based on the feedback for determining a next action; (Harris- Par.135-136-“ At step 328, the simulation computer can reweight the recommendations and update the consumer agent's satisfaction level. In some embodiments, during the first epoch the recommendations and satisfaction level may not yet differ from the initial values. However, it is understood that later in the epoch, and in later epochs, the recommendations and satisfaction levels may be updated based on performed simulated transactions and, in some embodiments, when a simulated transaction was not able to be performed. For example, if a consumer agent performs a transaction for a laptop, then the satisfaction level may be increased by an amount which may be proportional to the consumer agent's community group, the consumer agent's propensity; At step 330, after updating the recommendations and the satisfaction level(s), the simulation computer can determine whether or not a first recommendation of the recommendations for the current epoch and the current consumer agent is non-zero (i.e., determine whether or not there is a recommendation). For example, the first consumer agent can be a simulant consumer agent associated with the “high tech” community group can have a recommendation of “laptop.” If the recommendation is non-zero (i.e., there is a recommendation of, for example, “laptop”), then the simulation computer can proceed to step 332. If the recommendation is zero, or null (i.e., there is no recommendation), then the simulation computer can proceed to step 326 to iterate to the next recommendation as well as update the recommendations and satisfaction level(s) as appropriate. )
and generating, by the processor, simulated customer data by combining, the artificial customer profile with the simulated customer transaction data to form simulated customer data; (Harris- Par. 60-“ The additional processing performed by the evaluation computer 120 can include recommending a resource to an actual user associated with an actual consumer agent which performed a simulated transaction for a digital representation of the resource. The evaluation computer 120 can also recommend a resource to one or more actual users based on simulated transactions performed by consumer agents of a similar community group. For example, a simulant consumer agent may be associated with a “high tech” community group and may perform 10 simulated transactions during the simulation. The evaluation computer 120 can determine actual consumers that are also associated with the “high tech” community group and may generate one or more recommendations for actual resources for the actual consumers based on the simulated transactions performed by the simulant consumer agent. Additional processing may also include analyzing the simulated transactions for trends in purchasing habits, adjusting parameters of the simulation based on the resulting simulated transactions, etc.”;)


Harris teaches artificial intelligence techniques and the feature is expounded upon by Campos:
A computer program product for simulating customer data using a reinforcement learning model including an intelligent agent, a policy engine, and an environment, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: (Campos- Par.10-“In an embodiment, an AI engine has multiple independent modules to work on one or more computing platforms. The multiple independent modules are configured to have their instructions executed by one or more processors in the one or more computing platforms and any software instructions they may use can be stored in one or more memories of the computing platforms.”; Fig. 5, Par. 27; Par.99-“ For example, in each iteration, the machine learning software makes a decision about the next set of parameters for friction compensation and the next set of parameters for motion. These decisions are made by the modules of the AI engine. It is anticipated that the many iterations involved will require that the optimization process be capable of running autonomously. To achieve this, a software layer is utilized to enable the AI engine software to configure the control with the next iteration's parameterization for friction compensation and its parameterization of the axis motion. The goal for deep reinforcement learning in this example user's case is to explore the potential of the AI engine to improve upon manual or current automatic calibration. Specifically, to eliminate the human expert and make the AI the expert in selecting parameter values, equal or improve upon the degree of precision, reduce the number of iterations of tests needed, and hence the overall time needed to complete the circularity test. The AI engine is coded to understand machine dynamics and develop initial model of machine's dynamics. Development of a simulation model is included based on initial measurements. The AI engine's ability to set friction and backlash compensation parameters occurs within the simulation model. After the initial model training occurs, then the training of the simulation model of friction and backlash compensation is extended with the advice from any experts in that field. The training of the simulation model moves from the simulation model world, after the deep reinforcement learning is complete, to a real world environment. The training of the concept takes the learning from the real machine and uses it to improve and tune the simulation model.”;  Par 155)


Harris and Campos are directed to artificial intelligence modelling and analysis. It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have improve upon artificial intelligence (AI) analysis of Harris, as taught by Campos, by utilizing reinforcement learning techniques with a reasonable expectation of success of arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make the modification to the teachings of Harris with the motivation of using reinforcement learning technique to hierarchically decompose a complex tasks into multiple smaller, individual sub-tasks making up the complex task to improve the speed and/or accuracy of the resulting AI model. (Campos Par. 11 and Par. 90).

Regarding Claim 15, 
J Harris teaches
generate an artificial customer profile by combining randomly selected information from a set of real customer profile data; (Harris- Par. 4-“One embodiment of the disclosure is related to a method comprising: a) receiving, by a computer, network data comprising a plurality of transactions conducted by a plurality of actual users and a plurality of actual resource providers; b) generating, by the computer, a plurality of simulated users, each simulated user based upon a set of the plurality of actual users; c) generating, by the computer, a plurality of simulated resource providers, each simulated resource provider based upon at least one actual resource provider; d) executing, by the computer, a simulation using the plurality of simulated users and simulated resource providers; and e) determining, in response to step d), a plurality of simulated transactions conducted by the simulated users and simulated resource providers.”; Par. 5)
generate simulated transaction data in imitation of real transaction data a group of real customers having similar transaction characteristics, wherein generating the simulated transaction data further comprises: (Harris Par. 4; Par. 14; Par. 60-“The additional processing performed by the evaluation computer 120 can include recommending a resource to an actual user associated with an actual consumer agent which performed a simulated transaction for a digital representation of the resource. The evaluation computer 120 can also recommend a resource to one or more actual users based on simulated transactions performed by consumer agents of a similar community group. For example, a simulant consumer agent may be associated with a “high tech” community group and may perform 10 simulated transactions during the simulation. The evaluation computer 120 can determine actual consumers that are also associated with the “high tech” community group and may generate one or more recommendations for actual resources for the actual consumers based on the simulated transactions performed by the simulant consumer agent. Additional processing may also include analyzing the simulated transactions for trends in purchasing habits, adjusting parameters of the simulation based on the resulting simulated transactions, etc.”;)
providing statistic data representing the group of real customers having similar transaction characteristics as a goal; (Harris Par. 14-“ In some embodiments, a model may be a statistical model, which can be used to predict unknown information from known information. For example, a learning module may be a set of instructions for generating a regression line from training data (supervised learning) or a set of instructions for grouping data into clusters of different classifications of data based on similarity, connectivity, and/or distance between data points (unsupervised learning). The regression line or data clusters can then be used as a model for predicting unknown information from known information. Once a model has been built from the learning module, the model may be used to generate a predicted output from a new request. The new request may be a request for a prediction associated with presented data. For example, new request may be a request for classifying an image or for a recommendation for a user.”)
performing a plurality of iterations to simulate the real transaction data, wherein the plurality of iterations is performed until a degree of similarity of simulated transaction data relative to the statistical data is higher than a first predefined threshold (Harris Par. 155-“ For example, in some embodiments, the simulation computer can compare the plurality of actual consumer agents and the plurality of simulant consumer agents. The simulation computer can then remove, based on the comparing, actual consumer agents from the plurality of actual consumer agents which do not exceed a matching threshold when compared to the plurality of simulant consumer agents.”; Par 169-170-“ Several simulation are run out n epochs with consumer agents at a 1 to 1 match. The data can be split between the training set and the validation set. True transactional data can be pulled from an analogous actual time frame within the data. The adversarial AI can build a model using the training set and can then evaluate the model using the validation set. The model attempts to predict where a given consumer will be at a set time. In some embodiments, noise can be introduced to the network data to help obfuscate selected consumers whose actions are predicted too actually based on a predefined threshold. The noise can be in the form of data removal, swapping information between consumers, fuzzifing key elements of a transaction (e.g., amount) or shift a timeline.”; Abstract; Par. 14); 
in each iteration: conducting, by the intelligent agent, an action including a plurality of simulated transactions; (Harris Par. 4-“ One embodiment of the disclosure is related to a method comprising: a) receiving, by a computer, network data comprising a plurality of transactions conducted by a plurality of actual users and a plurality of actual resource providers; b) generating, by the computer, a plurality of simulated users, each simulated user based upon a set of the plurality of actual users; c) generating, by the computer, a plurality of simulated resource providers, each simulated resource provider based upon at least one actual resource provider; d) executing, by the computer, a simulation using the plurality of simulated users and simulated resource providers; and e) determining, in response to step d), a plurality of simulated transactions conducted by the simulated users and simulated resource providers.”; Par. 81-“ The simulation module 208B may comprise code or software, executable by the processor 204, for performing a simulation. The simulation can include an imitation of a situation and/or process. The simulation can include any suitable simulation (e.g., a continuous simulation, a discrete event simulation, a stochastic simulation, a deterministic simulation, etc.). The simulation module 208B, in conjunction with the processor 204, can be capable of implementing a pollinator-plant simulation which may simulate interactions between pollinators (e.g., simulated as consumer agents) and plants (e.g., simulated as resource provider agents). For example, the simulation module 208B, in conjunction with the processor 204, may iterate through a list of consumer agents and may determine whether or not each consumer agent can perform a simulated transaction for a recommendation during the current epoch based on data associated with the consumer agent (e.g., constraint data) and on data associated with the resource provider agent (e.g., if a consumer capacity value has not been exceed during the current epoch). The simulation module 208B, in conjunction with the processor 204, can determine whether or not a simulated transaction may be performed based on, for example, if a recommendation is non-zero, if a consumer agent is available, if a resource provider is available, etc. The simulation module 208B, in conjunction with the processor 204, can perform the simulation as described in further detail herein.”)
comparing, by the environment, the action with the goal (Harris Par. 155-“ For example, in some embodiments, the simulation computer can compare the plurality of actual consumer agents and the plurality of simulant consumer agents. The simulation computer can then remove, based on the comparing, actual consumer agents from the plurality of actual consumer agents which do not exceed a matching threshold when compared to the plurality of simulant consumer agents.”); 
providing, by the environment, a feedback associated with the action based on a degree of similarity relative to the goal; (Harris-Par. 135-136-“ At step 328, the simulation computer can reweight the recommendations and update the consumer agent's satisfaction level. In some embodiments, during the first epoch the recommendations and satisfaction level may not yet differ from the initial values. However, it is understood that later in the epoch, and in later epochs, the recommendations and satisfaction levels may be updated based on performed simulated transactions and, in some embodiments, when a simulated transaction was not able to be performed. For example, if a consumer agent performs a transaction for a laptop, then the satisfaction level may be increased by an amount which may be proportional to the consumer agent's community group, the consumer agent's propensity”)
updating, by the policy engine, a policy based on the feedback for determining a next action; (Harris- Par.135-136-“ At step 328, the simulation computer can reweight the recommendations and update the consumer agent's satisfaction level. In some embodiments, during the first epoch the recommendations and satisfaction level may not yet differ from the initial values. However, it is understood that later in the epoch, and in later epochs, the recommendations and satisfaction levels may be updated based on performed simulated transactions and, in some embodiments, when a simulated transaction was not able to be performed. For example, if a consumer agent performs a transaction for a laptop, then the satisfaction level may be increased by an amount which may be proportional to the consumer agent's community group, the consumer agent's propensity; At step 330, after updating the recommendations and the satisfaction level(s), the simulation computer can determine whether or not a first recommendation of the recommendations for the current epoch and the current consumer agent is non-zero (i.e., determine whether or not there is a recommendation). For example, the first consumer agent can be a simulant consumer agent associated with the “high tech” community group can have a recommendation of “laptop.” If the recommendation is non-zero (i.e., there is a recommendation of, for example, “laptop”), then the simulation computer can proceed to step 332. If the recommendation is zero, or null (i.e., there is no recommendation), then the simulation computer can proceed to step 326 to iterate to the next recommendation as well as update the recommendations and satisfaction level(s) as appropriate. )
and generate simulated customer data by combining, the artificial customer profile with the simulated customer transaction data to form simulated customer data; (Harris- Par. 60-“ The additional processing performed by the evaluation computer 120 can include recommending a resource to an actual user associated with an actual consumer agent which performed a simulated transaction for a digital representation of the resource. The evaluation computer 120 can also recommend a resource to one or more actual users based on simulated transactions performed by consumer agents of a similar community group. For example, a simulant consumer agent may be associated with a “high tech” community group and may perform 10 simulated transactions during the simulation. The evaluation computer 120 can determine actual consumers that are also associated with the “high tech” community group and may generate one or more recommendations for actual resources for the actual consumers based on the simulated transactions performed by the simulant consumer agent. Additional processing may also include analyzing the simulated transactions for trends in purchasing habits, adjusting parameters of the simulation based on the resulting simulated transactions, etc.”;)

Harris teaches artificial intelligence techniques and the feature is expounded upon by Campos:
A system for simulating customer data using a reinforcement learning model including an intelligent agent, a policy engine, and an environment, the system comprising: a processor configured to: (Campos- Par.10-“In an embodiment, an AI engine has multiple independent modules to work on one or more computing platforms. The multiple independent modules are configured to have their instructions executed by one or more processors in the one or more computing platforms and any software instructions they may use can be stored in one or more memories of the computing platforms.”; Fig. 5, Par. 27; Par.99-“ For example, in each iteration, the machine learning software makes a decision about the next set of parameters for friction compensation and the next set of parameters for motion. These decisions are made by the modules of the AI engine. It is anticipated that the many iterations involved will require that the optimization process be capable of running autonomously. To achieve this, a software layer is utilized to enable the AI engine software to configure the control with the next iteration's parameterization for friction compensation and its parameterization of the axis motion. The goal for deep reinforcement learning in this example user's case is to explore the potential of the AI engine to improve upon manual or current automatic calibration. Specifically, to eliminate the human expert and make the AI the expert in selecting parameter values, equal or improve upon the degree of precision, reduce the number of iterations of tests needed, and hence the overall time needed to complete the circularity test. The AI engine is coded to understand machine dynamics and develop initial model of machine's dynamics. Development of a simulation model is included based on initial measurements. The AI engine's ability to set friction and backlash compensation parameters occurs within the simulation model. After the initial model training occurs, then the training of the simulation model of friction and backlash compensation is extended with the advice from any experts in that field. The training of the simulation model moves from the simulation model world, after the deep reinforcement learning is complete, to a real world environment. The training of the concept takes the learning from the real machine and uses it to improve and tune the simulation model.”;  Par 155)

Harris and Campos are directed to artificial intelligence modelling and analysis. It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have improve upon artificial intelligence (AI) analysis of Harris, as taught by Campos, by utilizing reinforcement learning techniques with a reasonable expectation of success of arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make the modification to the teachings of Harris with the motivation of using reinforcement learning technique to hierarchically decompose a complex tasks into multiple smaller, individual sub-tasks making up the complex task to improve the speed and/or accuracy of the resulting AI model. (Campos Par. 11 and Par. 90).
Regarding Claim 18, Harris in view of Campos teach the system of claim 17…
wherein the processor is further configured to add the action into the environment. (Harris Par. 60-“ The additional processing performed by the evaluation computer 120 can include recommending a resource to an actual user associated with an actual consumer agent which performed a simulated transaction for a digital representation of the resource. The evaluation computer 120 can also recommend a resource to one or more actual users based on simulated transactions performed by consumer agents of a similar community group. For example, a simulant consumer agent may be associated with a “high tech” community group and may perform 10 simulated transactions during the simulation. The evaluation computer 120 can determine actual consumers that are also associated with the “high tech” community group and may generate one or more recommendations for actual resources for the actual consumers based on the simulated transactions performed by the simulant consumer agent. Additional processing may also include analyzing the simulated transactions for trends in purchasing habits, adjusting parameters of the simulation based on the resulting simulated transactions, etc.”)

Claims 6 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Harris et al. , US Publication No. 20210264448 A1, [hereinafter Harris],, in view of Campos et al. , US Publication No. 20180293498 A1, [hereinafter Campos], and in further view of Schroeder et al. , US Publication No. 20190377902 A1, [hereinafter Schroeder].

Regarding Claim 6 and Claim 13, Harris in view of Campos teach the method as recited in claim 1…, and the computer program product of claim 8…
Harris in view of Campos discloses clustering techniques and the feature is expounded upon by the teachings of Schroeder:
further comprising: acquiring, by the processor, the standard customer transaction data from raw customer transaction data through an unsupervised clustering approach (Schroeder Par. 72-“ In some implementations, artificial profile model 700 may be trained by executing one or more machine-learning algorithms on a data set including non-exhaustive set 610 of FIG. 6. For example, one or more clustering algorithms may be executed on the data set including non-exhaustive set 610 to identify clusters of data privacy elements that relate to each other or patterns of dependencies within the data set. The data protection platform can execute the clustering algorithms to identify patterns within the data set, which can then be used to generate artificial profile model 700. Non-limiting examples of machine-learning algorithms or techniques can include artificial neural networks (including backpropagation, Boltzmann machines, etc.), bayesian statistics (e.g., bayesian networks or knowledge bases), logistical model trees, support vector machines, information fuzzy networks, Hidden Markov models, hierarchical clustering (unsupervised), self-organizing maps, clustering techniques, and other suitable machine-learning techniques (supervised or unsupervised). For example, the data protection platform can retrieve one or more machine-learning algorithms stored in a database (not shown) to generate an artificial neural network in order to identify patterns or correlations within the data set of data privacy elements (i.e., within non-exhaustive set 610). As a further example, the artificial neural network can learn that when data privacy element #1 (in the data set) includes value A and value B, then data privacy element #2 is predicted as relevant data for data privacy element #1. Thus, a constrain, relationship and/or dependency can be defined between data privacy element #1 and data privacy element #2, such that any newly created or modified artificial profiles should be consistent with the relationship between data privacy elements #1 and #2. In yet another example, a support vector machine can be used either to generate output data that is used as a prediction, or to identify learned patterns within the data set. The one or more machine-learning algorithms may relate to unsupervised learning techniques, however, the present disclosure is not limited thereto. Supervised learning techniques may also be implemented. In some implementations, executing the one or more machine-learning algorithms may generate a plurality of nodes and one or more correlations between at least two nodes of the plurality of nodes. For example, the one or more machine-learning algorithms in these implementations can include unsupervised learning techniques, such as clustering techniques, artificial neural networks, association rule learning, and so on.”).
Harris, Campos and Schroeder are directed to artificial intelligence modelling and analysis. It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have improve upon artificial intelligence (AI) analysis of Harris in view of Campos, by utilizing clustering techniques with a reasonable expectation of success of arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make the modification to the teachings of Harris in view of Campos with the motivation of dynamically creating, modifying, and validating artificial profiles using a data protection platform to control data exposure. (Schroeder Par. 2).




Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: US Publication No. US 20180365674A1 to Han et al.- Abstract-“ A device may obtain, for a set of transactions, a set of transaction values associated with a particular industry. The device may determine one or more sample statistical distributions for a probabilistic transaction model by using one or more machine learning techniques. The one or more sample statistical distributions may be similar to one or more actual statistical distributions that are associated with the set of transaction values. The device may generate simulated transaction information using the probabilistic transaction model. The device may perform one or more actions after generating the simulated transaction information.”
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Chesiree Walton, whose telephone number is (571) 272-5219.  The examiner can normally be reached from Monday to Friday between 8 AM and 5 PM.  If any attempt to reach the examiner by telephone is unsuccessful, the examiner’s supervisor, Patricia Munson, can be reached at (571) 270-5396.  The fax telephone numbers for this group are either (571) 273-8300 or (703) 872-9326 (for official communications including After Final communications labeled “Box AF”).
	Another resource that is available to applicants is the Patent Application Information Retrieval (PAIR). Information regarding the status of an application can be obtained from the (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAX. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, please feel free to contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
	Applicants are invited to contact the Office to schedule an in-person interview to discuss and resolve the issues set forth in this Office Action.  Although an interview is not required, the Office believes that an interview can be of use to resolve any issues related to a patent application in an efficient and prompt manner.
Sincerely,
/Chesiree Walton/
Examiner, Art Unit 3624
/PATRICIA H MUNSON/Supervisory Patent Examiner, Art Unit 3624