DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

The instant application having application No. 17/853,648 filed on June 29, 2022, presents claims 1-20 for examination.

Examiner Notes
Examiner cites particular columns, paragraphs, figures and line numbers in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 17-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

	With respect to claims 17-19, line 1 of claim 17 recites “computing one or more metrics.” It is unclear whether this is the same as the “one or more metrics” recited in claim 15, from which claim 17 depends.  For purposes of compact prosecution only, Examiner has interpreted claim 17 as reciting -- computing one or more second metrics --.  Claims 18 and 19 inherit this deficiency and have likewise been interpreted as reciting -- the one or more second metrics --.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-6, 8-11, and 13-15 of U.S. Patent No. 11,409,637. Although the claims at issue are not identical, they are not patentably distinct from each other, as illustrated in the following table (for the sake of brevity, only claim 1 is presented in full-text): 

Instant Application 
Reference Patent No. 11,409,637
1. A computer-implemented method comprising: 

accessing one or more updated data structures that are to be included in a user interface functionality test, the updated data structures contributing at least partially to a user interface; 


accessing a portion of live or snapshotted data captured from one or more services running in a production environment, the live or snapshotted data being used in the user interface functionality test, wherein the services implement queue-based communication that decouples the services to allow computation of data in separate stages; 









generating a first simulated user interface instance using the updated data structures and using the accessed live or snapshotted data; 





generating a second simulated user interface instance using a different version of the updated data structures and using the accessed live or snapshotted data; 










computing one or more metrics related to the first and second user interface instances, wherein the computation is performed in a separate computing stage, the separate computing stage allowing the metrics to be changed during computation without regenerating at least a portion of the first and second user interface instances; 

comparing the first user interface instance to the second user interface instance to identify one or more differences between the first user interface instance and the second user interface instance; and





based on the comparison, determining one or more outcome-defining effects the updated data structures had on the user interface based on the identified differences between the first user interface instance and the second user interface instance.

Claim 15. (Text omitted as being substantially similar to claim 1).

Claim 20. (Text omitted as being substantially similar to claim 1).
A computer-implemented method comprising: 

accessing, by a controller, one or more updated data structures that are to be included in a user interface functionality test, the updated data structures contributing at least partially to a user interface; 

accessing, by the controller, a portion of live or snapshotted data captured from one or more services running in a production environment, the services being initialized at a specified common time identified in a shared contract that is shared among the services, the shared contract providing a common notion of time among the services, wherein the live or snapshotted data is used in the user interface functionality test, and wherein the live or snapshotted data includes service data generated by the services and user input data identifying user inputs received at the user interface, wherein the services implement queue-based communication that decouples the services and provides individual request retries; 

initiating, by the controller, generation of a first simulated user interface instance using the updated data structures and using the accessed live or snapshotted data, the first simulated user interface portraying how the updated data structures would function if exposed to external users when hosted by the production environment; 

initiating, by the controller, generation of a second simulated user interface instance using a different version of the updated data structures and using the accessed live or snapshotted data, the second simulated user interface portraying how the different version of the updated data structures would function if exposed to external users when hosted by the production environment, wherein the first and second simulated user interface instances are generated within the production environment but are inaccessible to external users; 

computing one or more metrics related to the first and second user interface instances, wherein the computation is performed in a separate computing stage, the separate computing stage allowing the metrics to be changed during computation without regenerating at least a portion of the first and second user interface instances; 

comparing, by the controller, the first user interface instance to the second user interface instance to identify one or more differences between the first user interface instance and the second user interface instance according to the live or snapshotted data and according to the user input data identifying user inputs received at the user interface; and 

based on the comparison, determining, by the controller, one or more outcome-defining effects the updated data structures had on the user interface based on the identified differences between the first user interface instance and the second user interface instance.
Claim 2
Claim 1
Claim 3
Claim 1
Claim 4
Claim 1
Claim 5
Claim 1
Claim 6
Claim 1
Claim 7
Claim 2
Claim 8
Claim 3
Claim 9
Claim 4
Claim 10
Claim 5
Claim 11
Claim 6
Claim 12
Claim 8
Claim 13
Claim 9
Claim 14
Claim 10
Claim 16
Claim 11
Claim 17
Claim 13
Claim 18
Claim 14
Claim 19
Claim 15


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 4, 5, 6, 10, 11, 12, 15, 17, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Szeto et al. (20170124487 -- hereinafter Szeto; see IDS dated 9/30/22) in view of Zapella et al. (10977149 – hereinafter Zapella; see IDS dated 9/30/22) and McHugh et al. (20160342485 -- hereinafter McHugh).

	With respect to claim 1, Szeto disclose A computer-implemented method comprising: 
	accessing one or more updated data structures that are to be included in a [user interface functionality] test, the updated data structures contributing at least partially to a user interface (e.g., Figs. 4-8 and associated text, e.g., [0208], The trained recommendation model will then output a purchase prediction or a recommendation for what products to display to such a user [contributing at least partially to a user interface] as a recommendation; [0294], engine variant V3 is generated based on engine variant V1 alone [V3, i.e. updated data structure], and engine variant V4 is generated based on engine variant V2 alone...Evaluator 840 may sort or rank the performances of such multiple engine variants; [0254], An evaluation metric may quantify prediction accuracy with a numerical score [testing]; see also [0285].); 
	accessing a portion of live or snapshotted data captured from one or more services running in a production environment, the live or snapshotted data being used in the [user interface functionality] test, [wherein the services implement queue-based communication that decouples the services to allow computation of data in separate stages] (e.g., Figs. 4-8, and 23 along with associated text, e.g., [0081] The Prediction or machine learning platform additionally leverages existing infrastructure to control model evaluation and deployment such that developers may utilize existing version control tools and interfaces. Developers may further execute multiple evaluations, experiments, and specify multiple deployments simultaneously, including deployment of multiple concurrent or simultaneous experiments on live or simulated or historical data depending upon the particular needs of the developer; [0426], the machine learning platform operates within a host organization which provides on-demand cloud computing services to a plurality of tenants; and in which receiving the training data includes receiving the training data as input from one of the plurality of tenants of the host organization; see also [0048] and [0285].); 
	generating a first [simulated user interface] instance using the updated data structures and using the accessed live or snapshotted data (e.g., Figs. 4-8, and 23 along with associated text, e.g., [0208], The trained recommendation model will then output a purchase prediction or a recommendation for what products to display to such a user as a recommendation; [0294] In some embodiments, engine variant V3 is generated [generating a first instance] based on engine variant V1 alone, and engine variant V4 is generated based on engine variant V2 alone; [0081], Developers may further execute multiple evaluations, experiments, and specify multiple deployments simultaneously, including deployment of multiple concurrent or simultaneous experiments on live or simulated or historical data depending upon the particular needs of the developer ; see also [0219], Additionally supported is A/B testing or comparisons of two or more different trained model variants such that an evaluation or experiment may determine which of the two or more possible trained model variants yield the preferred real-world behaviors according to some specified condition; see also [0025], [0285] and [0426].); 
	generating a second [simulated user interface] instance using a different version of the updated data structures and using the accessed live or snapshotted data (Id., particularly, [0294], engine variant V3 is generated [generation of a first instance] based on engine variant V1 alone, and engine variant V4 [generating a second instance] is generated based on engine variant V2 alone and [0081], including deployment of multiple concurrent or simultaneous experiments on live or simulated or historical data; see also [0290], a single query may be transferred to more than one deployed engine variants; see also [0219], Additionally supported is A/B testing or comparisons of two or more different trained model variants such that an evaluation or experiment may determine which of the two or more possible trained model variants yield the preferred real-world behaviors according to some specified condition. Comparison testing may additionally be performed upon trained model variants belonging to different customer organizations utilizing the same multi-tenant database system in the event that appropriate access permissions permit the comparisons.); 
	[computing one or more metrics related to the first and second user interface instances, wherein the computation is performed in a separate computing stage, the separate computing stage allowing the metrics to be changed during computation without regenerating at least a portion of the first and second user interface instances;] 
	comparing the first [user interface] instance to the second [user interface] instance to identify one or more differences between the first [user interface] instance and the second [user interface] instance (e.g., Figs. 4-8, and 23 along with associated text, e.g., [00294], Evaluator 840 may sort or rank the performances of such multiple engine variants, with pair-wise or multiple-way comparisons, before generating new engine variants for further deployment and evaluation; see also [0219], Additionally supported is A/B testing or comparisons of two or more different trained model variants such that an evaluation or experiment may determine which of the two or more possible trained model variants yield the preferred real-world behaviors according to some specified condition; [0281] In some embodiments, a PredictionIO or machine learning server generates and logs a unique tracking tag for each user query; see also [0051], [0280-281], [0285], and [0402-406].); and 
	based on the comparison, determining one or more outcome-defining effects the updated data structures had on the [user interface] based on the identified differences between the first [user interface] instance and the second [user interface] instance (Id., particularly [0294], Evaluator 840 may sort or rank the performances of such multiple engine variants, with pair-wise or multiple-way comparisons, before generating new engine variants for further deployment and evaluation; see also [0208], The trained recommendation model will then output a purchase prediction or a recommendation for what products to display to such a user as a recommendation; see also [0413], evaluating user interactions and more intuitive means by which to tune the engine and algorithms, for instance, by changing the scenario ad-hoc based on replay results such as making modifications to the email header, and replaying how results will perform for that engine variant, etc.... With live evaluation predictions, something which is shown to the user ultimately affects the outcome of the user; see also [0234] and [0412].).
	Although Szeto discloses testing first and second versions of generated instances of machine learning engines used to display recommendations to a user by comparing the first and second versions to determine the best one (see above), it does not appear to explicitly disclose testing the functionality of simulated presentations, i.e. user interface functionality, simulated user interface, user interface, and computing one or more metrics related to the first and second user interface instances, wherein the computation is performed in a separate computing stage, the separate computing stage allowing the metrics to be changed during computation without regenerating at least a portion of the first and second user interface instances.  However, this is taught in analogous art, Zapella (e.g., Figs. 1-2 and 4-5 along with associated text, e.g., col. 2:63-64, an "action" is an instruction to display certain content (e.g., a widget, an advertising campaign, an item, a title of an item, an image of an item, a description of an item, a button, a video, an audio player, etc.) [user interface]; col. 6:34-54, the simulation application 104 running on an experiment system 102 can operate in a separate test environment to test different policies before such policies are used by the action delivery system 120 to provide actions in response to action requests [simulated user interface]. .... While there may be little to no data gathered from user devices 112 in relation to use of the new policy, the simulation application 104 can use data associated with the existing policy to simulate how users of user devices 112 may respond if the action delivery system 120 used the new policy rather than the existing policy [first and second simulated interfaces]; col. 7:16-48, For each retrieved action, the simulation application 104 can generate a score [metric] by executing the retrieved prediction model, providing the respective action (or action identifier) and the test context as an input to the prediction model. The scores, however, may be biased given that certain actions are selected more often than others according to the existing policy [second simulated user interface]. To more accurately simulate the new policy [first simulated user interface], the simulation application 104 may unbias the scores [allowing the metrics to be changed during computation without regenerating at least a portion of the first and second user interface instances]. Thus, the simulation application 104 can normalize the generated scores using the retrieved action probabilities and test action probabilities included in the new policy being tested (as provided by the experiment operator)....In the context of content page requests, a normalized score [changed metric without regenerating at least a portion of the first and second user interface instances] can represent the expected value of user actions simulated to occur in response to the content associated with the normalized score being selected for display according to the new policy [first simulated user interface]. Similarly, the originally generated scores can represent the values of user behavior that occurred in response to the associated actions being selected according to the existing policy [second simulated user interface]. The simulation application 104 can then compare the expected values of the new policy [first simulated user interface] with the values of the existing policy [second simulated user interface], displaying the results of the comparison (e.g., in a user interface, on a wall via a projection, etc.) and/or storing the results; col. 14:43-45, The simulation application 104 can then simulate how the test policy may affect user behavior if the test policy were to be implemented by the action delivery system 120; col. 16:8-15, existing policies such that these policies can also be compared to the test policy; see also col. 2:59-col. 3:29, col. 3:54-61, col. 4:24-30, col. 7:44-47, . 8:15-23 col. 10:33-43, col. 14:26-30.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Szeto with the invention of Zapella because “the simulation application is not testing policies and/or prediction models in a production environment, [and thus] the simulation application can test any number of different policies and/or prediction models in less time than the conventional content delivery system.... [And] the possibility of any testing negatively impacting users is significantly reduced,” as suggested by Zapella (see col. 2:49-58).
	Szeto in view of Zapella does not appear to disclose wherein the services implement queue-based communication that decouples the services to allow computation of data in separate stages. However, this is taught in analogous art, McHugh (e.g., 002] For each service, the online system uses messages to keep track of the service status. For example, messages may inform the online system that an ad was shown, along with the clearing price of the ad. These messages are stored in queues that decouple the various independent services that process the data stream. In other words, the queue allows data to be transferred between independent services without sending back acknowledgements that the data was sent or received.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the invention of McHugh because “queues provide availability,” as suggested by McHugh.

With respect to claim 4, Szeto also discloses wherein the live or snapshotted data includes service data generated by the services and user input data identifying user inputs received at the user interface (e.g., Figs. 4-8 and 23 along with associated text, e.g., [0280-281], Such prediction history tracking may be performed in real-time, with live evaluation results returned as feedback to predictive engines for further engine parameter tuning and prediction accuracy improvement.... a PredictionIO or machine learning server generates and logs a unique tracking tag for each user query. Correspondingly, predicted results generated in response to the current query and parameters of the engine variant deployed are associated with the same tracking tag....Recall that in some embodiments, a query may include identifying information including user ID, product ID, time, and location. Similarly, a tracking tag may be in the form of (user-device ID, user ID, time stamp). Subsequent actual results including user actions and behaviors, and actual correct results revealed after the prediction result has been served, are also logged under the same tracking tag; [0405], PredictionIO Enterprise Edition provides a special data source (data reader) that can use the "tracking data" to replay how a prediction engine performs. This data source is able to reconstruct the complete history of each user that queried the system; see also [0051] and [0402]-[0406].).

With respect to claim 5, Zapella further discloses wherein the first simulated user interface portrays how the updated data structures would function if exposed to external users when hosted by the production environment, and wherein the second simulated user interface portrays how the different version of the data structures would function if exposed to external users when hosted by the production environment (e.g., Figs. 1-2 and 4-5 along with associated text, e.g., col. 2:63-64, an "action" is an instruction to display certain content (e.g., a widget, an advertising campaign, an item, a title of an item, an image of an item, a description of an item, a button, a video, an audio player, etc.) [user interface]; col. 6:34-54, the simulation application 104 running on an experiment system 102 can operate in a separate test environment to test different policies before such policies are used by the action delivery system 120 to provide actions in response to action requests [simulated user interface]. .... the simulation application 104 can use data associated with the existing policy to simulate how users of user devices 112 may respond if the action delivery system 120 used the new policy rather than the existing policy [first and second simulated interfaces]; see also col. 2:59-col. 3:29, col. 3:54-61, col. 4:24-30, col. 7:16-48, 8:15-23 col. 10:33-43, col. 14:26-30.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Szeto with the invention of Zapella for the same reason set forth above with respect to claim 1.


With respect to claim 6, Szeto also discloses wherein the first and second [simulated user interface] instances are generated within the production environment but are inaccessible to external users (e.g., Figs. 4-8, and 23 along with associated text, e.g., [0208], The trained recommendation model will then output a purchase prediction or a recommendation for what products to display to such a user as a recommendation; [0294] In some embodiments, engine variant V3 is generated [generation of a first instance] based on engine variant V1 alone, and engine variant V4 is generated based on engine variant V2 alone; [0081], Developers may further execute multiple evaluations, experiments, and specify multiple deployments simultaneously, including deployment of multiple concurrent or simultaneous experiments on live or simulated or historical data depending upon the particular needs of the developer ; see also [0219], Additionally supported is A/B testing or comparisons of two or more different trained model variants such that an evaluation or experiment may determine which of the two or more possible trained model variants yield the preferred real-world behaviors according to some specified condition; see also [0025], [0285] and [0426].) and Zapella discloses simulated user interface (e.g., Figs. 1-2 and 4-5 along with associated text, e.g., col. 2:63-64, an "action" is an instruction to display certain content (e.g., a widget, an advertising campaign, an item, a title of an item, an image of an item, a description of an item, a button, a video, an audio player, etc.) [user interface]; col. 6:34-54, the simulation application 104 running on an experiment system 102 can operate in a separate test environment to test different policies before such policies are used by the action delivery system 120 to provide actions in response to action requests [simulated user interface]. .... While there may be little to no data gathered from user devices 112 in relation to use of the new policy, the simulation application 104 can use data associated with the existing policy to simulate how users of user devices 112 may respond if the action delivery system 120 used the new policy rather than the existing policy [first and second simulated interfaces]; see also col. 2:59-col. 3:29, col. 3:54-61, col. 4:24-30, col. 7:16-48, . 8:15-23 col. 10:33-43, col. 14:26-30.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Szeto with the invention of Zapella for the same reason set forth above with respect to claim 1.

With respect to claim 10, Szeto also discloses wherein the snapshotted data includes inputs received at the one or more services in addition to outputs generated by the one or more services (e.g., Figs. 4-8, and 23 along with associated text, e.g., [0234], disclosed embodiments can replay the whole prediction scenario, from engine parameters, queries, prediction results, to actual results, user interactions, and evaluation metrics, to help developers understand particular behaviors of engine variants of interest, and to tailor and improve prediction engine design; see also [0412-413]).

With respect to claim 11, Szeto also discloses wherein the one or more services comprise stateful services running in the production environment, at least a portion of state information being stored for each service (e.g., [0073], state rollback capabilities with automated versioning control and tracking; [0287] Parameter set 813 states that variant 820 uses DataSource x2, and Algorithms 4 and 2; [0081] The Prediction or machine learning platform additionally leverages existing infrastructure to control model evaluation and deployment such that developers may utilize existing version control tools and interfaces. Developers may further execute multiple evaluations, experiments, and specify multiple deployments simultaneously, including deployment of multiple concurrent or simultaneous experiments on live or simulated or historical data depending upon the particular needs of the developer; see also [0285], [0357], and [0426]..).

With respect to claim 12, Szeto also discloses modifying the one or more services to generate one or more modified versions of the services, such that calls for data between the services are routed to the modified versions of the services (e.g., Figs. 1-8, and 23 along with associated text, e.g., [0228] Once a machine learning system is established, it can be deployed as a service, for example, as a web service, to receive dynamic user queries and to respond to such queries by generating and reporting prediction results to the user... As subsequent user actions or actual correct results can be collected and additional data may become available, a deployed machine learning system may be updated [modifies] with new training data, and may be re-configured according to dynamic queries and corresponding event data; see also [0214] and [00240-241].).

With respect to claim 15, Szeto discloses A system comprising: 
at least one physical processor; physical memory comprising computer-executable instructions that, when executed by the physical processor (Fig. 28 and associated text, e.g., [0483], The exemplary computer system 2800 includes a processor 2802, a main memory 2804.), cause the physical processor to: 
access one or more updated data structures that are to be included in a [user interface functionality] test, the updated data structures contributing at least partially to a user interface (e.g., Figs. 4-8 and associated text, e.g., [0208], The trained recommendation model will then output a purchase prediction or a recommendation for what products to display to such a user [contributing at least partially to a user interface] as a recommendation; [0294], engine variant V3 is generated based on engine variant V1 alone [V3, i.e. updated data structure], and engine variant V4 is generated based on engine variant V2 alone...Evaluator 840 may sort or rank the performances of such multiple engine variants; [0254], An evaluation metric may quantify prediction accuracy with a numerical score [testing]; see also [0285].); 
access a portion of live or snapshotted data captured from one or more services running in a production environment, the live or snapshotted data being used in the [user interface functionality] test, [wherein the services implement queue-based communication that decouples the services to allow computation of data in separate stages] (e.g., Figs. 4-8, and 23 along with associated text, e.g., [0081] The Prediction or machine learning platform additionally leverages existing infrastructure to control model evaluation and deployment such that developers may utilize existing version control tools and interfaces. Developers may further execute multiple evaluations, experiments, and specify multiple deployments simultaneously, including deployment of multiple concurrent or simultaneous experiments on live or simulated or historical data depending upon the particular needs of the developer; [0426], the machine learning platform operates within a host organization which provides on-demand cloud computing services to a plurality of tenants; and in which receiving the training data includes receiving the training data as input from one of the plurality of tenants of the host organization; see also [0048] and [0285].); 
generate a first [simulated user interface] instance using the updated data structures and using the accessed live or snapshotted data (e.g., Figs. 4-8, and 23 along with associated text, e.g., [0208], The trained recommendation model will then output a purchase prediction or a recommendation for what products to display to such a user as a recommendation; [0294] In some embodiments, engine variant V3 is generated [generating a first instance] based on engine variant V1 alone, and engine variant V4 is generated based on engine variant V2 alone; [0081], Developers may further execute multiple evaluations, experiments, and specify multiple deployments simultaneously, including deployment of multiple concurrent or simultaneous experiments on live or simulated or historical data depending upon the particular needs of the developer ; see also [0219], Additionally supported is A/B testing or comparisons of two or more different trained model variants such that an evaluation or experiment may determine which of the two or more possible trained model variants yield the preferred real-world behaviors according to some specified condition; see also [0025], [0285] and [0426].); 
generate a second [simulated user interface] instance using a different version of the updated data structures and using the accessed live or snapshotted data (Id., particularly, [0294], engine variant V3 is generated [generation of a first instance] based on engine variant V1 alone, and engine variant V4 [generating a second instance] is generated based on engine variant V2 alone and [0081], including deployment of multiple concurrent or simultaneous experiments on live or simulated or historical data; see also [0290], a single query may be transferred to more than one deployed engine variants; see also [0219], Additionally supported is A/B testing or comparisons of two or more different trained model variants such that an evaluation or experiment may determine which of the two or more possible trained model variants yield the preferred real-world behaviors according to some specified condition. Comparison testing may additionally be performed upon trained model variants belonging to different customer organizations utilizing the same multi-tenant database system in the event that appropriate access permissions permit the comparisons.); 
[compute one or more metrics related to the first and second user interface instances, wherein the computation is performed in a separate computing stage, the separate computing stage allowing the metrics to be changed during computation without regenerating at least a portion of the first and second user interface instances;] 
compare the first [user interface] instance to the second [user interface] instance to identify one or more differences between the first [user interface] instance and the second [user interface] instance (e.g., Figs. 4-8, and 23 along with associated text, e.g., [00294], Evaluator 840 may sort or rank the performances of such multiple engine variants, with pair-wise or multiple-way comparisons, before generating new engine variants for further deployment and evaluation; see also [0219], Additionally supported is A/B testing or comparisons of two or more different trained model variants such that an evaluation or experiment may determine which of the two or more possible trained model variants yield the preferred real-world behaviors according to some specified condition; [0281] In some embodiments, a PredictionIO or machine learning server generates and logs a unique tracking tag for each user query; see also [0051], [0280-281], [0285], and [0402-406].); and 
based on the comparison, determine one or more outcome-defining effects the updated data structures had on the [user interface] based on the identified differences between the first [user interface] instance and the second [user interface] instance (Id., particularly [0294], Evaluator 840 may sort or rank the performances of such multiple engine variants, with pair-wise or multiple-way comparisons, before generating new engine variants for further deployment and evaluation; see also [0208], The trained recommendation model will then output a purchase prediction or a recommendation for what products to display to such a user as a recommendation; see also [0413], evaluating user interactions and more intuitive means by which to tune the engine and algorithms, for instance, by changing the scenario ad-hoc based on replay results such as making modifications to the email header, and replaying how results will perform for that engine variant, etc.... With live evaluation predictions, something which is shown to the user ultimately affects the outcome of the user; see also [0234] and [0412].).
Although Szeto discloses testing first and second versions of generated instances of machine learning engines used to display recommendations to a user by comparing the first and second versions to determine the best one (see above), it does not appear to explicitly disclose testing the functionality of simulated presentations, i.e. user interface functionality, simulated user interface, user interface, and compute one or more metrics related to the first and second user interface instances, wherein the computation is performed in a separate computing stage, the separate computing stage allowing the metrics to be changed during computation without regenerating at least a portion of the first and second user interface instances.  However, this is taught in analogous art, Zapella (e.g., Figs. 1-2 and 4-5 along with associated text, e.g., col. 2:63-64, an "action" is an instruction to display certain content (e.g., a widget, an advertising campaign, an item, a title of an item, an image of an item, a description of an item, a button, a video, an audio player, etc.) [user interface]; col. 6:34-54, the simulation application 104 running on an experiment system 102 can operate in a separate test environment to test different policies before such policies are used by the action delivery system 120 to provide actions in response to action requests [simulated user interface]. .... While there may be little to no data gathered from user devices 112 in relation to use of the new policy, the simulation application 104 can use data associated with the existing policy to simulate how users of user devices 112 may respond if the action delivery system 120 used the new policy rather than the existing policy [first and second simulated interfaces]; col. 7:16-48, For each retrieved action, the simulation application 104 can generate a score [metric] by executing the retrieved prediction model, providing the respective action (or action identifier) and the test context as an input to the prediction model. The scores, however, may be biased given that certain actions are selected more often than others according to the existing policy [second simulated user interface]. To more accurately simulate the new policy [first simulated user interface], the simulation application 104 may unbias the scores [allowing the metrics to be changed during computation without regenerating at least a portion of the first and second user interface instances]. Thus, the simulation application 104 can normalize the generated scores using the retrieved action probabilities and test action probabilities included in the new policy being tested (as provided by the experiment operator)....In the context of content page requests, a normalized score [changed metric without regenerating at least a portion of the first and second user interface instances] can represent the expected value of user actions simulated to occur in response to the content associated with the normalized score being selected for display according to the new policy [first simulated user interface]. Similarly, the originally generated scores can represent the values of user behavior that occurred in response to the associated actions being selected according to the existing policy [second simulated user interface]. The simulation application 104 can then compare the expected values of the new policy [first simulated user interface] with the values of the existing policy [second simulated user interface], displaying the results of the comparison (e.g., in a user interface, on a wall via a projection, etc.) and/or storing the results; col. 14:43-45, The simulation application 104 can then simulate how the test policy may affect user behavior if the test policy were to be implemented by the action delivery system 120; col. 16:8-15, existing policies such that these policies can also be compared to the test policy; see also col. 2:59-col. 3:29, col. 3:54-61, col. 4:24-30, col. 7:44-47, . 8:15-23 col. 10:33-43, col. 14:26-30.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Szeto with the invention of Zapella because “the simulation application is not testing policies and/or prediction models in a production environment, [and thus] the simulation application can test any number of different policies and/or prediction models in less time than the conventional content delivery system.... [And] the possibility of any testing negatively impacting users is significantly reduced,” as suggested by Zapella (see col. 2:49-58).
	Szeto in view of Zapella does not appear to disclose wherein the services implement queue-based communication that decouples the services to allow computation of data in separate stages. However, this is taught in analogous art, McHugh (e.g., 002] For each service, the online system uses messages to keep track of the service status. For example, messages may inform the online system that an ad was shown, along with the clearing price of the ad. These messages are stored in queues that decouple the various independent services that process the data stream. In other words, the queue allows data to be transferred between independent services without sending back acknowledgements that the data was sent or received.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the invention of McHugh because “queues provide availability,” as suggested by McHugh.

With respect to claim 17, Szeto also discloses computing one or more metrics to establish a quality level of simulated behavior indicated by the outcome-defining effects  (please note the 35 USC 112(b) rejection and interpretation above; e.g., Figs. 4-8, and 23 along with associated text, e.g., [00294], Evaluator 840 may sort or rank the performances of such multiple engine variants, with pair-wise or multiple-way comparisons, before generating new engine variants for further deployment and evaluation; see also [0258], Evaluator 450 may receive actual results, including correct values, user actions, or actual user behaviors from a datastore or a user application for computing evaluation metrics; [0225], In addition, embodiments enable the tracking and replaying of queries, events, prediction results, and other necessary metrics for deducing and determining factors that affect the performance of a machine learning system of interest; [0229], For instance, variants of predictive engines and algorithms are evaluated by an evaluator, using one or more metrics with test data; [0264], One or more metrics can be defined to compare predicted results returned from the engine with actual results among the validation data. The goal of engine parameter tuning is to determine an optimal engine parameter set that maximizes evaluation metric scores; see also [0219], [0234], [257], [0271], [0276], [0285], and [0293].).

With respect to claim 19, Szeto also discloses wherein the metrics give at least partial credit for similarities in outcome-defining effects identified in the comparison between the first user interface instance and the second user interface instance (e.g., Figs. 4-8, and 23 along with associated text, e.g., [00294], Evaluator 840 may sort or rank the performances of such multiple engine variants, with pair-wise or multiple-way comparisons, before generating new engine variants for further deployment and evaluation; see also [0258], Evaluator 450 may receive actual results, including correct values, user actions, or actual user behaviors from a datastore or a user application for computing evaluation metrics; [0225], In addition, embodiments enable the tracking and replaying of queries, events, prediction results, and other necessary metrics for deducing and determining factors that affect the performance of a machine learning system of interest; [0229], For instance, variants of predictive engines and algorithms are evaluated by an evaluator, using one or more metrics with test data; [0264], One or more metrics can be defined to compare predicted results returned from the engine with actual results among the validation data. The goal of engine parameter tuning is to determine an optimal engine parameter set that maximizes evaluation metric scores; see also [0219], [0234], [257], [0271], [0276], [0285], and [0293].).

With respect to claim 20, Szeto discloses A non-transitory computer-readable medium comprising one or more computer- executable instructions that, when executed by at least one processor of a computing device (e.g., Fig. 28 and associated text, e.g., [0486] The secondary memory 2818 may include a non-transitory machine-readable storage medium or a non-transitory computer readable storage medium or a non-transitory machine-accessible storage medium 2831 on which is stored one or more sets of instructions (e.g., software 2822) embodying any one or more of the methodologies or functions described herein; see also claim 17.), cause the computing device to: 
access one or more updated data structures that are to be included in a [user interface functionality] test, the updated data structures contributing at least partially to a user interface (e.g., Figs. 4-8 and associated text, e.g., [0208], The trained recommendation model will then output a purchase prediction or a recommendation for what products to display to such a user [contributing at least partially to a user interface] as a recommendation; [0294], engine variant V3 is generated based on engine variant V1 alone [V3, i.e. updated data structure], and engine variant V4 is generated based on engine variant V2 alone...Evaluator 840 may sort or rank the performances of such multiple engine variants; [0254], An evaluation metric may quantify prediction accuracy with a numerical score [testing]; see also [0285].); 
access a portion of live or snapshotted data captured from one or more services running in a production environment, the live or snapshotted data being used in the [user interface functionality] test, [wherein the services implement queue-based communication that decouples the services to allow computation of data in separate stages] (e.g., Figs. 4-8, and 23 along with associated text, e.g., [0081] The Prediction or machine learning platform additionally leverages existing infrastructure to control model evaluation and deployment such that developers may utilize existing version control tools and interfaces. Developers may further execute multiple evaluations, experiments, and specify multiple deployments simultaneously, including deployment of multiple concurrent or simultaneous experiments on live or simulated or historical data depending upon the particular needs of the developer; [0426], the machine learning platform operates within a host organization which provides on-demand cloud computing services to a plurality of tenants; and in which receiving the training data includes receiving the training data as input from one of the plurality of tenants of the host organization; see also [0048] and [0285].); 
generate a first [simulated user interface] instance using the updated data structures and using the accessed live or snapshotted data (e.g., Figs. 4-8, and 23 along with associated text, e.g., [0208], The trained recommendation model will then output a purchase prediction or a recommendation for what products to display to such a user as a recommendation; [0294] In some embodiments, engine variant V3 is generated [generating a first instance] based on engine variant V1 alone, and engine variant V4 is generated based on engine variant V2 alone; [0081], Developers may further execute multiple evaluations, experiments, and specify multiple deployments simultaneously, including deployment of multiple concurrent or simultaneous experiments on live or simulated or historical data depending upon the particular needs of the developer ; see also [0219], Additionally supported is A/B testing or comparisons of two or more different trained model variants such that an evaluation or experiment may determine which of the two or more possible trained model variants yield the preferred real-world behaviors according to some specified condition; see also [0025], [0285] and [0426].); 
generate a second [simulated user interface] instance using a different version of the updated data structures and using the accessed live or snapshotted data (Id., particularly, [0294], engine variant V3 is generated [generation of a first instance] based on engine variant V1 alone, and engine variant V4 [generating a second instance] is generated based on engine variant V2 alone and [0081], including deployment of multiple concurrent or simultaneous experiments on live or simulated or historical data; see also [0290], a single query may be transferred to more than one deployed engine variants; see also [0219], Additionally supported is A/B testing or comparisons of two or more different trained model variants such that an evaluation or experiment may determine which of the two or more possible trained model variants yield the preferred real-world behaviors according to some specified condition. Comparison testing may additionally be performed upon trained model variants belonging to different customer organizations utilizing the same multi-tenant database system in the event that appropriate access permissions permit the comparisons.); 
[compute one or more metrics related to the first and second user interface instances, wherein the computation is performed in a separate computing stage, the separate computing stage allowing the metrics to be changed during computation without regenerating at least a portion of the first and second user interface instances;] 
compare the first [user interface] instance to the second [user interface] instance to identify one or more differences between the first [user interface] instance and the second [user interface] instance (e.g., Figs. 4-8, and 23 along with associated text, e.g., [00294], Evaluator 840 may sort or rank the performances of such multiple engine variants, with pair-wise or multiple-way comparisons, before generating new engine variants for further deployment and evaluation; see also [0219], Additionally supported is A/B testing or comparisons of two or more different trained model variants such that an evaluation or experiment may determine which of the two or more possible trained model variants yield the preferred real-world behaviors according to some specified condition; [0281] In some embodiments, a PredictionIO or machine learning server generates and logs a unique tracking tag for each user query; see also [0051], [0280-281], [0285], and [0402-406].); and 
based on the comparison, determine one or more outcome-defining effects the updated data structures had on the [user interface] based on the identified differences between the first [user interface] instance and the second [user interface] instance (Id., particularly [0294], Evaluator 840 may sort or rank the performances of such multiple engine variants, with pair-wise or multiple-way comparisons, before generating new engine variants for further deployment and evaluation; see also [0208], The trained recommendation model will then output a purchase prediction or a recommendation for what products to display to such a user as a recommendation; see also [0413], evaluating user interactions and more intuitive means by which to tune the engine and algorithms, for instance, by changing the scenario ad-hoc based on replay results such as making modifications to the email header, and replaying how results will perform for that engine variant, etc.... With live evaluation predictions, something which is shown to the user ultimately affects the outcome of the user; see also [0234] and [0412].).
Although Szeto discloses testing first and second versions of generated instances of machine learning engines used to display recommendations to a user by comparing the first and second versions to determine the best one (see above), it does not appear to explicitly disclose testing the functionality of simulated presentations, i.e. user interface functionality, simulated user interface, user interface, and compute one or more metrics related to the first and second user interface instances, wherein the computation is performed in a separate computing stage, the separate computing stage allowing the metrics to be changed during computation without regenerating at least a portion of the first and second user interface instances.  However, this is taught in analogous art, Zapella (e.g., Figs. 1-2 and 4-5 along with associated text, e.g., col. 2:63-64, an "action" is an instruction to display certain content (e.g., a widget, an advertising campaign, an item, a title of an item, an image of an item, a description of an item, a button, a video, an audio player, etc.) [user interface]; col. 6:34-54, the simulation application 104 running on an experiment system 102 can operate in a separate test environment to test different policies before such policies are used by the action delivery system 120 to provide actions in response to action requests [simulated user interface]. .... While there may be little to no data gathered from user devices 112 in relation to use of the new policy, the simulation application 104 can use data associated with the existing policy to simulate how users of user devices 112 may respond if the action delivery system 120 used the new policy rather than the existing policy [first and second simulated interfaces]; col. 7:16-48, For each retrieved action, the simulation application 104 can generate a score [metric] by executing the retrieved prediction model, providing the respective action (or action identifier) and the test context as an input to the prediction model. The scores, however, may be biased given that certain actions are selected more often than others according to the existing policy [second simulated user interface]. To more accurately simulate the new policy [first simulated user interface], the simulation application 104 may unbias the scores [allowing the metrics to be changed during computation without regenerating at least a portion of the first and second user interface instances]. Thus, the simulation application 104 can normalize the generated scores using the retrieved action probabilities and test action probabilities included in the new policy being tested (as provided by the experiment operator)....In the context of content page requests, a normalized score [changed metric without regenerating at least a portion of the first and second user interface instances] can represent the expected value of user actions simulated to occur in response to the content associated with the normalized score being selected for display according to the new policy [first simulated user interface]. Similarly, the originally generated scores can represent the values of user behavior that occurred in response to the associated actions being selected according to the existing policy [second simulated user interface]. The simulation application 104 can then compare the expected values of the new policy [first simulated user interface] with the values of the existing policy [second simulated user interface], displaying the results of the comparison (e.g., in a user interface, on a wall via a projection, etc.) and/or storing the results; col. 14:43-45, The simulation application 104 can then simulate how the test policy may affect user behavior if the test policy were to be implemented by the action delivery system 120; col. 16:8-15, existing policies such that these policies can also be compared to the test policy; see also col. 2:59-col. 3:29, col. 3:54-61, col. 4:24-30, col. 7:44-47, . 8:15-23 col. 10:33-43, col. 14:26-30.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Szeto with the invention of Zapella because “the simulation application is not testing policies and/or prediction models in a production environment, [and thus] the simulation application can test any number of different policies and/or prediction models in less time than the conventional content delivery system.... [And] the possibility of any testing negatively impacting users is significantly reduced,” as suggested by Zapella (see col. 2:49-58).
	Szeto in view of Zapella does not appear to disclose wherein the services implement queue-based communication that decouples the services to allow computation of data in separate stages. However, this is taught in analogous art, McHugh (e.g., 002] For each service, the online system uses messages to keep track of the service status. For example, messages may inform the online system that an ad was shown, along with the clearing price of the ad. These messages are stored in queues that decouple the various independent services that process the data stream. In other words, the queue allows data to be transferred between independent services without sending back acknowledgements that the data was sent or received.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the invention of McHugh because “queues provide availability,” as suggested by McHugh.

Claims 2 and 3 are rejected under 35 U.S.C. 103 as being unpatentable over Szeto in view of Zapella and McHugh, as applied to claim 1 above, and further in view of Constandache (20200349172 -- hereinafter Constandache; see IDS dated 9/30/22).

With respect to claim 2, Szeto in view of Zapella and McHugh does not appear to disclose wherein the services are initialized at a specified common time identified in a shared contract that is shared among the services. However, this is taught in analogous art, Constandache (e.g., Figs. 3, 4, and 6 along with associated text, e.g., The scheduler issues actions to a centralized synchronization service 308 according to a deployment schedule (e.g., deployment schedules 338-344) [shared contract] for code and/or data in the multi-cluster environment; [0044] Each deployment schedule includes a set of fields that define actions to be performed by one or more nodes and/or clusters in the multi-cluster environment. These fields include, but are not limited to, a type of action, a source cluster, a target cluster, a start time, and/or a frequency; [0047] The deployment schedule also specifies a start time, frequency and/or other fields related to timing of the corresponding action; [0048], the same deployment schedule is propagated to all clusters of the database, and different subsets of clusters are assigned different actions in the deployment schedule to configure the clusters for the same...purposes related to the development, deployment, and/or release of the distributed database [the services being initialized at a specified common time identified in a shared contract that is shared among the services]; see also [0057].).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the invention of Constandache because it provides “technological improvements in applications, tools, computer systems, and/or environments for managing deployment workflows, updates, errors, and/or failures in multi-cluster environments, distributed databases, and/or other types of distributed systems,” as suggested by Constandache (see [0069]).

With respect to claims 3, Constandache further discloses wherein the shared contract provides a common notion of time among the services (e.g., Figs. 3, 4, and 6 along with associated text, e.g., The scheduler issues actions to a centralized synchronization service 308 according to a deployment schedule (e.g., deployment schedules 338-344) [shared contract] for code and/or data in the multi-cluster environment; [0044] Each deployment schedule includes a set of fields that define actions to be performed by one or more nodes and/or clusters in the multi-cluster environment. These fields include, but are not limited to, a type of action, a source cluster, a target cluster, a start time, and/or a frequency; [0047] The deployment schedule also specifies a start time, frequency and/or other fields related to timing of the corresponding action; [0048], the same deployment schedule is propagated to all clusters of the database, and different subsets of clusters are assigned different actions in the deployment schedule to configure the clusters for the same...purposes related to the development, deployment, and/or release of the distributed database [the shared contract providing a common notion of time among the services]; see also [0057].).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the invention of Constandache for the same reason set forth above with respect to claim 2.

Claims 7 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Szeto in view of Zapella and McHugh, as applied to claim 1 above, and further in view of Baruch et al. (10210073 – hereinafter Baruch; 9/30/22).

	With respect to claim 7, Szeto in view of Zapella and McHugh does not appear to explicitly disclose coordinating the one or more services to access the live or snapshotted data starting at a specified point in time.  However, this is taught in analogous art, Baruch (e.g., Figs. 1-4 and associated text, e.g., col. 11:32-43, For example, after a problem or bug is detected in production site 302, test site 312 may be initiated using an enhanced copy snapshot replica 326 from a time before (e.g., earlier than) the time the problem of bug occurred. For example, the most recent enhanced copy snapshot replica 326 may be employed to initiate test site 312. In some embodiments, by employing an enhanced copy snapshot replica, test site 312 may be advantageously initiated as a copy of production site 302, including the application, application configurations, settings and environment (including clocks) at, or just before, the time at which the problem or bug was detected.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the invention of Baruch because it is “advantageously...create[s] a copy of the application in which the configurations, settings and environment (including clocks) appear to the developer to be moved back to those of the run time environment at the time of interest (e.g., a time at which a problem or bug was detected), even if available services have changed or been removed,” as suggested by Baruch (see col. 10:16-23).

With respect to claim 9, Szeto also discloses establishing a contract between the one or more services to ensure that the services use a same version of metadata and to ensure that the services access a same live or snapshotted data (e.g., Figs. 4-8, and 23 along with associated text, e.g., [0290], a single query may be transferred to more than one deployed engine variants.).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Szeto in view of Zapella, McHugh, and Baruch, as applied to claim 7 above, and further in view of Shazly et al. (20180024916 – hereinafter Shazly; 9/30/22).

	With respect to claim 8, Szeto in view of Zapella, McHugh, and Baruch does not appear to explicitly disclose coordinating the one or more services to access a common clock based on the specified point in time.  However, this is taught in analogous art, Shazly (e.g., [0039], A test clock, which is a "virtual clock", that provides for time synchronization across multiple LPARs (or systems) and is intelligent. The test clock does not simply keep track of time, instead, the test clock keeps track of the code progression and may modify (compress) the time based on the progression of the executing code; [0112] Using the separate DLL allows for the coordination of a set of independently executing distributed test scripts. In particular, this allows for the creation of any configuration of tests across any number of systems (e.g., LPARs) that are all synchronized through a single test clock; [0122], Embodiments allow for multi-system synchronization by providing access to the time of the test clock and allowing resetting of the time from multiple tests across multiple systems.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the invention of Shazly because it can “solve the problem of system testing (e.g., ‘production system testing’) by using variable time compression,” as suggested by Shazly (see [0043]).

Claims 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Szeto in view of Zapella and McHugh, as applied to claim 12 above, and further in view of Shazly.

With respect to claim 13, Szeto also discloses wherein the one or more modified versions of the services are instantiated to process a batch of tasks [and are automatically shut down upon completion of the batch of tasks] (e.g., Figs. 1-8, and 23 along with associated text, e.g., [0228] Once a machine learning system is established, it can be deployed as a service, for example, as a web service, to receive dynamic user queries and to respond to such queries [batch of tasks] by generating and reporting prediction results to the user... As subsequent user actions or actual correct results can be collected and additional data may become available, a deployed machine learning system may be updated [modifies] with new training data, and may be re-configured according to dynamic queries and corresponding event data; see also [0214], [0229-230], and [00240-241].).
Szeto in view of Zapella and McHugh does not appear to explicitly disclose are automatically shut down upon completion of the batch of tasks. However, this is taught in analogous art, Shazly (e.g., Figs. 7-8 and associated text, e.g., [0119], In block 708, the test control program 110 executes the one or more programs 130, 151 based on their order and, when a program 130, 151 completes, moves the test clock 118 time forward to a start time of a next program 130, 151 to be executed, until all of the programs 130, 151 have been executed [automatically shut down upon completion of the batch of tasks].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the invention of Shazly because it can “solve the problem of system testing (e.g., ‘production system testing’) by using variable time compression,” as suggested by Shazly (see [0043]).

	With respect to claim 14, Shazly further teaches wherein the modified versions of the services are instantiated to process the batch of tasks are assigned a specified data snapshot to use when processing the batch of tasks (e.g., Figs. 7-8 and associated text, e.g., [0119], In block 704, the test control program 110 reads the identified one or more workloads from the production system workload file 112, where each of the one or more workloads consists of programs 130, 151 that are to be run in a specified sequence starting at a specified time.).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the invention of Shazly for the same reason set forth above with respect to claim 13.

Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Szeto in view of Zapella and McHugh, as applied to claim 15 above, and further in view of Davis et al. (20140040669 -- hereinafter Davis; 9/30/22).

With respect to claim 16, Szeto also discloses a controller that is configured to [switch] from live data to snapshotted data or [switch] from snapshotted data to live data upon receiving a [switch] indication (e.g., Figs. 1-8, and 23 along with associated text, e.g., [0081], Developers may further execute multiple evaluations, experiments, and specify multiple deployments simultaneously, including deployment of multiple concurrent or simultaneous experiments on live or simulated or historical data depending upon the particular needs of the developer; [0413], Still further provided is support for both off-line and live evaluation prediction types; see also [0400-403]).
To the extent that Szeto in view of Zapella and Constandache does not appear to explicitly disclose switch, this is taught in analogous art, Davis (e.g., [0062], When the user is finished inspecting the historical state, the debugger switches its data source back to the live process; see also [0027] and [0061].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the invention of Davis because “memory overhead from in-process collection can be reduced,” as suggested by Davis (see [0012]).

Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Szeto in view of Zapella and McHugh, as applied to claim 17 above, and further in view of Chan et al. (20160147509 – hereinafter Chan; 9/30/22).

	With respect to claim 18, Szeto in view of Zapella and McHugh does not appear to explicitly disclose wherein computing the one or more metrics includes altering one or more user interface objects shown in the user interface to remove bias in the one or more outcome-defining effects.  However, this is taught by analogous art, Chan (e.g., Figs. 4, 7, 8, 10, and 12 along with associated text, e.g., [0098], In general, since the final UI will be provided to a large set of users, the user assignment unit 1204 may try to avoid assigning a UI version to a subset of users having bias against the UI version. For example, if older users prefer large font size but younger users do not like large font size, the user assignment unit 1204 can assign a UI version with large font size to a group of users including both older and younger users.).	
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the invention of Chan because it overcomes the drawback that “manually testing many variables associated with UI one by one, which is a very tedious process as engineers have to code all different variations of UI,” as suggested by Chan (see [0005-6]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Specifically, Van Rotterdam et al. 20200192470 teaches a web server to provide first and second variants of a web application for A/B testing.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to STEPHEN DAVID BERMAN whose telephone number is (571)272-7206.  The examiner can normally be reached on M-F, 9-6 Eastern.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hyung S. Sough can be reached on 571-272-6799.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/STEPHEN D BERMAN/Examiner, Art Unit 2192                                                                                                                                                                                                        


/s. SOUGH/
SPE, AU 2192/2194