DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 08/15/2022 has been entered.
Examiner Notes
Examiner cites particular columns, paragraphs, figures and line numbers in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
Response to Arguments
Applicant’s arguments with respect to claims 1, 10 and 17 have been considered but are moot in view of the new ground(s) of rejection (See new reference of Baruch, Dubois and Stowell).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 4, 10 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Baruch et al. (U.S. Patent No. 10,235,090 B1) in view of Dubois et al. (U.S. Pub. No. 2018/0095835 A1), further in view of Stowell et al. (U.S. Pub. No. 2019/0324861 A1).
Regarding claim 1, Baruch teaches a method, comprising: 
accessing, by a data monitoring system, one or more current datasets used by a first live database at a production datacenter, wherein the first live database uses the one or more current datasets to support a production version of a web service for client use (col. 3, line 23-25, Site I 100a corresponds to production site (e.g., facility where one or more hosts run data processing applications that write data to a storage system and read data from the storage system)), and a second live database at a non-production datacenter (col. 3, line 25-33, Site II 100b corresponds to a backup or replica site (e.g., a facility where replicated production site data is stored)).
Baruch does not explicitly disclose: wherein the second live database uses the one or more current datasets to perform analytics on the production version of the web service.
Dubois teaches: wherein the second live database uses the one or more current datasets to perform analytics on the production version of the web service (paragraph [0060], [0062], line 13-20, the resiliency data server is configured to retrieve or receive unencoded data from the active production data storage server; the analytic module is configured to retrieve or receive unencoded backup data from the resiliency data server; the analytic module which is capable of analyzing data to provide answer to business questions in the form of an analytic output).
It would have been obvious to one of ordinary skill in art before the effective filing date of the claim invention to include wherein the second live database uses the one or more current datasets to perform analytics on the production version of the web service into validating replication copy of Baruch.
Motivation to do so would be to include wherein the second live database uses the one or more current datasets to perform analytics on the production version of the web service to address issue with any original sources of data subjected to analytics processing are potentially subject to corruption (Dubois, paragraph [0002], line 31-33).
Baruch as modified by Dubois further teach:
performing, by the data monitoring system, encoding operations on the one or more current datasets to generate encode values corresponding to the one or more current datasets (Baruch, col. 8, line 18, hashing value of production data at one or more point in time);
retrieving, by the data monitoring system, an experimental dataset from an experimental database at the non-production datacenter (Baruch, col. 10, line 57-61, one or more selected data replicas may optionally be retrieved from storage).
Baruch as modified by Dubois do not explicitly disclose: the one or more current datasets and the experimental dataset include data organized into multiple data records having values corresponding to multiple data fields, the one or more current datasets and the experimental dataset have respective dataset schemas, and attributes of the dataset schemas include a number of data fields and formats of data fields.
Stowell teaches: the one or more current datasets and the experimental dataset include data organized into multiple data records having values corresponding to multiple data fields, the one or more current datasets and the experimental dataset have respective dataset schemas, and attributes of the dataset schemas include a number of data fields and formats of data fields (paragraph [0039]-[0040], the verification data can include values of various property of data stored by the component, e.g., a count of data entries, rows, columns, or tables in the component, etc.; the metadata can indicate types of data in a component database, and the system can cross-references the types of data in a restored component database with those in the metadata during validation, the metadata includes data structural relationship in a component database, and the system can compare the data relationship in a restored component database with that included in the metadata).
It would have been obvious to one of ordinary skill in art before the effective filing date of the claim invention to include the one or more current datasets and the experimental dataset include data organized into multiple data records having values corresponding to multiple data fields, the one or more current datasets and the experimental dataset have respective dataset schemas, and attributes of the dataset schemas include a number of data fields and formats of data fields into validating replication copy of Baruch.
Motivation to do so would be to include the one or more current datasets and the experimental dataset include data organized into multiple data records having values corresponding to multiple data fields, the one or more current datasets and the experimental dataset have respective dataset schemas, and attributes of the dataset schemas include a number of data fields and formats of data fields to provide details of relational database in process of data validation.
Baruch as modified by Dubois and Stowell further teach:
performing, by the data monitoring system, validation operations on the experimental dataset, wherein the validation operations include: retrieving the encode values corresponding to the one or more current datasets; and using the encode values to validate one or more characteristics of the experimental dataset (Baruch, col. 11, line 28-31validating a data replica is usable by comparing the has value associated with the data replica to the hash value of production site);
and in response to a determination of success of the validation operations, generating, by the data monitoring system, a validation output indicating that the experimental dataset should be published to the first and second live databases (Stowell, paragraph [0047], if the metadata matches the restored data in the corresponding components, the system can record an indication that the backup artifact for the component is verified; in conjunction the teaching of Dubois, col. 8, line 26-29, col. 9, line 1-3, the replica is consistent with the production data at the PIT the replica was generated; the replica data is valid and may be relied upon to accurately roll back to the associated PIT; employing a given data replica to recover or roll back data of a production site, which read on in response to a determination of success of the validation operations, generating, by the data monitoring system, a validation output indicating that the experimental dataset should be published to the first and second live databases as claimed).
Regarding claim 4, Cantwell as modified by Breck teach all claimed limitations as set forth in rejection of claim 1, but do not explicitly disclose wherein the performing the validation operations includes validating the dataset schema associated with the experimental dataset (Stowell, paragraph [0039]-[0040], [0046], the verification data can include values of various property of data stored by the component, e.g., a count of data entries, rows, columns, or tables in the component, etc.; the metadata can indicate types of data in a component database, and the system can cross-references the types of data in a restored component database with those in the metadata during validation, the metadata includes data structural relationship in a component database, and the system can compare the data relationship in a restored component database with that included in the metadata; the system performs a respective verification process for each component to verify the contents of the restored component using the verification data for the component).
As per claim 10, this claim is rejected on grounds corresponding to the same rationales given above for rejected claim 1 and is similarly rejected.
Regarding claim 17, Breck teaches a method, comprising: 
performing validation operations on an experimental dataset from an experimental database at a non-production datacenter, wherein the validation operations include:
retrieving encode values corresponding to one or more current datasets (Baruch, col. 8, line 18, hashing value of production data at one or more point in time), wherein the one or more current datasets are maintained in used by a first live database at a production datacenter, wherein the first live database uses the one or more current datasets to support a production version of a web service for client use (col. 3, line 23-25, Site I 100a corresponds to production site (e.g., facility where one or more hosts run data processing applications that write data to a storage system and read data from the storage system)) and a second live database at a non-production center (col. 3, line 25-33, Site II 100b corresponds to a backup or replica site (e.g., a facility where replicated production site data is stored)) but do not explicitly disclose: wherein the second live database uses the one or more current datasets to perform analytics on the production version of the web service.
Dubois teaches: wherein the second live database uses the one or more current datasets to perform analytics on the production version of the web service (paragraph [0060], [0062], line 13-20, the resiliency data server is configured to retrieve or receive unencoded data from the active production data storage server; the analytic module is configured to retrieve or receive unencoded backup data from the resiliency data server; the analytic module which is capable of analyzing data to provide answer to business questions in the form of an analytic output).
It would have been obvious to one of ordinary skill in art before the effective filing date of the claim invention to include wherein the second live database uses the one or more current datasets to perform analytics on the production version of the web service into validating replication copy of Baruch.
Motivation to do so would be to include wherein the second live database uses the one or more current datasets to perform analytics on the production version of the web service to address issue with any original sources of data subjected to analytics processing are potentially subject to corruption (Dubois, paragraph [0002], line 31-33).
Baruch as modified by Dubois do not explicitly disclose: the one or more current datasets and the experimental dataset include data organized into multiple data records having values corresponding to multiple data fields, the one or more current datasets and the experimental dataset have respective dataset schemas, and attributes of the dataset schemas include a number of data fields and formats of data fields.
Stowell teaches: the one or more current datasets and the experimental dataset include data organized into multiple data records having values corresponding to multiple data fields, the one or more current datasets and the experimental dataset have respective dataset schemas, and attributes of the dataset schemas include a number of data fields and formats of data fields (paragraph [0039]-[0040], the verification data can include values of various property of data stored by the component, e.g., a count of data entries, rows, columns, or tables in the component, etc.; the metadata can indicate types of data in a component database, and the system can cross-references the types of data in a restored component database with those in the metadata during validation, the metadata includes data structural relationship in a component database, and the system can compare the data relationship in a restored component database with that included in the metadata).
It would have been obvious to one of ordinary skill in art before the effective filing date of the claim invention to include the one or more current datasets and the experimental dataset include data organized into multiple data records having values corresponding to multiple data fields, the one or more current datasets and the experimental dataset have respective dataset schemas, and attributes of the dataset schemas include a number of data fields and formats of data fields into validating replication copy of Baruch.
Motivation to do so would be to include the one or more current datasets and the experimental dataset include data organized into multiple data records having values corresponding to multiple data fields, the one or more current datasets and the experimental dataset have respective dataset schemas, and attributes of the dataset schemas include a number of data fields and formats of data fields to provide details of relational database in process of data validation.
Baruch as modified by Dubois and Stowell further teach:
using the encode values to validate one or more characteristics of the experimental dataset (Baruch, col. 10, line 57-61, col. 11, line 28-31, one or more selected data replicas may optionally be retrieved from storage; validating a data replica is usable by comparing the has value associated with the data replica to the hash value of production site);
and in response to a determination that the experimental dataset passes the validation operations, storing the experimental dataset in the first and second live databases (Stowell, paragraph [0047], if the metadata matches the restored data in the corresponding components, the system can record an indication that the backup artifact for the component is verified; in conjunction the teaching of Dubois, col. 8, line 26-29, col. 9, line 1-3, the replica is consistent with the production data at the PIT the replica was generated; the replica data is valid and may be relied upon to accurately roll back to the associated PIT; employing a given data replica to recover or roll back data of a production site, which read on in response to a determination that the experimental dataset passes the validation operations, storing the experimental dataset in the first and second live databases as claimed).
Claims 2-3, 5-6 and 11-14 are rejected under 35 U.S.C. 103 as being unpatentable over Baruch et al. (U.S. Patent No. 10,235,090 B1) in view of Dubois et al. (U.S. Pub. No. 2018/0095835 A1) and Stowell et al. (U.S. Pub. No. 2019/0324861 A1), further in view of Naphade et al. (U.S. Pub. No. 2020/0410322 A1).
Regarding claim 2, Baruch as modified by Dubois and Stowell teach all claimed limitations as set forth in rejection of claim 1, but do not explicitly disclose: wherein the encoding operations include: training an autoencoder machine learning model based on one or more current datasets to generate a trained autoencoder.
Naphade teaches: wherein the encoding operations include: training an autoencoder machine learning model based on one or more current datasets to generate a trained autoencoder (learning and applying data encoding in unsupervised manner using  input data; generating data points deriving from a mixture of Gaussian distributions, probabilistic model provides a notification or indication of an anomaly in input data because from being trained, probabilistic model has learned which events are consider anomalous (e.g., different from normal data), paragraph [0022]-[0024], [0030], noted, the normal data which has been trained previously, which interpreted as the plurality of datasets).
It would have been obvious to one of ordinary skill in art before the effective filing date of the claim invention to include wherein the encoding operations include: training an autoencoder machine learning model based on one or more current datasets to generate a trained autoencoder into validating replication copy of Baruch.
Motivation to do so would be to include wherein the encoding operations include: training an autoencoder machine learning model based on one or more current datasets to generate a trained autoencoder to provide ability to learn normal event behavior all in one network (Naphade, paragraph [0013], line 6).
Regarding claim 3, Baruch as modified by Dubois, Stowell and Naphade teach all claimed limitations as set forth in rejection of claim 2, further teach wherein the validation operations further include: applying the trained autoencoder to the experimental dataset to detect one or more anomalous data records in the experimental dataset (Naphade teaches learning and applying data encoding in unsupervised manner using  input data; generating data points deriving from a mixture of Gaussian distributions, probabilistic model provides a notification or indication of an anomaly in input data because from being trained, probabilistic model has learned which events are consider anomalous (e.g., different from normal data), paragraph [0022]-[0024], [0030]; noted, the normal data which has been trained previously; in conjunction with the training data taught by Breck, it teaches wherein the validation operations further include: applying the trained autoencoder to the experimental dataset to detect one or more anomalous data records in the experimental dataset as claimed). 
Regarding claim 5, Baruch as modified by Dubois and Stowell teach all claimed limitations as set forth in rejection of claim 4, but do not explicitly disclose: wherein the performing the encoding operations includes training an autoencoder machine learning model using one or more current datasets.
Naphade teaches: wherein the performing the encoding operations includes training an autoencoder machine learning model using one or more current datasets (learning and applying data encoding in unsupervised manner using  input data; generating data points deriving from a mixture of Gaussian distributions, probabilistic model provides a notification or indication of an anomaly in input data because from being trained, probabilistic model has learned which events are consider anomalous (e.g., different from normal data), paragraph [0022]-[0024], [0030]).
It would have been obvious to one of ordinary skill in art before the effective filing date of the claim invention to include wherein the performing the encoding operations includes training an autoencoder machine learning model using one or more current datasets into validating replication copy of Baruch.
Motivation to do so would be to include wherein the performing the encoding operations includes training an autoencoder machine learning model using one or more current datasets to provide ability to learn normal event behavior all in one network (Naphade, paragraph [0013], line 6).
Baruch as modified by Dubois, Stowell and Naphade further teach: wherein the encode values include a schema encode value that indicates one or more baseline attributes that correspond to the dataset schemas of the plurality of datasets (Stowell, paragraph [0039]-[0040], paragraph [0047], the verification data can include values of various property of data stored by the component, e.g., a count of data entries, rows, columns, or tables in the component, etc.; the metadata can indicate types of data in a component database, and the system can cross-references the types of data in a restored component database with those in the metadata during validation, the metadata includes data structural relationship in a component database, and the system can compare the data relationship in a restored component database with that included in the metadata; if the metadata matches the restored data in the corresponding components, the system can record an indication that the backup artifact for the component is verified).
Regarding claim 6, Baruch as modified by Dubois, Stowell and Naphade teach all claimed limitations as set forth in rejection of claim 5, further teach wherein the validating the schema associated with the updated dataset includes: identifying one or more attributes associated with the dataset schema of the updated dataset; and comparing the one or more attributes associated with the schema of the updated dataset to the one or more baseline attributes associated with the schemas of the plurality of datasets (Stowell, paragraph [0039]-[0040], paragraph [0047], the verification data can include values of various property of data stored by the component, e.g., a count of data entries, rows, columns, or tables in the component, etc.; the metadata can indicate types of data in a component database, and the system can cross-references the types of data in a restored component database with those in the metadata during validation, the metadata includes data structural relationship in a component database, and the system can compare the data relationship in a restored component database with that included in the metadata; if the metadata matches the restored data in the corresponding components, the system can record an indication that the backup artifact for the component is verified). 
Regarding claim 11, Baruch as modified by Dubois and Stowell teach all claimed limitations as set forth in rejection of claim 10, but do not explicitly disclose: wherein the performing the validation operations includes validating a value distribution associated with the experimental dataset. 
Naphade teaches: wherein the performing the validation operations includes validating a value distribution associated with the experimental dataset (learning and applying data encoding in unsupervised manner using  input data; generating data points deriving from a mixture of Gaussian distributions, probabilistic model provides a notification or indication of an anomaly in input data because from being trained, probabilistic model has learned which events are consider anomalous (e.g., different from normal data), paragraph [0022]-[0024], [0030]).
It would have been obvious to one of ordinary skill in art before the effective filing date of the claim invention to include wherein the performing the validation operations includes validating a value distribution associated with the experimental dataset into validating replication copy of Baruch.
Motivation to do so would be to include wherein the performing the validation operations includes validating a value distribution associated with the experimental dataset to provide ability to learn normal event behavior all in one network (Naphade, paragraph [0013], line 6).
Regarding claim 12, Baruch as modified by Dubois, Stowell and Naphade teach all claimed limitations as set forth in rejection of claim 11, further teach wherein the performing the encoding operations includes: training an autoencoder machine learning model based on the one or more current datasets to generate a trained autoencoder model; and calculating a first latent probability distribution across multiple data record keys corresponding to the one or more current datasets using the trained autoencoder model (Naphade learning and applying data encoding in unsupervised manner using  input data, paragraph [0018]- [0021]; generating data points deriving from a mixture of Gaussian distributions, probabilistic model provides a notification or indication of an anomaly in input data because from being trained, probabilistic model has learned which events are consider anomalous (e.g., different from normal data), paragraph [0022]-[0024], [0030]). 
Regarding claim 13, Baruch as modified by Dubois, Stowell and Naphade teach all claimed limitations as set forth in rejection of claim 12, further teach wherein the autoencoder machine learning model is a Deep Autoencoding Gaussian Mixture Model (DAGMM) (Naphade, paragraph [0013], [0027]-[0030], deep autoencoder with latent space modeling of Gaussian Mixtures is interpreted as Deep Autoencoding Gaussian Mixture Model (DAGMM)). 
Regarding claim 14, Baruch as modified by Dubois and Stowell teach all claimed limitations as set forth in rejection of claim 12, further teach wherein the validating the value distribution associated with the updated dataset includes validating numerical data in the updated dataset, including by: applying the trained autoencoder model to the experimental dataset to calculate a second latent probability distribution across multiple data record keys corresponding to the experimental dataset; and comparing the first and second latent probability distributions (Naphade teaches learning and applying data encoding in unsupervised manner using  input data, paragraph [0018]- [0021]; generating data points deriving from a mixture of Gaussian distributions, probabilistic model provides a notification or indication of an anomaly in input data because from being trained, probabilistic model has learned which events are consider anomalous (e.g., different from normal data), paragraph [0022]-[0024], [0030]). 
Claims 7-9 are rejected under 35 U.S.C. 103 as being unpatentable over Baruch et al. (U.S. Patent No. 10,235,090 B1) in view of Dubois et al. (U.S. Pub. No. 2018/0095835 A1) and Stowell et al. (U.S. Pub. No. 2019/0324861 A1), further in view of Simca et al. (U.S. Patent No. 10,642,715 B1).
Regarding claim 7, Baruch as modified by Dubois and Stowell teach all claimed limitations as set forth in rejection of claim 1, but do not explicitly disclose: wherein the performing the validation operations includes validating an update pattern associated with one or more data records in the experimental dataset. 
Simca teaches: wherein the performing the validation operations includes validating an update pattern associated with one or more data records in the experimental dataset (Simca teaches testing data environment may maintain actual historical data, artificial testing data or simulated testing data which may be useful in learning about the processes and their attributes, col. 5, line 56-64, obtaining process information from testing data environment, col. 7, line 41-42; such activity as modifying a file, downloading a file, etc., col. 17, line 6-20; testing data environment may include historical data associated with processing running in live data environment, statistics parameters  and dynamic parameter; dynamic parameter may include: creating, modifying, or deleting data or file, col. 6, line 15-19, 31-32 and 60; each process may be stored with a reference or identifier (e.g., file name, keyword, numerical identifier, hash, category, pointer, etc.), col. 5, line 59-63; looking for pattern of known, valid activity (or know, invalid activity) and define steps in each process as part of each context profile, col. 8, line 18-25).
It would have been obvious to one of ordinary skill in art before the effective filing date of the claim invention to include wherein the performing the validation operations includes validating an update pattern associated with one or more data records in the experimental dataset into validating replication copy of Baruch.
Motivation to do so would be to include wherein the performing the validation operations includes validating an update pattern associated with one or more data records in the experimental dataset to provide adaptive, customized, and flexible security in networks with dynamically changing software applications (Simca, col. 1, line 63-64).
Regarding claim 8, Baruch as modified by Dubois, Stowell and Simca teach all claimed limitations as set forth in rejection of claim 7, further teach wherein the updated dataset is an updated version of a first dataset, and wherein the plurality of datasets includes a historical version of the first dataset (Simca teaches obtaining process information from testing data environment, col. 7, line 41-42; such activity as modifying a file, downloading a file, etc., col. 17, line 6-20; testing data environment may include historical data associated with processing running in live data environment, statistics parameters  and dynamic parameter; dynamic parameter may include: creating, modifying, or deleting data or file, col. 6, line 15-19, 31-32 and 60; each process may be stored with a reference or identifier (e.g., file name, keyword, numerical identifier, hash, category, pointer, etc.), col. 5, line 59-63); and wherein the performing the encoding operations includes encoding the historical version of the first dataset to generate update pattern encode values associated with the first dataset (Baruch, col. 10, line 8-12, a hash value of a snapshot data replica may be determined based upon a hash value associated a previous snapshot data replica and a hash value associated with a data difference between the previous snapshot data replica and a current snapshot data replica). 
Regarding claim 9, Baruch as modified by Dubois, Stowell and Simca teach all claimed limitations as set forth in rejection of claim 8, further teach wherein the validating the update pattern includes comparing the one or more data records in the experimental dataset to the update pattern encode values associated with the first dataset (Stowell, paragraph [0039]-[0040], paragraph [0047], the verification data can include values of various property of data stored by the component, e.g., a count of data entries, rows, columns, or tables in the component, etc.; the metadata can indicate types of data in a component database, and the system can cross-references the types of data in a restored component database with those in the metadata during validation, the metadata includes data structural relationship in a component database, and the system can compare the data relationship in a restored component database with that included in the metadata; if the metadata matches the restored data in the corresponding components, the system can record an indication that the backup artifact for the component is verified). 
Claims 15 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Baruch et al. (U.S. Patent No. 10,235,090 B1) in view of Dubois et al. (U.S. Pub. No. 2018/0095835 A1) and Stowell et al. (U.S. Pub. No. 2019/0324861 A1), further in view of Bhatnagar et al. (U.S. Pub. No. 2020/0104587 A1).
Regarding claim 15, Baruch as modified by Dubois and Stowell teach all claimed limitations as set forth in rejection of claim 10, but do not explicitly disclose wherein the performing the validation operations includes validating a value format of string-type data included in the experimental dataset. 
Bhatnagar teaches wherein the performing the validation operations includes validating a value format of string-type data included in the experimental dataset (receiving structured data from daily and historic invoices; extracting key data fields from the invoices using the regular expression; validating extracted string, paragraph [0022], [0025]-[0027], [0032]).
It would have been obvious to one of ordinary skill in art before the effective filing date of the claim invention to include wherein the performing the validation operations includes validating a value format of string-type data included in the experimental dataset into validating replication copy of Baruch.
Motivation to do so would be to include wherein the performing the validation operations includes validating a value format of string-type data included in the experimental dataset to overcome issue with manually examining such variations and volumes of invoices for correctness, genuineness, and duplicates against historic invoices is usually highly subjective and increases the average cost and time to process an invoice (Bhatnagar, paragraph [0003], line 9-13).
Regarding claim 16, Baruch as modified by Dubois, Stowell and Bhatnagar teach all claimed limitations as set forth in rejection of claim 15, further teach wherein the performing the encoding operations includes: generating one or more regular expressions based on string-type data included in at least one of the one or more current datasets; and wherein the validating the value format of string-type data included in the experimental dataset includes parsing data in the experimental dataset using the one or more regular expressions (Bhatnagar teaches receiving structured data from daily and historic invoices; extracting key data fields from the invoices using the regular expression; validating extracted string; comparing corresponding fields across two invoices for similarity measures, paragraph [0022], [0025]-[0027], [0032]-[0034]). 
Claims 18 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over B Baruch et al. (U.S. Patent No. 10,235,090 B1) in view of Dubois et al. (U.S. Pub. No. 2018/0095835 A1) and Stowell et al. (U.S. Pub. No. 2019/0324861 A1), further in view of Michelson et al. (U.S. Pub. No. 2020/0402672 A1).
Regarding claim 18, Baruch as modified by Dubois and Stowell teach all claimed limitations as set forth in rejection of claim 17, but do not explicitly disclose: wherein the performing the validation operations includes validating semantic values associated with one or more data records in the experimental dataset. 
Michelson teaches: wherein the performing the validation operations includes validating semantic values associated with one or more data records in the experimental dataset (Fig. 3, paragraph [0054], [0058], determine the feature value similarity between attribute value of first set attributes 314 and second set of attribute 318; in conjunction with the comparison of batch data against the schema as taught by Breck, it teaches wherein the performing the validation operations includes validating semantic values associated with one or more data records in the updated dataset as claimed). 
It would have been obvious to one of ordinary skill in art before the effective filing date of the claim invention to include wherein the performing the validation operations includes validating semantic values associated with one or more data records in the experimental dataset in training environment of Breck.
Motivation to do so would be to include wherein the performing the validation operations includes validating semantic values associated with one or more data records in the experimental dataset to utilize feature values of grouping features of medical results (Michelson, paragraph [0005], line 2-3).
Regarding claim 19, Baruch as modified by Dubois and Stowell teach all claimed limitations as set forth in rejection of claim 18, further teach performing encoding operations using a natural language processing (NLP) model to calculate first vector word-embedding representations of data in the one or more current datasets; and wherein the validating the semantic values includes: using the NLP model to calculate second vector word-embedding representations of data in the experimental dataset; and comparing the first vector word-embedding and second vector word-embedding representations (Michelson, Fig. 3, paragraph [0035], [0043], [0054], [0058], using natural language to determine the feature value of group of values through word-embedding model; determine the feature value similarity between attribute value of first set attributes 314 and second set of attribute 318; in conjunction with the comparison of batch data against the schema as taught by Breck, it teaches performing encoding operations using a natural language processing (NLP) model to calculate first vector word-embedding representations of data in the plurality of datasets; and wherein the validating the semantic values includes: using the NLP model to calculate second vector word-embedding representations of data in the updated dataset; and comparing the first vector word-embedding and second vector word-embedding representations as claimed). 
Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Baruch et al. (U.S. Patent No. 10,235,090 B1) in view of Dubois et al. (U.S. Pub. No. 2018/0095835 A1) and Stowell et al. (U.S. Pub. No. 2019/0324861 A1), further in view of Simca et al. (U.S. Patent No. 10,642,715 B1).
Regarding claim 20, Baruch as modified by Dubois and Stowell teach all claimed limitations as set forth in rejection of claim 17, but do not explicitly disclose: wherein the performing the validation operations includes validating an update pattern associated with one or more data records in the updated dataset. 
Simca teaches: wherein the performing the validation operations includes validating an update pattern associated with one or more data records in the updated dataset (Simca teaches testing data environment may maintain actual historical data, artificial testing data or simulated testing data which may be useful in learning about the processes and their attributes, col. 5, line 56-64, obtaining process information from testing data environment, col. 7, line 41-42; such activity as modifying a file, downloading a file, etc., col. 17, line 6-20; testing data environment may include historical data associated with processing running in live data environment, statistics parameters  and dynamic parameter; dynamic parameter may include: creating, modifying, or deleting data or file, col. 6, line 15-19, 31-32 and 60; each process may be stored with a reference or identifier (e.g., file name, keyword, numerical identifier, hash, category, pointer, etc.), col. 5, line 59-63; looking for pattern of known, valid activity (or know, invalid activity) and define steps in each process as part of each context profile, col. 8, line 18-25).
It would have been obvious to one of ordinary skill in art before the effective filing date of the claim invention to include wherein the performing the validation operations includes validating an update pattern associated with one or more data records in the updated dataset into syncing in a distributed system of Cantwell.
Motivation to do so would be to include wherein the performing the validation operations includes validating an update pattern associated with one or more data records in the updated dataset to provide adaptive, customized, and flexible security in networks with dynamically changing software applications (Simca, col. 1, line 63-64).
Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEN HOANG whose telephone number is (571)272-8401. The examiner can normally be reached M-F 7:30am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on (571) 272-4034. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/KEN HOANG/Examiner, Art Unit 2168