DETAILED ACTION
This is the response to applicant’s amendment action regarding application number 16/254,033, filed January 22, 2019.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments
The amendment filed June 20, 2022 has been entered. Examiner acknowledges receipt of Amendments to Application 16/254,033, which include: Amendments to the Claims, and Remarks containing Applicant’s amendments. 
Regarding Applicant’s Remarks and Amendments to the Claims, Examiner acknowledges Applicant has amended Claims 1, 3, 7-9, 11, 13, 16, 18-19, 21-22, and 25, with Claims 2, 5, 12, 14, and 17 previously cancelled, and Claims 10 and 15 newly cancelled. Examiner acknowledges Applicant has added new Claims 26-27. Claims 1, 3-4, 6-9, 11, 13, 16, and 18-27 remain pending in the application. Examiner notes that Applicant’s amendments have introduced a new claim objection, with the new claim objection identified in the relevant section below.
Regarding Applicant’s Remarks and Amendments to the Claims, Examiner acknowledges Applicant’s amendments have resolved the objection identified in Claim 22, and therefore the respective claim objection previously set forth in the Non-Final Office Action mailed March 29, 2022 for Claim 22 is withdrawn.
Regarding Applicant’s Remarks and Amendments to the Claims, Examiner acknowledges Applicant’s amendments have resolved the lack of written description issue identified in Claims 7-8, and therefore the respective 112(a) rejections previously set forth in the Non-Final Office Action mailed March 29, 2022 for Claims 7-8 are withdrawn.

Response to Arguments
Examiner acknowledges receipt of Arguments to Application 16/254,033, which include: Remarks containing Applicant’s arguments.
Regarding Applicant’s Remarks for Claims 1, 7-9, 13, and 19-23 under 35 U.S.C. 103 as being unpatentable over Ghanta et al., U.S. PGPUB 2020/0034665, filed 7/30/2018 [hereafter referred as Ghanta] in view of Dirac et al., U.S. PGPUB 2015/0379430, published 12/31/2015 [hereafter referred as Dirac '430], in further view of Bianco et al., A Practical and Effective Sampling Selection Strategy for Large Scale Deduplication, September 2015 [hereafter referred as Bianco]; for Claims 3, 10, and 15 under 35 U.S.C. 103 as being unpatentable over Ghanta in view of Dirac ‘430, in further view of Bianco as applied to Claims 1, 9, and 13; in even further view of Maag et al., U.S. PGPUB 2017/0220403, published 8/3/2017 [hereafter referred as Maag]; for Claims 4, 11, and 16 under 35 U.S.C. 103 as being unpatentable over Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Maag as applied to Claims 3, 10, and 15; in even further view of Dirac et al., U.S. PGPUB 2015/0379424, published 12/31/2015 [hereafter referred as Dirac '424]; and for Claims 6, 18, and 24-25 under 35 U.S.C. 103 as being unpatentable over Ghanta in view of Dirac ‘430, in further view of Bianco as applied to Claims 1, 13, and 22; in even further view of Schelter et al., Automating Large-Scale Data Quality Verification, 2018 [hereafter referred as Schelter], Examiner acknowledges Applicant’s arguments and have considered them, and have found them to be not persuasive. Examiner points out that the Applicant has since introduced considerable amendments to existing independent and dependent claims that were not previously presented, as well as introducing new claims, with the amendments to the existing claims changing the scope of the claims such that these changes necessitate further examination and re-evaluation of the amended, original, and new claims. These updated claim mappings according to the Applicant’s amended claims are provided in the sections indicated below.
Regarding Applicant’s Remarks:
“Applicant respectfully submits that Ghanta, Dirac' 430, and Bianco alone or in hypothetical combination, fail to teach every recitation of independent claims 1, 9, and 13. In particular, Ghanta, Dirac' 430, and Bianco alone or in hypothetical combination, fail to teach determining one or more failure metrics ... wherein the one or more failure metrics are determined based on one or more characteristics of the ... data set, one or more outputs of the plurality of phases of the ML pipeline, or any combination thereof . . . and performing a remediation of the . . . data set to provide additional training data to the target data set, as generally recited by independent claims 1, 9, and 13. In contrast, Ghanta, at most, teaches using a validation data set to validate a machine learning model and analyzing the output of the validation data to determine an error data set. See Ghanta, paragraph 75. Ghanta is silent as to determining failure metrics throughout one or more phases of a machine learning pipeline based on characteristics of a data set or outputs of phases of the ML pipeline. Further, Ghanta is silent to any phases of a machine learning pipeline, or determining failure metrics exceed a threshold value prior a completed build of the ML model, and rather determines error data sets based on running validation data through a completed build of a machine learning model. See Id Therefore, Ghanta is silent to determining one or more failure metrics ... wherein the one or more failure metrics are determined based on one or more characteristics of the ... data set, one or more outputs of the plurality of phases of the ML pipeline, or any combination thereof . . . and performing a remediation of the . . . data set to provide additional training data to the target data set, as recited by independent claims 1, 9, and 13.”
Examiner has considered this argument and finds the argument to be not persuasive. Examiner points out that Applicant’s above argument contains two sub-arguments, each of which are addressed in the following paragraphs.
Regarding Applicant’s sub-argument that the Ghanta reference does not teach a “plurality of phases” in a machine learning pipeline, Examiner finds this sub-argument to be not persuasive. Examiner reminds the Applicant that MPEP 2111 requires that during patent examination, the pending claims must be given their broadest reasonable interpretation consistent with the specification, and an Examiner must construe claim terms in the broadest reasonable manner during prosecution as is reasonably allowed in an effort to establish a clear record of what applicant intends to claim. Under its broadest reasonable interpretation, the term “plurality of phases” broadly indicates a plurality of logical stages or steps in a series of processes, and hence this limitation broadly recites a machine learning related pipeline that performs a logical step for building or generating a model. Ghanta teaches a machine learning service containing multiple logical stages or steps such as a training phase and an inference phase (Ghanta [0040]: “In certain embodiments of machine learning systems 200, there is a training phase, for generating the machine learning model, and an inference phase for analyzing an inference data set using the machine learning model.”). These stages are implemented as machine learning pipelines in a machine learning system (Ghanta [0036]: “… a machine learning system may involve various components, pipelines, data sets, and/or the like – such as training pipelines, orchestration/management pipelines, inference pipelines, and/or the like …”). In particular, Ghanta teaches the training pipeline is used to generate a machine learning model, and hence the training pipeline corresponds to a machine learning model building phase (Ghanta [0056]: “… the machine learning system 200 includes physical and/or logical groupings of the machine learning pipelines 202, 204, 206a-c … the ML management apparatus 104 may select a training pipeline 204 for generating a machine learning model configured for the desired objective and one or more inference pipelines 206a-c that are configured to analyze the desired objective …”). Additionally, Ghanta teaches different pipeline machine learning operations (i.e., steps) such as algorithm training/inference, feature engineering, validations, and scoring, where each of these machine learning operations further represent different steps (“phases”) of a machine learning service (Ghanta Figure 2A, elements 202, 204, 206a-c and [0053]: “… machine learning pipelines 202, 204, 206a-c comprise various machine learning features, components, objects, modules, and/or the like to perform various machine learning operations such as algorithm training/inference, feature engineering, validations, scoring, and/or the like.”). Hence, given the evidence provided above, Applicant’s sub-argument is not persuasive, and the existing prior art rejection is maintained.
Regarding Applicant’s sub-argument that the Ghanta reference does not teach “determining failure metrics throughout one or more phases of a machine learning pipeline based on characteristics of a data set or outputs of phases of the ML pipeline” in a machine learning pipeline, Examiner finds this sub-argument to be not persuasive. Under its broadest reasonable interpretation, the term “failure” as defined in the Merriam-Webster dictionary indicates something that lacks success or failing short (a deficiency), resulting in the term “failure metrics” to broadly recite any set of metrics that are used to determine a deficient or unsuccessful result. As indicated earlier, Ghanta teaches different pipeline machine learning operations, representing different stages (phases) of a machine learning service, where one of the stages includes a validation stage. As indicated in the Non-Final Office Action mailed March 29, 2022, Ghanta teaches a secondary validation module producing suitability metrics based on the predicted output generated by a primary validation module analyzing the first machine learning model (i.e., the error data set), where this generated error data set is based on the first machine learning model analyzing an inference data set. These suitability metrics are also outputs from a validation phase, and include statistical metrics, confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability. Ghanta further teaches an analysis module using these suitability metrics to determine whether the output produced by the first machine learning model satisfy thresholds that indicate whether the model is a good fit for generating accurate predictions. This analysis represents a suitability analysis for validating the performance of a first machine learning model. Thus, the determination made during a validation phase when these thresholds are not being satisfied represents an unsuitability condition (concluding that the first machine learning model lacks efficacy, accuracy, or effectiveness), with these unsuitability conditions representing a failed or deficient condition of the model. Analyzing and validating the output produced from a first machine learning model (through use of suitability metrics) corresponds to a determination process involving one or more outputs of the plurality of phases of the ML pipeline (Ghanta [0078]-[0081]; [0085]-[0087]: “… the secondary training module 306 may include … statistical signature scores for each sample in the data set, prediction values from the first machine learning algorithm … confidence metrics … parameters that are specific to the first machine learning algorithm/model …”; [0089]: “… the secondary validation module 308 analyzes other statistics, such as training statistics, to determine suitability of the second machine learning algorithm in accurately assessing the effectiveness of the first machine learning algorithm … confidence metrics, accuracy metrics, precision metrics, and/or the like …”; Table 1 and [0091]-[0093]: “… the predictions that the first machine learning algorithm/model generates can be used as input into the training of the second machine learning model, along with the error data … The analysis module 310 … is configured to determine whether the first machine learning algorithm/model is a suitable algorithm/model for generating predictions for the inference data set … may determine whether the various metrics/health scores each satisfy a threshold value … If so, then the analysis module 310 may determine that the first machine learning algorithm/model is generating accurate predictions for the inference data set. … the health scores/values may include prediction confidence values, data deviation values, AB testing values, canary values, and/or the like. Table 1 below illustrates an example output data set that the analysis module 310 may analyze to determine whether the first machine learning algorithm/model is a good fit …”; and [0098]-[0099]: “… the analysis module 310 may determine whether the suitability score based on the metrics/health scores in Table 1 satisfies a threshold to determine (1) whether the second machine learning algorithm/module is a good fit for validating the predictive performance of the first machine learning algorithm/model, and if so (2) whether the first machine learning algorithm/model is a good fit for generating accurate predictions for the inference data set.”). As indicated in Ghanta [0091]-[0093] and Table 1, Ghanta teaches that these suitability metrics are based on features received from the first machine learning algorithm, which is received as input into the secondary training/validation modules to be processed by the second machine learning model to generate the suitability metrics. Hence, the output from both the primary validation (i.e., the error data set) and secondary validation module that produces the suitability metrics (based on the error data set) also correspond to one or more outputs from a ML pipeline building phase. Thus, the suitability metrics are generated based on one or more of these outputs, and therefore this analysis process involving the suitability metrics correspond to one or more failure metrics being determined based on one or more outputs of the plurality of phases of the ML pipeline (Ghanta [0079]-[0080]: “… The resulting output of the validation of the first machine learning algorithm/model … comprises an error data set … includes values indicating the prediction error of the first machine learning algorithm/model … includes features that comprise one or more of features of the error data set …and/or one or more parameters specific to the first machine learning model …”; [0087]: “… the secondary training module 306 enhances the error data set by including additional data to supplement the prediction error data … may include data for additional features such as features of the data set itself … prediction values from the first machine learning algorithm (e.g., the predicted values output from analyzing the inference data set using the first machine learning algorithm/model …”; and [0095]: “… As explained above, the second machine learning algorithm receives features (e.g., of the inference data set, the error data set, and/or other features) as input and predicts whether the first machine learning algorithm is suitable for making accurate predictions on the inference data set …”). Hence, given the evidence provided above, Applicant’s sub-argument is not persuasive, and the existing prior art rejection is maintained.
Regarding Applicant’s sub-argument that “Ghanta is silent to … determining failure metrics exceed a threshold value prior a completed build of the ML model, and rather determines error data sets based on running validation data through a completed build of a machine learning model”, Examiner points to MPEP 2145(VI), which indicates that “Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.”. Examiner also cites the guidelines in MPEP 2111.01(II) which caution against importing written description into a claim limitation that is broader than the cited embodiment: "Though understanding the claim language may be aided by explanations contained in the written description, it is important not to import into a claim limitations that are not part of the claim. For example, a particular embodiment appearing in the written description may not be read into a claim when the claim language is broader than the embodiment.". Applicant’s amended claim limitations broadly recite that a ML pipeline includes “a plurality of phases”, and the one or more failure metrics are determined at “one or more of the plurality of phases” of the ML pipeline, and does not specifically restrict or limit the determination of the one or more failure metrics to be performed only during the ML model building phase. This is also the case with the Applicant’s amended limitation “determining that at least one of the more failure metrics exceeds a threshold value”, where the limitation broadly recites performing a determination of the one or more failure metrics exceeding a threshold value, and also does not limit this determination of exceeding a threshold value to a specific phase of the ML pipeline. Hence, Applicant’s sub-argument is not persuasive, and the existing prior art rejection is maintained.
Regarding Applicant’s Remarks:
“Further, Dirac '430 is silent as to determining one or more failure metrics ... wherein the one or more failure metrics are determined based on one or more characteristics of the ... data set, one or more outputs of the plurality of phases of the ML pipeline, or any combination thereof ... and performing a remediation of the ... data set to provide additional training data to the target data set, as generally recited by independent claims 1, 9, and 13. Dirac '430 merely teaches, a technique for determining whether "observation records of a first data set are duplicated in a second set." Dirac '430, abstract. Specifically, Dirac '430 teaches that "the duplicate detector [i.e., 7035] may keep track of the ORs [observation records] that are classified as having non-zero probabilities of being duplicates." Id at paragraph 363 (emphasis added). That is, Dirac '430 appears to teach classifying observation records that have a non-zero possibility of being duplicates. See id Dirac '430 fails to teach determining any failure metrics at one or more phases of a machine learning pipeline. Dirac '430 at most teaches detecting duplicate values, and is silent to performing any remediation measure that include providing additional data to any data sets.”
Examiner has considered this argument and finds the argument to be not persuasive. Examiner points out that part of Applicant’s above argument is based on the Applicant applying the Dirac ‘430 reference on the newly introduced and amended limitations that were not previously presented. Those newly introduced and amended limitations will be addressed in the relevant sections indicated below. Furthermore, as indicated in the Non-Final Office Action mailed March 29, 2022, Dirac ‘430 teaches the limitations “… wherein the ML pipeline includes a data pre-processing phase …” and “… applying the data pre-processing phase of the ML pipeline to the target data set before applying the ML model building phases of the ML pipeline …”, with the details of this claim mapping provided in the same Non-Final Office Action. Examiner also notes that the remainder of Applicant’s above arguments (involving a duplicate detector classifying observation records that have a non-zero possibility of being duplicates) is not relevant to the claim mapping that was provided in the same Non-Final Office Action mailed March 29, 2022. As indicated in the Non-Final Office Action mailed March 29, 2022, the latest claim mappings using the Dirac ‘430 reference are not directed towards the use of a duplicate detector. Examiner points out that these are earlier arguments previously presented by the Applicant with respect to an earlier submission (submitted on August 16, 2021), where the Examiner already addressed those arguments in the proper context in the Response to Arguments section in the Final Office Action mailed November 3, 2021. Hence, Examiner does not find this set of arguments to be relevant to the most recent entered claims, and thus Applicant’s argument is not persuasive, and the prior art rejection is maintained.
Regarding Applicant’s Remarks:
“… Further, Bianco fails to cure the deficiencies of Dirac '430 and Ghanta in regards to determining one or more failure metrics ... wherein the one or more failure metrics are determined based on one or more characteristics of the ... data set, one or more outputs of the plurality of phases of the ML pipeline, or any combination thereof . . . and performing a remediation of the . . . data set to provide additional training data to the target data set, as generally recited by independent claims 1, 9, and 13. Bianco, in contrast, merely discloses estimating an initial threshold, and indexing the threshold based on a desired number of candidate pairs based on computational costs. See, Bianco, page 2309. Bianco is silent as to determining one or more failure metrics, and in contrast suggests pruning candidate pairs based on the threshold values, and is silent to providing additional data to a target data set as part of remediation of the conditioned data set. See Id.”
Examiner has considered this argument and finds the argument to be not persuasive.
Examiner points out that part of Applicant’s above argument is based on the Applicant applying the Bianco reference on the newly introduced and amended limitations that were not previously presented. Those newly introduced and amended limitations will be addressed in the relevant sections indicated below. Furthermore, as indicated in the Non-Final Office Action mailed March 29, 2022, Bianco teaches the limitations “… identifying one or more duplicate data entries within the target data set; …”; “… wherein the conditioned data set includes each entry of the target data set except for the identified duplicate entries; …”; and “… determining that the conditioned data set comprises less than a threshold quantity of unique entries; …”. Regarding the first two limitations, Bianco teaches a T3S process for deduplicating data from datasets. Bianco teaches a set of Sig-Dedup filters perform an initial pre-processing deduplication stage based on prefix, length, positional, suffix filtering to identify a set of candidate matching pairs of records from a dataset (to remove initial duplicate records). Bianco additionally teaches a first stage that performs sorting and ranking of these candidate pairs of records according to similarity values, where the rank represents different sample levels of similarity, with the lowest level [0.0-0.1] representing a large number of candidate pairs, and the highest level [0.9-1.0] representing a large number of matching pairs. Bianco additionally teaches a second stage that performs the similarity-based deduplication based on an incremental removal of the non-informative or redundant pairs inside each sample level using a SSAR (selective sampling using association rules) active learning method, where each iteration removes irrelevant features and examples from a training set D. Hence, the process of identifying a set of duplicate data entries as candidate matching pairs and grouping them into different levels based on similarity corresponds to a process for identifying one or more duplicate data entries within a target data set. Examiner refers to the Non-Final Office Action mailed March 29, 2022 for the detailed claim mappings from the Bianco reference for these first two limitations. 
Examiner additionally points out that Applicant’s argument that the Bianco reference “merely discloses estimating an initial threshold, and indexing the threshold based on a desired number of candidate pairs based on computational costs” is directed to Applicant’s own earlier recited limitation (now removed from the latest set of submitted amendments) that required a step of “… determining that the conditioned data set comprises less than a threshold quantity of unique entries; …”, as well as questioning the associated motivation to combine the Bianco reference with the Ghanta and Dirac ‘430 references. Examiner reminds Applicant the guidelines provided in MPEP 2142: “During patent examination and reexamination, the concept of prima facie obviousness establishes the framework for the obviousness determination and the burdens the parties face. Under this framework, the patent examiner must first set forth a prima facie case, supported by evidence, showing why the claims at issue would have been obvious in light of the prior art. … 35 U.S.C. 103 authorizes a rejection where, to meet the claim, it is necessary to modify a single reference or to combine it with one or more other references. After indicating that the rejection is under 35 U.S.C. 103, the examiner should set forth in the Office action: (A) the relevant teachings of the prior art relied upon, preferably with reference to the relevant column or page number(s) and line number(s) where appropriate, (B) the difference or differences in the claim over the applied reference(s), (C) the proposed modification of the applied reference(s) necessary to arrive at the claimed subject matter, and (D) an explanation as to why the claimed invention would have been obvious to one of ordinary skill in the art at the time the invention was made.”. As indicated in the Non-Final Office Action mailed March 29, 2022, under its broadest reasonable interpretation, the identified limitation broadly recites a comparison step against a threshold, where the threshold is based on a number of records that is present after a deduplication step. Bianco teaches a step to determine an approximate blocking threshold as a stopping criterion for inspecting additional similarity levels during the first stage, where this blocking threshold is defined as an optimal threshold value that maximizes information recall in the records while avoiding an excessive generation of candidate pairs, where this determination identifies a random subset from a dataset that is matched (representing the pre-processed deduplicated set), and identifies the first threshold that results in the number of candidate pairs being smaller than the number of records from that random subset that were initially deduplicated (Bianco p.2309 col.1 1st-2nd paragraphs: “… A random subset is selected from the dataset that is matched by using a variable threshold which varies in fixed ranges. The stopping criterion specifies that the number of pairs needed to satisfy the Sig-Dedup filters must be lower than the subset size. When compared with the entire dataset, the random subset naturally decreases the number of true matching pairs. … When the threshold value is incrementally increased, fewer tokens in the sorted record are index, thus reducing the number of candidate pairs. On the other hand, a high threshold value selects few tokens in the sorted record and a lot of matching pairs can be pruned out. The stopping criterion produces a threshold that avoids both: a large generation of candidate pairs and recall degradation …”; p.2313 col.2 Section 5.4 Identifying the Initial Threshold 1st paragraph and p.2314 col.1 1st-2nd paragraphs).). Examiner further points out that while Applicant has now removed the above recited limitation from the Applicant’s latest set of amendments, this removed limitation does not change the obviousness rationale and the motivation to combine the Bianco references with the Ghanta and Dirac ‘430 references. As indicated in the Non-Final Office Action mailed March 29, 2022, both Ghanta in view of Dirac ‘430 and Bianco are analogous art since both teach performing data/feature processing on training data sets. It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the data/feature processing steps taught in Ghanta in view of Dirac ‘430 and incorporate the data deduplication steps taught in Bianco as a way to perform further data transformation and data validation on an input dataset. The motivation to combine is taught in Bianco, since detecting and removing duplicates in data sets results in considerable improvements in data quality, where the deduplication will also result in producing a training data set that contains more relevant information in which to perform additional machine learning training and analysis. A system that trains on this deduplicated data set makes the system more computationally efficient, without sacrificing the effectiveness with respect to other deduplication methods performed on the same training data set (Bianco p.2305 col.1-col.2 Introduction: “… data quality can be degraded mostly due to the presence of duplicate pairs with misspellings, abbreviations, conflicting data, and redundant entities, among other problems … a system designed to collect scientific publications on the Web to create a central repository … may suffer a lot in the quality of its provided services, e.g., search or recommendation may not produce results as expected by the end user due to the large number of replicated or near-replicated publications dispersed on the Web (e.g., a query response composed mostly by duplicates may be considered as having low informative value). The ability to check whether a new collected object already exists in the data repository (or a close version of it) is an essential task to improve data quality. … Considerable improvements in data quality can be obtained by detecting and removing duplicates.”; p.2317 Figure 6 and col.1-col.2 (Section 5.7 T3S vs ALISA, ALD, and Christen (2008)): “We present experiments with the real-world and one synthetic datasets … It can be seen, that T3S-[NGram and SVM] converge very quickly, producing good effectiveness with only a few manually labeled pairs … Note that T3S clearly outperforms ALIAS with a reduced labeling effort in both real datasets. … T3S-SVM requires only 103 and 31 labeled pairs (a reduction of 21 and 78 percent), reaching a statistically significant gain of 3 percent …”). Hence, Examiner’s claim mapping to the recited claim limitations and provided motivation are proper within the guidelines of the MPEP, and thus Applicant’s argument is not persuasive, and the prior art rejection is maintained.
Regarding Applicant’s Remarks:
“Claims 3, 4, 6, 24, and 25 depend from independent claim 1, claims 10 and 11 depend from independent claim 9, and claims 15, 16, and 18 depend from independent claim 13. As discussed above, Ghanta, Dirac '430, and Bianco either alone or in hypothetical combination, fail to disclose all elements of independent claims 1, 9, and 13. Moreover, Applicant submits that Maag, Dirac '424, and Schelter, do not obviate the deficiencies of Ghanta, Dirac '430, and Bianco with respect to independent claims 1, 9, and 13. Thus, based at least on their dependencies from independent claims 1, 9, and 13, as well as for the elements recited therein, claims 3, 4, 6, 10, 11, 15, 16, 18, 24, and 25 are believed to be in condition for allowance. As such, Applicant respectfully requests withdrawal of the rejection of claims 3, 4, 6, 10, 11, 15, 16, 18, 24, and 25 under 35 U.S.C. § 103 and allowance of the same.”
Examiner has considered this argument and finds the argument to be not persuasive. Examiner notes that Applicant does not provide any additional arguments other than referencing the previous set of arguments. As established in response to the previous set of arguments, Applicant’s arguments based on certain identified limitations in independent Claim 1 (that are also present in independent Claims 9 and 13) were not persuasive, and hence Applicant’s arguments here are also not persuasive, and thus the prior art rejections are maintained.

Claim Objections
Claim 21 is objected to because of the following informality: A typographical error, where the term “target data set” is missing the article “the” to indicate its reference to an earlier recited “a target data set” (“… applying the data pre-processing phase of the ML pipeline to the target data set …”). Appropriate correction is required.

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 7-8, 21-23, and 26 are rejected under 35 U.S.C. 103 as being unpatentable over 
Ghanta et al., U.S. PGPUB 2020/0034665, filed 7/30/2018 [hereafter referred as Ghanta] in view of Dirac et al., U.S. PGPUB 2015/0379430, published 12/31/2015 [hereafter referred as Dirac '430], in further view of Bianco et al., A Practical and Effective Sampling Selection Strategy for Large Scale Deduplication, September 2015 [hereafter referred as Bianco], in even further view of Baylor et al., TFX: A TensorFlow-Based Production-Scale Machine Learning Platform, KDD’17, August 13-17 2017 [hereafter referred as Baylor].
Regarding amended Claim 1, 
Ghanta teaches
(Currently amended) A system comprising:
a memory (Examiner’s note: Ghanta teaches a machine learning system/ML management apparatus containing a processor, volatile memory, and non-volatile computer readable storage medium, where the computer readable storage medium contains computer readable program instructions (program code) for modules implementing the described functions, as well as operational data for the modules, collected as data sets (Ghanta [0018]-[0021], [0025], [0029], [0033]; and Figure 1, element 104 and [0043]-[0044]).) containing:
a target data set (Examiner’s note: Under its broadest reasonable interpretation, a “target” data set broadly recites any identified data set that is used in a machine learning system. Ghanta teaches a machine learning system (containing training, orchestration/management (policy), and inference pipelines) using training, validation, test, error, and inference data sets, where these data sets are representative of types of a target data set (Ghanta Figure 2, elements 202, 204, 206a-c; Figure 1, element 104 and [0036]-[0037]: “… a machine learning system may involve various components, pipelines, data sets and/or the like – such as training pipelines, orchestration/management pipelines, inference pipelines, and/or the like … the ML management apparatus 104 provides an improvement for machine learning systems by training a first or primary machine learning model for a first/primary machine learning algorithm using a training data set, validating the first machine learning model using a validation data set, the output of which is an error data set that describes the accuracy of the first machine learning model for a second/auxiliary machine learning algorithm using the error data set. …”).); and
instructions defining a software application configured to apply a machine learning (ML) pipeline (Examiner’s note: As indicated earlier, Ghanta teaches a non-volatile computer readable storage medium containing computer readable program instructions for modules implementing the described functions on a machine learning system/ML management apparatus (Ghanta [0018]-[0021], [0025], [0029], [0033]; and Figure 1, element 104 and [0043]-[0044]).), 
wherein the ML pipeline includes a plurality of phases comprising … an ML model building phase (Examiner’s note: Under its broadest reasonable interpretation, the term “plurality of phases” broadly indicates a plurality of logical stages or steps in a series of processes, and hence this limitation broadly recites a machine learning related pipeline that performs a logical step for building or generating a model. Ghanta teaches a machine learning service containing multiple logical stages or steps such as a training phase and an inference phase (Ghanta [0040]: “In certain embodiments of machine learning systems 200, there is a training phase, for generating the machine learning model, and an inference phase for analyzing an inference data set using the machine learning model.”). Ghanta additionally teaches different pipeline machine learning operations (i.e., steps) such as algorithm training/inference, feature engineering, validations, and scoring, where each of these machine learning operations further represent different steps (“phases”) of a machine learning service. Ghanta additionally teaches the training operation (performed by a plurality of training pipelines) are used for generating machine learning models, and as such, these training operations correspond to a ML model building phase (Ghanta Figure 2A, elements 202, 204, 206a-c, [0036], [0053]: “… machine learning pipelines 202, 204, 206a-c comprise various machine learning features, components, objects, modules, and/or the like to perform various machine learning operations such as algorithm training/inference, feature engineering, validations, scoring, and/or the like.”; and [0056]: “… the machine learning system 200 includes physical and/or logical groupings of the machine learning pipelines 202, 204, 206a-c … the ML management apparatus 104 may select a training pipeline 204 for generating a machine learning model configured for the desired objective and one or more inference pipelines 206a-c that are configured to analyze the desired objective …”).); and 
a processor configured to execute the instructions to perform operations (Examiner’s note: As indicated earlier, Ghanta teaches a machine learning system/ML management apparatus containing a processor, volatile memory and non-volatile computer readable storage medium, where the computer readable storage medium contains computer readable program instructions (program code) for modules implementing the described functions (Ghanta [0018]-[0021], [0025], [0029], [0033]; and Figure 1, element 104 and [0043]-[0044]).) comprising:
obtaining, from the memory, the target data set (Examiner’s note: Under its broadest reasonable interpretation, a “target” data set broadly recites any identified data set that is used in a machine learning system. As indicated earlier, Ghanta teaches a machine learning system using training, validation, test, error, and inference data sets (“a target data set”), where these data sets are forms of operational data stored in memory (Ghanta [0019]: “… operational data may be identified and illustrated herein within modules, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations …”, [0036]-[0037]). Ghanta further teaches an ML management apparatus in the machine learning system using these identified data sets, where the ML management apparatus contains various modules including a primary training module, a primary validation module, a secondary training module, and a secondary validation module, with the primary training module training a first machine learning model using a training data set (Ghanta Figure 3 and [0075]-[0077]).);
… generating a conditioned data set based on the target data set (Examiner’s note: Under its broadest reasonable interpretation, the term “conditioned data set” broadly recites a data set that is based on an identified “target” data set (where an identified “target” data set broadly recites any identified data set that is used in a machine learning system). As indicated earlier, Ghanta teaches different pipeline machine learning operations such as algorithm training/inference, feature engineering, validations, and scoring, where each of these operations further represent different stages (“phases”) of a machine learning service. A person having ordinary skill in the art would understand that these different stages such as training, inference, feature engineering represents a process for analyzing and performing conversions on an input data set to an output data set, such that the output generated by each of these stages (for example, feature engineering) performed on an input data set to produce a subsequent data set for additional processing by a machine learning algorithm/model represents a conditioned data set (Ghanta [0053]: “… machine learning pipelines 202, 204, 206a-c comprise various machine learning features, components, objects, modules, and/or the like to perform various machine learning operations such as algorithm training/inference, feature engineering, validations, scoring, and/or the like. Pipelines 202, 204, 206a-c may analyze or process data 210 … from a static source, streaming …”; [0063]: “… the results generated by one logical machine learning layer 200 be used as input into a different logical machine learning layer 200, e.g., as training data for a machine learning model, as input data 210 to an inference pipeline 206a-c, and/or the like …”).) …
… determining one or more failure metrics at one or more of the plurality of phases of the ML pipeline (Examiner’s note: Under its broadest reasonable interpretation, the term “failure” as defined in the Merriam-Webster dictionary indicates something that lacks success or failing short (a deficiency), resulting in the term “failure metrics” to broadly recite any set of metrics that are used to determine a deficient or unsuccessful result. As indicated earlier, Ghanta teaches different pipeline machine learning operations, representing different stages (“phases”) of a machine learning service, where one of the stages includes a validation stage. Ghanta teaches a secondary validation module producing suitability metrics based on the predicted output generated by the primary validation module analyzing the first machine learning model, where this generated output (i.e., the error data set) is based on a received inference set. These suitability metrics include statistical metrics, confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability. Ghanta further teaches an analysis module using these suitability metrics to determine whether the output produced by the first machine learning model satisfy thresholds that indicate whether the model is a good fit for generating accurate predictions. This analysis represents a suitability analysis for validating the performance of a first machine learning model. Thus, the determination made during a validation phase when these thresholds are not being satisfied represents an unsuitability condition (concluding that the first machine learning model lacks efficacy, accuracy, or effectiveness), with these unsuitability conditions representing a failed or deficient condition of the model. Analyzing and validating the output produced from a first machine learning model (through use of suitability metrics) corresponds to a determination process involving one or more outputs of the plurality of phases of the ML pipeline (Ghanta [0078]-[0081]; [0085]-[0087]: “… the secondary training module 306 may include … statistical signature scores for each sample in the data set, prediction values from the first machine learning algorithm … confidence metrics … parameters that are specific to the first machine learning algorithm/model …”; [0089]: “… the secondary validation module 308 analyzes other statistics, such as training statistics, to determine suitability of the second machine learning algorithm in accurately assessing the effectiveness of the first machine learning algorithm … confidence metrics, accuracy metrics, precision metrics, and/or the like …”; Table 1 and [0091]-[0093]: “… the predictions that the first machine learning algorithm/model generates can be used as input into the training of the second machine learning model, along with the error data … The analysis module 310 … is configured to determine whether the first machine learning algorithm/model is a suitable algorithm/model for generating predictions for the inference data set … may determine whether the various metrics/health scores each satisfy a threshold value … If so, then the analysis module 310 may determine that the first machine learning algorithm/model is generating accurate predictions for the inference data set. … the health scores/values may include prediction confidence values, data deviation values, AB testing values, canary values, and/or the like.”; and [0098]-[0099]: “… the analysis module 310 may determine whether the suitability score based on the metrics/health scores in Table 1 satisfies a threshold to determine (1) whether the second machine learning algorithm/module is a good fit for validating the predictive performance of the first machine learning algorithm/model, and if so (2) whether the first machine learning algorithm/model is a good fit for generating accurate predictions for the inference data set.”).) …
… wherein the one or more failure metrics are determined based on one or more characteristics of the conditioned data set, one or more outputs of the plurality of phases of the ML pipeline, or any combination thereof (Examiner’s note: As indicated earlier, Ghanta teaches that these suitability metrics are based on features received from the first machine learning algorithm, which is received as input into the secondary training/validation modules to be processed by the second machine learning model to generate the suitability metrics. Hence, the output from both the primary validation (i.e., the error data set) and secondary validation module that produces the suitability metrics (based on the error data set) also correspond to one or more outputs from a ML pipeline building phase. Thus, the suitability metrics are generated based on one or more of these outputs, and therefore this analysis process involving the suitability metrics correspond to one or more failure metrics being determined based on one or more outputs of the plurality of phases of the ML pipeline (Ghanta [0079]-[0080]: “… The resulting output of the validation of the first machine learning algorithm/model … comprises an error data set … includes values indicating the prediction error of the first machine learning algorithm/model … includes features that comprise one or more of features of the error data set …and/or one or more parameters specific to the first machine learning model …”; [0087]: “… the secondary training module 306 enhances the error data set by including additional data to supplement the prediction error data … may include data for additional features such as features of the data set itself … prediction values from the first machine learning algorithm (e.g., the predicted values output from analyzing the inference data set using the first machine learning algorithm/model …”; Table 1 and [0091]-[0093]; [0095]: “… As explained above, the second machine learning algorithm receives features (e.g., of the inference data set, the error data set, and/or other features) as input and predicts whether the first machine learning algorithm is suitable for making accurate predictions on the inference data set …”; and [0098]-[0099]).) …
… determining that at least one of the one or more failure metrics exceeds a threshold value (Examiner’s note: As indicated earlier, Ghanta teaches a secondary validation module that produces suitability metrics (i.e., confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability), where these suitability metrics indicate whether the model is a good fit for generating accurate predictions based on a received inference set. Ghanta teaches an analysis module that uses the produced suitability metrics to determine whether these suitability metrics (i.e., confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability) satisfy thresholds for determining whether the first machine learning model is a good fit (Ghanta [0078]-[0079]; [0085]-[0087]; [0092]: “…The analysis module 310 … is configured to determine whether the first machine learning algorithm/model is a suitable algorithm/model for generating predictions for the inference data set … may determine whether the various metrics/health scores each satisfy a threshold value, if a percentage of the metrics/health scores satisfy threshold values, of if a calculated combination of various health scores (e.g., an average) satisfies a threshold. If so, then the analysis module 310 may determine that the first machine learning algorithm/model is generating accurate predictions for the inference data set …”; and [0098]-[0099]). A person having ordinary skill in the art would understand that using and comparing suitability metrics (indicating success) against threshold values involve having a threshold being defined as a minimum suitability baseline, and determining that these suitability metrics do not satisfy (i.e., are below) this minimum suitability baseline threshold (thus indicating failure or deficiency) is functionally equivalent to using and comparing failure metrics (indicating failure) against threshold values representing a minimum failure baseline, and determining that these metrics meet or exceed (i.e., are above) this minimum failure baseline threshold (thus indicating failure or deficiency).) …
… performing a remediation of the conditioned data set (Examiner’s note: As indicated earlier, Ghanta teaches an analysis module determining whether suitability thresholds are being satisfied based on the corresponding suitability metrics (Ghanta [0078]-[0079]; [0085]-[0087]; Table 1 and [0092]-[0093]; and [0098]-[0099]). Ghanta further teaches that in response to determining that the first machine learning model is not satisfying predetermined suitability thresholds, the analysis module requests an action module to perform various actions associated with the first machine learning algorithm, where these additional actions include retraining the first machine learning model using a different training data set, switching the first machine learning model to a different machine learning model for training, or recommending a different first machine learning algorithm to be used to analyze the inference data set, where these actions correspond to performing remediations based on a conditioned data set (Ghanta Figure 5, elements 514, 516, 518, 520; and Figure 4, element 410 and [0098], [0100]-[0102]; and [0109]: “…the analysis module 310 determines 514 whether the predicted suitability of the first machine learning algorithm/model satisfies a predetermined suitability threshold. If so, the method 500 ends. Otherwise, the action module 312 triggers one or more actions associated with the first machine learning algorithm. For instance, the action module 312 may trigger retraining 516 the first machine learning model with different training data, may trigger switching 518 the first machine learning model to a different machine learning model that is trained using different training data, may recommend 520 different machine learning algorithms for analyzing the inference data set, may update 522 suitability thresholds, and/or the like, and the method 500 ends.”).) …
While Ghanta teaches machine learning pipelines performing various machine learning operations such as algorithm training/inference and feature engineering, and are associated with third party analytic engines for performing machine learning numeric computations and analysis (Ghanta [0053], [0055]), Ghanta does not explicitly teach
… wherein the ML pipeline includes … a data pre-processing phase …
… applying the data pre-processing phase of the ML pipeline to the target data set before applying the ML model building phases of the ML pipeline …
Dirac ‘430 teaches
… wherein the ML pipeline includes … a data pre-processing phase (Examiner’s note: Dirac ‘430 teaches a machine learning service processing a model creation request that contains a training data set and recipes/constraint instructions for a feature processing manager, where the feature processing manager uses these recipes and constraints to generate a candidate set of feature processing transformations by scheduling one or more feature processing jobs, and using the processed (and pruned) training data set to execute the scheduled model training/re-training jobs, where this feature processing manager represents a data pre-processing phase that performs various validation checks on the data, and the processed (and pruned) training data set represents a conditioned data set based on the training data set (Dirac ‘430 Figure 9a, element 904 and [0120]: “… a client may want to create an artifact such as model, and may want that same model to be re-trained and/or re-executed for different input data sets … a client may specify a set of model/alias/recipe artifacts … to be used … Such programmatic interfaces may be referred as “pipelining APIs” … a separately-managed data pipelining service implemented at the provider network may be used in conjunction with the MLS …”, [0127]-[0128], [0130]: “… data type checking may have to be performed on the input data set for jobs that involve feature processing, or the MLS may have to verify that the input data set size within acceptable bounds …”; and Figure 42, elements 4210, 4220, 4227, 4080, 4255, 4261; and [0239]-[0240]: “… The model creation request 4210 may indicate … one or more training sets 4220 … one or more feature processing recipes 4226 … a client may also optionally indicate one or more constraints 4227, such as a mandatory feature processing transformation… The FP manager 4080 may generate a candidate set of feature processing transformations … a number of different jobs may be generated and scheduled during this process, including … one or more feature processing jobs 5244, … and/or one or more training or re-training jobs 4261. … The FP manager may consult the MLS’s knowledge base of best practices to identify candidate transformations … based on the problem domain being addresse[d] by the model to be created or trained … once a candidate set of FPTs (feature processing transformations) is identified, some subset of the transformations may be removed or pruned from the set in each of several optimization iterations, and different variants of the model may be trained … using the pruned FPT sets.”).) …
… applying the data pre-processing phase of the ML pipeline to the target data set before applying the ML model building phases of the ML pipeline (Examiner’s note: Under its broadest reasonable interpretation, a “target” data set broadly recites any identified data set that is used in a machine learning system. As indicated earlier, Dirac ‘430 teaches a machine learning service that performs feature processing to generate a processed (and pruned) training set representing a conditioned data set, where this conditioned data set is used to execute the scheduled model training/re-training jobs, such that this flow of performing feature processing to generate a conditioned data set that is used for the scheduled model training/re-training jobs represents a flow of applying the data pre-processing phase of the ML pipeline to a target data set before applying the ML model building phase (Dirac ‘430 Figure 42, elements 4210, 4220, 4227, 4080, 4255, 4261 and [0239]-[0240]).) …
	Both Ghanta and Dirac ‘430 are analogous art since both teach performing data/feature processing on training data sets.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the data/feature processing step taught in Ghanta and incorporate the feature processing manager taught in Dirac ‘430 as a way to further define prediction quality metrics and run-time goals to improve the prediction performance of the ML pipeline. The motivation to combine is taught in Dirac ‘430, since the feature processing manager can process the values in the training set and apply the associated feature constraints and feature processing transformations based on corresponding quality metrics and run-time goals to improve the performance of the machine learning pipeline, thus making the ML pipeline more computationally efficient while at the same time satisfying the required training goals in the system (Dirac ‘430 [0232]).
	While Ghanta in view of Dirac ‘430 teaches data/feature processing in a ML pipeline, Ghanta in view of Dirac ‘430 does not explicitly teach
… identifying one or more duplicate data entries within the target data set …
… wherein the conditioned data set includes each entry of the target data set except for the identified duplicate entries …
	Bianco teaches
… identifying one or more duplicate data entries within the target data set (Examiner’s note: Bianco teaches a T3S process for deduplicating data from datasets. Bianco teaches a set of Sig-Dedup filters perform an initial pre-processing deduplication stage based on prefix, length, positional, suffix filtering to identify a set of candidate matching pairs of records from a dataset (to remove initial duplicate records).  Bianco additionally teaches a first stage that performs sorting and ranking of these candidate pairs of records according to similarity values, where the rank represents different sample levels of similarity, with the lowest level [0.0-0.1] representing a large number of candidate pairs, and the highest level [0.9-1.0] representing a large number of matching pairs. Hence, the process of identifying a set of duplicate data entries as candidate matching pairs and grouping them into different levels based on similarity corresponds to a process for identifying one or more duplicate data entries within a target data set (Bianco p.2308, Figure 1. T3S steps overview; p.2307 col.1 Section 3.1 Signature-Based Deduplication (Sig-Dedup) 1st-5th paragraphs; p.2308 col.2 Section 4.1 Identifying the Approximate Blocking Threshold 1st-3rd paragraphs; p.2309 col.1 4th paragraph; p.2309 col.2 2nd-3rd paragraphs (Section 4.2 First Stage: Sample Selection Strategy): “The first stage of T3S adopts the concept of levels to allow each sample to have a similar diversity to that of the full set of pairs. The ranking, created by the blocking step, is fragmented into 10 levels … by using the similarity value of each candidate pair. … level [0.0-0.1] is composed of a large number of non-matching pairs (i.e., highly dissimilar records) while level [0.9-1.0] has matching pairs only.”).) …
… wherein the conditioned data set includes each entry of the target data set except for the identified duplicate entries (Examiner’s note: As indicated earlier, Bianco teaches a T3S process for deduplicating data, containing an initial pre-processing deduplication stage based on Sig-Dedup filtering to remove initial duplicate records, and a first stage that sorts and ranks the remaining candidate pairs according to different similarity levels. Bianco further teaches a second stage that performs the similarity-based deduplication based on an incremental removal of the non-informative or redundant pairs inside each sample level using a SSAR (selective sampling using association rules) active learning method, where each iteration removes irrelevant features and examples from a training set D, resulting in a final training set that contains only the most informative pairs required to maximize the training size diversity, where this final training set includes each entry from the target data set except for the most relevant (dissimilar) entries (Bianco p.2310 Figure 2; p.2309 col.2 Section 4.3 Second Stage: Redundancy Removal, 2nd paragraph: “The second stage of T3S aims at incrementally removing the non-informative or redundant pairs inside each sample level by using the SSAR … active learning method [21]. By redundant, we mean pairs carrying very similar information; the inclusion of a redundant pair does not contribute with useful information for the learning process. … The purpose of SSAR is to select for labeling only the most informative pairs required to maximize the training size diversity …”; p.2310 Algorithm 1 and 2nd paragraph: “Details of SSAR are shown in Algorithm 1. At each round, an unlabeled pair                         
                            
                                
                                    u
                                
                                
                                    i
                                
                            
                        
                     is used as a filter to remove irrelevant features and examples from D. … The objective of this procedure is to select the most dissimilar unlabeled pair by making a comparison with the current training set. … If the most dissimilar pair is not already present in the training set, it is labeled by the user and inserted into the training set D. … The idea is that this pair is the best “representative” of the information contained in the collection.”; and p.2311 col.1 3rd paragraph: “Our sample selection strategy incrementally invokes SSAR by using each level and the current training set as input. … As Fig.2 show, by using the incremental active selection, the redundant pairs at the levels [0.1-0.2, 0.2-0.3, 0.3-0.4, 0.4-0.5, and 0.5-0.6] can be removed …”).) …
Both Ghanta in view of Dirac ‘430 and Bianco are analogous art since both teach performing data/feature processing on training data sets.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the data/feature processing steps taught in Ghanta in view of Dirac ‘430 and incorporate the data deduplication steps taught in Bianco as a way to perform further data transformation and data validation on an input dataset. The motivation to combine is taught in Bianco, since detecting and removing duplicates in data sets results in considerable improvements in data quality, where the deduplication will also result in producing a training data set that contains more relevant information in which to perform additional machine learning training and analysis. Thus, a system that trains on this deduplicated data set is more computationally efficient, without sacrificing the effectiveness with respect to other deduplication methods performed on the same training data set (Bianco p.2305 col.1-col.2 Introduction: “… data quality can be degraded mostly due to the presence of duplicate pairs with misspellings, abbreviations, conflicting data, and redundant entities, among other problems … a system designed to collect scientific publications on the Web to create a central repository … may suffer a lot in the quality of its provided services, e.g., search or recommendation may not produce results as expected by the end user due to the large number of replicated or near-replicated publications dispersed on the Web (e.g., a query response composed mostly by duplicates may be considered as having low informative value). The ability to check whether a new collected object already exists in the data repository (or a close version of it) is an essential task to improve data quality. … Considerable improvements in data quality can be obtained by detecting and removing duplicates.”; p.2317 Figure 6 and col.1-col.2 (Section 5.7 T3S vs ALISA, ALD, and Christen (2008)): “We present experiments with the real-world and one synthetic datasets …  It can be seen, that T3S-[NGram and SVM] converge very quickly, producing good effectiveness with only a few manually labeled pairs … Note that T3S clearly outperforms ALIAS with a reduced labeling effort in both real datasets. … T3S-SVM requires only 103 and 31 labeled pairs (a reduction of 21 and 78 percent), reaching a statistically significant gain of 3 percent …”).
While Ghanta in view of Dirac ‘430, in further view of Bianco teaches actions that include receiving and analyzing data deviation score information, and selecting a different training data set to retrain a first machine learning model (Ghanta [0099]-[0100]), Ghanta in view of Dirac ‘430, in further view of Bianco does not explicitly teach
… performing a remediation … to provide additional training data to the target data set …
Baylor teaches
… performing a remediation … to provide additional training data to the target data set (Examiner’s note: Under its broadest reasonable interpretation, this limitation broadly recites a process that provides additional data as training data. Baylor teaches a data validation operation (corresponding to one of “a plurality of phases”) which validates the data properties of specific training and serving datasets, flags any deviations as potential anomalies, and provide actionable suggestions to fix the flagged anomaly. Baylor further teaches an existing data property can be encoded in a schema that describes the expected domain or range of a feature, and hence, deviations in the input data that exists outside of the expected domain or range of a feature represents an anomaly that corresponds to “one or more failure metrics at one or more of the plurality of phases of the ML pipeline”. Baylor further identifies an actionable suggestion to fix the identified deviation, which includes the expansion of the existing feature schema to include additional domain values (e.g., include the domain value “EDUCATION” in addition to the existing values “GAMES” and “BUSINESSES” in the feature ‘category’), such that any subsequent input data containing this new feature value will be correctly validated and incorporated as a normal state for a training or serving dataset. This action of expanding a feature schema to incorporate additional values as part of the training data set corresponds to performing a remediation action that provides additional data as training data (Baylor p.1389 Figure 1; p.1390-1391 Figure 2 and Section 3.3 Data Validation: “… To perform validation, the component relies on a schema that provides a versioned, succinct description of the expected properties of the data. The following are examples of the properties … The expected domain of a feature, i.e., the small universe of values for a string feature, or range for an integer feature … Using the schema, the component can validate the properties of specific (training and serving) datasets, flag any deviations from the schema as potential anomalies, and in most cases, provide actionable suggestions to fix the anomaly. These actions may include … for expected deviations in the data, updating the schema itself to match the data … In some cases the anomalies correspond to a natural evolution of the data, and the appropriate action is to change the schema (rather than fix the data) … our component generates for each anomaly a corresponding schema change that can bring the schema up-to-data (essentially, make the anomaly part of the normal state of the data)…”).) …
Both Ghanta in view of Dirac ‘430, in further view of Bianco and Baylor are analogous art since they both teach machine learning pipelines that perform validation operations.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the validation operations taught in Ghanta in view of Dirac ‘430, in further view of Bianco and enhance them to incorporate the validation operations taught in Baylor as a way to continuously improve the model quality over time. The motivation to combine is taught in Baylor, since identifying errors early during data validation and suggesting ways to resolve the resulting issue helps eliminate bugs that can significantly degrade the model quality over a period of time and in downstream operations, which then helps improves a machine learning platform’s scalability and performance by maintaining the overall model quality (Baylor p.1389 Section 3 Data Analysis, Transformation, and Validation 1st paragraph: “Machine learning models are only as good as their training data, so understanding the data and finding any anomalies early is critical for preventing data errors downstream, which are more subtle and harder to debug. … Faults … can occur at multiple points of this generation process, which makes anomalies in the data not an exception, but more the norm. As a machine learning platform scales to larger data and runs continuously, there is a strong need to a reusable component that enables rigorous checks for data quality and promotes best practices for data management … Small bugs in the data can significantly degrade model quality over a period of time in a way that is hard to detect and diagnose … so constant data vigilance should be a part of any long running development of a machine learning platform.”).
Regarding amended Claim 7,
 Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor teaches
(Currently amended) The system of claim 1, 
wherein the operations comprise applying the ML pipeline to generate an ML model, wherein the ML pipeline includes a utility validation phase (Examiner’s note: As indicated earlier, Ghanta teaches a training machine learning pipeline generating machine learning models, and hence using this training machine learning pipeline operation to generate machine learning models corresponds to an operation that applies a ML pipeline to generate an ML model (Ghanta Figure 2A, elements 202, 204, 206a-c, [0036], [0053], and [0056]). As indicated earlier, Ghanta further teaches an ML management apparatus in the machine learning system using these identified data sets, where the ML management apparatus contains various modules including a primary training module, a primary validation module, a secondary training module, and a secondary validation module, where the primary and secondary validation modules represent utility validation phases (Ghanta [0075], [0078]: “… the primary validation module 304 is configured to validate the first machine learning algorithm/model using a validation data set.”; and [0088]: “… the secondary validation module 308 is configured to determine a suitability of the second machine learning algorithm for predicting the suitability of the first machine learning algorithm.”).), 
wherein the utility validation phase comprises causing the processor to: 
generate first and second ML models from the conditioned data set, wherein the first ML model corresponds to the ML model generated during the ML model building phase, and wherein the second ML model has fewer trainable parameters than the first ML model (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation broadly recites validating first and second ML models using a conditioned data set, where the term “conditioned data set” broadly recites a data set based on a target data set (where an identified “target” data set broadly recites any identified data set that is used in a machine learning system). As indicated earlier, Ghanta teaches using training, validation, test, error, and inference data sets (“target data set”) in a machine learning system/ML management apparatus containing training, orchestration/management (policy), and inference pipelines (Ghanta [0019], [0036]-[0037]). As indicated earlier, Ghanta teaches the policy, training, and inference machine learning pipelines implementing various machine learning operations, including feature engineering operations, where a person having ordinary skill in the art would understand that feature engineering analyzes and performs conversions on a data set, where the output result of the feature engineering processing performed on a data set represents a conditioned data set (Ghanta [0053]). Under its broadest reasonable interpretation, the phrase “fewer trainable parameters” broadly indicates training the second ML model with a smaller selected set of parameters based on the training from the first ML model. Ghanta teaches using a primary training module to generate a first machine learning model from an ML training pipeline, and using a primary validation module to validate the generated first machine learning model and produce a resulting error data set. Ghanta additionally teaches generating a second machine learning model from an ML training pipeline by a secondary training module using a different machine learning algorithm, and using the error data set as input. Ghanta further teaches this error data set can include one or more parameters specific to the first machine learning model, and the secondary training model can select all or a subset of these feature parameters, where these one or more specific parameters corresponds to a smaller set of trainable parameters for training the second ML model (Ghanta [0074]: “… the training pipelines 204a-c generate different machine learning models …”, [0075]-[0077]: “… the primary training module 302 trains the first machine learning model for the first machine learning algorithm on a training data set … the primary training module 302 may receive, read, access, an/or the like a training data set and provide the training data set to a training pipeline 204 to train the machine learning model …”; [0078]-[0080]: “… the primary validation module 304 is configured to validate the first machine learning algorithm/model … The resulting output of the validation … comprises an error data set … includes values indicating the prediction error of the first machine learning algorithm/model … includes feature that comprise one or more of feature of the error data set … and/or one or more parameters specific to the first machine learning model.”; [0085]-[0087]: “… the secondary training module 306 is configured to train a second machine learning model for a second machine learning algorithm using the error data set … The second machine learning algorithm may be configured to predict a suitability of the first machine learning algorithm/model … the second machine learning algorithm is different than the first machine learning algorithm … the secondary training module 306 enhances the error data set by including additional data to supplement the prediction error data … the secondary training module 306 may include data for additional features such as features of the data set itself (e.g., the secondary training module 306 may select all or a subset of the available features of the error data set itself) …”).), 
compare a predictive ability of the first ML model and the second ML model (Examiner’s note: As indicated earlier, Ghanta teaches a secondary validation module that produces suitability metrics (i.e., confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability), where these suitability metrics indicate whether the model is a good fit for generating accurate predictions based on a received inference set. Ghanta teaches an analysis module that uses the produced suitability metrics to determine whether these suitability metrics (i.e., confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability) satisfy thresholds for determining whether the first machine learning model is a good fit. Hence, this analysis to determine whether these thresholds are being satisfied represents a suitability analysis for the first machine learning model, and as such, the determination where these thresholds are not being satisfied represents an unsuitability condition that concludes that a first machine learning model lacks efficacy, accuracy, or effectiveness. Thus, this suitability analysis involving output produced from the first and second machine learning model represents a process for comparing predictive ability using a first ML model and a second ML model (Ghanta [0078]-[0079]; [0085]-[0087]: “… the secondary training module 306 may include … statistical signature scores for each sample in the data set, prediction values from the first machine learning algorithm … confidence metrics …”; [0089]: “… the secondary validation module 308 analyzes other statistics, such as training statistics, to determine suitability of the second machine learning algorithm in accurately assessing the effectiveness of the first machine learning algorithm … confidence metrics, accuracy metrics, precision metrics, and/or the like …; Table 1 and [0092]-[0093]: “The analysis module 310 … is configured to determine whether the first machine learning algorithm/model is a suitable algorithm/model for generating predictions for the inference data set based on the predictions that the second machine learning algorithm generates. … the analysis module 310 may determine whether the various metrics/health scores each satisfy a threshold value, if a percentage of the metrics/health scores satisfy threshold values, of if a calculated combination of various health scores (e.g., an average) satisfies a threshold. If so, then the analysis module 310 may determine that the first machine learning algorithm/model is generating accurate predictions for the inference data set. … the health scores/values may include prediction confidence values, data deviation values, AB testing values, canary values, and/or the like.”; and [0098]-[0099]: “… the analysis module 310 may determine whether the suitability score based on the metrics/health scores in Table 1 satisfies a threshold to determine (1) whether the second machine learning algorithm/module is a good fit for validating the predictive performance of the first machine learning algorithm/model, and if so (2) whether the first machine learning algorithm/model is a good fit for generating accurate predictions for the inference data set.”).).  
Regarding amended Claim 8, 
Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor teaches
(Currently amended) The system of claim 7, wherein applying the utility validation phase to the conditioned data set comprises causing the processor to terminate the ML pipeline in response to determining, based on comparing the predictive ability of the first ML model and the second ML model, that the predictive ability of the first ML model fails to exceed the predictive ability of the second ML model by more than a threshold amount (Examiner’s note: As indicated earlier, Ghanta teaches an analysis module that uses the produced suitability metrics from the second validation module to determine whether these suitability metrics satisfy thresholds for determining whether the first machine learning model is a good fit for generating accurate predictions. This analysis to determine whether these thresholds are being satisfied represents a suitability analysis for the first machine learning model, and as such, the determination where these thresholds are not being satisfied represents an unsuitability condition concluding that a first machine learning model lacks efficacy, accuracy, or effectiveness. Additionally, this suitability analysis involving output produced from the first and second machine learning model represents a process for comparing predictive ability of the first ML model and the second ML model, and hence the request from the analysis module to the action module to trigger additional actions based on the earlier comparison result corresponds to a determination that the thresholds are not satisfied, which represents an indication of inadequacy with regards to the first machine learning model (Ghanta [0078]-[0079]; [0085]-[0087]; [0089]; Table 1 and [0092]-[0093]; and [0098]-[0099]). As indicated earlier, Ghanta teaches these additional actions include retraining the first machine learning model using different training data set, switching the first machine learning model to a different machine learning model for training, or recommending a different first machine learning algorithm to be used to analyze the inference data set. A person having ordinary skill in the art would understand that these actions of changing different training sets, switching different machine learning models for training, recommending different learning algorithms would require the termination of the existing pipeline that is using the previous defined training set or machine learning model/algorithm in order to apply the selected different training set and/or model/algorithm selected by the action module (Ghanta Figure 5, elements 514, 516, 518, 520; and Figure 4, element 410 and [0100]-[0102]; [0109]: “…the analysis module 310 determines 514 whether the predicted suitability of the first machine learning algorithm/model satisfies a predetermined suitability threshold. If so, the method 500 ends. Otherwise, the action module 312 triggers one or more actions associated with the first machine learning algorithm. For instance, the action module 312 may trigger retraining 516 the first machine learning model with different training data, may trigger switching 518 the first machine learning model to a different machine learning model that is trained using different training data, may recommend 520 different machine learning algorithms for analyzing the inference data set, may update 522 suitability thresholds, and/or the like, and the method 500 ends.”).).  
Regarding amended Claim 21, 
Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor teaches
(Currently amended) The system of claim 1, wherein, the operations comprise, in response to determining that at least one of the one or more failure metrics exceeds an additional threshold value (Examiner’s note: As indicated earlier, Ghanta teaches different pipeline machine learning operations, representing different stages (phases) of a machine learning service, where one of the stages includes a validation stage. Ghanta teaches secondary training and validation modules producing suitability metrics based on the predicted output generated by a first machine learning model, where this generated output is based on a received inference set. These suitability metrics include statistical metrics, confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability Ghanta [0078]-[0081]; [0085]-[0087]; [0089]; Table 1 and [0091]-[0093]; and [0098]-[0099]). Ghanta further teaches receiving and analyzing data deviation score information and comparing the data deviation score based on a predefined threshold, where this additional data deviation score information corresponds to a determination that at least one or the more failure metrics exceeds an additional threshold value, and performing an action that corresponds to a response based on the determination (Ghanta [0099]: “… the analysis module 310 may use additional data (e.g., in addition to the metrics/health scores in Table 1) to determine whether the first machine learning algorithm/model is suitable for the inference data … the analysis module 310 may receive or access data deviation information … If the data deviation scores do not deviate beyond a predefined threshold, then the second machine learning algorithm may be used … Otherwise, if the data deviation scores indicate that the inference data set is not similar enough … the analysis module 310 may trigger one or more the actions described below …”).), 
applying the data pre-processing phase of the ML pipeline to the target data set (Examiner’s note: Under its broadest reasonable interpretation, the ‘or’ identified in this claim limitation is interpreted as an exclusive ‘or’ defining a list of alternatives, indicating that only one of the identified limitations (“apply the ML pipeline to the target data set”, “change a criterion used to select the target data set”, or “both”) is required for the claimed invention. In the context of independent claim 1, his limitation broadly recites applying remediation actions involving data pre-processing to data that is present in the target data set. As indicated earlier, Baylor teaches a data validation operation (corresponding to one of “a plurality of phases”) which validates the data properties of specific training and serving datasets, flags any deviations as potential anomalies, and provide actionable suggestions to fix the flagged anomaly. Baylor further teaches an existing data property can be encoded in a schema that describes the expected domain or range of a feature, and hence, deviations in the input data that exists outside of the expected domain or range of a feature represents an anomaly that corresponds to “one or more failure metrics at one or more of the plurality of phases of the ML pipeline”. Baylor further identifies an actionable suggestion to fix the identified deviation, which includes the expansion of the existing feature schema to include additional domain values (e.g., include the domain value “EDUCATION” in addition to the existing values “GAMES” and “BUSINESSES” in the feature ‘category’), such that any subsequent input data containing this new feature value will be correctly validated and incorporated as a normal state for a training or serving dataset. This action of expanding a feature schema to incorporate additional values as part of the training data set to update the normal state of the data corresponds to applying remediation actions involving data pre-processing to data that is present in the target data set (Baylor p.1389 Figure 1; p.1390-1391 Figure 2 and Section 3.3 Data Validation: “… To perform validation, the component relies on a schema that provides a versioned, succinct description of the expected properties of the data. The following are examples of the properties … The expected domain of a feature, i.e., the small universe of values for a string feature, or range for an integer feature … Using the schema, the component can validate the properties of specific (training and serving) datasets, flag any deviations from the schema as potential anomalies, and in most cases, provide actionable suggestions to fix the anomaly. These actions may include … for expected deviations in the data, updating the schema itself to match the data … In some cases the anomalies correspond to a natural evolution of the data, and the appropriate action is to change the schema (rather than fix the data) … our component generates for each anomaly a corresponding schema change that can bring the schema up-to-data (essentially, make the anomaly part of the normal state of the data)…”).), 
changing a criterion used to select the target data set, or both.
Regarding amended Claim 22, 
Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor teaches
(Currently amended) The system of claim 1, wherein the operations comprise:
determining that applying the ML model building phase of the ML pipeline would result in a deficient ML model (Examiner’s note: Under its broadest reasonable interpretation, the term “deficient” as defined in the Merriam-Webster dictionary refers to something that is lacking in some necessary quality, or not up to a normal standard, and hence this limitation broadly recites a determination that the generated model (produced by applying the ML model building phase) would result in a failed/unsuccessful model. As indicated earlier, Ghanta teaches the ML management apparatus containing a primary training module that is configured to train a first machine learning model using a training data set (Ghanta Figure 3 and [0075]-[0077]). Ghanta further teaches applying this training data set to generate the first machine learning model, where this first machine learning model is then validated using a validation data set, and generating an error data set resulting from the output of the validation, where the error data set is used in a secondary training module configured to train a second machine learning model to predict a suitability of the first machine learning model, and where the suitability of the first machine learning model is based on a score describing the efficacy, accuracy, effectiveness of the predictions of the first model. Ghanta additionally teaches this score is further analyzed in a secondary validation model to produce suitability metrics to analyze the suitability of the second machine learning algorithm, where an analysis module further uses these suitability metrics produced from both the first and second machine learning models to determine whether these suitability metrics satisfy thresholds to eventually determine whether the first machine learning model is a good fit for generating accurate predictions for the inference set. Hence, this analysis to determine whether these thresholds are being satisfied represents a suitability analysis for the first machine learning model, and as such, the determination where these thresholds are not being satisfied represents an unsuitability condition concluding that a first machine learning model lacks efficacy, accuracy, or effectiveness (thus representing a deficient ML model) (Ghanta [0078]-[0079]; [0085]: “… the secondary training module 306 is configured to train a second machine learning model … using the error data set described above. The second machine learning algorithm may be configured to predict a suitability of the first machine learning algorithm/model … the suitability may comprise a value such as a health score that describes the efficacy, accuracy, effectiveness, … of the predictions that the first machine learning algorithm/model generates for the inference data set.”; [0089]; Table 1 and [0092]-[0093]; and [0098]-[0099]: “… the analysis module 310 may determine whether the suitability score based on the metrics/health scores in Table 1 satisfies a threshold to determine (1) whether the second machine learning algorithm/module is a good fit for validating the predictive performance of the first machine learning algorithm/model, and if so (2) whether the first machine learning algorithm/model is a good fit for generating accurate predictions for the inference data set. … the ML management apparatus 104 can predict, in real time, the efficacy of a trained model on generating predictions for an inference data set … and if it determines that the trained model is not generating accurate predictions, the ML management apparatus can react accordingly as described below with reference to the action module 312.”).); and 
terminating the ML pipeline, in response to determining that applying the ML model building phase of the ML pipeline would result in a deficient ML model (Examiner’s note: Under its broadest reasonable interpretation, this limitation broadly recites terminating the ML pipeline after determining that the generated model (produced by applying the ML model building phase) would result in a failed/unsuccessful model. As indicated earlier, Ghanta teaches an analysis module that uses the identified suitability metrics (e.g., health scores, confidence metrics, data deviation scores, accuracy metrics, precision metrics, etc.) to perform additional analysis to determine whether these metrics satisfy thresholds for generating accurate predictions (hence determining whether the first machine learning model is a good fit or a bad fit, where a bit fit indicates a deficiency). Ghanta further teaches that the analysis module triggers an action module to further perform actions based on the result of the threshold comparison that include changing a different training set, switching different machine learning models for training, recommending different learning algorithms for inference. A person having ordinary skill in the art would understand that these actions of changing different training sets, switching different machine learning models for training, recommending different learning algorithms would require the termination of the existing pipeline that is using the previous defined training set or machine learning model/algorithm in order to apply the selected different training set and/or model/algorithm selected by the action module (Ghanta Figures 4-5, and [0100]-[0102], [0109]).).
Regarding previously presented Claim 23,
 Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor teaches
(Previously presented) The system of claim 22, wherein determining the one or more failure metrics 
occurs at one or more sub-phases of the ML pipeline (Examiner’s note: Under its broadest reasonable interpretation, the term “sub-phase” broadly recites any decision point or step performed in the ML pipeline, including decisions or steps within defined ML pipeline phases. As indicated earlier, Ghanta teaches an analysis module that uses the identified suitability metrics (e.g., health scores, confidence metrics, data deviation values, accuracy metrics, precision metrics, etc.) produced from both the first and second machine learning models to perform additional analysis to determine whether these suitability metrics satisfy thresholds for determining whether the first machine learning model is a good fit (or deficient) for generating accurate predictions, such that these identified suitability metrics are also a representation of failure metrics, according to whether or not their comparison results against thresholds meet or fall below the threshold amount, with the generation of these scores/metrics done within the primary and secondary validation modules, and the comparison of these scores/metrics against thresholds and resulting actions done in an analysis and action module following the secondary validation module (Ghanta [0078]-[0079]; [0085]-[0087]; Table 1 and [0092]-[0093]; and [0098]-[0099]).).
Regarding new Claim 26, 
Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor teaches
(New) The system of claim 1, wherein determining the one or more failure metrics comprises determining a maximum-entry-number threshold based on a total amount of memory of a host system of the ML pipeline, an amount of free memory of the host system of the ML pipeline, a total number of processor cores of the host system of the ML pipeline, a number of free processor cores of the host system of the ML pipeline, or any combination thereof (Examiner’s note: Under its broadest reasonable interpretation, this limitation broadly recites determining failure metrics related to computer resources such as memory or processor cores. Dirac ‘430 teaches a machine learning service (“host system of the ML pipeline”) performing validation checking related to machine learning system resources, where the goal of performing this type of checking is to inform as soon as possible when a particular request is invalid, and to avoid wasting resources on requests that are unlikely to succeed, such that these types of validation checks also represent failure metrics. Dirac ‘430 teaches a job scheduler performing these validations by applying different quotas or limits on resources, where these quotas are maximum values that are set for each of several different resource types including CPUs/cores, memory, and disk, and verifying that the respective quotas have not been exhausted, where the exhausting of a quota corresponds to a failure condition. Thus, these quotas/resource limits associated with resource types such as CPUs/cores and memory correspond to maximum entry number limits based on a respective resource type, with the verification process that determines whether the quotas have been exhausted or not corresponding to a determination of a maximum-entry-number threshold based on a total amount of memory as well as a total number of processor cores of the host system of the ML pipeline (Dirac ‘430 [0120], [0130]: “… the MLS may perform validations at various other stages in some embodiments, e.g., with the general goals of (a) informing clients as soon as possible when a particular request is found to be invalid, and (b) avoiding wastage of MLS resources on requests that are unlikely to succeed. As shown in element 952 of FIG. 9b, one or more types of validation checks may be performed on the job Jk identified in element 951 … each client may have a quota or limit on the resources that can be applied to their jobs … respective quotas may be set for each of several different resource types---e.g., CPUs/cores, memory, disk, network bandwidth and the like … the job scheduler may be responsible for verifying that the quota or quotas of the client on whose behalf the job Jk is to be run have not been exhausted. If a quota has been exhausted, the job's execution may be deferred until at least some of the client's resources are released (e.g., as a result of a completion of other jobs performed on the same client's behalf). Such constraint limits may be helpful in limiting the ability of any given client to monopolize shared MLS resources, and also in minimizing the negative consequences of inadvertent errors or malicious code.”).).
Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over 
Ghanta et al., U.S. PGPUB 2020/0034665, filed 7/30/2018 [hereafter referred as Ghanta] in view of Dirac et al., U.S. PGPUB 2015/0379430, published 12/31/2015 [hereafter referred as Dirac '430], in further view of Bianco et al., A Practical and Effective Sampling Selection Strategy for Large Scale Deduplication, September 2015 [hereafter referred as Bianco], in even further view of Baylor et al., TFX: A TensorFlow-Based Production-Scale Machine Learning Platform, KDD’17, August 13-17 2017 [hereafter referred as Baylor] as applied to Claims 1, 9, and 13; in even further view of Maag et al., U.S. PGPUB 2017/0220403, published 8/3/2017 [hereafter referred as Maag].
Regarding amended Claim 3, 
Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor as applied to Claim 1 teaches
(Currently amended) The system of claim 1, 
… wherein the operations comprise applying the data pre-processing phase of the ML pipeline (Examiner’s note: As indicated earlier, Dirac ‘430 teaches a machine learning service that performs feature processing to generate a processed (and pruned) training set representing a conditioned data set, where this conditioned data set is used to execute the scheduled model training/re-training jobs, such that this flow of performing feature processing to generate a conditioned data set that is used for the scheduled model training/re-training jobs represents a flow of applying the data pre-processing phase of the ML pipeline to a target data set before applying the ML model building phase (Dirac ‘430 Figure 42, elements 4210, 4220, 4227, 4080, 4255, 4261 and [0239]-[0240]).) …
While Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor teaches feature processing and data-preprocessing applied to target data sets (represented by training, validation, test, error, inference data sets) to produce conditioned data sets, Ghanta in view of Dirac ’430, in further view of Bianco, in even further view of Baylor does not explicitly teach
wherein the conditioned data set is arranged in columns and rows, wherein the columns define fields of the conditioned data set and the rows define entries in the conditioned data set, and
… to determine that the one or more failure metrics based on, for a particular column of the columns of the conditioned data set, at least one of 
(i) the particular column being empty; (ii) more than a threshold amount of the entries in the particular column being empty; (iii) fewer than the threshold amount of the entries in the particular column not being empty; (iv) the particular column containing a single unique value; or (v) values of the particular column being skewed beyond the threshold amount of the entries in the particular column.  
Maag teaches
wherein the conditioned data set is arranged in columns and rows, wherein the columns define fields of the conditioned data set and the rows define entries in the conditioned data set (Examiner’s note: Under its broadest reasonable interpretation, the term “conditioned data set” broadly recites a data set based on an identified “target” data set (where an identified “target” data set broadly recites any identified data set that is used in a machine learning system). Maag teaches data pipelines performing data transformations on data sets obtained from data sources to produce transformed data sets (“conditioned data set”), where these data transformations involve mapping data elements in a source data format to a target data format in a data pipeline before providing the transformed data to a corresponding data sink, with these data sources being any source of data storing one or more datasets, and where the datasets are collections of data that are sent to a data sink that further feed into a machine learning model such as a classifier for training (Maag [0175], [0179]-[0180]). Maag additionally teaches a build service in the pipeline system performing data transformation actions including validation actions on transformed data, where validation errors (faults) are captured on the data, including those faults related to NULL values and missing data (representing empty data). Maag additionally teaches the data in the data source are arranged as a relational database in tabular format that provides rows of data, where rows of data in a relational database represent a record or entry, and columns corresponding to data fields (Maag [0086]; [0089]-[0090]; [0099]: “… Each of the data sources 320 may provide different data, possibly even in different data formats. … one data source 320 (e.g., 320A) may be a relational database server that provides rows of data … ”; [0104]: “… As data moves through the data pipeline system 310 from the data sources 320 to the data sinks 330, a number of data transformation steps may be performed on the data to prepare the data obtained from the data sources 320 for consumption …”; [0113]: “The build service 317 leverages the transaction service 318 to provide immutable and/or versioned transformed datasets. A transformed dataset may be defined as a dataset that is generated (built) by applying a transformation program … to one or more datasets …”; [0119]: “… the build service 317 an/or transaction service 318 include logic that perform one or more validation checks on the transformed data. If a fault is detected by either service, that service stores metadata in association with the affected dataset that includes information such as, the time the fault occurred, the dataset(s) involved in the fault, data specifically related to the fault … missing data during transformation … presence of NULL values where none should be …”; and [0164]: “… relational database schemas typically define tables of data, where each table is defined to include a number of columns (or fields), each tied to a specific type of data, such as strings, integers, doubles, floats, bytes, and so forth.”).), and 
… to determine that the one or more failure metrics based on, for a particular column of the columns of the conditioned data set, at least one of 
 (i) the particular column being empty (Examiner’s note: Under its broadest reasonable interpretation, the phrase “at least one of … or” is interpreted to define a list of alternatives, indicating that a minimum of one of the listed options is required to be identified for the claimed invention. As indicated earlier, Maag teaches generating transformed datasets from datasets provided from a data source, and the build service in the pipeline system performing data validation actions on transformed data (Maag [0086]; [0089]-[0090]; [0099]; [0104]; [0113], [0119]). Maag further teaches performing these validation tests based on a schema associated with the identified data set, where these tests produce fault/warning indications for cases where non-NULL defined columns contain NULL values (with the presence of NULL values in a column indicating an empty column) (Maag [0164], [0167]: “Configuration points for schema validation tests may include the schema that should be compared against the pre-transformation and/or post-transformation data, the pipeline and/or data sets from which to collect the data, how often the tests 700 should be performed, criteria for determining whether a violation is a "fault" or "potential fault“ (or "warning"), valid values for certain columns/fields (e.g. ensuring columns which are defined as non-NULL do not contain NULL values, that non-negative columns do not contain numbers which are negative, etc.) and so forth.”).); 
(ii) more than a threshold amount of the entries in the particular column being empty; 
(iii) fewer than the threshold amount of the entries in the particular column not being empty; 
(iv) the particular column containing a single unique value; or 
(v) values of the particular column being skewed beyond the threshold amount of the entries in the particular column.  
Both Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor and Maag are analogous art since both teach data pipelines for machine learning systems.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the feature processing performed in the ML pipeline taught in Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor and additionally incorporate a data transformation phase taught in Maag as a way to perform data transformation and validation on an identified data set as it moves through a ML pipeline. The motivation to combine is taught in Maag, since a data transformation phase provides functionality to not only transform input data into a particular desired format, but also the functionality to validate the transformed data before it is further processed downstream by another entity (e.g., by a model building phase), ensuring that it meets certain schema requirements, thus improving the quality of the target data set for use in a machine learning pipeline to build more accurate machine learning models (Maag paragraph [0164]: “Schema validation is the process of inspecting data to ensure that the data actually adheres to the format defined by the schema. Schemas in relational database may also define other constructs as well, such as relationships, views, indexes, packages, procedures, functions, queues, triggers, types, sequences, and so forth. However, schemas other than relational database schemas also exist, such as XML schemas. In some embodiments, the schema(s) indicating the format of the data stored by the data sources 320 and the schema representing the data format expected by the data sinks 330 are used to implement the transformations performed by the pipelines 410. For instance, the logic defined by each pipeline may represent the steps or algorithm required to transform data from the data source format into the data sink format. If the transformation is performed properly, the data after transformation should be able to pass validation with respect to the schema of the data sink. However, if errors occur during the transformation, the validation might fail if the transformed data is improperly formatted.”).
Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over 
Ghanta et al., U.S. PGPUB 2020/0034665, filed 7/30/2018 [hereafter referred as Ghanta] in view of Dirac et al., U.S. PGPUB 2015/0379430, published 12/31/2015 [hereafter referred as Dirac '430], in further view of Bianco et al., A Practical and Effective Sampling Selection Strategy for Large Scale Deduplication, September 2015 [hereafter referred as Bianco], in even further view of Baylor et al., TFX: A TensorFlow-Based Production-Scale Machine Learning Platform, KDD’17, August 13-17 2017 [hereafter referred as Baylor], in even further view of Maag et al., U.S. PGPUB 2017/0220403, published 8/3/2017 [hereafter referred as Maag] as applied to Claim 3; in even further view of Dirac et al., U.S. PGPUB 2015/0379424, published 12/31/2015 [hereafter referred as Dirac '424].
Regarding previously presented Claim 4, 
Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor, in even further view of Maag as applied to Claim 3 teaches
(Previously presented) The system of claim 3.
However, Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor, in even further view of Maag does not teach
… wherein the particular column contains at least one of 
(i) word vectors that describe, in a semantically-encoded vector space, a meaning of respective words, or 
(ii) paragraph vectors that describe, in a semantically-encoded vector space, a meaning of respective multi-word samples of text.  
Dirac ‘424 teaches
wherein the particular column contains at least one of 
(i) word vectors that describe, in a semantically-encoded vector space, a meaning of respective words (Examiner’s note: Under its broadest reasonable interpretation, the phrase “at least one of … or” is interpreted to define a list of alternatives, indicating that a minimum of one of the listed options is required to be identified for the claimed invention. Dirac ‘424 teaches performing data pre-processing operations on input data (“target data set”) containing data records containing variables of data types such as text, and using natural language processing to perform feature processing (Dirac ‘424 [0035]: “Some machine learning workflows, which may correspond to a sequence of API requests from a client 164, may include the extraction and cleansing of input data records from raw data repositories 130 (e.g., repositories indicated in data source definitions 150) by input record handlers 160 of the MLS, as indicated by arrow 114. … The input data may comprise data records that include variables of any of a variety of data types, such as, for example text, … The output produced by the input record handlers may be fed to feature processors 162 (as indicated by arrow 115), where a set of transformation operations may be performed 162 in accordance with recipes 152 using another set of resources from pool 185. Any of a variety of feature processing approaches may be used depending on the problem domain: e.g., … natural language processing …”). Dirac ‘424 further teaches performing the text data transformations by determining the root words to be included in an n-gram for use in a machine learning algorithm, where this transformation process represents transforming “word vectors that describe, in a semantically-encoded vector space, the meaning of respective words” (Dirac ‘424 [0079]: “… a recipe language defined by the MLS enables users to easily and concisely specify transformations to be performed on specified sets of data records to prepare the records for use for model training and prediction. … In at least one embodiment, a pipeline of successive transformations to be performed starting with a given input data set may be indicated within a single recipe. In one embodiment, the MLS may perform parameter optimization for one or more recipes---e.g., the MLS may automatically vary such transformation properties as the sizes of quantile bins or the number of root words to be included in an n-gram in an attempt to identify a more useful set of independent variables to be used for a particular machine learning algorithm.”).), or 
(ii) paragraph vectors that describe, in a semantically-encoded vector space, [[the]] a meaning of respective multi-word samples of text.  
Both Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor, in even further view of Maag and Dirac ‘424 are analogous art since both teach data pre-processing phases in machine learning systems.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the data pre-processing phase of Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor, in even further view of Maag and incorporate the text pre-processing steps of Dirac ‘424 as a way to handle text pre-processing in an input data set. The motivation to combine is taught in Dirac ‘424, as text data transformations that convert the data into root word vector representations/n-grams allow the machine learning system to perform automated parameter explorations for text data, thus improving upon the functionality of the machine learning system (Dirac ‘424 [0093]: “For many types of feature processing transformation operations, such as creating quantile bins for numeric data attributes, generating ngrams, or removing sparse or infrequent words from documents being analyzed, parameters may typically have to be selected, such as the sizes/boundaries of the bins, the lengths of the ngrams, the removal criteria for sparse words, and so on. The values of such parameters (which may also be referred to as hyper-parameters in some environments) may have a significant impact on the predictions that are made using the recipe outputs. Instead of requiring MLS users to manually submit requests for each parameter setting or each combination of parameter settings, in some embodiments the MLS may support automated parameter exploration.”; and [0094]: “Automated parameter exploration may also be used for selection dimensionality values for a vector representation of a text document (e.g., in accordance with the Latent Dirichlet Allocation (LDA) technique) or other natural language processing techniques. In some cases, the client may also indicate the criteria to be used to terminate exploration of the parameter value space, e.g., to arrive at acceptable parameter values. In at least some embodiments, the client may be given the option of letting the MLS decide the acceptance criteria to be used-such an option may be particularly useful for non-expert users. In one implementation, the client may indicate limits on resources or execution time for parameter exploration. In at least one implementation, the default setting for an auto-tune setting for at least some output transformations may be "true", e.g., a client may have to explicitly indicate that auto-tuning is not to be performed in order to prevent the MLS from exploring the parameter space for the transformations.”.).
Claims 6 and 24-25 are rejected under 35 U.S.C. 103 as being unpatentable over 
Ghanta et al., U.S. PGPUB 2020/0034665, filed 7/30/2018 [hereafter referred as Ghanta] in view of Dirac et al., U.S. PGPUB 2015/0379430, published 12/31/2015 [hereafter referred as Dirac '430], in further view of Bianco et al., A Practical and Effective Sampling Selection Strategy for Large Scale Deduplication, September 2015 [hereafter referred as Bianco], in even further view of Baylor et al., TFX: A TensorFlow-Based Production-Scale Machine Learning Platform, KDD’17, August 13-17 2017 [hereafter referred as Baylor] as applied to Claims 1 and 22; in even further view of Schelter et al., Automating Large-Scale Data Quality Verification, 2018 [hereafter referred as Schelter].
Regarding previously presented Claim 6, 
Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor as applied to Claim 1 teaches
(Currently amended) The system of claim 1 …
… wherein applying the ML model building phase comprises generating an ML model to predict a particular column of the conditioned data set (Examiner’s notes: Under its broadest reasonable interpretation, the term “conditioned data set” broadly recites a data set that is based on an identified “target” data set (where an identified “target” data set broadly recites any identified data set that is used in a machine learning system). Ghanta teaches training pipeline and inference pipelines training a machine learning model on a training data set, and performing inferences on an inference data set, where the training data set contains three columns of feature data (Age, Sex, Height), and the inference data set contains only two columns of data (Age, Height), such that the inference data set infers an output corresponding to the information from the third column (Sex; M/F) (Ghanta Figure 2A, elements 200, 206a-c and [0040]: “In certain embodiments of machine learning systems 200, there is a training phase, for generating the machine learning model, and an inference phase for analyzing an inference data set using the machine learning model. The output from the inference phase may be one or more predictive "labels" determined as a function of one or more features of the inference data set. For example, if the training data set comprises three columns of feature data-Age, Sex, and Height-that are used to train the machine learning model, and the inference data comprises two columns of feature data-Age and Height-the output from an inference pipeline 206 using the machine learning model may be a "label" describing the predicted Sex (M/F) based on the given inference data.”).) …
While Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor teaches feature processing and data-preprocessing applied to target data sets (represented by training, validation, test, error, inference data sets) to produce conditioned data sets, Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor does not explicitly teach
… wherein the conditioned data set is arranged in columns and rows, wherein the columns define fields of the conditioned data set and the rows define entries in the conditioned data set (Examiner’s note: Under its broadest reasonable interpretation, the term “conditioned data set” broadly recites a data set that is based on an identified “target” data set (where an identified “target” data set broadly recites any identified data set that is used in a machine learning system). Schelter teaches performing data quality verification tests on datasets that are used in determining predictions for a machine learning model (Schelter p.1787 col.2 2nd paragraph), where the data can be arranged in tables in a relational database system, and where the verification tests are based on computing and applying metrics to identify data constraints to a number of records (“hasSize”, “satisfies”, “satisfiesIf” ) as well as various other applying other metrics to identify data constraints on a set of numerical or categorical columns (“isComplete”, “hasCompleteness”, “isInRange”, “isUnique”, “hasUniqueness”, “hasDistinctness”) (Schelter p.1783 col.2 1st paragraph: “… errors in the data might cause unexpected errors in downstream system that consume the data. In many cases these errors may be hard to detect, e.g., they might cause regressions in the prediction quality of a machine learning model, which makes assumptions about the shape of particular features computed from the input data [34].”; p.1782 col.1-col.2 Section 2. Data Quality Dimensions: “… The quality of data can refer to the extension of the data (i.e., data values), or to the intension of the data (i.e., the schema) [4] … Completeness refers to the degree to which an entity includes data required to describe a real-world object. In tables in relational database system, completeness can be measured the by presence of null values, which is usually interpreted as a missing value. … Consistency is defined as the degree to which a set of semantic rules are violated. Intra-relation constraints define a range of admissible values, such as a specific data type, an interval for a numerical column, or a set of values for a categorical column. … Inter-relation constraints may involve columns from multiple tables. … Accuracy is the correctness of the data and can be measured in two dimensions: syntactic and semantic. …”; p.1783 Table 1 Constraints available for composing user-defined data quality checks, refer to “isComplete”, “hasCompleteness”, “isUnique”, “hasUniqueness”, “hasDistinctness”, “isInRange”, “satisfies”, “satisfiesIf”, “hasSize”; and p.1784 Table 2 Computable metrics to base constraints on).) … 
… determining that values of the particular column are skewed beyond a threshold amount (Examiner’s note: Under its broadest reasonable interpretation, the phrase “values … skewed beyond a threshold amount” broadly recites values that is outside a specified interval range. As indicated earlier, Schelter teaches performing data quality verification tests on datasets, where the verification tests can involve applying constraints (“isInRange”) to a range of admissible values analyzed over an interval for a numerical column, which requires the use of a corresponding computed “ValueRange” metric to identify the correct values within a range or outside the range, such that the identified values outside a range represent values skewed beyond a threshold amount (Schelter p.1782 col.1-col.2 Section 2. Data Quality Dimensions; p.1783 Table 1 Constraints available for composing user-defined data quality checks, refer to “isInRange”; p.1784 Table 2 Computable metrics to base constraints on; p.1786 col.1 Section 3.3 Constraint Suggestion 1st paragraph: “The benefits of our system to users heavily depend on the richness and specificity of the checks and constraints, which the users define and for which our system will regular compute data quality metrics. … we provide machinery to automatically suggest constraints and identify data types for datasets … Such suggestion functionality can then be integrated into ingestion pipelines and can also be used during exploratory data analysis.”; and p.1786 col.2 2nd paragraph, 6th bullet: “ … If the number of distinct values in a column is below a particular threshold, we interpret the column as categorical and suggest an isInRange constraint that checks whether future values are contained in the set of already observed values.”).).
	Both Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor and Schelter are analogous art since both teach data/feature processing on training data sets.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take data/feature processing steps taught in Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor and incorporate the data verification tests taught in Schelter as a way to detect and identify data quality issues (missing values, out-of-range values, etc.) that would impact the performance of a machine learning model. The motivation to combine is taught in Schelter, as a way to provide data quality checks to identify these data quality issues that would potentially result in failures when the data containing these issues are ingested by downstream processes (such as a machine learning model) within an automated machine learning system. Hence providing a way to generate an alert of these data quality issues when they are detected through automated data verification tests on the data sets allows a machine learning system to take further action to correct these issues on large datasets, thus improving the overall usability, scalability, and robustness of the machine learning system (Schelter p.1781 col.2 (Section 1 Introduction): “… there is a trend across different industries towards more automation of business processes with machine learning (ML) techniques. These techniques are often highly sensitive on input data, as the deployed models rely on strong assumptions about the shape of their inputs [43], and subtle errors introduced by changes in data can be very hard to detect [34]. At the same time, there is ample evidence that the volume of data available for training is often a decisive factor for a model’s performance [17], [44] … Many such data sources do not support integrity con[s]traints and data quality checks … Such issues potentially result in failures of the ingestion process. Even if the ingestion process still works, the errors in the data might cause unexpected errors in downstream systems that consume the data. … We therefore postulate that there is a need for increased automation of data validation. We present a system that we built for this task and that meets the demands of production use cases. …; p.1784 col.2 5th paragraph; and p.1791 col.2 last paragraph – p.1792 col.1 1st paragraph (Section 6 Learnings): “… users highlighted the fact that our data quality library runs on Spark, which they experience as a fast, scalable way to do data processing … Our system helped reduce manual and ad-hoc analysis on their data … Instead such check can now be run in an automated way as part of ingestion pipelines. Additionally, data producers can leverage our system to halt their data publishing pipelines when they encounter cases of data anomalies. By that, they can ensure that downstream data processing, which often includes training ML models, is only working with vetted data.”).  
Regarding previously presented Claim 24,
 Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor as applied to Claim 22 teaches
(Previously presented) The system of claim 22 …
… determining the one or more failure metrics (Examiner’s note: As indicated earlier, Ghanta teaches different pipeline machine learning operations, representing different stages (phases) of a machine learning service, where one of the stages includes a validation stage. Ghanta teaches secondary training and validation modules producing suitability metrics based on the predicted output generated by a first machine learning model, where this generated output is based on a received inference set. These suitability metrics include statistical metrics, confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability. Ghanta further teaches an analysis module using these suitability metrics to determine whether the output produced by the first machine learning model satisfy thresholds that indicate whether the model is a good fit for generating accurate predictions. This analysis represents a suitability analysis for validating the performance of a first machine learning model. Thus, the determination made during a validation phase when these thresholds are not being satisfied represents an unsuitability condition (concluding that the first machine learning model lacks efficacy, accuracy, or effectiveness), with these unsuitability conditions representing a failed or deficient condition of the model. Hence, this suitability analysis that involves analyzing and validating the output produced from a first machine learning model based on a set of suitability metrics represents a determination process involving one or more failure metrics at one or more of the plurality of phases in the ML pipeline (Ghanta [0078]-[0081]; [0085]-[0087]: (Ghanta [0078]-[0079]; [0085]-[0087]; [0089]; Table 1 and [0092]-[0093]; and [0098]-[0099]).); …
While Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor teaches the one or more failure metrics include data deviation metrics, Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor does not explicitly teach
… determining that the conditioned data set comprises more than a threshold number of empty values in a particular column, in a particular field, or both.
Schelter teaches
… determining that the conditioned data set comprises more than a threshold number of empty values in a particular column (Examiner’s note: Under its broadest reasonable interpretation, the ‘or’ identified in this claim limitation is interpreted as an exclusive ‘or’ defining a list of alternatives, indicating that only one of the identified limitations (“apply the ML pipeline to an additional target data set”, “change a criterion used to select the target data set”, or “both”) is required for the claimed invention. Furthermore, under its broadest reasonable interpretation, the term “conditioned data set” broadly recites a data set that is based on an identified “target” data set (where an identified “target” data set broadly recites any identified data set that is used in a machine learning system). As indicated earlier, Schelter teaches performing data quality verification tests on datasets, where the verification tests can involve applying constraints (“isComplete”, indicating that a particular column contains all non-NULL values (no empty/missing values), or “hasCompleteness”, indicating that some of the values in a column are NULL based on a custom validation from a user), both of which requires the use of a corresponding computed “Completeness” metric to identify the whether to apply a “isComplete” or “hasCompleteness” constraint (where the “hasCompleteness” constraint further identifies an associated lower bound start value of the interval for the completeness of the data), such that this determination of “isComplete” and “hasCompleteness” represents a determination that a data set comprises more than a threshold number of empty values in a particular column (Schelter p.1782 col.1-col.2 Section 2. Data Quality Dimensions; p.1783 Table 1 Constraints available for composing user-defined data quality checks, refer to “isComplete” and “hasCompleteness”; p.1784 Table 2 Computable metrics to base constraints on; p.1786 col.1 Section 3.3 Constraint Suggestion 1st paragraph: “The benefits of our system to users heavily depend on the richness and specificity of the checks and constraints, which the users define and for which our system will regular compute data quality metrics. … we provide machinery to automatically suggest constraints and identify data types for datasets … Such suggestion functionality can then be integrated into ingestion pipelines and can also be used during exploratory data analysis.”; and p.1786 col.2 2nd paragraph, 1st-2nd bullets: “ … If the column is complete in the sample at hand, we suggest an isComplete (not null) constraint. … If a column is incomplete in the sample at hand, we suggest a hasCompleteness constraint. We model the fact whether a value is present or not as a Bernoulli-distributed random variable, estimate a confidence interval for the corresponding probability, and return the start value of the interval as lower bound for the completeness in the data.”).), in a particular field, or both.
	Both Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor and Schelter are analogous art since both teach data/feature processing on training data sets.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take data/feature processing steps taught in Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor and incorporate the data verification tests taught in Schelter as a way to detect and identify data quality issues (missing values, out-of-range values, etc.) that would impact the performance of a machine learning model. The motivation to combine is taught in Schelter, as provided in the prior art claim mapping of Claim 6 recited above.
Regarding amended Claim 25, 
Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor as applied to Claim 22 teaches
 (Currently amended) The system of claim 22 … 
… wherein the operations comprise providing an indication of inadequacy of the conditioned data set (Examiner’s note: Under its broadest reasonable interpretation, the term “inadequacy” as defined in the Merriam-Webster dictionary refers to a state or condition of not being adequate, not enough, or not good enough, and hence this limitation broadly recites providing a notification associated with the inadequacy of a conditioned data set. As indicated earlier, Ghanta teaches the analysis module determining suitability of the first machine learning model for generating accurate predictions, where the determination of unsuitability (e.g., the model is not generating accurate predictions) identifies a condition that the first machine learning model is a deficient ML model. Ghanta further teaches the analysis module, in response to determining that the first machine learning model is deficient based on not satisfying predetermined suitability thresholds, can request the action module to perform various actions associated with the first machine learning algorithm, where these additional actions include retraining the first machine learning model using different training data set, switching the first machine learning model to a different machine learning model for training, or recommending a different first machine learning algorithm to be used to analyze the inference data set, and where the request from the analysis module to the action module to perform any of the above actions represents an indication of inadequacy with regards to the first machine learning model (Ghanta [0098]; and Figure 4, element 410 and [0100]-[0102]: “… the action module 312 is configured to trigger an action associated with the first machine learning algorithm, dynamically in real time, in response to the predicted suitability of the first machine learning algorithm/model for analyzing the inference data set not satisfying a predetermined suitability threshold. … the action comprises retraining the first machine learning model … using a different training data set … the action comprises switching the first machine learning model to a different machine learning model trained on different training data … the action comprises recommending one or more different first machine learning algorithms for analyzing the inference data set …”; and Figure 5, elements 514, 516, 518, 520).) …
… the indication of inadequacy of the conditioned data set … caused the ML pipeline to be terminated (Examiner’s note: Under its broadest reasonable interpretation, the term “inadequacy” as defined in the Merriam-Webster dictionary is representing a state or condition of not being adequate, not enough, or not good enough. As indicated earlier, Ghanta teaches the analysis module determining suitability of the first machine learning model for generating accurate predictions, where the determination of unsuitability (e.g., the model is not generating accurate predictions) identifies a condition that the first machine learning model is a deficient ML model. Ghanta further teaches the analysis module, in response to determining that the first machine learning model is deficient based on not satisfying predetermined suitability thresholds, can request the action module to perform various actions associated with the first machine learning algorithm, where these additional actions include retraining the first machine learning model using different training data set, switching the first machine learning model to a different machine learning model for training, or recommending a different first machine learning algorithm to be used to analyze the inference data set, and where the trigger to perform any of the above actions represents an indication of inadequacy with regards to the first machine learning model. A person having ordinary skill in the art would understand that any of these above actions/recommendations (changing a different training set, switching different machine learning models for training, recommending different learning algorithms for inference) would require that the current running ML pipeline to be terminated before these additional actions are applied to a machine learning pipeline (Ghanta Figure 5, elements 514, 516, 518, 520; and Figure 4, element 410 and [0098], [0100]-[0102]; and [0109]: “…the analysis module 310 determines 514 whether the predicted suitability of the first machine learning algorithm/model satisfies a predetermined suitability threshold. If so, the method 500 ends. Otherwise, the action module 312 triggers one or more actions associated with the first machine learning algorithm. For instance, the action module 312 may trigger retraining 516 the first machine learning model with different training data, may trigger switching 518 the first machine learning model to a different machine learning model that is trained using different training data, may recommend 520 different machine learning algorithms for analyzing the inference data set, may update 522 suitability thresholds, and/or the like, and the method 500 ends.”).) …
While Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor teaches providing an indication of inadequacy of a conditioned data set through triggering of actions that result in termination of a ML pipeline, Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor does not explicitly teach
… identifies one or more particular failure metrics …
Schelter teaches
… identifies one or more particular failure metrics (Examiner’s note: Schelter teaches the system performing the data quality verification tests on datasets reports which constraints succeeded and failed, including addition information on the predicates used for the corresponding metrics, and the values that made the constraint fail, where this detailed information represents an identification of the one or more particular failure metrics, and where this detailed information can be used as an indication to halt a machine learning pipeline (to ensure that the downstream training of ML models is not being trained with this failed data) (Schelter p.1784 col.2 5th paragraph: “Output. After execution of the data quality verification, our system reports which constraints succeeded and which failed, including information on the predicate applied to the metric into which the constraint was translated, and the value that made a constraint fail.”; p.1785 Listing 2; and p.1792 col.1 1st paragraph (Section 6 Learnings): “… data producers can leverage our system to halt their data publishing pipelines when they encounter cases of data anomalies. By that, they can ensure that downstream data processing, which often includes training ML models, is only working with vetted data.”).) …
	Both Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor and Schelter are analogous art since both teach data/feature processing on training data sets.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take data/feature processing steps taught in Ghanta in view of Dirac ‘430, in further view of Bianco, in even further view of Baylor and incorporate the data verification tests taught in Schelter as a way to detect and identify data quality issues (missing values, out-of-range values, etc.) that would impact the performance of a machine learning model. The motivation to combine is taught in Schelter, as provided in the prior art claim mapping of Claim 6 recited above.
Claims 9, 13, 19-20 and 27 are rejected under 35 U.S.C. 103 as being unpatentable over 
Ghanta et al., U.S. PGPUB 2020/0034665, filed 7/30/2018 [hereafter referred as Ghanta] in view of Dirac et al., U.S. PGPUB 2015/0379430, published 12/31/2015 [hereafter referred as Dirac '430], in further view of Baylor et al., TFX: A TensorFlow-Based Production-Scale Machine Learning Platform, KDD’17, August 13-17 2017 [hereafter referred as Baylor].
Regarding amended Claim 9, 
Ghanta teaches
(Currently amended) A method comprising:
obtaining a target data set (Examiner’s note: Under its broadest reasonable interpretation, a “target” data set broadly recites any identified data set that is used in a machine learning system. As indicated earlier, Ghanta teaches a machine learning system using training, validation, test, error, and inference data sets (“a target data set”), where these data sets are forms of operational data stored in memory (Ghanta [0019], [0036]-[0037]). Ghanta further teaches an ML management apparatus in the machine learning system using these identified data sets, where the ML management apparatus contains various modules including a primary training module, a primary validation module, a secondary training module, and a secondary validation module, with the primary training module training a first machine learning model using a training data set (Ghanta Figure 3 and [0075]-[0077]). (Ghanta Figure 3 and [0075]-[0077]).) …
… determining one or more failure metrics at one or more of the plurality of phases of the ML pipeline (Examiner’s note: Under its broadest reasonable interpretation, the term “failure” as defined in the Merriam-Webster dictionary indicates something that lacks success or failing short (a deficiency), resulting in the term “failure metrics” to broadly recite any set of metrics that are used to determine a deficient or unsuccessful result. As indicated earlier, Ghanta teaches different pipeline machine learning operations, representing different stages (“phases”) of a machine learning service, where one of the stages includes a validation stage. Ghanta teaches a secondary validation module producing suitability metrics based on the predicted output generated by the primary validation module analyzing the first machine learning model, where this generated output (i.e., the error data set) is based on a received inference set. These suitability metrics include statistical metrics, confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability. Ghanta further teaches an analysis module using these suitability metrics to determine whether the output produced by the first machine learning model satisfy thresholds that indicate whether the model is a good fit for generating accurate predictions. This analysis represents a suitability analysis for validating the performance of a first machine learning model. Thus, the determination made during a validation phase when these thresholds are not being satisfied represents an unsuitability condition (concluding that the first machine learning model lacks efficacy, accuracy, or effectiveness), with these unsuitability conditions representing a failed or deficient condition of the model. Analyzing and validating the output produced from a first machine learning model (through use of suitability metrics) corresponds to a determination process involving one or more outputs of the plurality of phases of the ML pipeline (Ghanta [0078]-[0081]; [0085]-[0087]; [0089]; Table 1 and [0091]-[0093]; and [0098]-[0099]).) …
… wherein the one or more failure metrics are determined based on one or more characteristics of the conditioned data set, one or more outputs of the plurality of phases of the ML pipeline, or any combination thereof (Examiner’s note: As indicated earlier, Ghanta teaches that these suitability metrics are based on features received from the first machine learning algorithm, which is received as input into the secondary training/validation modules to be processed by the second machine learning model to generate the suitability metrics. Hence, the output from both the primary validation (i.e., the error data set) and secondary validation module that produces the suitability metrics (based on the error data set) also correspond to one or more outputs from a ML pipeline building phase. Thus, the suitability metrics are generated based on one or more of these outputs, and therefore this analysis process involving the suitability metrics correspond to one or more failure metrics being determined based on one or more outputs of the plurality of phases of the ML pipeline (Ghanta [0079]-[0080]; [0087]; Table 1 and [0091]-[0093]; [0095]; and [0098]-[0099]).) …
… determining that at least one of the one or more failure metrics exceeds a threshold value (Examiner’s note: As indicated earlier, Ghanta teaches a secondary validation module that produces suitability metrics (i.e., confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability), where these suitability metrics indicate whether the model is a good fit for generating accurate predictions based on a received inference set. Ghanta teaches an analysis module that uses the produced suitability metrics to determine whether these suitability metrics (i.e., confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability) satisfy thresholds for determining whether the first machine learning model is a good fit (Ghanta [0078]-[0079]; [0085]-[0087]; [0092]; and [0098]-[0099]). A person having ordinary skill in the art would understand that using and comparing suitability metrics (indicating success) against threshold values involve having a threshold being defined as a minimum suitability baseline, and determining that these suitability metrics do not satisfy (i.e., are below) this minimum suitability baseline threshold (thus indicating failure or deficiency) is functionally equivalent to using and comparing failure metrics against threshold values representing a minimum failure baseline, and determining that these metrics meet or exceed (i.e., are above) this minimum failure baseline threshold to determine failure or deficiency.)…
… performing a remediation of the target data set (Examiner’s note: As indicated earlier, Ghanta teaches an analysis module determining whether suitability thresholds are being satisfied based on the corresponding suitability metrics (Ghanta [0078]-[0079]; [0085]-[0087]; Table 1 and [0092]-[0093]; and [0098]-[0099]). Ghanta further teaches that in response to determining that the first machine learning model is not satisfying predetermined suitability thresholds, the analysis module triggers the action module to perform various actions associated with the first machine learning algorithm. These additional actions include retraining the first machine learning model using a different training data set, switching the first machine learning model to a different machine learning model for training, or recommending a different first machine learning algorithm to be used to analyze the inference data set, and hence these actions correspond to performing remediations of a target data set (Ghanta Figure 5, elements 514, 516, 518, 520; and Figure 4, element 410 and [0098], [0100]-[0102]; and [0109]).) …
While Ghanta teaches machine learning pipelines performing various machine learning operations such as algorithm training/inference and feature engineering, and are associated with third party analytic engines for performing machine learning numeric computations and analysis (Ghanta [0053], [0055]), Ghanta does not explicitly teach
… applying a pre-processing phase of the ML pipeline to the target data set before applying an ML model building phase of the ML pipeline …
Dirac ‘430 teaches
… applying a pre-processing phase of the ML pipeline to the target data set before applying an ML model building phase of the ML pipeline (Examiner’s note: Under its broadest reasonable interpretation, a “target” data set broadly recites any identified data set that is used in a machine learning system. As indicated earlier, Dirac ‘430 teaches a machine learning service that performs feature processing to generate a processed (and pruned) training set, representing the target data set (Dirac ‘430 Figure 9a, element 904 and [0127]-[0128], [0130]; Figure 42, elements 4210, 4220, 4227, 4080, 4255, 4261 and [0239]-[0240]). where this conditioned data set is used to execute the scheduled model training/re-training jobs, such that this flow of performing feature processing to generate a conditioned data set that is used for the scheduled model training/re-training jobs represents a flow of applying the data pre-processing phase of the ML pipeline to a target data set before applying the ML model building phase (Dirac ‘430 Figure 42, elements 4210, 4220, 4227, 4080, 4255, 4261 and [0239]-[0240]).) …
	Both Ghanta and Dirac ‘430 are analogous art since both teach performing data/feature processing on training data sets.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the data/feature processing step taught in Ghanta and incorporate the feature processing manager taught in Dirac ‘430 as a way to further define prediction quality metrics and run-time goals to improve the prediction performance of the ML pipeline. The motivation to combine is taught in Dirac ‘430, as provided in the prior art claim mapping of Claim 1 recited above.
While Ghanta in view of Dirac ‘430 teaches actions that include receiving and analyzing data deviation score information, and selecting a different training data set to retrain a first machine learning model (Ghanta [0099]-[0100]), Ghanta in view of Dirac ‘430 does not explicitly teach
… performing a remediation … to provide additional training data to the target data set …
Baylor teaches
… performing a remediation … to provide additional training data to the target data set (Examiner’s note: Under its broadest reasonable interpretation, this limitation broadly recites a process that provides additional data as training data. Baylor teaches a data validation operation (corresponding to one of “a plurality of phases”) which validates the data properties of specific training and serving datasets, flags any deviations as potential anomalies, and provide actionable suggestions to fix the flagged anomaly. Baylor further teaches an existing data property can be encoded in a schema that describes the expected domain or range of a feature, and hence, deviations in the input data that exists outside of the expected domain or range of a feature represents an anomaly that corresponds to “one or more failure metrics at one or more of the plurality of phases of the ML pipeline”. Baylor further identifies an actionable suggestion to fix the identified deviation, which includes the expansion of the existing feature schema to include additional domain values (e.g., include the domain value “EDUCATION” in addition to the existing values “GAMES” and “BUSINESSES” in the feature ‘category’), such that any subsequent input data containing this new feature value will be correctly validated and incorporated as a normal state for a training or serving dataset. This action of expanding a feature schema to incorporate additional values as part of the training data set corresponds to performing a remediation action that provides additional data as training data (Baylor p.1389 Figure 1; p.1390-1391 Figure 2 and Section 3.3 Data Validation).) …
Both Ghanta in view of Dirac ‘430 and Baylor are analogous art since they both teach machine learning pipelines that perform validation operations.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the validation operations taught in Ghanta in view of Dirac ‘430 and enhance them to incorporate the validation operations taught in Baylor as a way to continuously improve the model quality over time. The motivation to combine is taught in Baylor, as provided in the prior art claim mapping of Claim 1 recited above.
Regarding amended Claim 13, 
Claim 13 recites an article of manufacture including a non-transitory computer-readable medium having stored thereon program instructions that upon execution by a computing system, causes the computing system to perform operations comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 9, and hence is rejected under similar rationale and motivations provided by Ghanta, Dirac ‘430, and Baylor as indicated in Claim 9. As indicated earlier, Ghanta teaches non-volatile computer readable storage medium containing computer readable program instructions (program code) for modules implementing the described functions on a machine learning system/ML management apparatus (Ghanta [0018]-[0021], [0025], [0029], [0033]; and Figure 1, element 104 and [0043]-[0044]), including examples of CD-ROM, DVD, and memory stick containing these program instructions, where such examples of computer readable storage medium containing computer readable program instructions represent articles of manufacture.
Regarding amended Claim 19, 
Ghanta in view of Dirac ‘430, in further view of Baylor teaches
(Currently amended) The article of manufacture of claim 13, wherein the operations comprise: 
applying the ML model building phase to generate an ML model (Examiner’s note: As indicated earlier, Ghanta teaches a training machine learning pipeline generating machine learning models, and hence using this training machine learning pipeline operation to generate machine learning models corresponds to an operation that applies a ML model building phase to generate an ML model (Ghanta Figure 2A, elements 202, 204, 206a-c, [0036], [0053], and [0056]).);
generating first and second ML models, wherein the first ML model corresponds to the ML model generated during the ML model building phase, and wherein the second ML model has fewer trainable parameters than the first ML model (Examiner’s note: Under its broadest reasonable interpretation, the phrase “fewer trainable parameters” broadly indicates training the second ML model with a smaller selected set of parameters based on the training from the first ML model. As indicated earlier, Ghanta teaches using a primary training module to generate a first machine learning model from an ML training pipeline, and using a primary validation module to validate the generated first machine learning model and produce a resulting error data set. Ghanta additionally teaches generating a second machine learning model from an ML training pipeline by a secondary training module using a different machine learning algorithm, and using the error data set as input. Ghanta further teaches this error data set can include one or more parameters specific to the first machine learning model, and the secondary training model can select all or a subset of these feature parameters, where these one or more specific parameters corresponds to a smaller set of trainable parameters for training the second ML model (Ghanta [0074], [0075]-[0077]; [0078]-[0080]; [0085]-[0087]).); and
comparing a predictive ability of the first ML model and the second ML model (Examiner’s note: As indicated earlier, Ghanta teaches a secondary validation module that produces suitability metrics (i.e., confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability), where these suitability metrics indicate whether the model is a good fit for generating accurate predictions based on a received inference set. Ghanta teaches an analysis module that uses the produced suitability metrics to determine whether these suitability metrics (i.e., confidence metrics, accuracy metrics, precision metrics, etc., where these metrics represent metrics used for determining predictive ability) satisfy thresholds for determining whether the first machine learning model is a good fit. Hence, this analysis to determine whether these thresholds are being satisfied represents a suitability analysis for the first machine learning model, and as such, the determination where these thresholds are not being satisfied represents an unsuitability condition that concludes that a first machine learning model lacks efficacy, accuracy, or effectiveness. Thus, this suitability analysis involving output produced from the first and second machine learning model represents a process for comparing predictive ability using a first ML model and a second ML model (Ghanta [0078]-[0079]; [0085]-[0087]; [0089]; Table 1 and [0092]-[0093]; and [0098]-[0099]).).
Regarding previously presented Claim 20, 
Claim 20 recites the article of manufacture of claim 19, further comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 8, and hence is rejected under similar rationale provided by Ghanta, Dirac ‘430, and Baylor as indicated in Claim 8, in view of the rejections of Claim 19.
Regarding new Claim 27,
Claim 27 recites the article of manufacture of claim 13, further comprising of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 26, and hence is rejected under similar rationale provided by Ghanta, Dirac ‘430, and Baylor as indicated in Claim 26, in view of the rejections of Claim 13.
Claims 11 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over 
Ghanta et al., U.S. PGPUB 2020/0034665, filed 7/30/2018 [hereafter referred as Ghanta] in view of Dirac et al., U.S. PGPUB 2015/0379430, published 12/31/2015 [hereafter referred as Dirac '430], in further view of Baylor et al., TFX: A TensorFlow-Based Production-Scale Machine Learning Platform, KDD’17, August 13-17 2017 [hereafter referred as Baylor] as applied to Claims 9 and 13; in even further view of Dirac et al., U.S. PGPUB 2015/0379424, published 12/31/2015 [hereafter referred as Dirac '424].
Regarding amended Claim 11, 
Ghanta in view of Dirac ‘430, in further view of Baylor as applied to Claim 9 teaches
 (Currently amended) The method of claim 9.
However, Ghanta in view of Dirac ‘430, in further view of Baylor does not teach
… wherein the particular column of the target data set contains at least one of 
(i) word vectors that describe, in a semantically-encoded vector space, a meaning of respective words, or 
(ii) paragraph vectors that describe, in a semantically-encoded vector space, a meaning of respective multi-word samples of text.  
Dirac ‘424 teaches
wherein the particular column of the target data set contains at least one of 
(i) word vectors that describe, in a semantically-encoded vector space, a meaning of respective words (Examiner’s note: Under its broadest reasonable interpretation, the phrase “at least one of … or” is interpreted to define a list of alternatives, indicating that a minimum of one of the listed options is required to be identified for the claimed invention. Dirac ‘424 teaches performing data pre-processing operations on input data (“target data set”) containing data records containing variables of data types such as text, and using natural language processing to perform feature processing (Dirac ‘424 [0035]). Dirac ‘424 further teaches performing the text data transformations by determining the root words to be included in an n-gram for use in a machine learning algorithm, where this transformation process represents transforming “word vectors that describe, in a semantically-encoded vector space, the meaning of respective words” (Dirac ‘424 [0079]).), or 
(ii) paragraph vectors that describe, in a semantically-encoded vector space, [[the]] a meaning of respective multi-word samples of text.  
Both Ghanta in view of Dirac ‘430, in further view of Baylor and Dirac ‘424 are analogous art since both teach data pre-processing phases in machine learning systems.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the data pre-processing phase of Ghanta in view of Dirac ‘430, in further view of Baylor and incorporate the text pre-processing steps of Dirac ‘424 as a way to handle text pre-processing in an input data set. The motivation to combine is taught in Dirac ‘424, as provided in the prior art claim mapping of Claim 4 recited above.
Regarding amended Claim 16, 
Claim 16 recites the article of manufacture of claim 13, where the article of manufacture further comprises of claim limitations that are similar in scope to corresponding claim limitations in Claim 11, and hence is rejected under similar rationale and motivations provided by Ghanta, Dirac ‘430, Baylor, and Dirac ‘424 as indicated in Claim 11, in view of rejections from Claim 13.
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over 
Ghanta et al., U.S. PGPUB 2020/0034665, filed 7/30/2018 [hereafter referred as Ghanta] in view of Dirac et al., U.S. PGPUB 2015/0379430, published 12/31/2015 [hereafter referred as Dirac '430], in further view of Baylor et al., TFX: A TensorFlow-Based Production-Scale Machine Learning Platform, KDD’17, August 13-17 2017 [hereafter referred as Baylor] as applied to Claim 13; in even further view of Schelter et al., Automating Large-Scale Data Quality Verification, 2018 [hereafter referred as Schelter].
Regarding amended Claim 18, 
Ghanta in view of Dirac ‘430, in further view of Baylor as applied to Claim 13 teaches
(Currently amended) The article of manufacture of claim 13 …
… wherein applying the ML model building phase comprises generating an ML model to predict a particular column of the target data set (Examiner’s notes: Ghanta teaches training pipeline and inference pipelines training a machine learning model on a training data set, where the training data set contains three columns of feature data (Age, Sex, Height) (Ghanta Figure 2A, elements 200, 206a-c and [0040]: “In certain embodiments of machine learning systems 200, there is a training phase, for generating the machine learning model … the training data set comprises three columns of feature data-Age, Sex, and Height-that are used to train the machine learning model …”).) …
While Ghanta in view of Dirac ‘430, in further view of Baylor teaches feature processing and data-preprocessing applied to target data sets (represented by training, validation, test, error, inference data sets), Ghanta in view of Dirac ‘430, in even further view of Baylor does not explicitly teach
… wherein the target data set is arranged in columns and rows, wherein the columns define fields of the target data set and the rows define entries in the target data set (Examiner’s note: As indicated earlier, Schelter teaches performing data quality verification tests on datasets that are used in determining predictions for a machine learning model (Schelter p.1787 col.2 2nd paragraph), where the data can be arranged in tables in a relational database system, and where the verification tests are based on computing and applying metrics to identify data constraints to a number of records (“hasSize”, “satisfies”, “satisfiesIf” ) as well as various other applying other metrics to identify data constraints on a set of numerical or categorical columns (“isComplete”, “hasCompleteness”, “isInRange”, “isUnique”, “hasUniqueness”, “hasDistinctness”) (Schelter p.1783 col.2 1st paragraph; p.1782 col.1-col.2 Section 2. Data Quality Dimensions; p.1783 Table 1 Constraints available for composing user-defined data quality checks, refer to “isComplete”, “hasCompleteness”, “isUnique”, “hasUniqueness”, “hasDistinctness”, “isInRange”, “satisfies”, “satisfiesIf”, “hasSize”; and p.1784 Table 2 Computable metrics to base constraints on).) … 
… determining that values of the particular column are skewed beyond a threshold amount (Examiner’s note: Under its broadest reasonable interpretation, the phrase “values … skewed beyond a threshold amount” broadly recites values that is outside a specified interval range. As indicated earlier, Schelter teaches performing data quality verification tests on datasets, where the verification tests can involve applying constraints (“isInRange”) to a range of admissible values analyzed over an interval for a numerical column, which requires the use of a corresponding computed “ValueRange” metric to identify the correct values within a range or outside the range, such that the identified values outside a range represent values skewed beyond a threshold amount (Schelter p.1782 col.1-col.2 Section 2. Data Quality Dimensions; p.1783 Table 1 Constraints available for composing user-defined data quality checks, refer to “isInRange”; p.1784 Table 2 Computable metrics to base constraints on; p.1786 col.1 Section 3.3 Constraint Suggestion 1st paragraph; and p.1786 col.2 2nd paragraph, 6th bullet).).
	Both Ghanta in view of Dirac ‘430, in further view of Baylor and Schelter are analogous art since both teach data/feature processing on training data sets.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take data/feature processing steps taught in Ghanta in view of Dirac ‘430, in further view of Baylor and incorporate the data verification tests taught in Schelter as a way to detect and identify data quality issues (missing values, out-of-range values, etc.) that would impact the performance of a machine learning model. The motivation to combine is taught in Schelter, as provided in the prior art claim mapping of Claim 6 recited above.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM WAI YIN KWAN whose telephone number is 303-297-4332. The examiner can normally be reached Monday-Friday 8:00am - 4:30pm PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 571-272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/WILLIAM WAI YIN KWAN/Examiner, Art Unit 2121                                                                                                                                                                                                        


/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121