DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-24 are presented for examination.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 22 February 2022 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 5, 6, 7, 8, 9, 11, 13, 14, 15, 16, 17, 19, 21, 22, 23 and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Chaudhuri et al. (U.S. Pub. No.: 20190378028, hereinafter Chaudhuri), in view of Esmaeilzadeh et al. (U.S. Pub. No.: 20190287017, hereinafter Esmaeilzadeh), and further in view of DIRAC et al. (U.S. Pub. No.: 20150379430, hereinafter DIRAC).
For claim 1, Chaudhuri discloses a method comprising: 
obtaining, from a user device and by a query engine that is configured to access one or more databases, a command to execute a user-defined function of the query engine (Chaudhuri: paragraph [0025], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF).” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0192], “The query manager 1602 manages the queries received in the system, such as via the user interface 1612” 
WHERE “user-defined function” is broadly interpreted as “user-defined-function (UDF)”
WHERE “a command to execute a user-defined function” is broadly interpreted as “the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF).”), wherein:
the command is written in a query language (Chaudhuri: paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs. A query may include a plurality of elements, such as those elements that may be defined using a database query format, such as Structured Query Language (SQL).”
WHERE “a query language” is broadly interpreted as “A query…such as Structured Query Language (SQL).”); 
the user-defined function includes an inference call to a trained machine learning model, wherein the command comprises one or more model inputs to the machine learning model (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…”); 
executing, by the query engine, the user-defined function, comprising processing the one or more model inputs using the machine learning model according to the obtained parameter values of the machine learning model to generate respective model outputs (Chaudhuri: paragraph [0041], “During a learning phase, the models are developed against a training dataset of inputs to optimize the models to correctly predict the output for a given input. Generally, the learning phase may be supervised, semi-supervised, or unsupervised, indicating a decreasing level to which the “correct” outputs are provided in correspondence to the training inputs. In a supervised learning phase, all of the outputs are provided to the model, and the model is directed to develop a general rule or algorithm that maps the input to the output. In contrast, in an unsupervised learning phase, the desired output is not provided for the inputs so that the model may develop its own rules to discover relationships within the training dataset. In a semi-supervised learning phase, an incompletely labeled training set is provided, with some of the outputs known and some unknown for the training dataset.…” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0063], “Further, queries may also contain grouping, aggregation (e.g., Q2), and joins (e.g., Q4). It is easy to observe that the materialization cost (e.g., time and resources used to execute the machine learning UDFs) will be high in processing these queries. It is also easy to see that materialization is query-specific. While there is some commonality, in general, different queries invoke different feature extractors, regressors, classifiers, etc” 
paragraph [0077], “It is possible to train PPs with different tuple values.” paragraph [0118], “FIG. 6 illustrates a query optimizer 604 that utilizes probabilistic predicates, according to some example embodiments. The query optimizer 604 takes, in addition to the query 102 and the input 410 database, two additional inputs: available trained PPs 510 and a desired accuracy threshold 610 for the query.” paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy.” paragraph [0152], “At training time, an array of thresholds th[a] is calculated, as discussed above with reference to equation (5) for different values of a, the desired accuracy. By calculating this array of thresholds th[a] it is possible to choose PPs based on the accuracy required at query optimization time, that is, based on the accuracy specified with the query…”);
obtaining, by the query engine and from the one or more databases, trained parameter values for the machine learning model (Chaudhuri: paragraph [0041], “During a learning phase, the models are developed against a training dataset of inputs to optimize the models to correctly predict the output for a given input. Generally, the learning phase may be supervised, semi-supervised, or unsupervised, indicating a decreasing level to which the “correct” outputs are provided in correspondence to the training inputs. In a supervised learning phase, all of the outputs are provided to the model, and the model is directed to develop a general rule or algorithm that maps the input to the output. In contrast, in an unsupervised learning phase, the desired output is not provided for the inputs so that the model may develop its own rules to discover relationships within the training dataset. In a semi-supervised learning phase, an incompletely labeled training set is provided, with some of the outputs known and some unknown for the training dataset.…” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input”  paragraph [0071], “…binary classifiers are trained, where the binary classifiers group the input blobs into those that disagree and those that may agree with the query predicate. The input blobs that disagree are discarded, and the remainder are passed through to the original query plan. These classifiers are the aforementioned probabilistic predicates, because each PP has associated values for the tuple [data reduction rate, cost, accuracy]”,  paragraph [0089], “…the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0077], “It is possible to train PPs with different tuple values.” paragraph [0118], “FIG. 6 illustrates a query optimizer 604 that utilizes probabilistic predicates, according to some example embodiments. The query optimizer 604 takes, in addition to the query 102 and the input 410 database, two additional inputs: available trained PPs 510 and a desired accuracy threshold 610 for the query.” paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy.” paragraph [0152], “At training time, an array of thresholds th[a] is calculated, as discussed above with reference to equation (5) for different values of a, the desired accuracy. By calculating this array of thresholds th[a] it is possible to choose PPs based on the accuracy required at query optimization time, that is, based on the accuracy specified with the query…”); and 
providing, by the query engine, the generated model outputs (Chaudhuri: paragraph [0041], “During a learning phase, the models are developed against a training dataset of inputs to optimize the models to correctly predict the output for a given input. Generally, the learning phase may be supervised, semi-supervised, or unsupervised, indicating a decreasing level to which the “correct” outputs are provided in correspondence to the training inputs. In a supervised learning phase, all of the outputs are provided to the model, and the model is directed to develop a general rule or algorithm that maps the input to the output. In contrast, in an unsupervised learning phase, the desired output is not provided for the inputs so that the model may develop its own rules to discover relationships within the training dataset. In a semi-supervised learning phase, an incompletely labeled training set is provided, with some of the outputs known and some unknown for the training dataset…”
paragraph [0081], “…The query 102 may include one or more UDFs. and the query optimizer 402 generates the plan 404 to efficiently access the input 410 data to retrieve the desired results…When executed, the operations 406 of the plan 404 generate the desired results 408.”
paragraph [0171], “…the “+” outputs of PP.sub.q 1212 are added 1204 with the “+” outputs of PP.sub.p 1202, and the result is used as input for the rest of the query 1206. The predicate p∨q is executed 1208 and the result is output 1210. The “−” results of PP.sub.q 1212 are discarded 1214 since they do not meet any of the conditions..” paragraph [0202], “At operation 1712, the database query is executed over the blobs that have not been discarded, the database search utilizing the UDF, and at operation 1714, the results of the database search are provided”).
However, Chaudhuri does not explicitly disclose the user-defined function has been written and launched onto the query engine by users of the query engine using one or more programming languages that are different from the query language in which the command is written;
“providing, to the user device” as in “providing, to the user device and by the query engine, the generated model outputs.” 
Esmaeilzadeh discloses the user-defined function has been written and launched onto the query engine by users of the query engine using one or more programming languages that are different from the query language in which the command is written (Esmaeilzadeh: paragraph [0024], “generated for one or more user defined functions (UDF), expressed as a part of a query (e.g., an SQL query) using a domain-specific language (e.g., Python).”
WHERE “the user-defined function…using one or more programming languages that are different from the query language in which the command is written” is broadly interpreted as “a domain-specific language (e.g., Python).,” where “the query language” is broadly interpreted as “a query (e.g., an SQL query)”).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “ACCELERATING MACHINE LEARNING INFERENCE WITH PROBABILISTIC PREDICATES” as taught by Chaudhuri by implementing “METHODS AND SYSTEMS FOR INTEGRATING MACHINE LEARNING/ANALYTICS ACCELERATORS AND RELATIONAL DATABASE SYSTEMS” as taught by Esmaeilzadeh, because it would provide Chaudhuri’s method with the enhanced capability of “an improved method and system for generating an architecture capable of efficiently executing a machine learning algorithm against large databases..” (Esmaeilzadeh: paragraph [0005])
However, Chaudhuri and Esmaeilzadeh do not explicitly disclose “providing, to the user device” as in “providing, to the user device and by the query engine, the generated model outputs.” 
DIRAC discloses “providing, to the user device” as in “providing, to the user device and by the query engine, the generated model outputs” (DIRAC: paragraph [0094], “…some relatively simple types of client requests 111 may result in the immediate generation, retrieval, storage, or modification of corresponding artifacts within MLS artifact repository 120 by the MLS request handler 180 (as indicated by arrow 141). Thus, the insertion of a job object in job queue 142 may not be required for all types of client requests. For example, a creation or removal of an alias for an existing model may not require the creation of a new job in such embodiments. In the embodiment shown in FIG. 1, clients 164 may be able to view at least a subset of the artifacts stored in repository 120, e.g., by issuing read requests 118 via programmatic interfaces 161…” paragraph [0096], “Clients 164 may be able to search for and retrieve KB entries via programmatic interfaces 161, as indicated by arrow 117, and may use the information contained in the entries to select parameters (such as specific recipes or algorithms to be used) for their request submissions. In at least some embodiments, new APIs may be implemented (or default values for API parameters may be selected) by the MLS on the basis of best practices identified over time for various types of machine learning practices.”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “ACCELERATING MACHINE LEARNING INFERENCE WITH PROBABILISTIC PREDICATES” as taught by Chaudhuri by implementing “EFFICIENT DUPLICATE DETECTION FOR MACHINE LEARNING DATA SETS” as taught by Esmaeilzadeh, because it would provide Chaudhuri’s method with the enhanced capability of “clients 164 may be able to view at least a subset of the artifacts stored in repository 120, e.g., by issuing read requests 118 via programmatic interfaces 161…” (DIRAC: paragraph [0094])
For claim 3, Chaudhuri, Esmaeilzadeh and DIRAC disclose the method of claim 1, further comprising training the machine learning model, the training comprising: 
obtaining, from a second user device and by the query engine, a second command to execute a second user-defined function of the query engine (Chaudhuri: paragraph [0025], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF).” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0192], “The query manager 1602 manages the queries received in the system, such as via the user interface 1612”), wherein: 
the second command is written in the query language (Chaudhuri: paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs. A query may include a plurality of elements, such as those elements that may be defined using a database query format, such as Structured Query Language (SQL).”); 
the second user-defined function has been written and launched onto the query engine by users of the query engine (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” 
paragraph [0041], “Machine learning techniques train models to accurately make predictions on data fed into the models (e.g., what was said by a user in a given utterance; whether a noun is a person, place, or thing; what the weather will be like tomorrow). During a learning phase, the models are developed against a training dataset of inputs to optimize the models to correctly predict the output for a given input. Generally, the learning phase may be supervised, semi-supervised, or unsupervised, indicating a decreasing level to which the “correct” outputs are provided in correspondence to the training inputs. In a supervised learning phase, all of the outputs are provided to the model, and the model is directed to develop a general rule or algorithm that maps the input to the output. In contrast, in an unsupervised learning phase, the desired output is not provided for the inputs so that the model may develop its own rules to discover relationships within the training dataset. In a semi-supervised learning phase, an incompletely labeled training set is provided, with some of the outputs known and some unknown for the training dataset…”
paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…”); and 
the second command comprises data identifying a plurality of training examples stored in the one or more databases (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…”
paragraph [0194], “…The PP database 1614 stores a plurality of trained PPs for use in processing queries…the ML training data database 1618 stores the data used for training the different classifiers.”); 
obtaining, by the query engine and from the one or more databases, the plurality of training examples (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…” paragraph [0194], “…The PP database 1614 stores a plurality of trained PPs for use in processing queries…the ML training data database 1618 stores the data used for training the different classifiers.”); 
executing, by the query engine, the second user-defined function, comprising processing the plurality of training examples using the machine learning model to generate trained parameter values for the machine learning model (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…” paragraph [0194], “…The PP database 1614 stores a plurality of trained PPs for use in processing queries…the ML training data database 1618 stores the data used for training the different classifiers.”); and 
storing, in the one or more databases, the trained parameter values (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…” paragraph [0194], “…The PP database 1614 stores a plurality of trained PPs for use in processing queries…the ML training data database 1618 stores the data used for training the different classifiers.” Where “storing, in the one or more databases, the trained parameter values” is broadly interpreted as “PP database 1614 stores a plurality of trained PPs” Paragraph [0180], “PP has been trained for some value t”).
However, Chaudhuri does not explicitly disclose the second user-defined function has been written using one or more programming languages that are different from the query language in which the second command is written.
Esmaeilzadeh discloses the second user-defined function has been written using one or more programming languages that are different from the query language in which the second command is written (Esmaeilzadeh: paragraph [0024], “generated for one or more user defined functions (UDF), expressed as a part of a query (e.g., an SQL query) using a domain-specific language (e.g., Python).”).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “ACCELERATING MACHINE LEARNING INFERENCE WITH PROBABILISTIC PREDICATES” as taught by Chaudhuri by implementing “METHODS AND SYSTEMS FOR INTEGRATING MACHINE LEARNING/ANALYTICS ACCELERATORS AND RELATIONAL DATABASE SYSTEMS” as taught by Esmaeilzadeh, because it would provide Chaudhuri’s method with the enhanced capability of “an improved method and system for generating an architecture capable of efficiently executing a machine learning algorithm against large databases..” (Esmaeilzadeh: paragraph [0005])
For claim 5, Chaudhuri, Esmaeilzadeh and DIRAC disclose the method of claim 3, wherein executing the second user-defined function further comprises pre-processing, by the query engine, the plurality of training examples before processing the training examples using the machine learning model (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…” paragraph [0194], “…The PP database 1614 stores a plurality of trained PPs for use in processing queries…the ML training data database 1618 stores the data used for training the different classifiers.” Paragraph [0180], “PP has been trained for some value t.”).
For claim 6, Chaudhuri, Esmaeilzadeh and DIRAC disclose the method of claim 1, further comprising evaluating the machine learning model, the evaluating comprising: 
obtaining, from a third user device and by the query engine, a third command to execute a third user-defined function of the query engine (Chaudhuri: paragraph [0025], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF).” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0192], “The query manager 1602 manages the queries received in the system, such as via the user interface 1612”), wherein: 
the third command is written in the query language (Chaudhuri: paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs. A query may include a plurality of elements, such as those elements that may be defined using a database query format, such as Structured Query Language (SQL).”); 
the third user-defined function has been written and launched onto the query engine by users of the query engine (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…”); and 
the third command comprising data identifying a plurality of testing examples stored in the one or more databases(Chaudhuri: paragraphs [0110], “…this comes with some additional cost during testing, as illustrated in the table 802 of FIG. 8. In particular, applying the KDE PP at test time may require a pass through the entire training set because the densities d.sup.+ and d.sup.− are computed based on the distance between the test point x and each of the training points…”
paragraph [0112], “…The formulas in the table 802 may be used to determine the costs of using PCA during training and test, where n can be either the full training set or the sampled subset…”
paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy,”
paragraphs [0027]-[0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” 
paragraph [0041], paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” 
paragraph [0063], paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…”
paragraph [0194], “…The PP database 1614 stores a plurality of trained PPs for use in processing queries…the ML training data database 1618 stores the data used for training the different classifiers.”; 
obtaining, by the query engine and from the one or more databases, the plurality of testing examples (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…” paragraph [0194], “…The PP database 1614 stores a plurality of trained PPs for use in processing queries…the ML training data database 1618 stores the data used for training the different classifiers.”); 
obtaining, by the query engine and from the one or more databases, the parameter values of the machine learning model (Chaudhuri: paragraphs [0110], “…this comes with some additional cost during testing, as illustrated in the table 802 of FIG. 8. In particular, applying the KDE PP at test time may require a pass through the entire training set because the densities d.sup.+ and d.sup.− are computed based on the distance between the test point x and each of the training points…” paragraph [0112], “…The formulas in the table 802 may be used to determine the costs of using PCA during training and test, where n can be either the full training set or the sampled subset…” paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy” paragraph [0077], “It is possible to train PPs with different tuple values.” paragraph [0118], “FIG. 6 illustrates a query optimizer 604 that utilizes probabilistic predicates, according to some example embodiments. The query optimizer 604 takes, in addition to the query 102 and the input 410 database, two additional inputs: available trained PPs 510 and a desired accuracy threshold 610 for the query.” paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy.” paragraph [0152], “At training time, an array of thresholds th[a] is calculated, as discussed above with reference to equation (5) for different values of a, the desired accuracy. By calculating this array of thresholds th[a] it is possible to choose PPs based on the accuracy required at query optimization time, that is, based on the accuracy specified with the query…”); 
executing, by the query engine, the third user-defined function, comprising processing the plurality of testing examples using the machine learning model according to the obtained parameter values of the machine learning model to generate a measure of performance of the machine learning model (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;”  paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0077], “It is possible to train PPs with different tuple values.” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…” paragraph [0118], “FIG. 6 illustrates a query optimizer 604 that utilizes probabilistic predicates, according to some example embodiments. The query optimizer 604 takes, in addition to the query 102 and the input 410 database, two additional inputs: available trained PPs 510 and a desired accuracy threshold 610 for the query.” paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy.” paragraph [0152], “At training time, an array of thresholds th[a] is calculated, as discussed above with reference to equation (5) for different values of a, the desired accuracy. By calculating this array of thresholds th[a] it is possible to choose PPs based on the accuracy required at query optimization time, that is, based on the accuracy specified with the query…” paragraph [0194], “…The PP database 1614 stores a plurality of trained PPs for use in processing queries…the ML training data database 1618 stores the data used for training the different classifiers.”); and 
providing, by the query engine, the generated measure of performance of the machine learning model (Chaudhuri: paragraphs [0110], “…this comes with some additional cost during testing, as illustrated in the table 802 of FIG. 8. In particular, applying the KDE PP at test time may require a pass through the entire training set because the densities d.sup.+ and d.sup.− are computed based on the distance between the test point x and each of the training points…”
paragraph [0112], “…The formulas in the table 802 may be used to determine the costs of using PCA during training and test, where n can be either the full training set or the sampled subset…”
paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy”
paragraph [0077], “It is possible to train PPs with different tuple values.”
paragraph [0118], “FIG. 6 illustrates a query optimizer 604 that utilizes probabilistic predicates, according to some example embodiments. The query optimizer 604 takes, in addition to the query 102 and the input 410 database, two additional inputs: available trained PPs 510 and a desired accuracy threshold 610 for the query.”
paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy.”
paragraph [0152], “At training time, an array of thresholds th[a] is calculated, as discussed above with reference to equation (5) for different values of a, the desired accuracy. By calculating this array of thresholds th[a] it is possible to choose PPs based on the accuracy required at query optimization time, that is, based on the accuracy specified with the query…”).
However, Chaudhuri does not explicitly disclose the third user-defined function using one or more programming languages that are different from the query language in which the third command is written;
providing, to the user device.
Esmaeilzadeh discloses the third user-defined function using one or more programming languages that are different from the query language in which the third command is written (Esmaeilzadeh: paragraph [0024], “generated for one or more user defined functions (UDF), expressed as a part of a query (e.g., an SQL query) using a domain-specific language (e.g., Python).”
WHERE “the user-defined function…using one or more programming languages that are different from the query language in which the command is written” is broadly interpreted as “a domain-specific language (e.g., Python).,” where “the query language” is broadly interpreted as “a query (e.g., an SQL query)”).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “ACCELERATING MACHINE LEARNING INFERENCE WITH PROBABILISTIC PREDICATES” as taught by Chaudhuri by implementing “METHODS AND SYSTEMS FOR INTEGRATING MACHINE LEARNING/ANALYTICS ACCELERATORS AND RELATIONAL DATABASE SYSTEMS” as taught by Esmaeilzadeh, because it would provide Chaudhuri’s method with the enhanced capability of “an improved method and system for generating an architecture capable of efficiently executing a machine learning algorithm against large databases..” (Esmaeilzadeh: paragraph [0005])
However, Chaudhuri and Esmaeilzadeh do not explicitly disclose providing, to the user device.
DIRAC discloses providing, to the user device (DIRAC: paragraph [0094], “…some relatively simple types of client requests 111 may result in the immediate generation, retrieval, storage, or modification of corresponding artifacts within MLS artifact repository 120 by the MLS request handler 180 (as indicated by arrow 141). Thus, the insertion of a job object in job queue 142 may not be required for all types of client requests. For example, a creation or removal of an alias for an existing model may not require the creation of a new job in such embodiments. In the embodiment shown in FIG. 1, clients 164 may be able to view at least a subset of the artifacts stored in repository 120, e.g., by issuing read requests 118 via programmatic interfaces 161…” paragraph [0096], “Clients 164 may be able to search for and retrieve KB entries via programmatic interfaces 161, as indicated by arrow 117, and may use the information contained in the entries to select parameters (such as specific recipes or algorithms to be used) for their request submissions. In at least some embodiments, new APIs may be implemented (or default values for API parameters may be selected) by the MLS on the basis of best practices identified over time for various types of machine learning practices.”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “ACCELERATING MACHINE LEARNING INFERENCE WITH PROBABILISTIC PREDICATES” as taught by Chaudhuri by implementing “EFFICIENT DUPLICATE DETECTION FOR MACHINE LEARNING DATA SETS” as taught by Esmaeilzadeh, because it would provide Chaudhuri’s method with the enhanced capability of “clients 164 may be able to view at least a subset of the artifacts stored in repository 120, e.g., by issuing read requests 118 via programmatic interfaces 161…” (DIRAC: paragraph [0094])
For claim 7, Chaudhuri, Esmaeilzadeh and DIRAC disclose the method of claim 1, further comprising refining the parameter values of the machine learning model, the refining comprising: 
obtaining, from a fourth user device and by the query engine, a fourth command to execute a fourth user-defined function of the query engine (Chaudhuri: paragraph [0025], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF).” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0192], “The query manager 1602 manages the queries received in the system, such as via the user interface 1612”), wherein: 
the fourth command is written in the query language (Chaudhuri: paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs. A query may include a plurality of elements, such as those elements that may be defined using a database query format, such as Structured Query Language (SQL).”); 
the fourth user-defined function has been written and launched onto the query engine by users of the query engine (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0041], “Machine learning techniques train models to accurately make predictions on data fed into the models (e.g., what was said by a user in a given utterance; whether a noun is a person, place, or thing; what the weather will be like tomorrow). During a learning phase, the models are developed against a training dataset of inputs to optimize the models to correctly predict the output for a given input. Generally, the learning phase may be supervised, semi-supervised, or unsupervised, indicating a decreasing level to which the “correct” outputs are provided in correspondence to the training inputs. In a supervised learning phase, all of the outputs are provided to the model, and the model is directed to develop a general rule or algorithm that maps the input to the output. In contrast, in an unsupervised learning phase, the desired output is not provided for the inputs so that the model may develop its own rules to discover relationships within the training dataset. In a semi-supervised learning phase, an incompletely labeled training set is provided, with some of the outputs known and some unknown for the training dataset…” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…”); and 
fourth command comprises data identifying a plurality of second training examples stored in the one or more databases (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…” paragraph [0194], “…The PP database 1614 stores a plurality of trained PPs for use in processing queries…the ML training data database 1618 stores the data used for training the different classifiers.”); 
obtaining, by the query engine and from the one or more databases, the plurality of second training examples (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…” paragraph [0194], “…The PP database 1614 stores a plurality of trained PPs for use in processing queries…the ML training data database 1618 stores the data used for training the different classifiers.”); 
obtaining, by the query engine and from the one or more databases, the parameter values for the machine learning model (Chaudhuri: paragraph [0041], “During a learning phase, the models are developed against a training dataset of inputs to optimize the models to correctly predict the output for a given input. Generally, the learning phase may be supervised, semi-supervised, or unsupervised, indicating a decreasing level to which the “correct” outputs are provided in correspondence to the training inputs…” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0071], “…binary classifiers are trained, where the binary classifiers group the input blobs into those that disagree and those that may agree with the query predicate. The input blobs that disagree are discarded, and the remainder are passed through to the original query plan. These classifiers are the aforementioned probabilistic predicates, because each PP has associated values for the tuple [data reduction rate, cost, accuracy]”, paragraph [0089], “…the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0077], “It is possible to train PPs with different tuple values.” paragraph [0118], “FIG. 6 illustrates a query optimizer 604 that utilizes probabilistic predicates, according to some example embodiments. The query optimizer 604 takes, in addition to the query 102 and the input 410 database, two additional inputs: available trained PPs 510 and a desired accuracy threshold 610 for the query.” paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy.” paragraph [0152], “At training time, an array of thresholds th[a] is calculated, as discussed above with reference to equation (5) for different values of a, the desired accuracy. By calculating this array of thresholds th[a] it is possible to choose PPs based on the accuracy required at query optimization time, that is, based on the accuracy specified with the query…”); 
executing, by the query engine, the fourth user-defined function, comprising processing the plurality of second training examples using the machine learning model according to the obtained parameter values of the machine learning model to generate refined parameter values of the machine learning model (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…” paragraph [0194], “…The PP database 1614 stores a plurality of trained PPs for use in processing queries…the ML training data database 1618 stores the data used for training the different classifiers.” WHERE “refined” is broadly interpreted as “learning”/“training” or “learned”/“trained”); and 
storing, in the one or more databases, the refined parameter values of the machine learning model (Chaudhuri: paragraph [0027], “…receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate, each PP being a binary classifier associated with a respective clause,… ;” paragraph [0028], “…the processing of a query that includes the use of machine learning classifiers, according to some example embodiments.;” paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs.” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input” paragraph [0078], “Processors are typically used to ingest data and perform per-blob ML operations such as feature extraction.” paragraph [0089], “the training process may run contemporaneously with the query execution. That is, at a cold start when no PP is available, the query plans output labeled inputs for relevant clauses. Periodically, or when enough labeled input is available, the PPs are trained 508, and subsequent runs of the query may use query plans that include the trained PPs 510.” paragraph [0100], “…Linear SVMs may be trained efficiently…” paragraph [0194], “…The PP database 1614 stores a plurality of trained PPs for use in processing queries…the ML training data database 1618 stores the data used for training the different classifiers.” Paragraph [0180], “PP has been trained for some value t”
WHERE “storing, in the one or more databases, the refined parameter values of the machine learning model” is broadly interpreted as “subsequent runs of the query may use query plans that include the trained PPs 510” where “trained” indicates the values are stored, and can be used during “subsequent runs…” and “PP database 1614 stores a plurality of trained PPs”).
However, Chaudhuri does not explicitly disclose the fourth user-defined function has been written using one or more programming languages that are different from the query language in which the fourth command is written.
Esmaeilzadeh discloses the fourth user-defined function has been written using one or more programming languages that are different from the query language in which the fourth command is written (Esmaeilzadeh: paragraph [0024], “generated for one or more user defined functions (UDF), expressed as a part of a query (e.g., an SQL query) using a domain-specific language (e.g., Python).”).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “ACCELERATING MACHINE LEARNING INFERENCE WITH PROBABILISTIC PREDICATES” as taught by Chaudhuri by implementing “METHODS AND SYSTEMS FOR INTEGRATING MACHINE LEARNING/ANALYTICS ACCELERATORS AND RELATIONAL DATABASE SYSTEMS” as taught by Esmaeilzadeh, because it would provide Chaudhuri’s method with the enhanced capability of “an improved method and system for generating an architecture capable of efficiently executing a machine learning algorithm against large databases..” (Esmaeilzadeh: paragraph [0005])
For claim 8, Chaudhuri, Esmaeilzadeh and DIRAC disclose the method of claim 1, wherein the query language is a declarative query language (Chaudhuri: paragraph [0029], “A query 102 in these systems begins by applying user-defined functions (UDFs) to extract relational columns from blobs. A query may include a plurality of elements, such as those elements that may be defined using a database query format, such as Structured Query Language (SQL).”
WHERE “the query language is a declarative query language” is broadly interpreted as “A query…such as Structured Query Language (SQL).”).
However, Chaudhuri does not explicitly disclose the one or more programming languages are imperative programming languages.
Esmaeilzadeh discloses the one or more programming languages are imperative programming languages (Esmaeilzadeh: paragraph [0024], “generated for one or more user defined functions (UDF), expressed as a part of a query (e.g., an SQL query) using a domain-specific language (e.g., Python).”
WHERE “declarative query language” is broadly interpreted as “Python”).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “ACCELERATING MACHINE LEARNING INFERENCE WITH PROBABILISTIC PREDICATES” as taught by Chaudhuri by implementing “METHODS AND SYSTEMS FOR INTEGRATING MACHINE LEARNING/ANALYTICS ACCELERATORS AND RELATIONAL DATABASE SYSTEMS” as taught by Esmaeilzadeh, because it would provide Chaudhuri’s method with the enhanced capability of “an improved method and system for generating an architecture capable of efficiently executing a machine learning algorithm against large databases..” (Esmaeilzadeh: paragraph [0005])
For claim 9, it is a system claim having similar limitations as cited in claim 1. Thus, claim 9 is also rejected under the same rationale as cited in the rejection of rejected claim 1.
For claim 11, it is a system claim having similar limitations as cited in claim 3. Thus, claim 11 is also rejected under the same rationale as cited in the rejection of rejected claim 3.
For claim 13, it is a system claim having similar limitations as cited in claim 5. Thus, claim 13 is also rejected under the same rationale as cited in the rejection of rejected claim 5.
For claim 14, it is a system claim having similar limitations as cited in claim 6. Thus, claim 14 is also rejected under the same rationale as cited in the rejection of rejected claim 6.
For claim 15, it is a system claim having similar limitations as cited in claim 7. Thus, claim 15 is also rejected under the same rationale as cited in the rejection of rejected claim 7.
For claim 16, it is a system claim having similar limitations as cited in claim 8. Thus, claim 16 is also rejected under the same rationale as cited in the rejection of rejected claim 8.
For claim 17, it is a computer product (non-transitory computer storage media) claim having similar limitations as cited in claim 1. Thus, claim 17 is also rejected under the same rationale as cited in the rejection of rejected claim 1.
For claim 19, it is a computer product (non-transitory computer storage media) claim having similar limitations as cited in claim 3. Thus, claim 19 is also rejected under the same rationale as cited in the rejection of rejected claim 3.
For claim 21, it is a computer product (non-transitory computer storage media) claim having similar limitations as cited in claim 5. Thus, claim 21 is also rejected under the same rationale as cited in the rejection of rejected claim 5.
For claim 22, it is a computer product (non-transitory computer storage media) claim having similar limitations as cited in claim 6. Thus, claim 22 is also rejected under the same rationale as cited in the rejection of rejected claim 6.
For claim 23, it is a computer product (non-transitory computer storage media) claim having similar limitations as cited in claim 7. Thus, claim 23 is also rejected under the same rationale as cited in the rejection of rejected claim 7.
For claim 24, it is a computer product (non-transitory computer storage media) claim having similar limitations as cited in claim 8. Thus, claim 24 is also rejected under the same rationale as cited in the rejection of rejected claim 8.

Claims 2, 4, 10, 12, 18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Chaudhuri et al. (U.S. Pub. No.: 20190378028, hereinafter Chaudhuri), in view of Esmaeilzadeh et al. (U.S. Pub. No.: 20190287017, hereinafter Esmaeilzadeh), and further in view of DIRAC et al. (U.S. Pub. No.: 20150379430, hereinafter DIRAC) as in Claim 1, and further in view of Lawrence et al. (U.S. Patent No.: 6484163, hereinafter Lawrence).
For claim 2, Chaudhuri, Esmaeilzadeh and DIRAC disclose the method of claim 1, wherein:
the command comprises a plurality of model inputs; comprising processing each of the plurality of model inputs using the machine learning model on a respective node of the plurality of nodes (Chaudhuri: paragraph [0026], “when executed by the one or more computer processors, cause the one or more computer processors to perform operations comprising: receiving a query to search a database, the query comprising a predicate for filtering blobs in the database utilizing a user-defined-function (UDF), the filtering requiring analysis of the blobs by the UDF to determine if each blob passes the filtering specified by the predicate; determining a PP sequence of one or more probabilistic predicates (PP) based on the predicate…executing the database query over the blobs that have not been discarded, the database search utilizing the UDF; and providing results of the database search.” paragraph [0041], “During a learning phase, the models are developed against a training dataset of inputs to optimize the models to correctly predict the output for a given input. Generally, the learning phase may be supervised, semi-supervised, or unsupervised, indicating a decreasing level to which the “correct” outputs are provided in correspondence to the training inputs…” paragraph [0062], “To answer such queries, multiple machine learning UDFs, such as feature extractors, classifiers, etc., are applied to the input”). 
However, Chaudhuri, Esmaeilzadeh and DIRAC do not explicitly disclose executing the user-defined function further comprises executing the user-defined function on each of a plurality of nodes of the query engine. 
Lawrence discloses executing the user-defined function further comprises executing the user-defined function on each of a plurality of nodes of the query engine (Lawrence: column 5, lines 29-34, “…the UDF in which the data mining operation is embodied is then executed by each of the individual nodes…”).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “ACCELERATING MACHINE LEARNING INFERENCE WITH PROBABILISTIC PREDICATES” as taught by Chaudhuri by implementing “Technique For Data Mining Of Large Scale Relational Databases Using SQL” as taught by Lawrence, because it would provide Chaudhuri’s modified method with the enhanced capability of “a database mining technique which does not degrade performance and simplifies the application of a data mining algorithm relative to a database..” (Lawrence: paragraph [0005])
For claim 4, Chaudhuri, Esmaeilzadeh and DIRAC disclose the method of claim 3, wherein executing the second user-defined function further comprises executing the second user-defined function on each of a plurality of nodes of the query engine, comprising: 
processing, by the query engine, the plurality of training examples using the machine learning model according to a respective different set of hyperparameter values (Chaudhuri: paragraph [0041], “Machine learning techniques train models to accurately make predictions on data fed into the models (e.g., what was said by a user in a given utterance; whether a noun is a person, place, or thing; what the weather will be like tomorrow). During a learning phase, the models are developed against a training dataset of inputs to optimize the models to correctly predict the output for a given input. Generally, the learning phase may be supervised, semi-supervised, or unsupervised, indicating a decreasing level to which the “correct” outputs are provided in correspondence to the training inputs. In a supervised learning phase, all of the outputs are provided to the model, and the model is directed to develop a general rule or algorithm that maps the input to the output. In contrast, in an unsupervised learning phase, the desired output is not provided for the inputs so that the model may develop its own rules to discover relationships within the training dataset. In a semi-supervised learning phase, an incompletely labeled training set is provided, with some of the outputs known and some unknown for the training dataset.” Paragraph [0067], “To reduce the execution cost and latency of the machine learning queries, suppose that a filter may be applied directly to the raw input which discards input data that will not pass the original query predicate. Cost decreases because the UDFs following the filter only have to process inputs that pass the filter. A higher data reduction rate r of the filter leads to a larger possible performance improvement. The data reduction rate r refers to the percentage of data inputs that may be eliminated by the filter.” paragraph [0077], “It is possible to train PPs with different tuple values.” paragraph [0118], “FIG. 6 illustrates a query optimizer 604 that utilizes probabilistic predicates, according to some example embodiments. The query optimizer 604 takes, in addition to the query 102 and the input 410 database, two additional inputs: available trained PPs 510 and a desired accuracy threshold 610 for the query.” paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy.” paragraph [0152], “At training time, an array of thresholds th[a] is calculated, as discussed above with reference to equation (5) for different values of a, the desired accuracy. By calculating this array of thresholds th[a] it is possible to choose PPs based on the accuracy required at query optimization time, that is, based on the accuracy specified with the query…” paragraph [0180], “PP has been trained for some value t”); 
determining, for each of the plurality of different sets of hyperparameter values, a measure of performance of the set of hyperparameter values (Chaudhuri: paragraph [0077], “It is possible to train PPs with different tuple values.” paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy.” paragraph [0152], “At training time, an array of thresholds th[a] is calculated, as discussed above with reference to equation (5) for different values of a, the desired accuracy. By calculating this array of thresholds th[a] it is possible to choose PPs based on the accuracy required at query optimization time, that is, based on the accuracy specified with the query…”); 
selecting a particular set of hyperparameter values from the plurality of different sets of hyperparameter values according to the determined measures of performance (Chaudhuri: paragraph [0077], “It is possible to train PPs with different tuple values.” paragraph [0100], “It is to be noted that linear SVMs have pros and cons. Linear SVMs may be trained efficiently (see the table 802 in FIG. 8) and have a small cost of testing. However, linear SVMs yield a poor PP if the input blobs are not linearly separable; i.e., in such case, meeting the desired filtering accuracy results in a small data reduction.” paragraph [0118], “FIG. 6 illustrates a query optimizer 604 that utilizes probabilistic predicates, according to some example embodiments. The query optimizer 604 takes, in addition to the query 102 and the input 410 database, two additional inputs: available trained PPs 510 and a desired accuracy threshold 610 for the query.”
paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy.”
paragraph [0152], “At training time, an array of thresholds th[a] is calculated, as discussed above with reference to equation (5) for different values of a, the desired accuracy. By calculating this array of thresholds th[a] it is possible to choose PPs based on the accuracy required at query optimization time, that is, based on the accuracy specified with the query…”); and 
generating the trained parameter values for the machine learning model according to the selected set of hyperparameter values (Chaudhuri: paragraph [0077], “It is possible to train PPs with different tuple values.” paragraph [0118], “FIG. 6 illustrates a query optimizer 604 that utilizes probabilistic predicates, according to some example embodiments. The query optimizer 604 takes, in addition to the query 102 and the input 410 database, two additional inputs: available trained PPs 510 and a desired accuracy threshold 610 for the query.” paragraph [0149], “FIG. 10 illustrates the generation of threshold values based on accuracy levels, according to some example embodiments. The threshold is the minimum possible value of (x) that provides the required accuracy on the training or the test set. Values larger than the threshold will provide the same or better accuracy.” paragraph [0152], “At training time, an array of thresholds th[a] is calculated, as discussed above with reference to equation (5) for different values of a, the desired accuracy. By calculating this array of thresholds th[a] it is possible to choose PPs based on the accuracy required at query optimization time, that is, based on the accuracy specified with the query…”).
However, Chaudhuri, Esmaeilzadeh and DIRAC do not explicitly disclose processing, by each of the plurality of nodes of the query engine. 
Lawrence discloses processing, by each of the plurality of nodes of the query engine (Lawrence: column 5, lines 29-34, “…the UDF in which the data mining operation is embodied is then executed by each of the individual nodes…”).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to improve upon “ACCELERATING MACHINE LEARNING INFERENCE WITH PROBABILISTIC PREDICATES” as taught by Chaudhuri by implementing “Technique For Data Mining Of Large Scale Relational Databases Using SQL” as taught by Lawrence, because it would provide Chaudhuri’s modified method with the enhanced capability of “a database mining technique which does not degrade performance and simplifies the application of a data mining algorithm relative to a database..” (Lawrence: paragraph [0005])
For claim 10, it is a system claim having similar limitations as cited in claim 2. Thus, claim 10 is also rejected under the same rationale as cited in the rejection of rejected claim 2.
For claim 12, it is a system claim having similar limitations as cited in claim 4. Thus, claim 12 is also rejected under the same rationale as cited in the rejection of rejected claim 4.
For claim 18, it is a computer product (non-transitory computer storage media) claim having similar limitations as cited in claim 2. Thus, claim 18 is also rejected under the same rationale as cited in the rejection of rejected claim 2.
For claim 20, it is a computer product (non-transitory computer storage media) claim having similar limitations as cited in claim 4. Thus, claim 20 is also rejected under the same rationale as cited in the rejection of rejected claim 4.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YU ZHAO whose telephone number is (571)270-3427. The examiner can normally be reached Monday-Friday 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 5712724046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

YU ZHAO
Primary Examiner
Art Unit 2169



/YU ZHAO/           Primary Examiner, Art Unit 2169