DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is in response to the filing date of 12/30/2019.
Claims 1-20 are pending and have been considered below.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 12/30/2019 is being considered by the examiner.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. 20070033160 to Santosuosso in view of “A Machine Learning Approach for Predicting Execution Time of Spark Jobs” to Mustafa.

Per claims 1, 14 and 20, Santosuosso teaches a method for monitoring database processes to generate machine learning predictions, the method comprising:
monitoring a plurality of database processes executed on database implementations, wherein the monitoring comprises determining a start time, an end time, and a number of rows impacted by portions of the database processes, and the monitored database processes generate instances of machine learning data comprising at least the number of rows impacted and an associated duration of time (see at least paragraph [0047] “when a query is executed, the start time and stop time are noted to determine the time to execute the query 930. The number of records affected 940 by the query can also be used to estimate future update times…”); and
predicting, using the method then gets the stop time (step 830). The start time and stop time are relative time marks that are typically available to the computer processor. The start time and stop time are then used to compute a time 930 to execute the query, which is stored in the historical data record 900 described above with reference to FIG. 9. The method also retrieves the record count of records that are pending in the staging table (step 840) for the executed query…”).
	
Santosuosso does not explicitly teach using a machine learning component.
However, Mustafa teaches a machine learning for predicting execution time of SQL queries (see page 3768, col. 1 “In this paper, we extend Spark by adding to it a new component that accurately predicts the execution time of Spark jobs using a machine learning model. We make sure that our proposed platform is capable of predicting the execution time for Spark SQL queries and iterative machine learning applications…”).
It would have been obvious for a person of an ordinary skill in the art as of the effective filing date of the claimed invention to modify the teaching of Santosuosso to incorporate the teaching of Mustafa to use a machine learning for predicting execution time of SQL queries in order to predict with high accuracy execution time of SQL queries.

Per claims 2 and 15, Santosuosso further teaches
wherein one or more of the monitored database processes comprise a plurality of database sub-processes (see at least paragraph [0026] “…The expression in the WHERE clause of FIG. 3 is shown in FIG. 4. Where not specifically stated herein, the term "expression" is intended to mean an arbitrary predicate expression, which can be an entire expression in a query a portion of an expression in a query...”), and the monitoring comprises logging an end time and a number of rows impacted upon executing database sub-processes of the database processes (see at least paragraph [0051] “…The start time and stop time are then used to compute a time 930 to execute the query, which is stored in the historical data record 900 described above with reference to FIG. 9…”).

Per claims 3 and 16, Santosuosso further teaches
wherein one or more of the monitored database processes comprise a plurality of database steps, and the monitoring comprises logging an end time and a number of rows impacted upon executing database steps of the database processes (see at least paragraph [0051] “…The start time and stop time are then used to compute a time 930 to execute the query, which is stored in the historical data record 900 described above with reference to FIG. 9. The method also retrieves the record count of records that are pending in the staging table (step 840) for the executed query. The accumulated data is then stored in the historical data record (step 850)…”).  

Per claims 4 and 17, Santosuosso further teaches
wherein one or more of the monitored databaseORA190392-US-NP-26-ORACLE CONFIDENTIAL PATENTDocket No.: 2011-0595US01processes comprise a plurality of SQL statements, and the monitoring comprises logging an end time and a number of rows impacted upon executing SQL statements of the database processes (see at least paragraph [0051] “The method first gets the start time (step 810), prior to executing the query (step 820). The query is executed in the manner known in the prior art. After completing the execution of the query, the method then gets the stop time (step 830). The start time and stop time are relative time marks that are typically available to the computer processor.  The start time and stop time are then used to compute a time 930 to execute the query, which is stored in the historical data record 900 described above with reference to FIG. 9. The method also retrieves the record count of records that are pending in the staging table (step 840) for the executed query.  The accumulated data is then stored in the historical data record (step 850)…”).

Per claims 5 and 18, Santosuosso further teaches
wherein the generated machine learning data comprises a set of SQL statements, a duration of time for each SQL statement in the set, and a number of database rows impacted for each SQL statement in the set (see at least paragraph [0051] “The method first gets the start time (step 810), prior to executing the query (step 820). The query is executed in the manner known in the prior art. After completing the execution of the query, the method then gets the stop time (step 830). The start time and stop time are relative time marks that are typically available to the computer processor.  The start time and stop time are then used to compute a time 930 to execute the query, which is stored in the historical data record 900 described above with reference to FIG. 9. The method also retrieves the record count of records that are pending in the staging table (step 840) for the executed query.  The accumulated data is then stored in the historical data record (step 850)…”).  

Per claim 6, Santosousso further teaches
 wherein the candidate database process comprises a plurality of SQL statements processes (see at least paragraph [0051] “…The start time and stop time are then used to compute a time 930 to execute the query, which is stored in the historical data record 900 described above with reference to FIG. 9…”).

Per claim 7, Mustafa further teaches
wherein the machine learning component is trained using the machine learning data generated by the monitoring, and the trained machine learning component is used to predict the duration of time for the candidate database process (see at least page 3769, col.2 “…we estimate the execution time of each stage using the trained prediction models…”). 

Per claim 8, Santosousso further teaches
The data that is stored in the historical record includes a copy of the query or an appropriate query identifier 910, a time stamp 920, the time for the query to execute 930, the number of records affected by the query 940, a trigger query status 950, and a trigger ID 960”).  
	Santosousso does not explicitly teach “the machine learning component comprises an unsupervised machine learning component”
Mustafa teaches a machine learning component comprises an unsupervised machine learning component (see page 3768, col. 1 “In this paper, we extend Spark by adding to it a new component that accurately predicts the execution time of Spark jobs using a machine learning model. We make sure that our proposed platform is capable of predicting the execution time for Spark SQL queries and iterative machine learning applications…”).
It would have been obvious for a person of an ordinary skill in the art as of the effective filing date of the claimed invention to modify the teaching of Santosuosso to incorporate the teaching of Mustafa to use a machine learning for predicting execution time of SQL queries in order to predict with high accuracy execution time of SQL queries.

Per claims 9 and 19, Santosousso further teaches
wherein, for at least a subset of the executedORA190392-US-NP-27-ORACLE CONFIDENTIAL PATENTDocket No.: 2011-0595US01database processes, the execution of the subset of database processes is performed using a first database connection (see at least paragraph [0026] “…The expression in the WHERE clause of FIG. 3 is shown in FIG. 4. Where not specifically stated herein, the term "expression" is intended to mean an arbitrary predicate expression, which can be an entire expression in a query a portion of an expression in a query...”), and the monitoring of the subset database processes, logging, and generation of the machine learning data is performed using a second database connection that is different from the first database connection (see at least paragraph [0045] “…historical data from previous queries to more accurately estimate the time to update the MQT. FIG. 9 shows an example of a historical data record 900. A historical data record 900 is stored for queries corresponding to an MQT. The data that is stored in the historical record includes a copy of the query or an appropriate query identifier 910, a time stamp 920, the time for the query to execute 930, the number of records affected by the query 940, a trigger query status 950, and a trigger ID 960…”).

Per claim 10, Santosousso further teaches
wherein the logging of start times, end times, and numbers of rows impacted is achieved by writing data to a log database table using the second database connection (see at least paragraph [0045] “…historical data from previous queries to more accurately estimate the time to update the MQT. FIG. 9 shows an example of a historical data record 900. A historical data record 900 is stored for queries corresponding to an MQT. The data that is stored in the historical record includes a copy of the query or an appropriate query identifier 910, a time stamp 920, the time for the query to execute 930, the number of records affected by the query 940, a trigger query status 950, and a trigger ID 960…”).

Per claim 11, Santosousso further teaches
wherein at least one of the monitored database processes comprises a plurality of SQL statements, a subset of the plurality of SQL statements are flagged for monitoring, and start times, end times, and numbers of rows impacted are logged for the subset of the plurality of SQL statements flagged for monitoring (see at least FIG. 9).

Per claim 12, Santosousso further teaches
wherein the generated instances of machine learning data comprise a number of rows impacted and an associated duration of time for executed SQL statements flagged for monitoring (see at least paragraph [0045] “…historical data from previous queries to more accurately estimate the time to update the MQT. FIG. 9 shows an example of a historical data record 900. A historical data record 900 is stored for queries corresponding to an MQT. The data that is stored in the historical record includes a copy of the query or an appropriate query identifier 910, a time stamp 920, the time for the query to execute 930, the number of records affected by the query 940, a trigger query status 950, and a trigger ID 960…”).

Per claim 13, Mustafa further teaches
wherein the machine learning component is configured to predict the duration of time for the candidate database process independent of hardware used to execute the candidate database process (see at least page 3768, col.1 “The main contributions of this paper are as follows: A new platform to predict with high accuracy the execution time of Spark SQL queries and Spark general applications such as machine learning (Section 2).  An implementation of the prediction platform as extension to Apache Spark…”).


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
“Predicting Workflow Task Execution Time in the Cloud Using A Two-Stage Machine Learning Approach” to Pham 
“Towards Predicting Query Execution Time for Concurrent and Dynamic Database Workloads” to Wu
“Predicting Query Execution Time using Statistical Techniques” to Pasunuru.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHILLIP H NGUYEN whose telephone number is (571)270-1070. The examiner can normally be reached Monday-Friday 9:00AM-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wei Zhen can be reached on (571) 272-3708. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PHILLIP H NGUYEN/Primary Examiner, Art Unit 2191