DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office action is in response to the amendment filed on 04/08/2022.
As per instant Examiner Amendment, Claims 1, 5, 8, 12, 15 and 19 have been amended. Claims 1, 8 and 15 are independent. Claims 1, 3, 5-6, 8, 10, 12-13, 15, 17 and 19 have been examined and are pending in this application. Claims 1, 3, 5-6, 8, 10, 12-13, 15, 17 and 19 are allowed

Examiner Amendments

In attempt to accelerate the prosecution process, the Examiner has contacted the Applicant’s representative, Mr. Robert J. Kroczynski (Reg No. 53160) and conducted a telephone interview on 05/06/2022. During the interview, the Examiner and applicant's representative discussed proposed amendments to further clarify the claimed invention and to distinguish the claimed invention over prior art of record. The examiner suggested rolling up Claim 2 and claim 7 into Claim 1. Mirror other independent claims with amended claim 1. Cancel claims 2, 9, 16 and cancel claims 7, 14, 20 after rolling up to independent claims, and for putting the application in condition for allowance. Mr. Robert J. Kroczynski (Reg No. 53160) has agreed and authorized the Examiner’s amendment. 
An Examiner's Amendment to the record appears below. Should the changes and/or additions be unacceptable to Applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.


Claims:

Please replace claims 1, 3, 5-6, 8, 10, 12-13, 15, 17 and 19 as following:

Claim 1: 	(Currently Amended)   A computer implemented provenance-based threat detection method, comprising:
building a provenance graph including a plurality of paths using a processor device from provenance data obtained from one or more computer systems and/or networks, wherein the provenance graph is built by collecting the provenance data using hook functions that intercept operating system calls;
sampling the provenance graph to form a plurality of linear sample paths;  
calculating a regularity score for each of the plurality of linear sample paths using a processor device;
selecting a subset of linear sample paths from the plurality of linear sample paths based on the regularity score;
embedding each of the subset of linear sample paths by converting each of the subset of linear sample paths into a numerical vector using a processor device;  
detecting anomalies in the embedded paths to identify malicious process activities, wherein the anomalies in the embedded paths are detected using an anomaly detection model that is configured to identify malicious activity, wherein the anomaly detection model is selected from the group consisting of one-class support vector machine (OC-SVM) and Local Outlier Factor (LOF); and   
terminating a process related to the embedded path having the identified malicious process activities.  

Claim 2.	(Cancelled).     

Claim 3.	(Original)   The method as recited in claim 1, wherein selecting a subset of linear sample paths addresses a dependency explosion problem. 

Claim 4.	(Cancelled).     

Claim 5.	(Currently Amended)   The method as recited in claim [[4]]1, wherein the anomaly detection model is trained using a benign training data set.

Claim 6.	(Original)   The method as recited in claim 5, wherein embedding each of the plurality of paths is done using graph2vec or doc2vec.  

Claim 7.	(Cancelled).    

Claim 8.	(Currently Amended)   A non-transitory computer readable storage medium comprising a computer readable program for a computer implemented provenance-based threat detection tool, wherein the computer readable program when executed on a computer causes the computer to perform the steps of:
building a provenance graph including a plurality of paths using a processor device from provenance data obtained from one or more computer systems and/or networks, wherein the provenance graph is built by collecting the provenance data using hook functions that intercept operating system calls;
sampling the provenance graph to form a plurality of linear sample paths;  
calculating a regularity score for each of the plurality of linear sample paths using a processor device;
selecting a subset of linear sample paths from the plurality of linear sample paths based on the regularity score;
embedding each of the subset of linear sample paths by converting each of the subset of linear sample paths into a numerical vector using a processor device;  
detecting anomalies in the embedded paths to identify malicious process activities, wherein the anomalies in the embedded paths are detected using an anomaly detection model that is configured to identify malicious activity, wherein the anomaly detection model is selected from the group consisting of one-class support vector machine (OC-SVM) and Local Outlier Factor (LOF); and   
terminating a process related to the embedded path having the identified malicious process activities.  

Claim 9.	(Cancelled).     

Claim 10.	(Original)   The method as recited in claim 8, wherein selecting a subset of linear sample paths addresses a dependency explosion problem.

Claim 11.	(Cancelled).   

Claim 12.	(Currently Amended)   The computer readable program as recited in claim [[11]] 8, wherein the anomaly detection model is trained using a benign training data set.

Claim 13.	(Original)   The computer readable program as recited in claim 12, wherein embedding each of the plurality of paths is done using graph2vec or doc2vec.  

Claim 14.	(Cancelled).     

Claim 15. 	(Currently Amended)   A system for provenance-based threat detection, comprising:
a computer system including: 
random access memory configured to store a provenance-based threat detection tool;
one or more processor devices and an operating system having a kernel, wherein one or more hook functions operating in the kernel are configured to collect provenance data using the hook functions that intercept operating system calls; and
a database configured to store the provenance data collected by the one or more hook functions, wherein the provenance-based threat detection tool is configured to:  
build a provenance graph including a plurality of paths using the one or more processor devices from provenance data obtained from the computer systems and/or a network;
sample the provenance graph to form a plurality of linear sample paths;  
calculate a regularity score for each of the plurality of linear sample paths using the one or more processor devices;  
select a subset of linear sample paths from the plurality of linear sample paths based on the regularity score;  
embed each of the subset of linear sample paths by converting each of the subset of linear sample paths into a numerical vector using the one or more processor devices;  
detect anomalies in the embedded paths to identify malicious process activities, wherein the anomalies in the embedded paths are detected using an anomaly detection model that is configured to identify malicious activity, wherein the anomaly detection model is selected from the group consisting of one-class support vector machine (OC-SVM) and Local Outlier Factor (LOF); and   
terminate a process related to the embedded path having the identified malicious process activities.  

Claim 16.	(Cancelled).       
Claim 17.	(Original)   The system as recited in claim 15, wherein selecting a subset of linear sample paths addresses a dependency explosion problem.

Claim 18.	(Cancelled).     

Claim 19.	(Currently Amended)   The system as recited in claim [[18]]15, wherein the anomaly detection model is trained using a benign training data set.

Claim 20.	(Cancelled).     


Response to Arguments/Remarks
Claims 1, 3, 5-6, 8, 10, 12-13, 15, 17 and 19 are allowed

Examiner’s Statement of reason for Allowance
Claims 1, 3, 5-6, 8, 10, 12-13, 15, 17 and 19 are allowed.
The following is an examiner’s statement of reasons for allowance: 
The present invention is an indication of building a provenance graph including a plurality of paths and  the provenance graph is built by collecting the provenance data using hook functions that intercept operating system calls; Sampling each one of the provenance graph and calculating the score for each of sample paths; selecting on of the sample paths and converting each of the sample paths into a numerical vector using a processor device; detecting anomalies and terminating the embedded path having the identified malicious process activities.
The closest prior art, as previously recited, are PURI (US 20170324759), Kvasyuk (US 20210006471), Edwards (US 20210004458), KARUS (US 20200285737), Lee (US 11091020) in which, PURI discloses sampling based path decomposition and anomaly detection may include evaluating computer-generated log file data to generate a master network graph that specifies known events and transitions between the known events, and decomposing the master network graph to generate a representative network graph that includes a reduced number of paths of the master network graph. A source may be monitored to determine a cyber security threat by receiving incoming log file data related to the source. Kvasyuk discloses  graph represent devices in the network. The device simulates traffic for one or more of the devices by performing random walks starting at a particular node on the directed graph to generate a set of trails, each trail representing a sequence of one or more flows. The device clusters the set of trails to form one or more clusters. Edwards discloses identify a malicious process; construct a genealogical process tree of the malicious process, the genealogical process tree including both vertical direct inheritance and horizontal indirect inheritance relationships; and terminate the malicious process. KARUS discloses . A machine learning model similarity function measures anomalousness of a candidate sequence relative to a specified history, thus computing an anomaly score. Lee discloses configured to receive, from a compute device associated with an entity, raw data for a current time period. The processor may be configured to process the raw data by removing anomalous data to produce processed data.
		However, none of PURI (US 20170324759), Kvasyuk (US 20210006471), Edwards (US 20210004458), KARUS (US 20200285737), Lee (US 11091020), teaches or suggests, alone or in combination, the particular combination of steps or elements as recited in the independent Claim1 and similarly Claim 8 and Claim 15. For example, none of the cited prior teaches or suggest the steps of Claim 1 and similarly Claim 8 and Claim 15: building a provenance graph including a plurality of paths using a processor device from provenance data obtained from one or more computer systems and/or networks, wherein the provenance graph is built by collecting the provenance data using hook functions that intercept operating system calls; sampling the provenance graph to form a plurality of linear sample paths; calculating a regularity score for each of the plurality of linear sample paths using a processor device; selecting a subset of linear sample paths from the plurality of linear sample paths based on the regularity score; embedding each of the subset of linear sample paths by converting each of the subset of linear sample paths into a numerical vector using a processor device; detecting anomalies in the embedded paths to identify malicious process activities, wherein the anomalies in the embedded paths are detected using an anomaly detection model that is configured to identify malicious activity, wherein the anomaly detection model is selected from the group consisting of one-class support vector machine (OC-SVM) and Local Outlier Factor (LOF); and terminating a process related to the embedded path having the identified malicious process activities.  

Therefore the claims are allowable over the cited prior art.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHAO WANG whose telephone number is (313)446-6644.  The examiner can normally be reached on Monday-Friday 7:30-4:30PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Luu Pham can be reached on (571)270-5002. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  
For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


	/C.W./Examiner, Art Unit 2439   



/LUU T PHAM/Supervisory Patent Examiner, Art Unit 2439