Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
This application claims the benefit of U.S. provisional patent application No. 62/212,541 filed on Aug. 31, 2015, and titled “Network Security System,” which is incorporated by reference herein in its entirety.
DETAILED ACTION
This Office Action is in response to an amendment application received on 08/17/2022. In the amendment, claims 1, 25 and 28 have been amended. Claims 9-10 and 17-18 remain cancelled. Claims 2-8, 11-16, 19-24, 26-27 and 29-31 remain original. 
For this Office Action, claims 1-8, 11-16 and 19-31 have been received for consideration and have been examined. 
Response to Arguments
Claim Rejections – 35 USC § 112
	Applicant’s amendments to independent claims have been reviewed by the examiner and found to be persuasive. Therefore this rejection has been withdrawn.
Claim Rejections – 35 USC § 103
Applicant’s arguments, filed 08/17/2022, with respect to the rejection(s) of claim(s) under 35 USC § 103 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of new amendments to the independent claims.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-8, 11-16, 19-22 and 24-31 are rejected under 35 U.S.C. 103 as being unpatentable over Engel et al., (US20140165207A1) in view of Raugas et al., (US20150128263A1) in view of NPL reference Titled “Making Real Time Data Analytics Available as a Service” published May 04-08, 2015 hereinafter referred as NPL and further in view of Kapoor et al., (US20070192863A1).
Regarding claim 1, Engel discloses:
A method comprising: 
in a real-time detection mode: 
inputting, to a real-time (See [0021] i.e. online detection of anomalous network) anomaly decision engine ([0012] FIG. 6 illustrates an anomaly detection module), feature sets of event data (See [0082] i.e. sensors 110), the feature sets of event data produced based on first raw event data ([0088] the anomaly detection module 200 may receive raw data from one or more sensors) originating from a plurality of data sources on a computer network ([0014] The present invention discloses a method for detecting anomalous action within a computer network. The method comprises the steps of: [0021] online or batch detection of anomalous network actions associated with entities based on the statistical models; [0082] the sensors 110 may collect data from several places in the computer network 100 and after analysis of the collected data the sensors 110 may send the data to an anomaly detection module 175; [0088] the anomaly detection module 200 may receive raw data from one or more sensors); 
causing the real-time anomaly decision engine to detect a first network security anomaly (i.e., receiving raw event data in stage 310 of FIG. 3), based on the feature sets of event data, in real time as the feature sets of event data are produced based on the first raw event data ([0051] discloses that an anomaly detection module is used for online or batch detection of anomalies of actions associated with entities; [0110] FIG. 3; Stage 310 discloses receiving raw data from sensors); and 
training, based on the feature sets of event data ([0100] According to some embodiments of the present invention, a training process is performed automatically over multiple time periods, preforming statistical analysis of network actions at each period. The training process continues until a statistically significant stabilization of the statistical model is reached); 
in a batch detection mode: 
inputting, to a batch anomaly decision engine (See [0021] i.e. batch detection of anomalous network actions), stored feature sets of event data, the stored feature sets of event data based on second raw event data originating from the plurality of data sources on the computer network ([0014] The present invention discloses a method for detecting anomalous action within a computer network. The method comprises the steps of: [0021] online or batch detection of anomalous network actions associated with entities based on the statistical models; [0103] anomalies can be detected by finding specific entities that differ in their behavior from the majority of other entities in the computer network which have similar functionality, or finding actions that differ from the majority of actions in their characteristics. This method works on a batch of data and detects the anomalies rather than compare a specific action to a model; [0128] According to some embodiments of the present invention, statistical modeling module may begin with receiving detailed entities actions related data including identity of entity over time from the association module activity (stage 510); [0150] anomalies can be detected by finding specific entities that differ in their behavior from the majority of other entities in the computer network, or finding actions that differ from the majority of actions in their characteristics and their associated entities (stage 630). This method works on a batch of data and detects the anomalies between entities or actions rather than compare a specific action to a model); 
wherein the batch anomaly decision engine [statistical modeling module] sets the variable time slice [generates statistical models over multiple periods of time] parameter associated with the machine learning anomaly model to a time period length that is longer than a single event to enable the machine learning anomaly model process the variable time slice on a grouping stored events ([0133] the statistical modeling module may maintain statistics of protocol and entities usage/pattern behavior over multiple time periods for each entity (stage 525). For example over the last hour, over the last day, last week, last month, or last year. Some changes or anomalies are relevant when something happens in one minute (for example a large number of connections originating from one computer), and other anomalies are relevant in longer timespans (an aggregate number of failed connections to the same server over 1 week). The level of detail can vary between the different time periods to maintain a manageable dataset. For example on a 1-year timespan the average number of connections will be saved for each month and not each specific connection);
causing the batch anomaly decision engine to detect a second network security anomaly (i.e., analyzing logs to extract relevant network action related data; FIG. 3 Stage 320), by using the machine learning anomaly model, based on the stored feature sets of event data ([0051] discloses that an anomaly detection module is used for online or batch detection of anomalies of actions associated with entities; [0099] the anomaly detection module 270 receives information regarding actions in the computer network and identifies anomalous behavior by comparing actual network actions with the statistical model; [0111] the condenser module may analyze logs [which is batch data] to extract relevant computer network action related data (stage 320)); and 
training based on the stored feature sets of event data ([0100] According to some embodiments of the present invention, a training process is performed automatically over multiple time periods, preforming statistical analysis of network actions at each period. The training process continues until a statistically significant stabilization of the statistical model is reached); 
outputting, to a threat decision engine, anomaly data indicative of the first network security anomaly and the second network security anomaly ([0099] The anomalies may be sent to a decision engine 280. The purpose of the decision engine 280 is to aggregate relevant anomalies together and create incidents. The incidents may be reported as notifications 285 regarding anomaly action or an attack activity); 
causing the threat decision engine to detect, based on the anomaly data, a 
network security threat ([0104] According to other embodiments of the present invention, the decision engine 280, may analyze several anomaly actions and generate incidents/alerts based on identified anomalies according to predefined rules); and 
causing output (i.e., notification), via a user interface, of threat data indicative of the network security threat ([0101] The notifications 285 may be sent to a manual inspection 297. The manual inspection 297 may determine if an action is false positive or not and the feedback (299) of the manual inspection may be sent to the statistical models database 265);
wherein detecting the first network security anomaly, detecting the second network security anomaly or detecting the network security threat is performed by using a task-parallel distributed processing engine (see FIG. 7; [0152] According to some embodiments of the present invention, the decision engine module receives specific information on anomalies in the computer network (stage 710). Next, the decision engine module may be creating incidents by aggregating and clustering related anomalies based on specified parameters (stage 715) and then analyzing and ranking the incidents (stage 720)). 
Engle fails to disclose:
	using a machine learning anomaly model to train; wherein the machine learning anomaly model is configured to process a variable time slice of data to produce a score indicative of a detected anomaly; wherein the machine learning anomaly model is configured to process a variable time slice of data to produce a second score indicative of a detected anomaly and wherein the real-time anomaly decision engine set a variable time slice parameter associated with the machine learning anomaly model to event-by-event; wherein the threat decision engine comprises a plurality of machine learning threat models, each including logic to detect a different type of threat; training a second model state of the machine learning model, wherein the second model state is an offline version of model state of the machine learning model: determining, for the second model state, a criterion for readiness for active deployment, based on a degree of training of the second model state; and live-swapping the second model state to replace the first model state as the active version of model state, based on the criterion for readiness for active deployment.
However, Raugas discloses:
	using a machine learning anomaly model to train ([0020] A system, method, medium, or computer based product may provide tools for real-time detection and classification of advanced malware using supervised machine learning; [0022] Using one or more methods of feature extraction, the network samples may be prepared for scoring. Subsequently, a specific machine learning algorithm may use models trained a priori against specific and/or known classes of malware);
	wherein the machine learning anomaly model is configured to process a variable time slice (i.e., time window) of data to produce a score indicative of a detected anomaly (See Abstract: Methods, system, and media for detecting malware are disclosed. A network may be monitored for a configured time interval collecting all of or some of the network traffic or samples of the network traffic; [0022] a specific machine learning algorithm may use models trained a priori against specific and/or known classes of malware to compute rankable scores (for a given time window). The scores may indicate whether particular hosts or network devices exhibit network behavior most similar to a particular class of malware)
wherein the machine learning anomaly model is configured to process a variable time slice of data to produce a second score indicative of a detected anomaly ([0070] Using the generated score 130, a computing device may, for each time interval/window; [0071] Blocks 225, 226, 227, and 228 depict one or more machine learning models from a plurality of machine learning models being applied to the features 210 and generating a set of scores, 235, 236, 237 and 238), and
	wherein the real-time anomaly decision engine set a variable time slice parameter associated with the machine learning anomaly model to event-by-event ([0027] Network traffic may be monitored by monitoring device 135 during a time configurable time interval. The time interval may be specified in a configuration file, by an administrator, or by a user, and may correspond to the window over which feature vectors are constructed; [0028] In one or more embodiments, features may comprise only a subset of monitored or examined network data computed during the configurable time interval (“window”); Raugas further teaches “event-by-event” claim element as “feature set” which are disclosed through paragraphs [0029-0063] in which different features of network data is extracted by machine learning models to produce scores);
wherein the threat decision engine comprises a plurality of machine learning threat models, each including logic to detect a different type of threat ([0068] In one embodiment, a machine learning model 125 may include a set of supervised learning algorithms, such as Boosted Decision Trees, Support Vector Machines, and Gaussian Mixture Models. One or more of a plurality of machine learning models may be specified as part of a predefined configuration or may be specified by a user; [0071] FIG. 2 depicts a block diagram of an exemplary system 200 in accordance with one or more embodiments, in which a plurality of machine learning models are applied to features from system 100 … Block 210 depicts features, such as those discussed with respect to system 100. Blocks 225, 226, 227, and 228 depict one or more machine learning models from a plurality of machine learning models being applied to the features 210).
It would have been obvious to one of the ordinary person skilled in the art before the effective filing date of the claimed invention to modify the Engle reference and use machine learning models techniques to detect anomalies and malware, as disclosed by Raugas.
The motivation is to be able to correctly identify evolution of malware threats using machine learning techniques.
The combination of Engel and Raugas fails to disclose:
	wherein a model state of the machine learning anomaly model is shared [updated] between the real-time anomaly decision engine and the batch anomaly decision engine; training a second model state of the machine learning model, wherein the second model state is an offline version of model state of the machine learning model: determining, for the second model state, a criterion for readiness for active deployment, based on a degree of training of the second model state; and live-swapping the second model state to replace the first model state as the active version of model state, based on the criterion for readiness for active deployment. 
However, NPL discloses:
	wherein a model state of the machine learning anomaly model is shared [updated] between the real-time anomaly decision engine and the batch anomaly decision engine (Page # 75, section 3.1.1. in light of Figure 1 discloses a Training system consists of batch processing cluster and stream [real-time] processing cluster. It further discloses that the training system uses historical data [batch data] and updates the model with additional real time data. Finally, the training system allows data to enter into the system in streams, and then integrates [shares] batch and stream processing on these data by using micro-batch processing. Each data stream is treated as a sequence of small batches so that they can be delivered from stream processing cluster to batch processing cluster along with the existing model, and the result after being applied scalable machine learning algorithms on the batch processing cluster will be delivered back to the stream processing cluster. The interaction between these two clusters requires either cluster resource sharing or a database connection in between; Page # 79, Section 4.1 Backend Training System discloses using specific Machine Learning algorithms to train the data which supports both batch processing and stream processing; Also see Page # 79, Sections 5.1, 5.2 & 5.3 disclosing the case study, the evaluation design, experiment results and analysis for updating the model using batch and real time data).
	It would have been obvious to an ordinary person skilled in the art before the effective filing date of the claimed invention to modify the Engel and Raugas references and include a machine learning algorithm to train and predict anomaly model, as disclosed by NPL.
	The motivation to use the machine learning algorithm to train and predict anomaly model is to increase precision in data analytics by integrating historical or batch data with real time data in order to have most up-to-date anomaly model for improved prediction accuracy.
The combination of Engel, Raugas and NPL fails to disclose: 
	training a second model state of the machine learning model, wherein the second model state is an offline version of model state of the machine learning model: determining, for the second model state, a criterion for readiness for active deployment, based on a degree of training of the second model state; and live-swapping the second model state to replace the first model state as the active version of model state, based on the criterion for readiness for active deployment.
However, Kapoor discloses:
	training a second model state of the machine learning model, wherein the second model state is an offline version (i.e., [0176] discloses data flow can be processed off-line for training purpose) of model state of the machine learning model: determining, for the second model state, a criterion for readiness for active deployment, based on a degree of training of the second model state; and live-swapping the second model state to replace the first model state as the active version of model state, based on the criterion for readiness for active deployment ([0174-0176] discloses a training process in which the system feeds feature vectors to another self-organizing maps (SOM) as training data to generate a newer model which eventually replace the older model when the feature vectors are considered outdated. This way the system maintains a relatively current view of what is normal and can continuously monitor data flows for anomalies).
	It would have been obvious to an ordinary skill in the art before the effective filing date of the claimed invention to modify Engel, Raugas and NPL references and include a system which creates new anomaly models based on new feature vectors, as disclosed by Kapoor.
	The motivation to include such a system is so that the system maintains a relatively current view of what is “normal” and can continuously monitor data flows for anomalies.
Regarding claim 2, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, further comprising, prior to inputting the feature sets of event data to the real-time anomaly decision engine: receiving the first raw event data; and processing the first raw event data through an extract-transform-load (ETL) process to produce the feature sets of event data (Engel: [0074], [0086], [0091-0092] & [0095]).
Regarding claim 3, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, further comprising, prior to inputting the feature sets of event data to the real-time anomaly decision engine: receiving the first raw event data; and processing the first raw event data through an extract-transform-load (ETL) process to produce the feature sets of event data; wherein the detecting of the first network security anomaly is performed in real time as the raw event data are received by the ETL process (Engel: [0074], [0086], [0091-0092] & [0095]).
Regarding claim 4, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, further comprising continuously training the threat decision engine based on the anomaly data, concurrently with detecting the first network security anomaly or the second network security anomaly (Engel: [0095]).
Regarding claim 5, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein the first raw event data or second raw event data comprise machine data (Engel: [0124] & [0132]).
Regarding claim 6, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein the first raw event data or second raw event data comprise timestamped machine data (Engel: [0132-0133]).
Regarding claim 7, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, further comprising: outputting at least a portion of the anomaly data via the user interface (Engel: [0101]).
Regarding claim 8, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein the threat decision engine comprises a plurality of threat models (Engel: [0124] & [0132]).
Regarding claim 11, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein the machine learning anomaly model includes processing logic defining a process for assigning an anomaly score based on the processing of the feature sets of event data or stored feature sets of event data, the anomaly score indicative of a particular category of anomalous activity on the computer network, the processing logic for the machine learning anomaly model configured based on the state of the machine learning anomaly model (Raugas: [0071-0072]).
Regarding claim 12, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein detecting of the network security threat is performed in real time as the raw event data are received from the plurality of data sources (Engel: [0110]).
Regarding claim 13, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein the stored feature sets of event data are stored in a persistent storage system (Raugas: [0080] & [0082]).
Regarding claim 14, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, the method further comprising: in the batch processing mode: detecting an additional network security anomaly based on the stored feature sets of event data; and detecting an additional network security threat based on the additional network security anomaly (Engel: [0103]).
Regarding claim 15, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein the machine learning anomaly model includes processing logic configured to detect lateral movement, communication by blacklisted entities, malware communications, and/or beacon activity (Raugas: [0028] Features 120 may be extracted from the sampled network traffic in block 140; See Feature set A through feature Q for teachings of lateral movement, communication by blacklisted entities and etc.)
Regarding claim 16, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein: in the real-time processing mode: the machine learning anomaly model is continually trained as additional feature sets of event data are produced based on additional raw event data originating from the plurality of data sources on the computer network; and in the batch processing mode: the machine learning anomaly model is continually trained as additional feature  sets of event data are stored (Engel: [0100] & [0103]).
Regarding claim 19, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein the detecting of the first network security anomaly, the detecting of the second network security anomaly, and/or the detecting of the network security threat are performed by using Apache Storm or Apache Spark Streaming (NPL; Page # 78; Section 4.1).
Regarding claim 20, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein the detecting of the first network security anomaly, the detecting of the second network security anomaly, or the detecting of the network security threat are performed by using a data- parallel distributed processing engine (Engel: [0152-0153]).
Regarding claim 21, the combination of Engel, Raugas, NPL and Kapoor discloses:
 The method as recited in claim 1, wherein the detecting of the first network security anomaly, the detecting of the second network security anomaly, and/or the detecting of the network security threat are performed by using Apache Spark (NPL; Page # 78; Section 4.1).
Regarding claim 22, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein the stored feature sets of event data are stored in a distributed file system (Engel: [0098] Data of the statistical models may be stored in a statistical models database 265).
Regarding claim 24, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein the detecting of the first network security anomaly and/or second network security anomaly comprises performing at least one of: automated user behavior analysis or automated user-entity behavior analysis (Engel: [0098-0100]).
Regarding claim 25, Engel discloses:
A computer system comprising: 
a processor; and a memory having instructions stored thereon, which when executed by the processor cause the system to:
in a real-time detection mode: 
inputting, to a real-time (See [0021] i.e. online detection of anomalous network) anomaly decision engine ([0012] FIG. 6 illustrates an anomaly detection module), feature sets of event data (See [0082] i.e. sensors 110), the feature sets of event data produced based on first raw event data ([0088] the anomaly detection module 200 may receive raw data from one or more sensors) originating from a plurality of data sources on a computer network ([0014] The present invention discloses a method for detecting anomalous action within a computer network. The method comprises the steps of: [0021] online or batch detection of anomalous network actions associated with entities based on the statistical models; [0082] the sensors 110 may collect data from several places in the computer network 100 and after analysis of the collected data the sensors 110 may send the data to an anomaly detection module 175; [0088] the anomaly detection module 200 may receive raw data from one or more sensors); 
causing the real-time anomaly decision engine to detect a first network security anomaly (i.e., receiving raw event data in stage 310 of FIG. 3), based on the feature sets of event data, in real time as the feature sets of event data are produced based on the first raw event data ([0051] discloses that an anomaly detection module is used for online or batch detection of anomalies of actions associated with entities; [0110] FIG. 3; Stage 310 discloses receiving raw data from sensors); and 
training, based on the feature sets of event data ([0100] According to some embodiments of the present invention, a training process is performed automatically over multiple time periods, preforming statistical analysis of network actions at each period. The training process continues until a statistically significant stabilization of the statistical model is reached); 
in a batch detection mode: 
inputting, to a batch anomaly decision engine (See [0021] i.e. batch detection of anomalous network actions), stored feature sets of event data, the stored feature sets of event data based on second raw event data originating from the plurality of data sources on the computer network ([0014] The present invention discloses a method for detecting anomalous action within a computer network. The method comprises the steps of: [0021] online or batch detection of anomalous network actions associated with entities based on the statistical models; [0103] anomalies can be detected by finding specific entities that differ in their behavior from the majority of other entities in the computer network which have similar functionality, or finding actions that differ from the majority of actions in their characteristics. This method works on a batch of data and detects the anomalies rather than compare a specific action to a model; [0128] According to some embodiments of the present invention, statistical modeling module may begin with receiving detailed entities actions related data including identity of entity over time from the association module activity (stage 510); [0150] anomalies can be detected by finding specific entities that differ in their behavior from the majority of other entities in the computer network, or finding actions that differ from the majority of actions in their characteristics and their associated entities (stage 630). This method works on a batch of data and detects the anomalies between entities or actions rather than compare a specific action to a model); 
wherein the batch anomaly decision engine [statistical modeling module] sets the variable time slice [generates statistical models over multiple periods of time] parameter associated with the machine learning anomaly model to a time period length that is longer than a single event to enable the machine learning anomaly model process the variable time slice on a grouping stored events ([0133] the statistical modeling module may maintain statistics of protocol and entities usage/pattern behavior over multiple time periods for each entity (stage 525). For example over the last hour, over the last day, last week, last month, or last year. Some changes or anomalies are relevant when something happens in one minute (for example a large number of connections originating from one computer), and other anomalies are relevant in longer timespans (an aggregate number of failed connections to the same server over 1 week). The level of detail can vary between the different time periods to maintain a manageable dataset. For example on a 1-year timespan the average number of connections will be saved for each month and not each specific connection);
causing the batch anomaly decision engine to detect a second network security anomaly (i.e., analyzing logs to extract relevant network action related data; FIG. 3 Stage 320), by using the machine learning anomaly model, based on the stored feature sets of event data ([0051] discloses that an anomaly detection module is used for online or batch detection of anomalies of actions associated with entities; [0099] the anomaly detection module 270 receives information regarding actions in the computer network and identifies anomalous behavior by comparing actual network actions with the statistical model; [0111] the condenser module may analyze logs [which is batch data] to extract relevant computer network action related data (stage 320)); and 
training based on the stored feature sets of event data ([0100] According to some embodiments of the present invention, a training process is performed automatically over multiple time periods, preforming statistical analysis of network actions at each period. The training process continues until a statistically significant stabilization of the statistical model is reached); 
outputting, to a threat decision engine, anomaly data indicative of the first network security anomaly and the second network security anomaly ([0099] The anomalies may be sent to a decision engine 280. The purpose of the decision engine 280 is to aggregate relevant anomalies together and create incidents. The incidents may be reported as notifications 285 regarding anomaly action or an attack activity); 
causing the threat decision engine to detect, based on the anomaly data, a 
network security threat ([0104] According to other embodiments of the present invention, the decision engine 280, may analyze several anomaly actions and generate incidents/alerts based on identified anomalies according to predefined rules); and 
causing output (i.e., notification), via a user interface, of threat data indicative of the network security threat ([0101] The notifications 285 may be sent to a manual inspection 297. The manual inspection 297 may determine if an action is false positive or not and the feedback (299) of the manual inspection may be sent to the statistical models database 265);
wherein detecting the first network security anomaly, detecting the second network security anomaly or detecting the network security threat is performed by using a task-parallel distributed processing engine (see FIG. 7; [0152] According to some embodiments of the present invention, the decision engine module receives specific information on anomalies in the computer network (stage 710). Next, the decision engine module may be creating incidents by aggregating and clustering related anomalies based on specified parameters (stage 715) and then analyzing and ranking the incidents (stage 720)). 
Engle fails to disclose:
	using a machine learning anomaly model to train; wherein the machine learning anomaly model is configured to process a variable time slice of data to produce a score indicative of a detected anomaly; wherein the machine learning anomaly model is configured to process a variable time slice of data to produce a second score indicative of a detected anomaly and wherein the real-time anomaly decision engine set a variable time slice parameter associated with the machine learning anomaly model to event-by-event; wherein the threat decision engine comprises a plurality of machine learning threat models, each including logic to detect a different type of threat; training a second model state of the machine learning model, wherein the second model state is an offline version of model state of the machine learning model: determining, for the second model state, a criterion for readiness for active deployment, based on a degree of training of the second model state; and live-swapping the second model state to replace the first model state as the active version of model state, based on the criterion for readiness for active deployment.
However, Raugas discloses:
	using a machine learning anomaly model to train ([0020] A system, method, medium, or computer based product may provide tools for real-time detection and classification of advanced malware using supervised machine learning; [0022] Using one or more methods of feature extraction, the network samples may be prepared for scoring. Subsequently, a specific machine learning algorithm may use models trained a priori against specific and/or known classes of malware);
	wherein the machine learning anomaly model is configured to process a variable time slice (i.e., time window) of data to produce a score indicative of a detected anomaly (See Abstract: Methods, system, and media for detecting malware are disclosed. A network may be monitored for a configured time interval collecting all of or some of the network traffic or samples of the network traffic; [0022] a specific machine learning algorithm may use models trained a priori against specific and/or known classes of malware to compute rankable scores (for a given time window). The scores may indicate whether particular hosts or network devices exhibit network behavior most similar to a particular class of malware)
wherein the machine learning anomaly model is configured to process a variable time slice of data to produce a second score indicative of a detected anomaly ([0070] Using the generated score 130, a computing device may, for each time interval/window; [0071] Blocks 225, 226, 227, and 228 depict one or more machine learning models from a plurality of machine learning models being applied to the features 210 and generating a set of scores, 235, 236, 237 and 238), and
	wherein the real-time anomaly decision engine set a variable time slice parameter associated with the machine learning anomaly model to event-by-event ([0027] Network traffic may be monitored by monitoring device 135 during a time configurable time interval. The time interval may be specified in a configuration file, by an administrator, or by a user, and may correspond to the window over which feature vectors are constructed; [0028] In one or more embodiments, features may comprise only a subset of monitored or examined network data computed during the configurable time interval (“window”); Raugas further teaches “event-by-event” claim element as “feature set” which are disclosed through paragraphs [0029-0063] in which different features of network data is extracted by machine learning models to produce scores);
wherein the threat decision engine comprises a plurality of machine learning threat models, each including logic to detect a different type of threat ([0068] In one embodiment, a machine learning model 125 may include a set of supervised learning algorithms, such as Boosted Decision Trees, Support Vector Machines, and Gaussian Mixture Models. One or more of a plurality of machine learning models may be specified as part of a predefined configuration or may be specified by a user; [0071] FIG. 2 depicts a block diagram of an exemplary system 200 in accordance with one or more embodiments, in which a plurality of machine learning models are applied to features from system 100 … Block 210 depicts features, such as those discussed with respect to system 100. Blocks 225, 226, 227, and 228 depict one or more machine learning models from a plurality of machine learning models being applied to the features 210).
It would have been obvious to one of the ordinary person skilled in the art before the effective filing date of the claimed invention to modify the Engle reference and use machine learning models techniques to detect anomalies and malware, as disclosed by Raugas.
The motivation is to be able to correctly identify evolution of malware threats using machine learning techniques.
The combination of Engel and Raugas fails to disclose:
	wherein a model state of the machine learning anomaly model is shared [updated] between the real-time anomaly decision engine and the batch anomaly decision engine; training a second model state of the machine learning model, wherein the second model state is an offline version of model state of the machine learning model: determining, for the second model state, a criterion for readiness for active deployment, based on a degree of training of the second model state; and live-swapping the second model state to replace the first model state as the active version of model state, based on the criterion for readiness for active deployment. 
However, NPL discloses:
	wherein a model state of the machine learning anomaly model is shared [updated] between the real-time anomaly decision engine and the batch anomaly decision engine (Page # 75, section 3.1.1. in light of Figure 1 discloses a Training system consists of batch processing cluster and stream [real-time] processing cluster. It further discloses that the training system uses historical data [batch data] and updates the model with additional real time data. Finally, the training system allows data to enter into the system in streams, and then integrates [shares] batch and stream processing on these data by using micro-batch processing. Each data stream is treated as a sequence of small batches so that they can be delivered from stream processing cluster to batch processing cluster along with the existing model, and the result after being applied scalable machine learning algorithms on the batch processing cluster will be delivered back to the stream processing cluster. The interaction between these two clusters requires either cluster resource sharing or a database connection in between; Page # 79, Section 4.1 Backend Training System discloses using specific Machine Learning algorithms to train the data which supports both batch processing and stream processing; Also see Page # 79, Sections 5.1, 5.2 & 5.3 disclosing the case study, the evaluation design, experiment results and analysis for updating the model using batch and real time data).
	It would have been obvious to an ordinary person skilled in the art before the effective filing date of the claimed invention to modify the Engel and Raugas references and include a machine learning algorithm to train and predict anomaly model, as disclosed by NPL.
	The motivation to use the machine learning algorithm to train and predict anomaly model is to increase precision in data analytics by integrating historical or batch data with real time data in order to have most up-to-date anomaly model for improved prediction accuracy.
The combination of Engel, Raugas and NPL fails to disclose: 
	training a second model state of the machine learning model, wherein the second model state is an offline version of model state of the machine learning model: determining, for the second model state, a criterion for readiness for active deployment, based on a degree of training of the second model state; and live-swapping the second model state to replace the first model state as the active version of model state, based on the criterion for readiness for active deployment.
However, Kapoor discloses:
	training a second model state of the machine learning model, wherein the second model state is an offline version (i.e., [0176] discloses data flow can be processed off-line for training purpose) of model state of the machine learning model: determining, for the second model state, a criterion for readiness for active deployment, based on a degree of training of the second model state; and live-swapping the second model state to replace the first model state as the active version of model state, based on the criterion for readiness for active deployment ([0174-0176] discloses a training process in which the system feeds feature vectors to another self-organizing maps (SOM) as training data to generate a newer model which eventually replace the older model when the feature vectors are considered outdated. This way the system maintains a relatively current view of what is normal and can continuously monitor data flows for anomalies).
	It would have been obvious to an ordinary skill in the art before the effective filing date of the claimed invention to modify Engel, Raugas and NPL references and include a system which creates new anomaly models based on new feature vectors, as disclosed by Kapoor.
	The motivation to include such a system is so that the system maintains a relatively current view of what is “normal” and can continuously monitor data flows for anomalies.
Regarding claim 26, the combination of Engel, Raugas, NPL and Kapoor discloses:
The computer system as recited in claim 25: wherein the threat decision engine comprises a plurality of threat models (Engel: [0124] & [0132]). 
Regarding claim 27, the combination of Engel, Raugas, NPL and Kapoor discloses:
The computer system as recited in claim 25, wherein the machine learning anomaly model includes processing logic configured to detect lateral movement, communication by blacklisted entities, malware communications, and/or beacon activity (Raugas: [0028] Features 120 may be extracted from the sampled network traffic in block 140; See Feature set A through feature Q for teachings of lateral movement, communication by blacklisted entities and etc.).
Regarding claim 28, Engel discloses:
A non-transitory machine-readable storage medium for use in a processing system, the non-transitory machine-readable storage medium storing instructions, an execution of which in the processing system causes the processing system to perform operations comprising:
in a real-time detection mode: 
inputting, to a real-time (See [0021] i.e. online detection of anomalous network) anomaly decision engine ([0012] FIG. 6 illustrates an anomaly detection module), feature sets of event data (See [0082] i.e. sensors 110), the feature sets of event data produced based on first raw event data ([0088] the anomaly detection module 200 may receive raw data from one or more sensors) originating from a plurality of data sources on a computer network ([0014] The present invention discloses a method for detecting anomalous action within a computer network. The method comprises the steps of: [0021] online or batch detection of anomalous network actions associated with entities based on the statistical models; [0082] the sensors 110 may collect data from several places in the computer network 100 and after analysis of the collected data the sensors 110 may send the data to an anomaly detection module 175; [0088] the anomaly detection module 200 may receive raw data from one or more sensors); 
causing the real-time anomaly decision engine to detect a first network security anomaly (i.e., receiving raw event data in stage 310 of FIG. 3), based on the feature sets of event data, in real time as the feature sets of event data are produced based on the first raw event data ([0051] discloses that an anomaly detection module is used for online or batch detection of anomalies of actions associated with entities; [0110] FIG. 3; Stage 310 discloses receiving raw data from sensors); and 
training, based on the feature sets of event data ([0100] According to some embodiments of the present invention, a training process is performed automatically over multiple time periods, preforming statistical analysis of network actions at each period. The training process continues until a statistically significant stabilization of the statistical model is reached); 
in a batch detection mode: 
inputting, to a batch anomaly decision engine (See [0021] i.e. batch detection of anomalous network actions), stored feature sets of event data, the stored feature sets of event data based on second raw event data originating from the plurality of data sources on the computer network ([0014] The present invention discloses a method for detecting anomalous action within a computer network. The method comprises the steps of: [0021] online or batch detection of anomalous network actions associated with entities based on the statistical models; [0103] anomalies can be detected by finding specific entities that differ in their behavior from the majority of other entities in the computer network which have similar functionality, or finding actions that differ from the majority of actions in their characteristics. This method works on a batch of data and detects the anomalies rather than compare a specific action to a model; [0128] According to some embodiments of the present invention, statistical modeling module may begin with receiving detailed entities actions related data including identity of entity over time from the association module activity (stage 510); [0150] anomalies can be detected by finding specific entities that differ in their behavior from the majority of other entities in the computer network, or finding actions that differ from the majority of actions in their characteristics and their associated entities (stage 630). This method works on a batch of data and detects the anomalies between entities or actions rather than compare a specific action to a model); 
wherein the batch anomaly decision engine [statistical modeling module] sets the variable time slice [generates statistical models over multiple periods of time] parameter associated with the machine learning anomaly model to a time period length that is longer than a single event to enable the machine learning anomaly model process the variable time slice on a grouping stored events ([0133] the statistical modeling module may maintain statistics of protocol and entities usage/pattern behavior over multiple time periods for each entity (stage 525). For example over the last hour, over the last day, last week, last month, or last year. Some changes or anomalies are relevant when something happens in one minute (for example a large number of connections originating from one computer), and other anomalies are relevant in longer timespans (an aggregate number of failed connections to the same server over 1 week). The level of detail can vary between the different time periods to maintain a manageable dataset. For example on a 1-year timespan the average number of connections will be saved for each month and not each specific connection);
causing the batch anomaly decision engine to detect a second network security anomaly (i.e., analyzing logs to extract relevant network action related data; FIG. 3 Stage 320), by using the machine learning anomaly model, based on the stored feature sets of event data ([0051] discloses that an anomaly detection module is used for online or batch detection of anomalies of actions associated with entities; [0099] the anomaly detection module 270 receives information regarding actions in the computer network and identifies anomalous behavior by comparing actual network actions with the statistical model; [0111] the condenser module may analyze logs [which is batch data] to extract relevant computer network action related data (stage 320)); and 
training based on the stored feature sets of event data ([0100] According to some embodiments of the present invention, a training process is performed automatically over multiple time periods, preforming statistical analysis of network actions at each period. The training process continues until a statistically significant stabilization of the statistical model is reached); 
outputting, to a threat decision engine, anomaly data indicative of the first network security anomaly and the second network security anomaly ([0099] The anomalies may be sent to a decision engine 280. The purpose of the decision engine 280 is to aggregate relevant anomalies together and create incidents. The incidents may be reported as notifications 285 regarding anomaly action or an attack activity); 
causing the threat decision engine to detect, based on the anomaly data, a 
network security threat ([0104] According to other embodiments of the present invention, the decision engine 280, may analyze several anomaly actions and generate incidents/alerts based on identified anomalies according to predefined rules); and 
causing output (i.e., notification), via a user interface, of threat data indicative of the network security threat ([0101] The notifications 285 may be sent to a manual inspection 297. The manual inspection 297 may determine if an action is false positive or not and the feedback (299) of the manual inspection may be sent to the statistical models database 265);
wherein detecting the first network security anomaly, detecting the second network security anomaly or detecting the network security threat is performed by using a task-parallel distributed processing engine (see FIG. 7; [0152] According to some embodiments of the present invention, the decision engine module receives specific information on anomalies in the computer network (stage 710). Next, the decision engine module may be creating incidents by aggregating and clustering related anomalies based on specified parameters (stage 715) and then analyzing and ranking the incidents (stage 720)). 
Engle fails to disclose:
	using a machine learning anomaly model to train; wherein the machine learning anomaly model is configured to process a variable time slice of data to produce a score indicative of a detected anomaly; wherein the machine learning anomaly model is configured to process a variable time slice of data to produce a second score indicative of a detected anomaly and wherein the real-time anomaly decision engine set a variable time slice parameter associated with the machine learning anomaly model to event-by-event; wherein the threat decision engine comprises a plurality of machine learning threat models, each including logic to detect a different type of threat; training a second model state of the machine learning model, wherein the second model state is an offline version of model state of the machine learning model: determining, for the second model state, a criterion for readiness for active deployment, based on a degree of training of the second model state; and live-swapping the second model state to replace the first model state as the active version of model state, based on the criterion for readiness for active deployment.
However, Raugas discloses:
	using a machine learning anomaly model to train ([0020] A system, method, medium, or computer based product may provide tools for real-time detection and classification of advanced malware using supervised machine learning; [0022] Using one or more methods of feature extraction, the network samples may be prepared for scoring. Subsequently, a specific machine learning algorithm may use models trained a priori against specific and/or known classes of malware);
	wherein the machine learning anomaly model is configured to process a variable time slice (i.e., time window) of data to produce a score indicative of a detected anomaly (See Abstract: Methods, system, and media for detecting malware are disclosed. A network may be monitored for a configured time interval collecting all of or some of the network traffic or samples of the network traffic; [0022] a specific machine learning algorithm may use models trained a priori against specific and/or known classes of malware to compute rankable scores (for a given time window). The scores may indicate whether particular hosts or network devices exhibit network behavior most similar to a particular class of malware)
wherein the machine learning anomaly model is configured to process a variable time slice of data to produce a second score indicative of a detected anomaly ([0070] Using the generated score 130, a computing device may, for each time interval/window; [0071] Blocks 225, 226, 227, and 228 depict one or more machine learning models from a plurality of machine learning models being applied to the features 210 and generating a set of scores, 235, 236, 237 and 238), and
	wherein the real-time anomaly decision engine set a variable time slice parameter associated with the machine learning anomaly model to event-by-event ([0027] Network traffic may be monitored by monitoring device 135 during a time configurable time interval. The time interval may be specified in a configuration file, by an administrator, or by a user, and may correspond to the window over which feature vectors are constructed; [0028] In one or more embodiments, features may comprise only a subset of monitored or examined network data computed during the configurable time interval (“window”); Raugas further teaches “event-by-event” claim element as “feature set” which are disclosed through paragraphs [0029-0063] in which different features of network data is extracted by machine learning models to produce scores);
wherein the threat decision engine comprises a plurality of machine learning threat models, each including logic to detect a different type of threat ([0068] In one embodiment, a machine learning model 125 may include a set of supervised learning algorithms, such as Boosted Decision Trees, Support Vector Machines, and Gaussian Mixture Models. One or more of a plurality of machine learning models may be specified as part of a predefined configuration or may be specified by a user; [0071] FIG. 2 depicts a block diagram of an exemplary system 200 in accordance with one or more embodiments, in which a plurality of machine learning models are applied to features from system 100 … Block 210 depicts features, such as those discussed with respect to system 100. Blocks 225, 226, 227, and 228 depict one or more machine learning models from a plurality of machine learning models being applied to the features 210).
It would have been obvious to one of the ordinary person skilled in the art before the effective filing date of the claimed invention to modify the Engle reference and use machine learning models techniques to detect anomalies and malware, as disclosed by Raugas.
The motivation is to be able to correctly identify evolution of malware threats using machine learning techniques.
The combination of Engel and Raugas fails to disclose:
	wherein a model state of the machine learning anomaly model is shared [updated] between the real-time anomaly decision engine and the batch anomaly decision engine; training a second model state of the machine learning model, wherein the second model state is an offline version of model state of the machine learning model: determining, for the second model state, a criterion for readiness for active deployment, based on a degree of training of the second model state; and live-swapping the second model state to replace the first model state as the active version of model state, based on the criterion for readiness for active deployment. 
However, NPL discloses:
	wherein a model state of the machine learning anomaly model is shared [updated] between the real-time anomaly decision engine and the batch anomaly decision engine (Page # 75, section 3.1.1. in light of Figure 1 discloses a Training system consists of batch processing cluster and stream [real-time] processing cluster. It further discloses that the training system uses historical data [batch data] and updates the model with additional real time data. Finally, the training system allows data to enter into the system in streams, and then integrates [shares] batch and stream processing on these data by using micro-batch processing. Each data stream is treated as a sequence of small batches so that they can be delivered from stream processing cluster to batch processing cluster along with the existing model, and the result after being applied scalable machine learning algorithms on the batch processing cluster will be delivered back to the stream processing cluster. The interaction between these two clusters requires either cluster resource sharing or a database connection in between; Page # 79, Section 4.1 Backend Training System discloses using specific Machine Learning algorithms to train the data which supports both batch processing and stream processing; Also see Page # 79, Sections 5.1, 5.2 & 5.3 disclosing the case study, the evaluation design, experiment results and analysis for updating the model using batch and real time data).
	It would have been obvious to an ordinary person skilled in the art before the effective filing date of the claimed invention to modify the Engel and Raugas references and include a machine learning algorithm to train and predict anomaly model, as disclosed by NPL.
	The motivation to use the machine learning algorithm to train and predict anomaly model is to increase precision in data analytics by integrating historical or batch data with real time data in order to have most up-to-date anomaly model for improved prediction accuracy.
The combination of Engel, Raugas and NPL fails to disclose: 
	training a second model state of the machine learning model, wherein the second model state is an offline version of model state of the machine learning model: determining, for the second model state, a criterion for readiness for active deployment, based on a degree of training of the second model state; and live-swapping the second model state to replace the first model state as the active version of model state, based on the criterion for readiness for active deployment.
However, Kapoor discloses:
	training a second model state of the machine learning model, wherein the second model state is an offline version (i.e., [0176] discloses data flow can be processed off-line for training purpose) of model state of the machine learning model: determining, for the second model state, a criterion for readiness for active deployment, based on a degree of training of the second model state; and live-swapping the second model state to replace the first model state as the active version of model state, based on the criterion for readiness for active deployment ([0174-0176] discloses a training process in which the system feeds feature vectors to another self-organizing maps (SOM) as training data to generate a newer model which eventually replace the older model when the feature vectors are considered outdated. This way the system maintains a relatively current view of what is normal and can continuously monitor data flows for anomalies).
	It would have been obvious to an ordinary skill in the art before the effective filing date of the claimed invention to modify Engel, Raugas and NPL references and include a system which creates new anomaly models based on new feature vectors, as disclosed by Kapoor.
	The motivation to include such a system is so that the system maintains a relatively current view of what is “normal” and can continuously monitor data flows for anomalies.
Regarding claim 29, the combination of Engel, Raugas, NPL and Kapoor discloses:
The non-transitory machine-readable storage medium as recited in claim 28: wherein the threat decision engine comprises a plurality of threat models (Engel: [0124] & [0132]).
Regarding claim 30, the combination of Engel, Raugas, NPL and Kapoor discloses:
The non-transitory machine-readable storage medium as recited in claim 28, wherein the machine learning anomaly model includes processing logic configured to detect lateral movement, communication by blacklisted entities, malware communications, and/or beacon activity (Raugas: [0028] Features 120 may be extracted from the sampled network traffic in block 140; See Feature set A through feature Q for teachings of lateral movement, communication by blacklisted entities and etc.).
Regarding claim 31, the combination of Engel, Raugas, NPL and Kapoor discloses:
The method as recited in claim 1, wherein the stored feature sets of event data are stored in a persistent storage system (Raugas: [0080] & [0082]).

Claim 23 are rejected under 35 U.S.C. 103 as being unpatentable over Engel et al., (US20140165207A1) in view of Raugas et al., (US20150128263A1) in view of NPL reference Titled “Making Real Time Data Analytics Available as a Service” published May 04-08, 2015 hereinafter referred as NPL in view of Kapoor et al., (US20070192863A1) and further in view of Cohen et al., (US20150273693A1).
Regarding claim 23, the combination of Engel, Raugas, NPL and Kapoor fails to disclose:
	The method as recited in claim 1, wherein the event feature set is stored feature sets of event data are stored in a Hadoop Distributed File System (HDFS).
However, Cohen discloses:
wherein the event feature set is stored feature sets of event data are stored in a Hadoop Distributed File System (HDFS) ([0052] various embodiments one or more databases 340 may comprise a relational database system using a structured query language (SQL), while others may comprise an alternative data storage technology such as those referred to in the art as “NoSQL” (for example, Hadoop/HDFS, Apache Spark, hBase, MongoDB, Cassandra, Google BIGTABLE™, and so forth)).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention, to modify the system and method of Engel, Raugas, NPL and Kapoor and utilize Apache Spark or Hadoop Distributed File System (HDFS) to achieve faster database access in order to detect and prevent anomalies and threat in computer network, as taught by Cohen.
The motivation is to have a system which can provide faster data storage and access from the database in order to compare various network behavioral models with the real time data to detect an anomaly and prevent the computer networks.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 


Any inquiry concerning this communication or earlier communications from the examiner should be directed to SYED M AHSAN whose telephone number is (571)272-5018. The examiner can normally be reached 8:30 AM - 6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jeffery L. Nickerson can be reached on 469-295-9235. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/S.M.A./Patent Examiner, Art Unit 2432                                                                                                                                                                                                        /Jeffrey Nickerson/Supervisory Patent Examiner, Art Unit 2432