DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed on 11/15/2021 have been fully considered:
Regarding the 101 rejection, has been considered and examiner agrees that amendments overcome the 101 abstract idea issue. Therefore 101 rejection has been withdrawn.
Regarding art rejection Remarks pages 10-12:
Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.  (FP 7-37-11). For clarifications, examiner will clarify based on the limitations the applicant pointed out. 
Applicant argues that “Muddu does not disclose the use of unsupervised machine learning that operates without referencing pre-generated behavior profiles or other similar profiles. Additionally, Muddu does not disclose restricting access to, or operations by, the computing environment in response to the detection of the anomaly”. Remarks pages 10-12 does not discuss the reference applied against the (claims 1, 11 and 16), explaining how the claims avoid the references or distinguish from them. 
unsupervised machine learning (see ¶ 275, “the ML-based CEP engine can utilize unsupervised machine learning models). Regarding the limitation, the use of the unsupervised machine learning occurs without using pre- determined or pre-configured metrics for assessing the data instances of the input stream (see ¶ 275, “the ML-based CEP engine can utilize unsupervised machine learning models, it can identify entity behaviors and event patterns that are not previously known to security experts”, ML utilizes unsupervised machine learning therefore it does not use pre- determined or pre-configured metrics for assessing the data instances). Using unsupervised machine learning occurs without using pre- determined or pre-configured metrics is the definition of a unsupervised machine learning. Definition of unsupervised learning can be found in at least (https://www.ibm.com/cloud/learn/unsupervised-learning) or (https://towardsdatascience.com/concise-guide-to-unsupervised-learning-with-clustering-4924cdbb27cb), where unsupervised machine learning is a type of machine learning that looks for previously undetected patterns in a data set with no pre-existing labels or pre-trained results. Therefore, Muddu teaches the use of the unsupervised machine learning occurs without using pre- determined or pre-configured metrics for assessing the data instances of the input stream.
Furthermore, Muddu teaches restricting access to, or operations by, the computing environment in response to the detection of the anomaly (see ¶ 151, “The anomalies and threats detected by the real-time processing path may be employed to automatically trigger an action, such as stopping the intrusion, shutting down network access, locking out users, preventing information theft or information transfer, shutting down software and or hardware processes, and the like”).
For the limitation “categorizing the data instances of the unstructured input stream of data instances, the data instances comprising at least one principal value and at least one categorical attribute, each of the at least one categorical attribute being an equivalence class determined through unsupervised machine learning based on a weighted edit distance”.
The specification gives some examples for categorical attributes such as users, login location, system ID etc. wherein the principle value is logon time (see specification ¶ 31). In the prior art Muddu ¶ 216-217 and figures 9A-B shows how a logon information (900) including a user, IP address and website location is being categorized as showing in figure 9B element 902.  Furthermore ¶ 568-573 show a login graph which utilizes a machine learning model 6300 to generate classification metadata 6320 for each network device and user. The login event corresponds to the principle value, therefore the nodes and edges showing relationship between the users and devices corresponds to the categorical attributes. The unsupervised machine learning model 6300 assigns usage similarity scores 6360 to the network devices represented by the device nodes such that any given set of network devices that are accessed by the same or similar group of users are assigned similarity scores that are closer in value to each other than the similarity scores of any other set of network devices that are not accessed by the same or similar group of users, which corresponds to an equivalence class determined through machine learning based on a weighted edit distance, because an equivalence class are elements that equivalent therefor users and devices that are accessed by the same group is 
Same argument and response apply to independent claims 1, 11 and 16. 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 1, 4-11 and 15 are rejected under 35 U.S.C. 102(a) (2) as being anticipated by Muddu et al. (US 2017/0063910 A1).

Regarding claim 1, 
a computer-implemented method for detecting anomalous activity in a computing environment (see ¶ 137, “anomalous activity detection in a networked environment”, also see ¶ 139, “The environment 10 may represent a networked computing environment of one or multiple companies or organizations”), the method comprising: 
receiving an unstructured input stream of data instances from the computing environment (see ¶ 143, “incoming data is processed using machine learning/data science techniques to extract knowledge from large volumes of data that are structured or unstructured”, i.e. unstructured input stream of data instances), the unstructured input stream being time stamped (see ¶ 174, “a time series database 370 that represents the database for storing time stamped data”, also see ¶ 222, “even if the events arrive in an order that is not the same as how they actually took place, as long as the events have timestamps”, teaches unstructured input being time stamped); 
categorizing the data instances of the unstructured input stream of data instances(see figure 9A showing raw event data 900 received by the data intake and figure 9B showing categorized/parsed data showing entities involved in the event and their relationship, also ¶ 216-217 categorizing the event data to a user, IP address, visits etc., also see ¶ 568, “Based on the event data 6310 (e.g., the login graph), the machine learning model 6300 generates classification metadata 6320 for each of the network devices and for each of the users”, i.e. event data interpreted as data instances), the data instances comprising at least one principal value (see ¶ 570, “if the event data 6310 includes a login graph having information that relates to the login events, the machine learning model 6300 can identify the usage relationships 6330 as login events indicative ) and at least one categorical attribute (see ¶ 570, “the machine learning model 6300 can identify the usage relationships 6330 as login events indicative of the users logging into the network devices… the usage relationship 6330 can be presented as a graph having nodes and edges interconnecting the nodes, as illustrated in FIG. 63.  The nodes represent network entities such as users and network devices, and the edges represent the login events that the users log into the network devices” also see ¶ 571-572, i.e. usage relationship of users and devices (nodes and edges) are interpreted as categorical attributes), each of the at least one categorical attribute being an equivalence class determined through unsupervised machine learning (see ¶ 275, “the ML-based CEP engine can utilize unsupervised machine learning models) based on a weighted edit distance (see ¶ 572, “the machine learning model 6300 assigns usage similarity scores 6360 (also referred to as "similarity scores") to the network devices represented by the device nodes”, also see ¶¶ 573, “The similarity scores are assigned such that any given set of network devices that are accessed by the same or similar group of users are assigned similarity scores that are closer in value to each other than the similarity scores of any other set of network devices that are not accessed by the same or similar group of users.”, where the similarity scores [i.e. weighted edit distance] assigned by the machine learning to the usage relationship [i.e. equivalence class] of users and devices (nodes and edges) corresponds to an categorical attribute being an equivalence class determined through machine learning based on weighted edit distance), the use of the unsupervised machine learning occurs without using pre- determined or pre-configured metrics for assessing the data instances of the input stream (see ¶ 275, “the ML-based CEP engine can utilize unsupervised machine learning models, it can identify entity behaviors and event patterns that are not previously known to security experts”, ML utilizes unsupervised machine learning therefore it does not use pre- determined or pre-configured metrics for assessing the data instances); 
generating anomaly scores for each of the data instances collected over a period of time (see ¶ 567, “the processes of generating the classification metadata and/or assigning usage similarity scores are performed in real-time as the event data are received”, also see ¶¶ 572-577, the machine learning model 6300 determines the similarity scores of the particular network devices that are accessed by the same or similar group of users, also see figures 63-64); 
detecting a change in the at least one categorical attribute that is indicative of an anomaly (see ¶ 577, “Once the user 6412 logs into device 6424 (as represented by the dashed line in FIG. 64), the machine learning model 6300 determines the similarity score of the particular device 6424 (i.e., 0.06 for device 6424) fails to satisfy a specific closeness criterion relative to similarity scores of network devices with which the particular user usually interacts (i.e., 0.30 for device 6422 and 0.33 for device 6423)…The machine learning model 6300 then detects an anomaly because the difference of similarity scores exceeds the threshold value”); 
and restricting access to, or operations by, the computing environment in response to the detection of the anomaly (see ¶ 151, “The anomalies and threats detected by the real-time processing path may be employed to automatically trigger an action, such as stopping the intrusion, shutting down network access, locking out users, shutting down software and or hardware processes, and the like”).

Regarding claim 4, 
Muddu teaches the method of claim 1,
Muddu further teaches wherein categorizing the data instances of the unstructured input stream of data instances comprises: tokenizing segments within the unstructured input stream (see ¶ 206, “the received data representing an event, which field represents a token that may correspond to a timestamp, an entity, an action, an IP address, an event identifier (ID), a process ID, a type of the event, a type of machine that generates the event, and so forth” [i.e. tokenizing is substituting identifiers therefore the data of events being tokened to entities, IP address etc. Corresponds to tokenizing segments of input stream], also see ¶ 216-17 and figures 9A-B); 
filtering or removing a portion of the tokenized segments based on a set of filtering criteria (see ¶ 164, “The data receivers 310 may also optionally filter some of the event data”, also see ¶ 167, “An optional filter attribution block 322 in the semantic processor 316 removes certain pre-defined events.  The attribution filter 322 in the semantic processor 316 may further remove events that need not be processed by the security platform”); 
applying a weight to one or more of the filtered, tokenized segments (see ¶ 235, “Depending on the model, other criteria for an event to be considered relevant for model training and/or updating purposes may include, for example, when a new event includes a particular machine identifier, a particular user identifier, and/or the recency of 
comparing the filtered, tokenized segments to one another to determine if a match exists therebetween (see ¶ 262, “two sessions is determined based on comparing three items: "from-session-link-context", "to-session-link-context", and "Link-Event time"… Two existing sessions should be linked or correlated if the newly added session (1) matches a link event time range, (2a) has a match in one of its from-session-link-context or to-session-link-context with those of one existing session, and (2b) has at least a partial match in one of its from-session-link-context or to-session-link-context with those of another existing session”, also see ¶ 634, “assigning an anomaly score indicating a confidence level that the entity identifier matches a particular entry in the external data source based on the comparing.”); 
and categorizing the filtered, tokenized segments based on the comparison (see ¶ 249, “ the fields can be used by a machine learning model to identify which subset of the event data (e.g., serverIP, sourceIP, sourcePort, etc.) is the information that the model wants to receive.”, see figure 9A showing raw event data 900 received by the data intake and figure 9B showing categorized/parsed data showing entities involved in the event and their relationship, also ¶ 216-217 categorizing the event data to a user, IP address, visits etc., also see ¶ 568, “Based on the event data 6310 (e.g., the login graph), the machine learning model 6300 generates classification metadata 6320 for each of the network devices and for each of the users”, i.e. event data interpreted as data instances).

Regarding claim 5, 

Muddu further teaches wherein when a match does not exist, a new category is created and attributed to one or more of the filtered, tokenized segments (see ¶ 204, “after the data connectors 802 obtain/receive the data, if the data format of the data is unknown (e.g., the administrator has not specified how to parse the data), then the format detector 804 can be used to detect the data format of the input data.  For example, the format detector 804 can perform pattern matching for all known formats to determine the most likely format of a particular event data”, also see 476, “as shown in FIG. 44A, the GUI provides a bubble 4400 prompting the user to tag the threat with "Threat Watchlist," "False Positive," "Important," "Reviewed," "Save for Later," or to define a new category for tagging (via the "New Threat Watchlist" selection)”). 

Regarding claim 6, 
Muddu teaches the method of claim 5,
Muddu further teaches further comprising applying tokenization rules to exclude at least a portion of the segments of the unstructured log file (see ¶ 147, “the real-time processing path excludes historical data (i.e., stored data pertaining to past events) from its evaluation”, also see ¶ 164, “The data receivers 310 may also optionally filter some of the event data. For example, to reduce the workload of the security platform”, also see ¶ 218, “if the network administrator wishes to receive data in a new data format, he can edit the configuration file to create rules (e.g., in the form of functions or macros) for the particular data format including, for example, identifying how to tokenize the data”, also see ¶ 704, “The semantic processor 316 (FIG. 3) can process the ).

Regarding claim 7, 
Muddu teaches the method of claim 6,
Muddu further teaches further comprising applying filtering rules to remove tokenized segments corresponding to only numerical values or tokenized segments corresponding to date-related words, or tokenized segments corresponding to numbers with trailing units (see ¶ 167, “The attribution filter 322 in the semantic processor 316 may further remove events that need not be processed by the security platform. An example of such an event is an internal data transfer that occurs between two IP addresses as part of a regular file backup. In some embodiments, the functions of semantic processor 316 are configurable by a configuration file to permit easy updating or adjusting”, i.e. IP addresses filtering corresponds to tokenized segments corresponding to only numerical values or the tokenized segments corresponding to date-related words because IP addresses can include date-related words and also can be tokenized segments corresponding to numbers with trailing units). 

Regarding claim 8, 
Muddu teaches teach the method of claim 7,
Muddu further teaches wherein comparing the filtered, tokenized segments to one another further comprises determining distances between the filtered, tokenized segments based on deletions and replacements (see ¶ 167, “filter attribution block 322 in the semantic processor 316 removes certain pre-defined events. The attribution filter 322 in the semantic processor 316 may further remove events that need not be processed by the security platform.”, also see ¶ 427, “he computer system can further identify events that have timestamps satisfying a specific closeness criterion (e.g., the timestamps having differences less than a threshold value)”), 
further wherein the distances between the filtered, tokenized segments indicates if a weighted number of operations relative to a total weight of the tokenized segments is within 15% (see ¶ 573, “The similarity scores are assigned such that any given set of network devices that are accessed by the same or similar group of users are assigned similarity scores that are closer in value to each other than the similarity scores of any other set of network devices that are not accessed by the same or similar group of users”, [i.e. these paragraphs teaches to use for example Euclidean distance or similarity score to indicate if the similarities are close in value enough to match], also see ¶ 588) 

Regarding claim 9, 
Muddu teaches the method of claim 1,
Muddu further teaches wherein generating the anomaly scores comprises: creating features for a current group of the data instances (see ¶ 270, “the input data of the ML-based CEP engine includes event feature sets, where each event feature set corresponds to an observable event in the target computer network”, also see ¶ 292, “if the model type topology 1714 specifies users as the entity type, the ML-based CEP ); 
applying an anomaly detection algorithm that takes as inputs the features for the current group, and group features calculated using set functions for groups earlier than the current group (see ¶ 341, “a particular machine learning model can be configured to process a time slice of data to produce a score for detecting a network security-related issue, and with model state sharing, the size of the time slice can be controlled by whichever event processing engine currently utilizes the particular machine learning model… the time slice can be set by the batch processing engine 
to whichever time period length is suitable for grouping the historic events”, also see ¶ 345, “the combination of the behavioral baseline establishment technique and the model state sharing technique can be particularly useful to detect a specific entity's anomalous behavior when historical data of that specific entity is not available (e.g., a new employee joins the enterprise)”); 
and generating the anomaly scores, the anomaly scores being indicative of how anomalous are the features for the current group (see ¶ 359, “the resulting anomaly score may be a value between 0 and 10, with 0 being the least anomalous and 10 being the most anomalous”, also see ¶ 577, “Once the user 6412 logs into device 6424 (as represented by the dashed line in FIG. 64), the machine learning model 6300 determines the similarity score of the particular device 6424 (i.e., 0.06 for device 6424) fails to satisfy a specific closeness criterion relative to similarity scores of network devices )

Regarding claim 10, 
Muddu teaches the method of claim 1,
Muddu further teach further comprising enacting changes in the computing environment relative to at least a portion of the at least one categorical attribute to prevent future instances of the anomalous activity (see ¶ 151, “The anomalies and threats detected by the real-time processing path may be employed to automatically trigger an action, such as stopping the intrusion, shutting down network access, locking out users, preventing information theft or information transfer, shutting down software and or hardware processes, and the like.”). 

Regarding claim 11, 
Muddu teaches a computer-implemented method for detecting anomalous activity in a computing environment (see ¶ 137, “anomalous activity detection in a networked environment”, also see ¶ 139, “The environment 10 may represent a networked computing environment of one or multiple companies or organizations, and can be implemented across multiple geographic regions”), the method comprising: 
receiving an unstructured input stream of data instances (see ¶ 143, “incoming data is processed using machine learning/data science techniques to extract knowledge from large volumes of data that are structured or unstructured”, i.e. the data instances in the unstructured input stream being time stamped (see ¶ 174, “a time series database 370 that represents the database for storing time stamped data”, also see ¶ 222, “even if the events arrive in an order that is not the same as how they actually took place, as long as the events have timestamps”, teaches unstructured input being time stamped, also see ¶ 567, “The event data 6310 can be, e.g., timestamped machine data”); 
categorizing the data instances of the unstructured input stream of data instances(see figure 9A showing raw event data 900 received by the data intake and figure 9B showing categorized/parsed data showing entities involved in the event and their relationship, also ¶ 216-217 categorizing the event data to a user, IP address, visits etc., also see ¶ 568, “Based on the event data 6310 (e.g., the login graph), the machine learning model 6300 generates classification metadata 6320 for each of the network devices and for each of the users”, i.e. event data interpreted as data instances), the data instances comprising at least one principal value (see ¶ 570, “if the event data 6310 includes a login graph having information that relates to the login events, the machine learning model 6300 can identify the usage relationships 6330 as login events indicative of the users logging into the network devices”, i.e. login events interpreted as principle value) and a set of categorical attributes (see ¶ 570, “the machine learning model 6300 can identify the usage relationships 6330 as login events indicative of the users logging into the network devices… the usage relationship 6330 can be presented as a graph having nodes and edges interconnecting the nodes, as illustrated in FIG. 63.  The nodes represent network entities such as users and network devices, and the edges represent the login events that the users log into the network devices” also see ¶ 571-572, i.e. usage ), each of the categorical attribute being an equivalence class determined through unsupervised machine learning (see ¶ 275, “the ML-based CEP engine can utilize unsupervised machine learning models) based on a weighted edit distance (see ¶ 572, “the machine learning model 6300 assigns usage similarity scores 6360 (also referred to as "similarity scores") to the network devices represented by the device nodes”, also see ¶¶ 573, “The similarity scores are assigned such that any given set of network devices that are accessed by the same or similar group of users are assigned similarity scores that are closer in value to each other than the similarity scores of any other set of network devices that are not accessed by the same or similar group of users.”, where the similarity scores [i.e. weighted edit distance] assigned by the machine learning to the usage relationship [i.e. equivalence class] of users and devices (nodes and edges) corresponds to an categorical attribute being an equivalence class determined through machine learning based on weighted edit distance); 
the use of the unsupervised machine learning occurs without using pre- determined or pre-configured metrics for assessing the data instances of the input stream (see ¶ 275, “the ML-based CEP engine can utilize unsupervised machine learning models, it can identify entity behaviors and event patterns that are not previously known to security experts”, ML utilizes unsupervised machine learning therefore it does not use pre- determined or pre-configured metrics for assessing the data instances); 
grouping the data instances into groups based on continuous time intervals and at least one principal value (see ¶ 567, “the processes of generating the classification metadata and/or assigning usage similarity scores are performed in real-same or similar group of users [i.e. grouping data instances based on continues time], also see figures 63-64), each of the continuous time intervals having a length (see ¶ 341, “if the batch processing engine is utilizing the model, the time slice can be set by the batch processing engine to whichever time period length is suitable for grouping the historic events (i.e., events that are already stored as opposed to being currently streamed) into batches for processing”);
applying set functions to each of the groups (see figures 63-66, ¶ 571, “As shown in FIG. 63, the usage relationships 6330 between the users and the network devices can be captured in a bipartite graph including a first set of nodes representing users (nodes 6341, 6342, 6343 and 6344) and a second set of nodes representing network devices (nodes 6351, 6352, 6353 and 6354). The first and second sets are disjoint sets. Every edge in the bipartite graph connects a user node in the first set to a device node in the second set. In addition, the relationships 6330 between the user nodes and the device nodes also represent a time series of events in which the users have interacted (e.g., logged in) with the network devices” [i.e. set functions of groups], also see ¶ 575-590); 
and generating an anomaly score for each of the groups using the set functions (see ¶ 567, “the processes of generating the classification metadata and/or assigning usage similarity scores are performed in real-time as the event data are received”, also see ¶¶ 572-577, the machine learning model 6300 determines the );
restricting access to, or operations by, the computing environment in response to the detection of an anomaly based on the anomaly scores (see ¶ 151, “The anomalies and threats detected by the real-time processing path may be employed to automatically trigger an action, such as stopping the intrusion, shutting down network access, locking out users, preventing information theft or information transfer, shutting down software and or hardware processes, and the like”).

Regarding claim 15, 
Muddu teaches the method of claim 11,
Muddu further teaches wherein the regularity analysis further comprises identifying when a categorical attribute of the set of categorical attributes influences the anomaly score for the set of categorical attributes if an output of an anomaly detection algorithm is within 30% to alternative instances in which the set of categorical attributes exists (see ¶ 315, “the model deliberation process thread processes the most recent time slice from the group-specific data stream to compute a score associated with the most recent time slice.  The most recent time slice can correspond to an event or a sequence of event observed at the target computer network”, also see ¶ 371 “a threat indicator score can be assigned based on the processing of the anomaly data with a threat indicator being identified if the threat indicator score satisfies a specified criterion.  For example, the 20 entities associated with a particular anomaly ).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 12 and 13 are rejected under 35 USC 103 as being unpatentable over Muddu et al. (US 2017/0063910 A1) in view of Green et al (US 2013/0238476 A1).

Regarding claim 2, 
Muddu teaches the method of claim 1, 
Muddu do not teach wherein the change in anomaly scores is performed using a counterfactual analysis. 
Green teaches the change in anomaly scores is performed using a counterfactual analysis (see ¶ 7, “an electronic device that performs counterfactual testing…Based on the comparison and a testing metric, the electronic device determines a result of the counterfactual testing”, also see ¶ 111, “Counterfactual module 938 may compare the financial output values 964 for the modified and the original functional 
Muddu and Green pertain to the problem of behavioral patterns, thus being analogous. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to combine Muddu and Green to perform the change of anomaly scores using counterfactual analysis. The motivation for doing so would be for the purpose of reducing false positive classification such as possibility of fraud when sharing functional representations and also identify outliers that deviate from normal behavior (see Green ¶ 75).

Regarding claim 12, 
Muddu teaches the method of claim 11, 
Muddu further teaches to identify which of the set of categorical attributes for a group is influencing one or more anomalies in the groups that are indicative of the anomalous activity in the computing environment (see ¶ 493, “each Anomalies Detailed View provides different boxes and graphics to illustrate parameters that correspond to the type of anomaly in the view.”, also see ¶ 723, “the rarity criterion for determining whether an event is anomalous can include additional parameters, such as a minimum number of features and/or feature pairs in the event to be anomalous, a list of features and/or feature pairs in the event to be anomalous.”), 
Muddu do not teach applying a counterfactual analysis or a regularity analysis. 
applying a counterfactual analysis or a regularity analysis (see ¶ 7, “an electronic device that performs counterfactual testing…Based on the comparison and a testing metric, the electronic device determines a result of the counterfactual testing”, also see ¶ 111, “Counterfactual module 938 may compare the financial output values 964 for the modified and the original functional representations, and may determine one or more results 966 based on one or more testing metrics 974 and/or one of performance metrics 972”).
Muddu and Green pertain to the problem of behavioral patterns, thus being analogous. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to combine Muddu and Green to perform the change of anomaly scores using counterfactual analysis. The motivation for doing so would be for the purpose of reducing false positive classification such as possibility of fraud when sharing functional representations and also identify outliers that deviate from normal behavior (see Green ¶ 75).

Regarding claim 13, 
Muddu and Green teach the method of claim 12, 
Muddu further teaches wherein generating the anomaly score further comprises applying an anomaly detection algorithm to values generated using the set function to detect changes in the groups over the continuous time intervals (see ¶ 14 and figure 6, “FIG. 6 shows an example representation of the process of building adaptive behavioral baselines and evaluating against such baselines to support the detection of anomalies”, also see ¶ 379, “Anomalies may be detected over a period of .

Claims 3 and 14 are rejected under 35 USC 103 as being unpatentable over Muddu et al. (US 2017/0063910 A1) in view of Green et al (US 2013/0238476 A1) in further view of Flanders et al. (US 2016/0055654 A1).

Regarding claim 3, 
Muddu and Green teaches the method of claim 2,
Green further teaches wherein the counterfactual analysis (see ¶ 7, “an electronic device that performs counterfactual testing…Based on the comparison and a testing metric, the electronic device determines a result of the counterfactual testing”, also see ¶ 111, “Counterfactual module 938 may compare the financial output values 964 for the modified and the original functional representations, and may determine one or more results 966 based on one or more testing metrics 974 and/or one of performance metrics 972”).
Muddu and Green pertain to the problem of behavioral patterns, thus being analogous. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to combine Muddu and Green to perform the change of anomaly scores using counterfactual analysis. The motivation for doing so would be for the purpose of reducing false positive classification such as possibility of fraud when 
Muddu and Green do not teach removing at least a portion of the data instances; regenerating the anomaly scores for each of the data instances over the continuous time intervals; and wherein if the regenerated anomaly scores are improved compared to the anomaly scores, at least a portion of the categorical attributes are identified as anomalous categorical attributes and a cause of the anomalous activity. 
Flanders teaches removing at least a portion of the data instances (see ¶ 22, “eliminate scene pixels with low anomaly scores from further processing”); 
regenerating the anomaly scores for each of the data instances over the continuous time intervals (see ¶ 2, “for each scene pixel remaining in the plurality of scene pixels, computing an updated intermediate anomaly score using the updated anomaly equation”); 
and wherein if the regenerated anomaly scores are improved compared to the anomaly scores, at least a portion of the categorical attributes are identified as anomalous categorical attributes and a cause of the anomalous activity (see ¶ 2, “comparing the updated intermediate anomaly scores to an updated threshold, the updated threshold being greater in value than the previous threshold…The method further includes declaring which scene pixels include anomalies based on comparisons of the computed full dimension anomaly scores to a full dimension anomaly score threshold, the full dimension anomaly score threshold being greater in value than the threshold and the updated threshold.). 


Regarding claim 14, 
Muddu and Green teaches the method of claim 13,
Green further teaches the counterfactual analysis (see ¶ 7, “an electronic device that performs counterfactual testing…Based on the comparison and a testing metric, the electronic device determines a result of the counterfactual testing”, also see ¶ 111, “Counterfactual module 938 may compare the financial output values 964 for the modified and the original functional representations, and may determine one or more results 966 based on one or more testing metrics 974 and/or one of performance metrics 972”).
Muddu and Green pertain to the problem of behavioral patterns, thus being analogous. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to combine Muddu and Green to perform the change of anomaly scores using counterfactual analysis. The motivation for doing so would be for 
Muddu and Green do not teach determining a change in the anomaly score; removing at least a portion of the data instances; regenerating the anomaly score for each of the data instances which remain after the removing; and comparing the regenerated anomaly score to the anomaly score to identify if at least a portion of the categorical attributes caused the change in the anomaly score. 
Flanders teaches determining a change in the anomaly score (see ¶ 2, “for each scene pixel remaining in the plurality of scene pixels, computing an updated intermediate anomaly score using the updated anomaly equation”); 
removing at least a portion of the data instances (see ¶ 22, “eliminate scene pixels with low anomaly scores from further processing”); 
regenerating the anomaly score for each of the data instances which remain after the removing (see ¶ 2, “for each scene pixel remaining in the plurality of scene pixels, computing an updated intermediate anomaly score using the updated anomaly equation”); 
and comparing the regenerated anomaly score to the anomaly score to identify if at least a portion of the categorical attributes caused the change in the anomaly score (see ¶ 2, “comparing the updated intermediate anomaly scores to an updated threshold, the updated threshold being greater in value than the previous threshold…The method further includes declaring which scene pixels include anomalies based on comparisons of the computed full dimension anomaly scores to a full dimension 
Muddu, Green and Flanders pertain to the problem of behavioral patterns and anomaly detection, thus being analogous. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to combine Muddu, Green and Flanders to perform a removal and regeneration of anomaly instances. The motivation for doing so would be for the purpose of processing a few terms of all scene pixels, eliminate most scene pixels, and calculate more terms on high anomaly scoring scene pixels as needed to form an updated anomaly equation if the number of eliminated scene pixels is less than a specified fraction of a total number of scene pixels and a number of computed intermediate anomaly scores is less than a counter. The way the algorithm can run faster (Flanders abstract and ¶ 41).

Claims 16-20 are rejected under 35 USC 103 as being unpatentable over Muddu et al. (US 2017/0063910 A1) in view of Dubey et al. (US 2018/0173698 A1).

Regarding claim 16. 
Muddu teaches a computer-implemented method for anomaly detection, comprising: receiving an unstructured log file of a computing environment (see ¶ 143, “incoming data is processed using machine learning/data science techniques to extract knowledge from large volumes of data that are structured or unstructured”, i.e. unstructured input stream of data instances), the unstructured log file comprising temporal data (see ¶ 174, “a time series database 370 that represents the database for storing time stamped data”, also see ¶ 222, “even if the events arrive in an order that is not the same as how they actually took place, as long as the events have timestamps”, teaches unstructured input being time stamped); 
tokenizing segments within the unstructured log file (see ¶ 190,"extracting a token from an event" will be understood as extracting a token from the event data that represents the event.”, also see ¶ 206, “the received data representing an event, which field represents a token that may correspond to a timestamp, an entity, an action, an IP address, an event identifier (ID), a process ID, a type of the event, a type of machine that generates the event, and so forth”); 
filtering or removing a portion of the tokenized segments based on a set of filtering criteria (see ¶ 164, “The data receivers 310 may also optionally filter some of the event data”, also see ¶ 0165, “the semantic processor 316 may perform parsing of the incoming event data, enrichment (also called decoration or annotation) of the event data with certain information, and optionally, filtering the event data”, also see ¶ 167, “An optional filter attribution block 322 in the semantic processor 316 removes certain pre-defined events. The attribution filter 322 in the semantic processor 316 may further remove events that need not be processed by the security platform”); 
applying a weight to one or more of the filtered, tokenized segments (see ¶ 235, “Depending on the model, other criteria for an event to be considered relevant for model training and/or updating purposes may include, for example, when a new event includes a particular machine identifier, a particular user identifier, and/or the recency of ; 
comparing the filtered, tokenized segments to one another to determine if a match exists therebetween (see ¶ 262, “two sessions is determined based on comparing three items: "from-session-link-context", "to-session-link-context", and "Link-Event time"… Two existing sessions should be linked or correlated if the newly added session (1) matches a link event time range, (2a) has a match in one of its from-session-link-context or to-session-link-context with those of one existing session, and (2b) has at least a partial match in one of its from-session-link-context or to-session-link-context with those of another existing session”, also see ¶ 634, “assigning an anomaly score indicating a confidence level that the entity identifier matches a particular entry in the external data source based on the comparing.”); 
and categorizing the filtered, tokenized segments based on the comparison (see figure 9A showing raw event data 900 received by the data intake and figure 9B showing categorized/parsed data showing entities involved in the event and their relationship, also ¶ 216-217 categorizing the event data to a user, IP address, visits etc., also see ¶ 568, “Based on the event data 6310 (e.g., the login graph), the machine learning model 6300 generates classification metadata 6320 for each of the network devices and for each of the users”, i.e. event data interpreted as data instances, also see ¶ 564-566, “The method further generates or identifies classification metadata of the user and the device, based on event data about the login event, to further explain the relevance of the user and the device in a security context.”, see also ¶¶ 249 and 445), the categorizing comprising at least one categorical attribute (see ¶ 570, “the machine nodes and edges interconnecting the nodes, as illustrated in FIG. 63.  The nodes represent network entities such as users and network devices, and the edges represent the login events that the users log into the network devices” also see ¶ 571-572, i.e. usage relationship of users and devices (nodes and edges) are interpreted as categorical attributes), each of the at least one categorical attribute being an equivalence class determined through unsupervised machine learning (see ¶ 275, “the ML-based CEP engine can utilize unsupervised machine learning models) based on a weighted edit distance (see ¶ 572, “the machine learning model 6300 assigns usage similarity scores 6360 (also referred to as "similarity scores") to the network devices represented by the device nodes”, also see ¶¶ 573, “The similarity scores are assigned such that any given set of network devices that are accessed by the same or similar group of users are assigned similarity scores that are closer in value to each other than the similarity scores of any other set of network devices that are not accessed by the same or similar group of users.”, where the similarity scores [i.e. weighted edit distance] assigned by the machine learning to the usage relationship [i.e. equivalence class] of users and devices (nodes and edges) corresponds to an categorical attribute being an equivalence class determined through machine learning based on weighted edit distance), the use of the unsupervised machine learning occurs without using pre-determined or pre-configured metrics for assessing the data instances of the tokenized segments (see ¶ 275, “the ML-based CEP engine can utilize unsupervised machine learning models, it can identify entity behaviors and event patterns that are not ).

Muddu does not teach wherein the filtering excludes tokens comprising a particular character set and tokens less than a particular character count, the particular character set and the particular character count being based on a data type of the set of filtering criteria.
Dubey teaches wherein the filtering excludes tokens comprising a particular character set and tokens less than a particular character count, the particular character set and the particular character count being based on a data type of the set of filtering criteria (see ¶ 125, “the phrases can include n-grams of varying lengths, e.g., unigrams, bigrams, etc., up to arbitrary lengths. Phrases can then be removed from further consideration using one or more filters.”[i.e. filtering varying lengths ex. Unigrams (one word or character), bigram (two words or character sequence) which is a specific length corresponds to specific character set], also see ¶ 127, “relatively long repeated phrases, e.g., over a length of 20 words, can be filtered out regardless of how many times the phrases appear in the documents 618.” [], also see ¶ 128, “relatively low-length phrases that appear with relatively low frequency can be filtered out. Examples of relatively low frequencies can include, e.g., phrases occurring in less than a certain percentage (e.g., 1%) of the documents 618 or occurring fewer than a selected number of times (e.g., <10 occurrences in 1000 documents 618).”, also see ¶ 29, “examples of );
Both Muddu and Dubey pertain to the problem of data analyzing, thus being analogous. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to combine Muddu and Dubey to filter words/phrases that contains a length over a specific amount of characters/words. The motivation for doing so would be for the purpose of eliminating portions of unwanted data to minimize the size of data to be processed (see Dubey ¶¶ 69 and 121).

Regarding claim 17. 
Muddu and Dubey teach the method of claim 16, 
Muddu further teaches wherein when a match does not exist, a new category is created and attributed to one or more of the filtered, tokenized segments (see ¶ 204, “after the data connectors 802 obtain/receive the data, if the data format of the data is unknown (e.g., the administrator has not specified how to parse the data), then the format detector 804 can be used to detect the data format of the input data.  For example, the format detector 804 can perform pattern matching for all known formats to determine the most likely format of a particular event data”, also see 476, “as shown in FIG. 44A, the GUI provides a bubble 4400 prompting the user to tag the threat with "Threat Watchlist," "False Positive," "Important," "Reviewed," "Save for Later," or to define a new category for tagging (via the "New Threat Watchlist" selection)”).

Regarding claim 18. 

Muddu further teaches further comprising applying tokenization rules to exclude at least a portion of the segments of the unstructured log file (see ¶ 147, “To operate in real-time, the evaluation is performed primarily or exclusively on event data pertaining to current events contemporaneously with the data being generated by and/or received from the data source(s).  In certain embodiments, the real-time processing path excludes historical data (i.e., stored data pertaining to past events) from its evaluation.  Alternatively in an embodiment, the real-time processing path excludes third-party data from the evaluation in the real-time processing path.  These example types of data that are excluded from the real-time path can be evaluated in the batch processing path”, also see ¶ 164, “The data receivers 310 may also optionally filter some of the event data. For example, to reduce the workload of the security platform”, also see ¶ 218, “if the network administrator wishes to receive data in a new data format, he can edit the configuration file to create rules (e.g., in the form of functions or macros) for the particular data format including, for example, identifying how to tokenize the data”, also see ¶ 704, “The semantic processor 316 (FIG. 3) can process the event data to remove, add or modify at least some of the information and generate the traffic log 8050 in a condition that is suitable for further processing by the system 8025 efficiently”).

Regarding claim 19. 
Muddu and Dubey teach the method of claim 16, 	
Muddu further teaches further comprising applying filtering rules to remove tokenized segments corresponding to only numerical values or tokenized segments corresponding to date-related words, or tokenized segments corresponding to numbers with trailing units (see ¶ 164, “the data receivers 310 may also optionally filter some of the event data”, also see ¶ 167, “The attribution filter 322 in the semantic processor 316 may further remove events that need not be processed by the security platform.  An example of such an event is an internal data transfer that occurs between two IP addresses as part of a regular file backup.  In some embodiments, the functions of semantic processor 316 are configurable by a configuration file to permit easy updating or adjusting”, IP addresses filtering corresponds to tokenized segments corresponding to only numerical values or the tokenized segments corresponding to date-related words because IP addresses can include date-related words and also can be tokenized segments corresponding to numbers with trailing units).

Regarding claim 20. 
Muddu and Dubey teach the method of claim 16, 	
Muddu further teaches wherein comparing the filtered, tokenized segments to one another further comprises determining distances between the filtered, tokenized segments based on deletions and replacements, further wherein the distances between the filtered, tokenized segments indicates if they are within two operations to match (see ¶ 167, “filter attribution block 322 in the semantic processor 316 removes certain pre-defined events.  The attribution filter 322 in the semantic processor 316 may further remove events that need not be processed by the security platform.”, also see ¶ 407, “the method can identify an insider who poses a security threat based on a group of anomalies being close to each other in time and their confidence  

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the 



Any inquiry concerning this communication or earlier communications from the examiner should be directed to IMAD M KASSIM whose telephone number is (571)272-2958. The examiner can normally be reached mon-fri 730-500.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J. Huntley can be reached on (303) 297 - 4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 



/I.K./Examiner, Art Unit 2129                                                                                                                                                                                                        
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129