DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
The following claims is/are pending in this office action: 1-14
Claim(s) 1-14 is/are rejected.

Drawings
The drawings were received on 02/26/2019 are accepted.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 12/06/2019 and 08/29/2019 have been accepted.  The submissions are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner. Initialed and dated copies of Applicant’s IDS forms 1449 are attached to the Office Action.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 

Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: an input unit, a storage unit, and a processing unit in claim 1, and in the dependent claims 2, 3, 4, 5, 6, and 7.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-14 are rejected under 35 U.S.C. 103 as being unpatentable over Bhuyan et al. (Network anomaly detection: methods, systems and tools; hereinafter “Bhuyan”) in view of Ankur et al. (WO2017037443A1; hereinafter “Ankur”).

Regarding claim 1, Bhuyan teaches A system of training behavior labeling model, (Section 1B Para 1: “Consequently, an event or an object is detected as anomalous if its degree of deviation with respect to the profile or behavior of the system, specified by the normality model, is high enough.”) comprising: an input unit, receiving a labeled data set, wherein the labeled data set comprises a training data set and a verification data set and each data of the training data set and each data of the verification data set respectively comprise first labeling information (Page 316 Section C: “The architecture of MINDS is given in Figure 10. It accepts NetFlow data collected through flow tools as input.” Page 326 Second Last Para: “The KDD training dataset consists of approximately 4, 900, 000 single connection vectors, each of which contains 41 features and is labeled as either normal or attack with a specific attack type. The test dataset contains about 300, 000 samples with 24 training attack types, with an additional 14 attack types in the test set only.” Labelled training and test/verification data is being used in the analysis.) a storage unit, storing a plurality of learning modules (Fig. 10 shows a storage unit to store data.) a processing unit, connected to the input unit and the storage unit and respectively inputting each data of the training data set to the learning modules to establish a plurality of labeling models (Page 312 Para 1: “The statistical processor maintains a reference model of typical network activities, compares reports from the event preprocessor with the reference models, and forms a stimulus vector to feed into the neural network classifier. The neural network classifier analyzes the stimulus vector from the statistical model to decide whether the network traffic is normal.” Page 316 Section C Para 3: “For example, MINDS (Minnesota Intrusion Detection System) [34] is a data mining-based system… Before data is fed into the anomaly detection module, a data filtering step is executed to remove network traffic in which the analyst is not interested.” Processor maintains the learning model/module and fed data to these models/modules (also called anomaly detection model/module. This model/classifier classifies the data into normal or not-normal.)
wherein the processing unit further obtains a plurality of second labeling information corresponding to each data of the verification data set respectively according to the labeling
models (Page 309 Section 4: “The typical approach used in such techniques is to build a model for the class corresponding to normal behavior, and use the model to identify anomalies in the test data.” Page 310 Section 6: “Typically, the outputs produced by anomaly detection techniques are of two types: (a) a score..(b) a label, which is a value (normal or anomalous) given to each test instance.” Second labelling information coming out as an output of supervised model (first labelling information of the data is already known for test data since supervised machine learning is being used. All data already has labelled information. The test data is fed into the model to also get the labels from the model so that this label can be compared with the first label which was already known to assist in model verification.)
and respectively generates a behavior labeling result corresponding to each data of the
verification data set according to the second labeling information corresponding to each data of the verification data set (Page 309 Section 4: “The typical approach used in such techniques is to build a model for the class corresponding to normal behavior, and use the model to identify anomalies in the test data.” Each instance is classified as normal or abnormal behavior in test/verification data.).
Bhuyan does not explicitly teach wherein the processing unit further obtains a labeling change value according to the behavior labeling result and the first labeling information corresponding to each data of the verification data set and determines whether the labeling change value is greater than a change threshold, and in response to that the labeling change value is greater than the change threshold, the processing unit updates the first labeling information corresponding to each data of the verification data set according to the behavior labeling results, exchanges the training data set and the verification data set and reestablishes the labeling models according to the exchanged training data set.
Ankur, however, teaches wherein the processing unit further obtains a labeling change value according to the behavior labeling result and the first labeling information corresponding to each data of the verification data set and determines whether the labeling change value is greater than a change threshold (Page 38 lines 29-34: “The tests may be used to produce a score which may be compared against a number of thresholds in order to classify an event or series or events, as mentioned. The first test uses the anomaly detection algorithm and aims to find divergence between the tested event(s) and expected behavior.” Page 45 lines 18-21: “The outputs 30 of the analysis engine are compared against confidence thresholds. A plurality of thresholds may be provided for different excluded actions or behavioural change points, and a plurality of thresholds may be provided per excluded action or behavioural change points.” Output or score from a model can be compared with threshold for behavior change point (the point reflects a user behavior may change). The comparison will show if the score is higher than the change threshold.)
and in response to that the labeling change value is greater than the change threshold, the processing unit updates the first labeling information corresponding to each data of the verification data set according to the behavior labeling results (Page 15 lines 2-6: “Optionally, the method further comprises classifying calculated scores using one or more predetermined or dynamically calculated thresholds.” Based on the threshold comparison above, the behavior can be reclassified/relabeled. The first classification label was already done/known before threshold comparison when test data was entered in the model.) 
exchanges the training data set and the verification data set and reestablishes the labeling models according to the exchanged training data set (Page 8 lines 26-30: “Optionally, the method further comprises receiving feedback related to changes in behaviour traits. Receiving feedback related to output accuracy can allow for the probabilistic model to adapt in response to feedback, which can improve the accuracy of future outputs.” Feedback is provided to the model to readjust the first label based on the label change done in previous limitation. This will become a training set to train the model further to improve its accuracy.).
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to combine the user behavior labelling models of Bhuyan with the threshold and labelling change mechanism of Ankur to predict user behavior change that can cause abnormal event so that preventive action can be taken (Ankur, Page 45, last para).

Regarding claim 2, Bhuyan and Ankur teach the method of claim 1.
(Page 46 Lines 5-10: “If the threat or predicted behavioural change meets one or more thresholds, the analysis engine 230 may assess a pre-emptive action to be taken. The action selected may, for example, be determined in dependence on the confidence that an excluded action or predicted behavioural change will occur…” Based on the spec para 0044 and Fig. 3 it advises that by further storing the labeling models when labeling change value is not greater than the change threshold hold means no feedback or updates to the model is done. Likewise, from Ankur, it says that action is only taken when behavior change meets the threshold. If it does not, no action is taken.).
Same motivation to combine the teachings of Bhuyan and Ankur as claim 1.

Regarding claim 3, Bhuyan and Ankur teach the method of claim 1.
Bhuyan also teaches wherein the second labeling information corresponds to normal labels and abnormal labels (Page 310 Section Reporting Anomalies: “(b) a label, which is a value (normal or anomalous) given to each test instance.”) and the processing unit is further configured to determine the number belonging to the normal labels and the number belonging to the abnormal labels in the second labeling information corresponding to each data of the verification data set (Page 310 Section Reporting Anomalies: “Typically, the outputs produced by anomaly detection techniques are of two types: (a) a score… (b) a label, which is a value (normal or anomalous) given to each test instance” Output labels given by multiple models on test or verification data can be counted as normal or anomalous/abnormal.) and generate (Page 310 Section Reporting Anomalies: “(iii) majority voting based on the outputs given by multiple indices.” After Output labels given by multiple models is counted the final label is based on majority voting (meaning which ever label is in majority.).

Regarding claim 4, Bhuyan and Ankur teach the method of claim 1.
Bhuyan also teaches wherein the processing unit further execute operations of: obtaining a first number, wherein the first number is a number of the data which the behavior labeling result and the first labeling information are normal in the verification data set (Page 310 Section 6: “(b) a label, which is a value (normal or anomalous) given to each test instance.” Each instance in test /verification data is labelled as normal or anomalous/abnormal. The count of normal labels given by multiple models can be counted which can be named as first number.)
obtaining a second number, wherein the second number is a number of the data which the behavior labeling result and the first labeling information are abnormal in the verification data set (Page 310 Section 6: “(b) a label, which is a value (normal or anomalous) given to each test instance.” Each instance in test /verification data is labelled as normal or anomalous/abnormal. The count of abnormal labels given by multiple models can be counted which can be named as second number.)
obtaining a measurement value of accuracy according to a ratio of a sum of the first number and the second number to a data amount of the verification data set (Page 328 Section Accuracy: “Accuracy is a metric that measures how correctly an IDS works, measuring the percentage of detection and failure as well as the number of false alarms that the system produces [223], [224]. If a system has 80% accuracy, it means that it correctly classifies 80 instances out of 100 to their actual classes.” Also shown in Fig. 15 accuracy is calculated as a ratio of sum of first and second number VS the total numbers of actual positives and negatives in a verification or test dataset.)  
obtaining a measurement value of specificity according to a ratio of the first number to the number of the first labeling information which is normal (Page 328 Section Sensitivity and Specificity: “On the other hand, if a normal (n) test instance is predicted as normal (N) it is known as true negative (TN)… The true negative rate (TNR) is also called specificity.” First number which belongs to normal label is used to calculate specificity rate. Specificity rate calculation is also shown in Fig. 15.)
obtaining a measurement value of sensitivity according to a ratio of the second number to the number of the first labeling information which is abnormal (Page 328 Section Sensitivity and Specificity: “The true positive rate (TPR) is the proportion of anomalous instances classified correctly over the total number of anomalous instances present in the test data. TPR is also known as sensitivity.” Second number belongs to abnormal or anomalous label which is used to calculate sensitivity.)
respectively determining difference values between the measurement value of accuracy, the measurement value of specificity and the measurement value of sensitivity and a historic measurement value of accuracy, a historic measurement value of specificity and a historic measurement value of sensitivity, so as to obtain the labeling change value (Page 326 Section 2: “This dataset was prepared by Stolfo et al. [206] and is built on the data captured in the DARPA98 IDS evaluation program. The KDD training dataset consists of approximately single connection vectors, each of which contains 4, 900, 000 features and is labeled as either normal or attack with a 41 specific attack type.” Page 315 Para 2: “In traditional classification, new information can be incorporated by re-training with the entire dataset.” Historical data/dataset is available which is already labelled and whose statistics (including accuracy, specificity, and sensitivity) will be known (using the formulas discussed in previous limitation) so difference or change in label values can be found when compared with the new information to determine if label change values requires re-training the model.).

Regarding claim 5, Bhuyan and Ankur teach the method of claim 1.
Bhuyan also teaches wherein the input unit further receives a historic data set, wherein the historic data set comprises a first data set and a second data set (Page 326 Section 2: “This dataset was prepared by Stolfo et al. [206] and is built on the data captured in the DARPA98 IDS evaluation program. The KDD training dataset consists of approximately single connection vectors, each of which contains 4, 900, 000 features and is labeled as either normal or attack with a 41 specific attack type. The test dataset contains about 300, 000 samples with 24 training attack types, with an additional 14 attack types in the test set only.” Historic data contain first dataset (training dataset) and second dataset (test or verification dataset).)
 each data of the first data set comprises third labeling information (Page 308 Section 3: “The label associated with a data instance denotes if that instance is normal or anomalous. It should be noted that obtaining accurate labeled data of both normal or anomalous types is often prohibitively expensive. Labeling is often done manually by human experts and hence substantial effort is required to obtain the labeled training dataset.” Third label information is defined in spec para 0047 as: “Each data of the first data set has third labeling information, and the third labeling information is manually labeled on the behavior of the users, wherein types of the labels are, for example, normal and abnormal.” This means third labeling information is manual. In Bhuyan, labeling is also done manually in first/training dataset which can be named third labeling information.)
wherein the processing unit further inputs the first data set to the initial learning module to obtain an initial labeling model (Page 309 Section 4: “In supervised mode, one assumes the availability of a training dataset which has labeled instances for the normal as well as the anomaly class. The typical approach in such cases is to build a predictive model for normal vs. anomaly classes.” “Rough sets have two useful features [149]: (i) enabling learning with small size training datasets.” Initial labeling or classification model is built by training it with first dataset or training dataset.) and labels the second data set according to the initial labeling model to generate the first labeling information of the second data set, and the labeled data set is the second data set with the first labeling information (Page 309 Section 4: In supervised mode, one assumes the availability of a training dataset which has labeled instances for the normal as well as the anomaly class. The typical approach in such cases is to build a predictive model for normal vs. anomaly classes. Any unseen data instance is compared against the model to determine which class it belongs to.” Second dataset or test dataset is unseen data for the model, model uses learning it did during training process to label the test data.).

Regarding claim 6, Bhuyan and Ankur teach the method of claim 5.
Ankur also teaches wherein the initial labeling model and each of the labeling models respectively comprise behavior features corresponding to a plurality of time zones (Page 21 lines 16-21: “The log-ingesting server 210 acts to aggregate received log files 10, which originate from the client system 100 and typically the log files 10 will originate from the variety of devices 4 , 6 , 8 within the client system 100 and so can have a wide variety of formats and parameters. The log-ingesting server 210 then exports the received log files 10 to the data store 220, where they are processed into normalised log files 20.” Page 33 lines 27-28: “All log files 10 used by the behavioural analysis system 200 should therefore contain timestamp information…” Log file contains user login behavior with timestamp (which will also tell time zones). Shown in Fig. 4 log files are index database and then analysis engine to be used in the labeling model.)
wherein the processing unit further finds out the corresponding behavior feature respectively according to a login time corresponding to each data of the second data set and the time zones in the initial labeling model and labels each data of the second data set according to the corresponding behavior feature to generate the labeled data set (“Page 22 last Para: The analysis engine 230 compares the normalised log files 20 (providing a measure of present user interactions) to data previously saved in the data store (providing a measure of historic user interactions) and evaluates whether the normalised log files 20 show or indicate that the present user interactions are normal or abnormal, along with comparing recent user interactions against known patterns of user interactions.” Login time and time zone are contained in log files as discussed above. The log files information will be used to label user interaction or behavior as normal or abnormal.)
wherein the processing unit further finds out the corresponding behavior feature respectively according to a login time corresponding to each data of the verification data set and the time zones in the labeling model and labels each data of the verification data set according to the behavior feature to obtain the plurality of the second labeling information corresponding to each data of the verification data set (Page 11 lines 16-18: “Optionally, testing each of said plurality of user interactions with the monitored computer networks against one or more probabilistic models further comprises identifying abnormal user interactions.” Page 2 lines 23-26: “The use of metadata related to user interactions (as encapsulated in log files, for example, which are typically already generated by devices and/or applications).” User interaction data in the form or log files which contains user activity with login time and time zone can used as a test set to label them as normal or abnormal activity using the models.).
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to combine the user behavior labelling models of Bhuyan with the user log and timing information of Ankur to identify user behavior is abnormal so that alert can be generated (Ankur, Page 23, first para).

Regarding claim 7, Bhuyan and Ankur teach the method of claim 1.
Ankur also teaches wherein the input unit further receives a historic data set, wherein the historic data set comprises a user data set corresponding to each of a plurality of users (Page 3 lines 8-14: “Optionally, the historic user activity comprises one or more of: individual user historic activity; user peers historic activity and/or interactions with an individual user; historic user group behavior learned from a plurality of organisations…The use of real historic data from within the one or more monitored computer networks allows for scenarios leading up to excluded events to be more accurately identified.”) wherein the processing unit further executes operations of: determining an abnormal degree of usage amount of each of the users in the time zones respectively according to each of the user data sets and determining an abnormal rate of each user in the time zones according to each of the historic data sets and the labeling models (Page 4 lines 22-33“Optionally, the contextual data comprises data related to any one or more of: identity data, job roles, psychological profiles, risk ratings, working or usage patterns, action permissibilities, and/or times and dates of events… The use of heuristics, for example predetermined heuristics based on psychological principles or insights, can allow for factors that may not be easily quantifiable to be taken into greater account, which can improve recognition of scenarios that may indicate that an excluded event may occur.” Usage pattern together with log files are included as input data in the model to identify if the event is “excluded/abnormal”.)
respectively determining the abnormal degrees of usage amount and the abnormal rates of the users in the time zones according to the abnormal degree of usage amount and the abnormal rate of each of the users and obtaining a comprehensive abnormal indicator corresponding to each of the users (Page 43 lines 9-21: “A notable indicator in this regard is stress, as previously described. Other detectable behaviours which may indicate that 'something is wrong' include, for example, agitation, distraction (as detected by the user multitasking or spending only short amounts of time on any activity, for example), erratic behaviour, negligence, or idleness (as detected by a very low number of interactions with documents, for example)… for example, this might include that users working with valuable information in a highly competitive industry may be strongly tempted to perform an excluded action such as stealing and selling information.” Page 42 Lines 11-18 “Other assumptions produced from contextual data 40 (such as a user's job) may include that, generally, certain employees (i.e. users) do not work regular hours… A mix of generalisations can be compiled per job type (i.e. user groups), thus allowing for sudden changes of behaviour as compared to colleagues with the same job type to be easily detected.” User interaction pattern, working hours, long-short user activity are some examples of indicators that tells may be associated with abnormal behavior. A comprehensive indicators list can be maintained.)
obtaining a sorting order of abnormality of each user according to the comprehensive abnormal indicator corresponding to each user (Page 34 lines 17-19: “This contextual data 40 may be associated with a given user, device, application or activity, producing a 'profile' which is saved in the data store 220.” Page 42 lines 26-34, Page 43 lines 1-4: “For example, if a contextual data 40 such as a psychological profile is inputted, a user may be characterized as an extrovert. Alternatively, a user may be automatically classified as an extrovert based on factors relating to their outgoing communications to other users, for example. This may then change certain parameter limits for determining whether an activity is suspicious. The behavioural analysis system 200 may then be able to detect whether a user is behaving out of character…specific user” User profile is maintained based on his activities and characteristics. A list of comprehensive abnormal indicator is maintained as discussed above. A user activity can be rated or judged based on the comprehensive indicator list. In this example, if a user was historically characterized as an extrovert, but his recent activities showing otherwise, he can be put as high risk employee compared to other. A sorting list of user based on user activities VS their historical profile can be made.).
Same motivation to combine the teaching of Bhuyan and Ankur as claim 6.

Regarding claim 8-14, they are substantially similar to claims 1-7 and are rejected in the same manner, the same art, and reasoning applying.

Conclusion
An inquiry concerning this communication or earlier communication from the examiner should be directed QAMAR IQBAL whose telephone number is 571-272-2563. The examiner can normally be reached on M-F 10-6pm (EST). 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 

/Q.I/ 
Examiner 
Art unit 2123
04/15/2021

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123