Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 01/29/2019 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Note Regarding 35 USC 101
Claims 15-20 recite “A computer program product, comprising a tangible machine-readable storage medium having encoded therein executable code of one or more software programs when executed by at least one processing device perform…”. Since Page 15, line 18-30 of the application discloses “ … Articles of manufacture comprising such processor-readable storage media are considered illustrative embodiments. A given such article of manufacture may comprise, for example, a storage array, a storage disk or an integrated circuit containing RAM, ROM or other electronic memory, or any of a wide variety of other types of computer program products. The term "article of manufacture" as used herein should be understood to exclude transitory, propagating signals …”, claims 15-20 are directed to statutory subject matter and does not leave open the possibility that the media could be transitory. Therefore, a 35 USC 101 rejection is not required.



Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 1-20 is rejected under 35 U.S.C. 102(a)(1) as being anticipated by Wang et al (U.S. 20150373043 A1; Wang)

Regarding claims 1, 9 and 15, Wang discloses a method, comprising: 
[Claim 9: A system, comprising: a memory (Fig.12, memory 1210) ; and at least one processing device (Fig.12, microprocessor(s) 1205), coupled to the memory, operative to implement (Paragraph 123: “The data processing system 1200 includes memory 1210, which is coupled to the microprocessor(s) 1205. The memory 1210 may be used for storing data, metadata, and programs for execution by the microprocessor(s) 1205.”) the following steps: ]
[Claim 15: A computer program product, comprising a tangible machine-readable storage medium having encoded therein executable code of one or more software programs, (Paragraph 24: “the sensor (or logic or engine) may be software in the form of one or more software images or software modules, such as executable code in the form of an executable application, an application programming interface (API), a routine or subroutine, a script, a procedure, an applet, a servlet, source code, object code, a shared library/dynamic load library, or one or more instructions. The software module(s) may be stored in any type of a suitable non-transitory storage medium, or transitory storage medium (e.g., electrical, optical, acoustical or other form of propagated signals such as carrier waves, infrared signals, or digital signals).”) wherein the one or more software programs when executed by at least one processing device perform (Paragraph 123: “The data processing system 1200 includes memory 1210, which is coupled to the microprocessor(s) 1205. The memory 1210 may be used for storing data, metadata, and programs for execution by the microprocessor(s) 1205.”) the following steps: ]

obtaining anomaly analysis data (Fig.7, Domain threat model 720A-720B) integrated from a plurality of data sources (Fig.7, Data analysis engine 220A -220B) of an organization, (Fig.7, Customer 300A-300B) wherein the plurality of data sources comprises at least one set of labeled anomaly data related to anomalous transactions; (Fig.7, See the result at 755 such that source 220A had  virus or suspicious) (Paragraph 101: “FIG. 7 illustrates customer 300A that typically receives and transmits traffic from language family 1 (e.g., English) and customer 300B that typically receives and transmits traffic from language family 2 (e.g., Chinese). The data analysis engines 220A-B of the customers 300A-B include the domain threat models 720A-B respectively. In one embodiment, the domain threat models 720A-B are the same (although working on different data). The domain threat models 720A-B may include DGA detection.”) 
extracting features from the integrated anomaly analysis data (Fig.7, Domain threat model 720A-720B) that correlate with an indication of an anomaly, based on predefined correlation criteria; (Fig.7, operation 750-40 ; Paragraph 101: “At operation 740, the data analysis engine 220B detects traffic that is sent to domain 1 of language family 2. … At operation 750, the data analysis engine 220A detects traffic that is sent to domain 1 of language family 2.”)
training, using at least one processing device,(Fig.7, operation 760; Paragraph 101: “At operation 760, the data analysis engine 220A transmits the model result to the centralized controller 240 that indicates that suspicious traffic is being sent to domain 1.”) a plurality of machine learning models using the extracted features, (Fig.7, operation 755 and 745) wherein each of the plurality of machine learning models is trained using different combinations of the extracted features; (Fig.7, operation 755 and 745 ; Paragraph 101: “The data analysis engine 220B generates a threat score for domain 1 at operation 745 and deems that the domain 1 is not suspicious … At operation 755 the data analysis engine 220A trains the domain threat model 720A using the detected traffic information and generates a threat score for domain 1 and deems that domain 1 is receiving suspicious traffic.”)
 evaluating, (Fig.7, operation 765; Paragraph 101: “At operation 730 (corrected operation 765), the centralized controller 240 trains the domain threat model 730 using the received model result (and potentially other information related to the domain) to generate a threat score for domain 1.”) using the at least one processing device, (Fig.7, centralized controller 240; operation 760) a performance of the plurality of trained machine learning models; (Fig.7, operation 755 and 745 ; Paragraph 101: “The data analysis engine 220B generates a threat score for domain 1 at operation 745 and deems that the domain 1 is not suspicious … At operation 755 the data analysis engine 220A trains the domain threat model 720A using the detected traffic information and generates a threat score for domain 1 and deems that domain 1 is receiving suspicious traffic.”)  and 
extracting one or more rules (Fig.7, operation 765-770-775-780; Paragraph 101: “At operation 730, the centralized controller 240 trains the domain threat model 730 using the received model result (and potentially other information related to the domain) to generate a threat score for domain 1. The centralized controller 240 transmits the intermediate result (the threat score for domain 1) to the data analysis engine 220B at operation 770. The data analysis engine 220B refines the domain threat model 720B with the threat score for domain 1 received from the centralized controller 240 at operation 775 and generates a threat score for domain 2 at operation 780.”) from one or more of the trained machine learning models based on the performance, wherein the extracted one or more rules ( Fig.7, operation 765-770 -775 and 780; Paragraph 101: “The centralized controller 240 transmits the intermediate result (the threat score for domain 1) to the data analysis engine 220B at operation 770. The data analysis engine 220B refines the domain threat model 720B with the threat score for domain 1 received from the centralized controller 240 at operation 775 and generates a threat score for domain 2 at operation 780.”) are used to classify transactions as anomalous. (Fig.7, operation 780 ; ( Fig.7, operation 775 and 780; Paragraph 101: “The centralized controller 240 transmits the intermediate result (the threat score for domain 1) to the data analysis engine 220B at operation 770. The data analysis engine 220B refines the domain threat model 720B with the threat score for domain 1 received from the centralized controller 240 at operation 775 and generates a threat score for domain 2 at operation 780.”, it shows “at operation 780 generate threat score for deemed suspicious” interpreted as “classify transactions as anomalous.”)

Regarding claims 2,10 and 16  , Wang discloses  an integration of the anomaly analysis data from the plurality of data sources of an organization comprises one or more of merging tables, removing irrelevant information and removing redundant information. (Paragraph 103: “entity group behavior modeling is performed where multiple entities in a group are profiled and monitored for anomalous behavior (e.g., abnormal group behavior, abnormal individual entity as compared to other entities of the group). An entity group includes entities that typically share at least a same characteristic such as: type, location, purpose, organization, or the like.” ; Paragraph 43: “as an optional feature, the metadata may be anonymized to remove sensitive or personalized information for the enterprise network 140. For instance, the metadata may be anonymized by substituting a user name associated with the input information being analyzed with a generic identifier.”)

Regarding claims 3, 11 and 17, Wang discloses  the extracted features comprise at least one engineered feature relevant to anomaly classification based on domain knowledge. (Paragraph 75: “the local threat intelligence module 345 may adapt local model(s) by modifying feature(s) of the local model(s) (e.g., adding features, removing features, and/or prioritizing features), updating input to the local model(s) based on the result of the global model(s), and/or modifying the algorithm of the local model(s) based on the result of the global model(s).”) (Paragraph 88: “a result of training an entity risk model for a particular entity (e.g., a user risk model for a particular user) may be that the resulting entity risk score exceeds a certain compromise threshold, which is an indication that the entity has been compromised. A compromised entity may perform malicious actions resulting from the compromise (e.g., downloading unauthorized files, attempting to login to secure servers, uploading information to servers, attempting to compromise other entities, etc.).”)

Regarding claim 4, Wang discloses  the extracted features comprise one or more of contact information features; online activity features and order processing features. (Paragraph 85: “Each dimension profiled can take as input one or more features. For example, with respect to the visited domain dimension, there may be a number of features as input such as the total number of visited domains, the frequency distribution of visited domains, time range of each visited domain, etc.”; Paragraph 83: “the results of training the local model(s) may be displayed in a user interface such as a dashboard for the customer. The data analysis engine 220 may also support interactive customer queries over the stored data including the results of training the local model(s). Example local models that may be trained include: a local model for destination IP address, a local model for destination domains, a local model for filenames or file hashes, a local model for entity risk, etc.”)

Regarding claims 5, 12 and 18, Wang discloses one or more of the trained machine learning models comprise a decision tree comprising paths to an anomaly classification having a predefined significance. (FIG. 7 illustrates an example of collaborative and adaptive threat intelligence for domain threat modeling according to one embodiment;, showing a decision trees. Paragraph 42: “The flow records 282 allow the data analysis engine 220 (or network sensor engine 200.sub.1 itself) to formulate a threat exposure mapping (e.g., display of communication paths undertaken by network devices within the enterprise network 140), which may be used to detect anomalous communication patterns through deviations in normal communications by one or more of the network devices, such as an endpoint device (e.g., client device or server)”)

Regarding claims 6, 13 and 19, Wang discloses one or more properties of the extracted one or more rules are tunable by a user.(Fig.7 shows that the centralized controller 240 can be used to extract one or more rule tunable by a user – see Paragraph 101 and Abstract: “Result data is received from the centralized controller that is a result of one or more global models trained on the centralized controller using data collected on multiple customer networks including the first customer network. The one or more local models are adjusted using the received result data and the one or more adjusted local models are trained.”; Paragraph 43: “Besides receipt and processing of input information as described above, the first network sensor engine 200.sub.1 may be adapted to generate metadata in a normalized format that is readable by the data analysis engine 220. Some or all of the input information received by first network sensor engine 200.sub.1 is used to generate the metadata. Herein, as an optional feature, the metadata may be anonymized to remove sensitive or personalized information for the enterprise network 140. For instance, the metadata may be anonymized by substituting a user name associated with the input information being analyzed with a generic identifier. Additionally or in the alternative, the file name assigned to the input information or other properties may be substituted for corresponding generic identifiers, where these generic identifiers may be re-mapped by the first network sensor engine 200.sub.1 or another network device to recover the user name, file name and/or removed properties.”)

Regarding claim 7,  Wang discloses the step of adjusting a distribution of instances of at least one label in the anomaly analysis data to address an imbalance of the at least one label. (Paragraph 94: “the data analysis engine 220 receives the transmitted result data from the centralized controller 240 and at operation 490 adjusts the local modeling using the result data such as adapting one or more local model(s) and/or input(s) into the local model(s) using the transmitted result data. For example, in the case of a feature modification of a local model (e.g., remove, prioritize, or add feature(s), the data analysis engine 220 modifies that local model accordingly. In the case of an intermediate result (e.g., a threat probability score), the data analysis engine 220 may use that intermediate result as an input to the local model(s) or modify that intermediate result based on the analysis engine data prior to using it as input to the local model(s) to adapt the intermediate result to the local intelligence. For example, if the intermediate results from the centralized controller 240 indicate that an IP address has a 30% probability threat score, the data analysis engine 220 may adjust the probability threat score based on its local intelligence to reflect the threat of that IP address experienced by that particular data analysis engine”) 

Regarding claims 8, 14 and 20, Wang discloses one or more of the extracted rules are in a human-readable format for one or more of configuration and modification by a user. (Fig.7 shows that the centralized controller 240 can be used to extract one or more rule tunable by a user – see Paragraph 0101 and Abstract, Par 0043) (PARAGRAPH 83: “the results of training the local model(s) may be displayed in a user interface such as a dashboard for the customer. The data analysis engine 220 may also support interactive customer queries over the stored data including the results of training the local model(s). Example local models that may be trained include: a local model for destination IP address, a local model for destination domains, a local model for filenames or file hashes, a local model for entity risk, etc.”
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Fogel (U.S. 20060036560 A1), “Intelligently Interactive Profiling System And Method”, teaches about an intelligently interactive system for identifying at least one property of data. It also teaches about a programmed product, comprising a signal-bearing medium or signal-bearing media tangibly embodying a program of machine-readable instructions executable by a digital processing apparatus to perform a method for identifying at least one property of data.
Eberhardt, III et al (U.S. 20130198119 A1), “Application of Machine Learned Bayesian Networks to Detection of Anomalies in Complex Systems”, teaches about the use of machine-learned Bayesian networks to detect anomalies in a complex system. It also teaches about machine learning is a field of computer science that uses intelligent algorithms to allow a computer to mimic the process of human learning. The machine learning algorithm allows the computer to learn information structure dynamically from the data that resides in the data warehouse. The machine learning algorithms automatically detect and promote significant relationships between variables, without the need for human interaction. This allows for the processing of vast amounts of complex data quickly and easily.
Gupta et al (U.S. 20160055426 A1), “Predictive model for anomaly detection and feedback-based scheduling”, teaches about detecting anomalies in computer server clusters. It also teaches the system two major components—a predictive model and a scheduler. The predictive model collects data from time-series database. The nodes in the cluster generate the data at every periodic interval. The predictive model may run machine-learning based algorithms and output a “score” for each node at every time intervals. The score indicates whether any node in the cluster has a possibility of going anomalous in near future or not.
Yan et al (U.S. 20180365696 A1), “Financial fraud detection using user group behavior analysis”, teaches about fraud detection and more particularly financial fraud detection using user group behavior analysis. It also teaches about the system includes an account holder cluster generator for clustering account holders into groups by jointly considering account activities as features in a clustering algorithm such that account holders in each group have similar behavior according to analysis of the features in the clustering algorithm. A suspicious behavior detection system is used for detecting, in each group, a list of suspicious transactions by determining outlier transactions for a transaction type of interest relative to transactions of each account holder in a group. A fraud suspicion response system is for alerting users automatically of the suspicious transactions.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Duy A Tran whose telephone number is (571)272-4887. The examiner can normally be reached Monday-Friday 8:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Edward F Urban can be reached on (571)-272-7899. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DUY TRAN/            Examiner, Art Unit 2665                   
                                                                                                                                                                         
/BOBBAK SAFAIPOUR/            Primary Examiner, Art Unit 2665