Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 09/09/2022 has been entered.
DETAILED ACTION
	This Office Action is in response to a Request for Continued Examination (RCE) application received on 09/09/2022. In the RCE, Applicant has amended independent claims 1, 5, 9 and 17. Claims 2-4, 6-8, 10-16 and 18-24 remain original. 
	For this Office Action, claims 1-24 have been received for consideration and have been examined. 
Response to Arguments
Claim Rejection under 35 USC § 112
	Applicant’s remarks regarding claim rejection under 35 USC § 112(a) have been reviewed and found to be persuasive. Therefore the rejection has been withdrawn.  
 Claim Rejection under 35 USC § 103	
	Applicant’s remarks regarding claim rejection under 35 USC § 103 have been reviewed and summarized as follows:
Stupak fails to teach “providing a set of high-dimensional flow representations of network traffic by processing historical flow data through a deep learning (DL) model, each high-dimensional flow representation in the set of high-dimensional flow representation being a high-dimensional vector representing each host within the historical flow data" as recited in each of the claims 1, 9, and 17”. The final Office action interprets the "high-dimensional flow representations" as "traffic from uninfected computing devices" and interprets "low-dimensional representations" as "traffic from infected computing devices." These interpretations, however, do not make sense as they contradict the plain language of the claims. More particularly, the claims recite "each high-dimensional flow representation in the set of high-dimensional flow representations comprising a high-dimensional vector representing each host within the historical flow data" (emphasis added). In contrast, "traffic from uninfected computing devices" of Stupak only accounts for uninfected computing devices. More plainly stated, while the high-dimensional flow representations of the claims accounts for "each host within the historical flow data" (regardless of whether malicious), "traffic from uninfected computing devices" of Stupak only accounts for computing devices already known to not be infected (Page # 14-15).
Further, the claims recite "providing a set of low-dimensional flow representations of the network traffic based on the set of high-dimensional flow representations" (emphasis added). In contrast, "traffic from infected computing devices" of Stupak only accounts for infected computing devices and cannot be provided based on Stupak's "traffic from uninfected computing  devices." That is, the "uninfected computing devices" and the "infected computing devices" of Stupak are mutually exclusive sub-sets of the computing devices of Stupak. With regard to "labeling," the final Office action asserts that Stupak teaches: labeling at least a portion of the set of low-dimensional flow representations to provide a sub-set of labeled (See [0047] i.e. known malware data based on determined suspicious activity pattern) low- dimensional flow representations and a sub-set of unlabeled (See [0051] i.e. new suspicious activity is not similar to attacks previously known) low-dimensional flow representations ([0047] At step 203, the cloud Al engine 112 determines that at least one executable 108 on a computing device 104 is malware [labeled low-dimensional flow] based on the determined suspicious activity pattern; [0051] In one aspect, the cloud Al engine 112 detects that the new suspicious activity is not similar [unlabeled low-dimensional flow] to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114); final Office action, p. 6 (emphasis original). Here, the interpretation is non-sensical and directly contradictory to Stupak as well as the final Office action's own interpretations of "high-dimensional representations" and "low- dimensional representations." More particularly, the final Office action already interprets "low- dimensional representations" as "traffic from infected computing devices” (Page # 15-16).
Examiner’s Response
	 Regarding remark # 1, that Stupak fails to teach “providing a set of high-dimensional flow representations of network traffic by processing historical flow data through a deep learning (DL) model, each high-dimensional flow representation in the set of high-dimensional flow representation being a high-dimensional vector representing each host within the historical flow data", examiner respectfully disagree. Examiner would like to note that Stupak clearly discloses a concept where cloud AI engine [claimed deep learning (DL) model] is configured to collect and analyze the information received from the computing devices [claimed host] where historical data [claimed historical flow data] is being analyzed by cloud AI engine for each computing device in the system which eventually provides a detailed historical record of past behavior in each of the computing devices 104, as well as aggregated statistics representing the behavior of the computing devices (see Stupak [0041], [0046] & [0051]). Here, examiner would like to stress that historical data contains all type of network traffic from all the computing devices which includes traffic from uninfected computing devices as well, as mentioned in the Final Office Action. Therefore, Applicant’s remark is not correct that examiner’s interpretations contradict the plain language of the claims. 
	Regarding remark # 2, Applicant writes that “the "uninfected computing devices" and the "infected computing devices" of Stupak are mutually exclusive sub-sets of the computing devices of Stupak” (page # 16) to which examiner agrees and therefore examiner has mapped the low-dimensional flow as “known traffic from infected computing devices” in Final Office Action page 5-6. Applicant’s remarks regarding “labeling” of "low- dimensional representations” flows, examiner respectfully disagree with Applicant’s remarks that examiner’s mapping is contradictory to interpretations of "high-dimensional representations" and "low- dimensional representations."
	Examiner would like to note that the claim mapping in the Final Office Action for “labeled” and “unlabeled” is similar to Applicant’s disclosure which mentions that ‘labeled’ flow is malicious which in other words ‘known malicious’ and ‘unlabeled’ flow is ‘potential malicious’ which in other words ‘suspiciously malicious’ (See instant disclosure [0042] & [0051]). Therefore, based on this, examiner’s mapping coincides with how instant disclosure describes what are unlabeled and labeled low-dimensional flows. 
	Applicant’s amended language “at least one low-dimensional flow representation representing the known malicious host within the network traffic and at least one low-dimensional flow representation representing an unknown [malicious] host within the network traffic” 
is still taught by the primary references of Stupak which discloses low-dimensional flow representation representing the known malicious host ([0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware based on the determined suspicious activity pattern) and unknown [malicious] host ([0051] In one aspect, the cloud AI engine 112 detects that the new suspicious activity is not similar to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114).
Regarding amended second limitation “determining that at least one blacklisted Internet protocol (IP) address is present in the flow data, the at least one blacklisted IP address representing a known malicious host”, already relied upon reference Sanghavi et al., teaches this limitation in light of Fig. 6 [0110-0111]. See Office Action for details.


Claim Objections
Independent claims 1, 9 and 17 are objected to because of the following informalities:  
Claims 1, 9 and 17 recite “at least one low-dimensional flow representation representing the known malicious host within the network traffic and at least one low-dimensional flow representation representing an unknown host within the network traffic”.  
Examiner notes that second recitation of “at least one low-dimensional flow representation representing an unknown host within the network traffic” is missing a word ‘malicious’ between ‘unknown host’. For the purpose of examination, examiner will append the clause as “at least one low-dimensional flow representation representing an unknown [malicious] host within the network traffic”. Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-24 are rejected under 35 U.S.C. 103 as being unpatentable over Stupak et al., (US20190058736A1) in view of Sanghavi et al., (US20200007548A1) in view of Ding et al., (US20190087692A1) and further in view of Miller et al., (US20190188212A1).
Regarding claim 1, Stupak discloses:
A computer-implemented method for identifying and remediating zero-day attacks on a network, the method being executed by one or more processors and comprising:
receiving flow data representative of communication traffic of the network ([0043] an agent of the cloud AI engine 112 may monitor file and network activity and may determine how, through which protocols and applications each file has appeared on the system; [0041] the security cloud 102 is configured to collect and analyze the information received from the computing devices 104; [0045] The method 200 begins at step 201, in which the cloud AI engine 112 collects from a plurality of agents 106 executing on a respective computing device 104, analysis data corresponding to executables 108 on the respective computing device);
determining that suspicious Internet protocol (IP) address is present in the flow data, ([0049] the cloud AI engine 112 may further analyze the malware and detect specific traces linked to certain persons or parties responsible for its distribution and creation. In some aspects, the cloud AI engine 112 may detect specific internet activity, access to specific IP addresses), and in response:
providing a set of high-dimensional flow (See FIG. 3A; i.e. traffic from uninfected computing devices such as 304) representations of network traffic by processing historical flow data (i.e. historical data stored in cloud storage) through a deep learning (DL) model ([0051] the cloud AI engine 112 detects that the new suspicious activity is not similar to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114), 
each high- dimensional flow representation in the set of high-dimensional flow representations comprising a high-dimensional vector representing each host within the historical flow data ([0041] Using the collected data, in one aspect the cloud storage 114 may provide both a detailed historical record of past behavior in each of the computing devices 104, as well as aggregated statistics representing the behavior of the computing devices 104)
providing a set of low-dimensional flow (See FIG. 3A; i.e. known traffic from infected computing devices such as 302A & 302B) representations of the network traffic based on the set of high-dimensional flow representations ([0046] In some aspects, the cloud AI engine 112 may detect a malware epidemic based on previously known distribution methods. In another aspect, the cloud AI engine 112 may detect a security flaw in a user application that can be used for malware based on previously known distribution methods; [0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware based on the determined suspicious activity pattern),
at least one low-dimensional flow representation representing the known malicious host within the network traffic ([0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware based on the determined suspicious activity pattern) and at least one low-dimensional flow representation representing and an unknown [malicious] host ([0051] In one aspect, the cloud AI engine 112 detects that the new suspicious activity is not similar to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114), and
labeling at least a portion of the set of low-dimensional flow representations to provide a sub-set of labeled (See [0047] i.e. known malware data based on determined suspicious activity pattern) low-dimensional flow representations and a sub-set of unlabeled (See [0051] i.e. new suspicious activity is not similar to attacks previously known) low-dimensional flow representations ([0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware [labeled low-dimensional flow] based on the determined suspicious activity pattern; [0051] In one aspect, the cloud AI engine 112 detects that the new suspicious activity is not similar [unlabeled low-dimensional flow] to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114); and
identifying a host associated with an unlabeled (i.e. identification of zero-day malware) low-dimensional flow representation as a potentially malicious host ([0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware based on the determined suspicious activity pattern. In some aspects, the cloud AI engine 112 may identify a zero-day attack by malware not found in a database of previously known malware based on the suspicious activity pattern. In some aspects, the cloud AI engine may identify a zero-day attack by malware identified as modified from a previously known version based on the suspicious activity pattern and using a database of previously known malware), and 
in response, automatically executing a remedial action with respect to the potentially malicious host (See FIG. 2; [0048] At step 204, the cloud AI engine 112 may determine a remedial action and distribute to the plurality of agents 106 one or more commands to protect the respective computing device 104 from the malware … In an exemplary aspect, the cloud AI engine 112 may analyze the code and generate a set of new remedial steps to remedy or disinfect a client device of malware).
Stupak fails to disclose:
	determining that the suspicious flow contains at least one blacklisted Internet protocol (IP) address in the flow data, the at least one blacklisted IP address representing a known malicious host; reducing dimensions of the high-dimensional flow representations in the set of high-dimensional flow representations; clustering the unlabeled low-dimensional flow representations in the sub-set of unlabeled low-dimensional flow representations in view of the labeled low- dimensional flow representations in the sub-set of the labeled low-dimensional flow representations.
However, Sanghavi discloses:
	determining that the flow contains at least one blacklisted Internet protocol (IP) address is present in the flow data, the at least one blacklisted IP address representing a known malicious host (See FIG. 6; [0110-0111]; step 630 & 640 discloses receiving DNS flow data which contains destination domain identifier corresponds to a blacklisted domain identifier of the blacklisted domain identifiers).
It would have been obvious to one of the ordinary person skilled in the art before the effective filing date of the claimed invention to modify the Stupak reference and include a network device which obtains a set of rules that specify match criteria, associated with the blacklisted domains, that include source network addresses and/or destination network addresses for comparison to packet source network addresses and/or packet destination network addresses associated with incoming packets, as disclosed by Sanghavi.
The motivation to include such a network device is to capture and block malicious network traffic from spreading into the enterprise network.
The combination of Stupak and Sanghavi fails to disclose:
	reducing dimensions of the high-dimensional flow representations in the set of high-dimensional flow representations; clustering the unlabeled low-dimensional flow representations in the sub-set of unlabeled low-dimensional flow representations in view of the labeled low- dimensional flow representations in the sub-set of the labeled low-dimensional flow representations.
However, Ding discloses:
	reducing dimensions of the high-dimensional flow representations in the set of high-dimensional flow representations ([0006] In accordance with an embodiment, there is provided a system for determining a reliability score indicative of a level of fidelity between high dimensional (HD) data and corresponding dimension-reduced (LD) data; See [0047-0048] discloses various machine learning algorithms for dimensionality reduction techniques).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the Stupak and Sanghavi references and machine learning mathematical techniques, as disclosed by Ding.
	The motivation to include machine learning mathematical techniques is to create data visualization and, more particularly, a system and method to determine fidelity of visualizations of multi-dimensional data sets.
The combination of Stupak, Sanghavi and Ding fails to disclose:
	clustering the unlabeled low-dimensional flow representations in the sub-set of unlabeled low-dimensional flow representations in view of the labeled low- dimensional flow representations in the sub-set of the labeled low-dimensional flow representations.
However, Miller discloses:
	clustering the unlabeled low-dimensional flow representations in the sub-set of unlabeled low-dimensional flow representations in view of the labeled low- dimensional flow representations in the sub-set of the labeled low-dimensional flow representations ([0006] It is an object of the present invention to provide a system for and a means of prioritized detection of clusters of anomalous samples in an unlabeled data batch, each characterized by its samples being anomalous on the same (in general low-dimensional) subset of the full (high-dimensional) feature set; [0030] The present invention relates generally to the detection of groups/clusters of anomalous samples, with, in general, very high feature dimensionality, that may represent unknown classes (i.e., either classes of phenomena that have not been previously identified or which are known to exist but which are not expected to be observed in a given environment). Moreover, jointly, while detecting these groups/clusters of anomalies, the present invention identifies salient, low-dimensional feature subsets on which the groups manifest their atypicality).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the Stupak, Sanghavi and Ding references and include a method and system which detects clusters of anomalous samples low-dimensional data subset of the high-dimensional dataset for high-dimensional and low-dimensional feature representation, as disclosed by Miller.
	The motivation to include Miller’s method and system is to be able to detect and present anomalous clusters in large and complicated datasets.
Regarding claim 2, the combination of Stupak, Sanghavi, Ding and Miller discloses:
	The method of claim 1, wherein providing a set of low-dimensional flow representations of the network traffic based on the set of high-dimensional flow representations comprises processing the set of high-dimensional flow representations using one of t-distributed stochastic neighbor embedding (t-SNE) and principal component analysis (PCA) to provide the set of low-dimensional flow representations (Ding: See Figures 2 & 3; [0047] For linear dimensionality reduction (LDR) mapping methods (e.g., PCA) information loss can be characterized by the null-space, whose components are all mapped to {0}. Knowing this limitation of the linear methods, many nonlinear dimensionality reduction (NLDR) methods were developed, each of which applied different methods to attempt to preserve relevant information; [0048] These include distance preservation methods, for example multidimensional scaling (MDS), Sammon mapping, Isomap, curvilinear component analysis, kernel PCA; topology preservation methods including local linear embedding, Laplacian eigenmaps; neighborhood preservation methods including stochastic neighborhood embedding (SNE), and t-SNE).
Regarding claim 3, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The method of claim 1, wherein identifying a host associated with an unlabeled low-dimensional flow representation as a potentially malicious host comprises:
executing k-nearest neighbor (k-NN) clustering over the sub-set of labeled low-dimensional flow representations and the sub-set of unlabeled low-dimensional flow representations; and classifying the unlabeled low-dimensional flow representation as potentially malicious in response to the unlabeled low-dimensional flow representation being clustered with one or more labeled low-dimensional flow representations (Correlation Between Many-to-One and K-Nearest Neighbor Classification [0105] In some embodiments, assuming kNN has a high accuracy before dimension reduction, for kNN to be accurate in low dimensional space on point y, it is important that y's neighbors correspond to f−1(y)'s neighbors, which means y needs to have a low degree of many-to-one; [0106] As shown in FIG. 6, 602, 606, and 610 respectively show many-to-one vs kNN accuracy under t-SNE of NewsAgg with different perplexities, t-SNE of MNIST with different perplexities and dimensionality reduction of MNIST with different methods, including PCA, MDS, LLE, Isomap and t-SNE. Similarly 604, 608, and 610 show results on 1—Precision vs kNN accuracy. As depicted, many-to-one has strong negative correlation with kNN accuracy. While for precision the relationship is either non-existent, in 604, or weak, in 608 and 612).
Regarding claim 4, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The method of claim 1, wherein labeling at least a portion of the set of low-dimensional flow representations comprises determining that a low-dimensional flow representation is associated with a known malicious host and, in response, labeling the low-dimensional flow representation to provide a labeled low-dimensional flow representation included in the sub-set of labeled low-dimensional flow representations (Stupak: [0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware [labeled low-dimensional flow] based on the determined suspicious activity pattern; [0051] In one aspect, the cloud AI engine 112 detects that the new suspicious activity is not similar [unlabeled low-dimensional flow] to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114).
Regarding claim 5, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The method of claim 1, wherein determining that at least one blacklisted IP address is present in the flow data comprises:
receiving threat information from one or more threat information (TI) feeds (Sanghavi: [0019] the information obtained by the routing device may include, for example, a list of blacklisted domain identifiers associated with blacklisted domains, network addresses (i.e., Internet protocol (IP) addresses) for one or more sinkhole servers associated with the blacklisted domain identifiers, network address ranges (i.e., IP ranges) of blacklisted devices, network address prefixes (i.e., IP prefixes) of blacklisted devices, and/or the like. In some implementations, the network addresses for the one or more sinkhole servers associated with the blacklisted domain identifiers include sinkhole server identifiers, which may correspond to IPv4 addresses and/or IPv6 addresses. In some implementations, the routing device may obtain the information from the security platform via downloading, fetching, subscribing to receive a feed from the security platform, streaming the information in real-time or near real-time, receiving push notifications, and/or the like);
comparing blacklisted IP addresses in a set of blacklisted IP addresses provided in the threat information to IP addresses included in the flow data (Sanghavi: [0018] receiving DNS requests, comparing domain names received in the DNS requests to blacklisted domain identifiers stored in the DNS sinkhole data structure, and responding to the DNS requests with a network address of a sinkhole server that is associated with the blacklisted domain identifier); and 
determining that an IP address included in the flow data matches a blacklisted IP address (Sanghavi: [0018] In some implementations, the blacklisted domains may be associated with an attacker or an attacker's website, which a customer (e.g., a country, a network service provider, a network operator, etc.) determines should be blocked).
Regarding claim 6, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The method of claim 1, further comprising extracting the historical flow data in response to determining that the at least one blacklisted IP address is present in the flow data (Stupak: [0041-0054] In some aspects, the cloud AI engine 112 may further analyze the malware and detect specific traces linked to certain persons or parties responsible for its distribution and creation. For example, the cloud AI engine 112 detect specific internet activity, access to specific IP addresses, specific distribution traces, specific code patterns, or specific behavioral patterns; [0049] the cloud AI engine 112 may further analyze the malware and detect specific traces linked to certain persons or parties responsible for its distribution and creation. In some aspects, the cloud AI engine 112 may detect specific internet activity, access to specific IP addresses, specific distribution traces, specific code patterns, or specific behavioral patterns).
Regarding claim 7, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The method of claim 1, wherein automatically executing a remedial action with respect to the potentially malicious host comprises configuring a firewall system to at least partially block communication with the potentially malicious host (Sanghavi: [0017] The routing device may, in some implementations, include a firewall module as described herein, by which the forwarding component may be configured or instructed to forward traffic (i.e., packets) to the security device for inspection; [0018] As shown in FIG. 1A, and by reference number 102, the routing device may receive or obtain information from a security platform, including information stored in the DNS sinkhole data structure or database (DB) of the security platform. The security platform may include a local or cloud-based security solution (e.g., a firewall, etc.) configured to detect threats and implement DNS sinkhole functionality. The DNS sinkhole functionality may include, for example, receiving DNS requests, comparing domain names received in the DNS requests to blacklisted domain identifiers stored in the DNS sinkhole data structure, and responding to the DNS requests with a network address of a sinkhole server that is associated with the blacklisted domain identifier).
Regarding claim 8, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The method of claim 1, wherein each low-dimensional flow representation comprises a three-dimensional (3D) flow representation (Ding: [0096] Referring now to FIG. 4, an illustration of t-SNE of 3D “linked rings” with different perplexities and their average Wasserstein costs may be provided (402-416); [0097] Referring now to FIG. 5, an illustration of Wasserstein cost distinguish misleading visualization: 502 one 3D Gaussian blob; 504 t-SNE of one blob with cost 0.060; 506 t-SNE of one blob with cost 0.046; 508 two 3D Gaussian blobs; 510 t-SNE of two blobs with cost 0.045; and, 512 t-SNE of two blobs with cost 0.021).
Regarding claim 9, Stupak discloses:
A non-transitory computer-readable storage medium coupled to one or more processors and having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations for identifying and remediating zero-day attacks on a network, the operations comprising:
receiving flow data representative of communication traffic of the network ([0043] an agent of the cloud AI engine 112 may monitor file and network activity and may determine how, through which protocols and applications each file has appeared on the system; [0041] the security cloud 102 is configured to collect and analyze the information received from the computing devices 104; [0045] The method 200 begins at step 201, in which the cloud AI engine 112 collects from a plurality of agents 106 executing on a respective computing device 104, analysis data corresponding to executables 108 on the respective computing device);
determining that suspicious Internet protocol (IP) address is present in the flow data, ([0049] the cloud AI engine 112 may further analyze the malware and detect specific traces linked to certain persons or parties responsible for its distribution and creation. In some aspects, the cloud AI engine 112 may detect specific internet activity, access to specific IP addresses), and in response:
providing a set of high-dimensional flow (See FIG. 3A; i.e. traffic from uninfected computing devices such as 304) representations of network traffic by processing historical flow data (i.e. historical data stored in cloud storage) through a deep learning (DL) model ([0051] the cloud AI engine 112 detects that the new suspicious activity is not similar to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114), 
each high- dimensional flow representation in the set of high-dimensional flow representations comprising a high-dimensional vector representing each host within the historical flow data ([0041] Using the collected data, in one aspect the cloud storage 114 may provide both a detailed historical record of past behavior in each of the computing devices 104, as well as aggregated statistics representing the behavior of the computing devices 104)
providing a set of low-dimensional flow (See FIG. 3A; i.e. known traffic from infected computing devices such as 302A & 302B) representations of the network traffic based on the set of high-dimensional flow representations ([0046] In some aspects, the cloud AI engine 112 may detect a malware epidemic based on previously known distribution methods. In another aspect, the cloud AI engine 112 may detect a security flaw in a user application that can be used for malware based on previously known distribution methods; [0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware based on the determined suspicious activity pattern),
at least one low-dimensional flow representation representing the known malicious host within the network traffic ([0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware based on the determined suspicious activity pattern) and at least one low-dimensional flow representation representing and an unknown [malicious] host ([0051] In one aspect, the cloud AI engine 112 detects that the new suspicious activity is not similar to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114), and
labeling at least a portion of the set of low-dimensional flow representations to provide a sub-set of labeled (See [0047] i.e. known malware data based on determined suspicious activity pattern) low-dimensional flow representations and a sub-set of unlabeled (See [0051] i.e. new suspicious activity is not similar to attacks previously known) low-dimensional flow representations ([0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware [labeled low-dimensional flow] based on the determined suspicious activity pattern; [0051] In one aspect, the cloud AI engine 112 detects that the new suspicious activity is not similar [unlabeled low-dimensional flow] to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114); and
identifying a host associated with an unlabeled (i.e. identification of zero-day malware) low-dimensional flow representation as a potentially malicious host ([0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware based on the determined suspicious activity pattern. In some aspects, the cloud AI engine 112 may identify a zero-day attack by malware not found in a database of previously known malware based on the suspicious activity pattern. In some aspects, the cloud AI engine may identify a zero-day attack by malware identified as modified from a previously known version based on the suspicious activity pattern and using a database of previously known malware), and 
in response, automatically executing a remedial action with respect to the potentially malicious host (See FIG. 2; [0048] At step 204, the cloud AI engine 112 may determine a remedial action and distribute to the plurality of agents 106 one or more commands to protect the respective computing device 104 from the malware … In an exemplary aspect, the cloud AI engine 112 may analyze the code and generate a set of new remedial steps to remedy or disinfect a client device of malware).
Stupak fails to disclose:
	determining that the flow contains at least one blacklisted Internet protocol (IP) address is present in the flow data, the at least one blacklisted IP address representing a known malicious host; reducing dimensions of the high-dimensional flow representations in the set of high-dimensional flow representations; clustering the unlabeled low-dimensional flow representations in the sub-set of unlabeled low-dimensional flow representations in view of the labeled low- dimensional flow representations in the sub-set of the labeled low-dimensional flow representations.
However, Sanghavi discloses:
	determining that the flow contains at least one blacklisted Internet protocol (IP) address is present in the flow data, the at least one blacklisted IP address representing a known malicious host (See FIG. 6; [0110-0111]; step 630 & 640 discloses receiving DNS flow data which contains destination domain identifier corresponds to a blacklisted domain identifier of the blacklisted domain identifiers).
It would have been obvious to one of the ordinary person skilled in the art before the effective filing date of the claimed invention to modify the Stupak reference and include a network device which obtains a set of rules that specify match criteria, associated with the blacklisted domains, that include source network addresses and/or destination network addresses for comparison to packet source network addresses and/or packet destination network addresses associated with incoming packets, as disclosed by Sanghavi.
The motivation to include such a network device is to capture and block malicious network traffic from spreading into the enterprise network.
The combination of Stupak and Sanghavi fails to disclose:
	reducing dimensions of the high-dimensional flow representations in the set of high-dimensional flow representations; clustering the unlabeled low-dimensional flow representations in the sub-set of unlabeled low-dimensional flow representations in view of the labeled low- dimensional flow representations in the sub-set of the labeled low-dimensional flow representations.
However, Ding discloses:
	reducing dimensions of the high-dimensional flow representations in the set of high-dimensional flow representations ([0006] In accordance with an embodiment, there is provided a system for determining a reliability score indicative of a level of fidelity between high dimensional (HD) data and corresponding dimension-reduced (LD) data; See [0047-0048] discloses various machine learning algorithms for dimensionality reduction techniques).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the Stupak and Sanghavi references and machine learning mathematical techniques, as disclosed by Ding.
	The motivation to include machine learning mathematical techniques is to create data visualization and, more particularly, a system and method to determine fidelity of visualizations of multi-dimensional data sets.
The combination of Stupak, Sanghavi and Ding fails to disclose:
	clustering the unlabeled low-dimensional flow representations in the sub-set of unlabeled low-dimensional flow representations in view of the labeled low- dimensional flow representations in the sub-set of the labeled low-dimensional flow representations.
However, Miller discloses:
	clustering the unlabeled low-dimensional flow representations in the sub-set of unlabeled low-dimensional flow representations in view of the labeled low- dimensional flow representations in the sub-set of the labeled low-dimensional flow representations ([0006] It is an object of the present invention to provide a system for and a means of prioritized detection of clusters of anomalous samples in an unlabeled data batch, each characterized by its samples being anomalous on the same (in general low-dimensional) subset of the full (high-dimensional) feature set; [0030] The present invention relates generally to the detection of groups/clusters of anomalous samples, with, in general, very high feature dimensionality, that may represent unknown classes (i.e., either classes of phenomena that have not been previously identified or which are known to exist but which are not expected to be observed in a given environment). Moreover, jointly, while detecting these groups/clusters of anomalies, the present invention identifies salient, low-dimensional feature subsets on which the groups manifest their atypicality).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the Stupak, Sanghavi and Ding references and include a method and system which detects clusters of anomalous samples low-dimensional data subset of the high-dimensional dataset for high-dimensional and low-dimensional feature representation, as disclosed by Miller.
	The motivation to include Miller’s method and system is to be able to detect and present anomalous clusters in large and complicated datasets.
Regarding claim 10, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The computer-readable storage medium of claim 9, wherein providing a set of low-dimensional flow representations of the network traffic based on the set of high-dimensional flow representations comprises processing the set of high-dimensional flow representations using one of t-distributed stochastic neighbor embedding (t-SNE) and principal component analysis (PCA) to provide the set of low-dimensional flow representations (Ding: See Figures 2 & 3; [0047] For linear dimensionality reduction (LDR) mapping methods (e.g., PCA) information loss can be characterized by the null-space, whose components are all mapped to {0}. Knowing this limitation of the linear methods, many nonlinear dimensionality reduction (NLDR) methods were developed, each of which applied different methods to attempt to preserve relevant information; [0048] These include distance preservation methods, for example multidimensional scaling (MDS), Sammon mapping, Isomap, curvilinear component analysis, kernel PCA; topology preservation methods including local linear embedding, Laplacian eigenmaps; neighborhood preservation methods including stochastic neighborhood embedding (SNE), and t-SNE).
Regarding claim 11, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The computer-readable storage medium of claim 9, wherein identifying a host associated with an unlabeled low-dimensional flow representation as a potentially malicious host comprises: 
executing k-nearest neighbor (k-NN) clustering over the sub-set of labeled low-dimensional flow representations and the sub-set of unlabeled low-dimensional flow representations; and classifying the unlabeled low-dimensional flow representation as potentially malicious in response to the unlabeled low-dimensional flow representation being clustered with one or more labeled low-dimensional flow representations (Correlation Between Many-to-One and K-Nearest Neighbor Classification [0105] In some embodiments, assuming kNN has a high accuracy before dimension reduction, for kNN to be accurate in low dimensional space on point y, it is important that y's neighbors correspond to f−1(y)'s neighbors, which means y needs to have a low degree of many-to-one; [0106] As shown in FIG. 6, 602, 606, and 610 respectively show many-to-one vs kNN accuracy under t-SNE of NewsAgg with different perplexities, t-SNE of MNIST with different perplexities and dimensionality reduction of MNIST with different methods, including PCA, MDS, LLE, Isomap and t-SNE. Similarly 604, 608, and 610 show results on 1—Precision vs kNN accuracy. As depicted, many-to-one has strong negative correlation with kNN accuracy. While for precision the relationship is either non-existent, in 604, or weak, in 608 and 612).
Regarding claim 12, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The computer-readable storage medium of claim 9, wherein labeling at least a portion of the set of low-dimensional flow representations comprises determining that a low-dimensional flow representation is associated with a known malicious host and, in response, labeling the low-dimensional flow representation to provide a labeled low-dimensional flow representation included in the sub-set of labeled low-dimensional flow representations (Stupak: [0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware [labeled low-dimensional flow] based on the determined suspicious activity pattern; [0051] In one aspect, the cloud AI engine 112 detects that the new suspicious activity is not similar [unlabeled low-dimensional flow] to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114).
Regarding claim 13, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The computer-readable storage medium of claim 9,	wherein determining that at least one blacklisted IP address is present in the flow data comprises:
receiving threat information from one or more threat information (TI) feeds (Sanghavi: [0019] the information obtained by the routing device may include, for example, a list of blacklisted domain identifiers associated with blacklisted domains, network addresses (i.e., Internet protocol (IP) addresses) for one or more sinkhole servers associated with the blacklisted domain identifiers, network address ranges (i.e., IP ranges) of blacklisted devices, network address prefixes (i.e., IP prefixes) of blacklisted devices, and/or the like. In some implementations, the network addresses for the one or more sinkhole servers associated with the blacklisted domain identifiers include sinkhole server identifiers, which may correspond to IPv4 addresses and/or IPv6 addresses. In some implementations, the routing device may obtain the information from the security platform via downloading, fetching, subscribing to receive a feed from the security platform, streaming the information in real-time or near real-time, receiving push notifications, and/or the like);
comparing blacklisted IP addresses in a set of blacklisted IP addresses provided in the threat information to IP addresses included in the flow data (Sanghavi: [0018] receiving DNS requests, comparing domain names received in the DNS requests to blacklisted domain identifiers stored in the DNS sinkhole data structure, and responding to the DNS requests with a network address of a sinkhole server that is associated with the blacklisted domain identifier); and 
determining that an IP address included in the flow data matches a blacklisted IP address (Sanghavi: [0018] In some implementations, the blacklisted domains may be associated with an attacker or an attacker's website, which a customer (e.g., a country, a network service provider, a network operator, etc.) determines should be blocked).
Regarding claim 14, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The computer-readable storage medium of claim 9, wherein operations further include extracting the historical flow data in response to determining that the at least one blacklisted IP address is present in the flow data (Stupak: [0054] In some aspects, the cloud AI engine 112 may further analyze the malware and detect specific traces linked to certain persons or parties responsible for its distribution and creation. For example, the cloud AI engine 112 detect specific internet activity, access to specific IP addresses, specific distribution traces, specific code patterns, or specific behavioral patterns; [0049] the cloud AI engine 112 may further analyze the malware and detect specific traces linked to certain persons or parties responsible for its distribution and creation. In some aspects, the cloud AI engine 112 may detect specific internet activity, access to specific IP addresses, specific distribution traces, specific code patterns, or specific behavioral patterns).
Regarding claim 15, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The computer-readable storage medium of claim 9, wherein automatically executing a remedial action with respect to the potentially malicious host comprises configuring a firewall system to at least partially block communication with the potentially malicious host (Sanghavi: [0017] The routing device may, in some implementations, include a firewall module as described herein, by which the forwarding component may be configured or instructed to forward traffic (i.e., packets) to the security device for inspection; [0018] As shown in FIG. 1A, and by reference number 102, the routing device may receive or obtain information from a security platform, including information stored in the DNS sinkhole data structure or database (DB) of the security platform. The security platform may include a local or cloud-based security solution (e.g., a firewall, etc.) configured to detect threats and implement DNS sinkhole functionality. The DNS sinkhole functionality may include, for example, receiving DNS requests, comparing domain names received in the DNS requests to blacklisted domain identifiers stored in the DNS sinkhole data structure, and responding to the DNS requests with a network address of a sinkhole server that is associated with the blacklisted domain identifier).
Regarding claim 16, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The computer-readable storage medium of claim 9, wherein each low-dimensional flow representation comprises a three-dimensional (3D) flow representation (Ding: [0096] Referring now to FIG. 4, an illustration of t-SNE of 3D “linked rings” with different perplexities and their average Wasserstein costs may be provided (402-416); [0097] Referring now to FIG. 5, an illustration of Wasserstein cost distinguish misleading visualization: 502 one 3D Gaussian blob; 504 t-SNE of one blob with cost 0.060; 506 t-SNE of one blob with cost 0.046; 508 two 3D Gaussian blobs; 510 t-SNE of two blobs with cost 0.045; and, 512 t-SNE of two blobs with cost 0.021).
Regarding claim 17, Stupak discloses:
A system, comprising:
one or more processors; and
a computer-readable storage device coupled to the one or more processors and having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations for identifying and remediating zero-day attacks on a network, the operations comprising:
receiving flow data representative of communication traffic of the network ([0043] an agent of the cloud AI engine 112 may monitor file and network activity and may determine how, through which protocols and applications each file has appeared on the system; [0041] the security cloud 102 is configured to collect and analyze the information received from the computing devices 104; [0045] The method 200 begins at step 201, in which the cloud AI engine 112 collects from a plurality of agents 106 executing on a respective computing device 104, analysis data corresponding to executables 108 on the respective computing device);
determining that suspicious Internet protocol (IP) address is present in the flow data, ([0049] the cloud AI engine 112 may further analyze the malware and detect specific traces linked to certain persons or parties responsible for its distribution and creation. In some aspects, the cloud AI engine 112 may detect specific internet activity, access to specific IP addresses), and in response:
providing a set of high-dimensional flow (See FIG. 3A; i.e. traffic from uninfected computing devices such as 304) representations of network traffic by processing historical flow data (i.e. historical data stored in cloud storage) through a deep learning (DL) model ([0051] the cloud AI engine 112 detects that the new suspicious activity is not similar to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114), 
each high- dimensional flow representation in the set of high-dimensional flow representations comprising a high-dimensional vector representing each host within the historical flow data ([0041] Using the collected data, in one aspect the cloud storage 114 may provide both a detailed historical record of past behavior in each of the computing devices 104, as well as aggregated statistics representing the behavior of the computing devices 104)
providing a set of low-dimensional flow (See FIG. 3A; i.e. known traffic from infected computing devices such as 302A & 302B) representations of the network traffic based on the set of high-dimensional flow representations ([0046] In some aspects, the cloud AI engine 112 may detect a malware epidemic based on previously known distribution methods. In another aspect, the cloud AI engine 112 may detect a security flaw in a user application that can be used for malware based on previously known distribution methods; [0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware based on the determined suspicious activity pattern), 
at least one low-dimensional flow representation representing the known malicious host within the network traffic ([0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware based on the determined suspicious activity pattern) and at least one low-dimensional flow representation representing and an unknown [malicious] host ([0051] In one aspect, the cloud AI engine 112 detects that the new suspicious activity is not similar to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114), and
labeling at least a portion of the set of low-dimensional flow representations to provide a sub-set of labeled (See [0047] i.e. known malware data based on determined suspicious activity pattern) low-dimensional flow representations and a sub-set of unlabeled (See [0051] i.e. new suspicious activity is not similar to attacks previously known) low-dimensional flow representations ([0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware [labeled low-dimensional flow] based on the determined suspicious activity pattern; [0051] In one aspect, the cloud AI engine 112 detects that the new suspicious activity is not similar [unlabeled low-dimensional flow] to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114); and
identifying a host associated with an unlabeled (i.e. identification of zero-day malware) low-dimensional flow representation as a potentially malicious host ([0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware based on the determined suspicious activity pattern. In some aspects, the cloud AI engine 112 may identify a zero-day attack by malware not found in a database of previously known malware based on the suspicious activity pattern. In some aspects, the cloud AI engine may identify a zero-day attack by malware identified as modified from a previously known version based on the suspicious activity pattern and using a database of previously known malware), and 
in response, automatically executing a remedial action with respect to the potentially malicious host (See FIG. 2; [0048] At step 204, the cloud AI engine 112 may determine a remedial action and distribute to the plurality of agents 106 one or more commands to protect the respective computing device 104 from the malware … In an exemplary aspect, the cloud AI engine 112 may analyze the code and generate a set of new remedial steps to remedy or disinfect a client device of malware).
Stupak fails to disclose:
	determining that the flow contains at least one blacklisted Internet protocol (IP) address in the flow data, the at least one blacklisted IP address representing a known malicious host; reducing dimensions of the high-dimensional flow representations in the set of high-dimensional flow representations; clustering the unlabeled low-dimensional flow representations in the sub-set of unlabeled low-dimensional flow representations in view of the labeled low- dimensional flow representations in the sub-set of the labeled low-dimensional flow representations.
However, Sanghavi discloses:
	determining that the flow contains at least one blacklisted Internet protocol (IP) address is present in the flow data, the at least one blacklisted IP address representing a known malicious host (See FIG. 6; [0110-0111]; step 630 & 640 discloses receiving DNS flow data which contains destination domain identifier corresponds to a blacklisted domain identifier of the blacklisted domain identifiers).
It would have been obvious to one of the ordinary person skilled in the art before the effective filing date of the claimed invention to modify the Stupak reference and include a network device which obtains a set of rules that specify match criteria, associated with the blacklisted domains, that include source network addresses and/or destination network addresses for comparison to packet source network addresses and/or packet destination network addresses associated with incoming packets, as disclosed by Sanghavi.
The motivation to include such a network device is to capture and block malicious network traffic from spreading into the enterprise network.
The combination of Stupak and Sanghavi fails to disclose:
	reducing dimensions of the high-dimensional flow representations in the set of high-dimensional flow representations; clustering the unlabeled low-dimensional flow representations in the sub-set of unlabeled low-dimensional flow representations in view of the labeled low- dimensional flow representations in the sub-set of the labeled low-dimensional flow representations.
However, Ding discloses:
	reducing dimensions of the high-dimensional flow representations in the set of high-dimensional flow representations ([0006] In accordance with an embodiment, there is provided a system for determining a reliability score indicative of a level of fidelity between high dimensional (HD) data and corresponding dimension-reduced (LD) data; See [0047-0048] discloses various machine learning algorithms for dimensionality reduction techniques).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the Stupak and Sanghavi references and machine learning mathematical techniques, as disclosed by Ding.
	The motivation to include machine learning mathematical techniques is to create data visualization and, more particularly, a system and method to determine fidelity of visualizations of multi-dimensional data sets.
The combination of Stupak, Sanghavi and Ding fails to disclose:
	clustering the unlabeled low-dimensional flow representations in the sub-set of unlabeled low-dimensional flow representations in view of the labeled low- dimensional flow representations in the sub-set of the labeled low-dimensional flow representations.
However, Miller discloses:
	clustering the unlabeled low-dimensional flow representations in the sub-set of unlabeled low-dimensional flow representations in view of the labeled low- dimensional flow representations in the sub-set of the labeled low-dimensional flow representations ([0006] It is an object of the present invention to provide a system for and a means of prioritized detection of clusters of anomalous samples in an unlabeled data batch, each characterized by its samples being anomalous on the same (in general low-dimensional) subset of the full (high-dimensional) feature set; [0030] The present invention relates generally to the detection of groups/clusters of anomalous samples, with, in general, very high feature dimensionality, that may represent unknown classes (i.e., either classes of phenomena that have not been previously identified or which are known to exist but which are not expected to be observed in a given environment). Moreover, jointly, while detecting these groups/clusters of anomalies, the present invention identifies salient, low-dimensional feature subsets on which the groups manifest their atypicality).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the Stupak, Sanghavi and Ding references and include a method and system which detects clusters of anomalous samples low-dimensional data subset of the high-dimensional dataset for high-dimensional and low-dimensional feature representation, as disclosed by Miller.
	The motivation to include Miller’s method and system is to be able to detect and present anomalous clusters in large and complicated datasets.
Regarding claim 18, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The system of claim 17, wherein providing a set of low-dimensional flow representations of the network traffic based on the set of high-dimensional flow representations comprises processing the set of high-dimensional flow representations using one of t-distributed stochastic neighbor embedding (t-SNE) and principal component analysis (PCA) to provide the set of low-dimensional flow representations (Ding: See Figures 2 & 3; [0047] For linear dimensionality reduction (LDR) mapping methods (e.g., PCA) information loss can be characterized by the null-space, whose components are all mapped to {0}. Knowing this limitation of the linear methods, many nonlinear dimensionality reduction (NLDR) methods were developed, each of which applied different methods to attempt to preserve relevant information; [0048] These include distance preservation methods, for example multidimensional scaling (MDS), Sammon mapping, Isomap, curvilinear component analysis, kernel PCA; topology preservation methods including local linear embedding, Laplacian eigenmaps; neighborhood preservation methods including stochastic neighborhood embedding (SNE), and t-SNE).
Regarding claim 19, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The system of claim 17, wherein identifying a host associated with an unlabeled low-dimensional flow representation as a potentially malicious host comprises executing k-nearest neighbor (k-NN) clustering over the sub-set of labeled low-dimensional flow representations and the sub-set of unlabeled low-dimensional flow representations; and classifying the unlabeled low-dimensional flow representation as potentially malicious in response to the unlabeled low-dimensional flow representation being clustered with one or more labeled low-dimensional flow representations (Ding: Correlation Between Many-to-One and K-Nearest Neighbor Classification [0105] In some embodiments, assuming kNN has a high accuracy before dimension reduction, for kNN to be accurate in low dimensional space on point y, it is important that y's neighbors correspond to f−1(y)'s neighbors, which means y needs to have a low degree of many-to-one; [0106] As shown in FIG. 6, 602, 606, and 610 respectively show many-to-one vs kNN accuracy under t-SNE of NewsAgg with different perplexities, t-SNE of MNIST with different perplexities and dimensionality reduction of MNIST with different methods, including PCA, MDS, LLE, Isomap and t-SNE. Similarly 604, 608, and 610 show results on 1—Precision vs kNN accuracy. As depicted, many-to-one has strong negative correlation with kNN accuracy. While for precision the relationship is either non-existent, in 604, or weak, in 608 and 612).
Regarding claim 20, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The system of claim 17, wherein labeling at least a portion of the set of low-dimensional flow representations comprises determining that a low-dimensional flow representation is associated with a known malicious host and, in response, labeling the low-dimensional flow representation to provide a labeled low-dimensional flow representation included in the sub-set of labeled low-dimensional flow representations (Stupak: [0047] At step 203, the cloud AI engine 112 determines that at least one executable 108 on a computing device 104 is malware [labeled low-dimensional flow] based on the determined suspicious activity pattern; [0051] In one aspect, the cloud AI engine 112 detects that the new suspicious activity is not similar [unlabeled low-dimensional flow] to attacks previously known, for example, based on a comparison with historical data stored in cloud storage 114).
Regarding claim 21, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The system of claim 17, wherein determining that at least one blacklisted IP address is present in the flow data comprises: 
receiving threat information from one or more threat information (TI) feeds (Sanghavi: [0019] the information obtained by the routing device may include, for example, a list of blacklisted domain identifiers associated with blacklisted domains, network addresses (i.e., Internet protocol (IP) addresses) for one or more sinkhole servers associated with the blacklisted domain identifiers, network address ranges (i.e., IP ranges) of blacklisted devices, network address prefixes (i.e., IP prefixes) of blacklisted devices, and/or the like. In some implementations, the network addresses for the one or more sinkhole servers associated with the blacklisted domain identifiers include sinkhole server identifiers, which may correspond to IPv4 addresses and/or IPv6 addresses. In some implementations, the routing device may obtain the information from the security platform via downloading, fetching, subscribing to receive a feed from the security platform, streaming the information in real-time or near real-time, receiving push notifications, and/or the like);
comparing blacklisted IP addresses in a set of blacklisted IP addresses provided in the threat information to IP addresses included in the flow data (Sanghavi: [0018] receiving DNS requests, comparing domain names received in the DNS requests to blacklisted domain identifiers stored in the DNS sinkhole data structure, and responding to the DNS requests with a network address of a sinkhole server that is associated with the blacklisted domain identifier); and 
determining that an IP address included in the flow data matches a blacklisted IP address (Sanghavi: [0018] In some implementations, the blacklisted domains may be associated with an attacker or an attacker's website, which a customer (e.g., a country, a network service provider, a network operator, etc.) determines should be blocked).
Regarding claim 22, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The system of claim 17, wherein operations further include extracting the historical flow data in response to determining that the at least one blacklisted IP address is present in the flow data (Stupak: [0054] In some aspects, the cloud AI engine 112 may further analyze the malware and detect specific traces linked to certain persons or parties responsible for its distribution and creation. For example, the cloud AI engine 112 detect specific internet activity, access to specific IP addresses, specific distribution traces, specific code patterns, or specific behavioral patterns; [0049] the cloud AI engine 112 may further analyze the malware and detect specific traces linked to certain persons or parties responsible for its distribution and creation. In some aspects, the cloud AI engine 112 may detect specific internet activity, access to specific IP addresses, specific distribution traces, specific code patterns, or specific behavioral patterns).
Regarding claim 23, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The system of claim 17, wherein automatically executing a remedial action with respect to the potentially malicious host comprises configuring a firewall system to at least partially block communication with the potentially malicious host  (Sanghavi: [0017] The routing device may, in some implementations, include a firewall module as described herein, by which the forwarding component may be configured or instructed to forward traffic (i.e., packets) to the security device for inspection; [0018] As shown in FIG. 1A, and by reference number 102, the routing device may receive or obtain information from a security platform, including information stored in the DNS sinkhole data structure or database (DB) of the security platform. The security platform may include a local or cloud-based security solution (e.g., a firewall, etc.) configured to detect threats and implement DNS sinkhole functionality. The DNS sinkhole functionality may include, for example, receiving DNS requests, comparing domain names received in the DNS requests to blacklisted domain identifiers stored in the DNS sinkhole data structure, and responding to the DNS requests with a network address of a sinkhole server that is associated with the blacklisted domain identifier).
Regarding claim 24, the combination of Stupak, Sanghavi, Ding and Miller discloses:
The system of claim 17, wherein each low-dimensional flow representation comprises a three-dimensional (3D) flow representation (Ding: [0096] Referring now to FIG. 4, an illustration of t-SNE of 3D “linked rings” with different perplexities and their average Wasserstein costs may be provided (402-416); [0097] Referring now to FIG. 5, an illustration of Wasserstein cost distinguish misleading visualization: 502 one 3D Gaussian blob; 504 t-SNE of one blob with cost 0.060; 506 t-SNE of one blob with cost 0.046; 508 two 3D Gaussian blobs; 510 t-SNE of two blobs with cost 0.045; and, 512 t-SNE of two blobs with cost 0.021).



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SYED M AHSAN whose telephone number is (571)272-5018. The examiner can normally be reached 8:30 AM - 6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jeffery L. Nickerson can be reached on 469-295-9235. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Jeffrey Nickerson/Supervisory Patent Examiner, Art Unit 2432                                                                                                                                                                                                        
/S.M.A./Patent Examiner, Art Unit 2432