DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is in response to communication filed on 03/14/2022.
Status of claims in the instant application:
Claims 1-20 are pending.
Claims 1 and 16 have been amended.
No new claim has been added.
No claim has been canceled.
Response to Arguments
Applicant submitted corrected/updated specification on 03/14/2022. Examiner has examined the updated specification, and it has been accepted.
Applicant's arguments, see the remarks filed on 03/14/2022, have been fully considered but they are not persuasive. Therefore, the Applicant is directed to Examiner’s response below.
Applicant states, page [13] of the remarks filed on 03/14/2020, “In the rejection of claim 4, Hulten Para [0054, 0067] is said to teach "a rollup aggregating data for an IPRID template which includes a domain name portion of a uniform resource locator, wherein the domain name portion includes a placeholder designating an unknown value;". Applicant respectfully disagrees, for at least the following reasons: 
"The cited material does not teach a hierarchical rollup. Attention is respectfully directed to Applicant's specification at [00377- 00380] for a discussion and examples of what is meant by hierarchical rollup. This can be discussed during the requested interview.”
	In response, Examiner disagrees with Applicant’s characterization of the prior art not disclosing the claimed feature of “hierarchical rollup”. Examiner notes the following from Ringlein prior art as cited in the office action:
“Ringlein, Para [0065-0069, 0033-0034], FIG. 2: … FIG. 2 is an example diagram of a security knowledge graph for an example security incident or alert in accordance with one illustrative embodiment … As shown in FIG. 2, the security knowledge graph consists of the security offense source IP as a root node of the security knowledge graph. Network observables, such as an Internet Protocol (IP) address, Universal Resource Locator (URL), domain name, file hashes, and the like, are represented as nodes connected to the root node via different network events … The source IP address connected to an external IP address that resolved to a domain name (the domain name being an external network observable) indicated by the other black nodes connected by edges to the root node … The illustrative embodiments provide mechanisms for implementing a machine learning based model, such as a neural network model, that is trained to recognize patterns of features extracted from the security incident itself, a security knowledge graph associated with the security incident, as well as metrics generated from analysis of the security knowledge graph, and predict a disposition of the security incident based on the recognized patterns. In extracting features from the security knowledge graph, selected features may be extracted from the nodes, edges, and overall topology of the graph in accordance with a specific configuration of the feature extraction mechanisms. The feature extraction mechanisms utilize a single traversal of the security knowledge graph associated with the security incident to collect the extracted features rather than having to perform multiple traversal …)”
Examiner interprets that data related to entity (i. e. IP address, domain name URL etc.) are collected by traversing the knowledge graph and hence it’s a “hierarchical rollup”.
	Examiner also notes that Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
Examiner also notes that, in response to Applicant referring to several section of the specification of the instant, the claims are interpreted in light of the specification, but limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
Examiner also notes (see mpep §2111.01.II) that "Though understanding the claim language may be aided by explanations contained in the written description, it is important not to import into a claim limitations that are not part of the claim. For example, a particular embodiment appearing in the written description may not be read into a claim when the claim language is broader than the embodiment." Superguide Corp. v. DirecTV Enterprises, Inc., 358 F.3d 870, 875, 69 USPQ2d 1865, 1868 (Fed. Cir. 2004)”. 
Examiner also notes (see mpep §2111) “scope of claims in patent applications not solely on the basis of the claim language, but upon giving claims their broadest reasonable construction “in light of the specification as it would be interpreted by one of ordinary skill in the art.” In re Am. Acad. of Sci. Tech. Ctr., 367 F.3d 1359, 1364[, 70 USPQ2d 1827, 1830] (Fed. Cir. 2004)”
Applicant states, page [13] of the remarks filed on 03/14/2020, “The cited material does not teach a placeholder designating an unknown value. Attention is respectfully directed to Applicant's specification at [00310, 00376- 00380] for a discussion and examples of what is meant by a placeholder designating an unknown value. In particular, Hulten's [0067] training data is a count, which is not a placeholder designating an unknown value. Moreover, even if an "obfuscated URL, HREF mismatch" was itself in the training data, there is no teaching in the cited material that it would be used as a placeholder in a hierarchical rollup. This can be discussed during the requested interview.”
	In response, Examiner disagrees with Applicant’s characterization of the prior art not disclosing the claimed feature of “a placeholder designating an unknown value”. Examiner notes the following from Hulten prior art as cited in the office action:
	“Hulten, Para [0006, 0008,  0054, 0067]: … various aspects of the subject matter described herein are directed towards processing data from at least one data source related to phishing sites, and using a predictive model to determine whether a site is likely to be a phishing site. For example, processing the data may comprise generating a report for each of a plurality of data sources, aggregating the reports and applying the predictive model to the aggregated reports. The predictive model may be built using machine learning based on training data, e.g., including known phishing sites and/or known non-phishing sites … The model is strengthened by aggregating phishing-related data from a plurality of sources, which, for example, may include at least one source corresponding to an email service and at least one source corresponding to an internet access service. The features and properties of each site may be logged, and used to develop more accurate training data. The model is strengthened further by using known phishing sites as well as known non-phishing sites, e.g., sites that appear to have features that would indicate phishing, but in actuality have been graded as non-phishing sites  …  Based on the data sources 301-307, the system 300 is able to collect a significant amount of statistical information about URLs, including their features and properties. Note that each property can be tracked per URL, per domain, and per IP address (via a DNS lookup on the domain). Properties may also be tracked at several different time resolutions, e.g., in the last ten minutes, hour, day, week, month, and all time (particularly for known good URL sources) … Number of times the object appeared in the source with a phishing trick (numeric IP, obfuscated URL, HREF mismatch, "look-alike URL", and so forth”
Examiner interprets that the obfuscated URL and look-alike URL do disclose the “unknown” and “place holder value”.
Examiner again notes that in response to Applicant referring to several section of the specification of the instant, the claims are interpreted in light of the specification, but limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).

	


Applicant states, page [13] of the remarks filed on 03/14/2020, “In the rejection of claim 7, Ringlein Para [0065-0069] is also said to teach hierarchical rollups of data about IPRIDs. Applicant respectfully disagrees. This material discusses a "security knowledge graph" in connection with "threat intelligence". It does not mention "machine learning" or "training data". This can be discussed during the requested interview.”
	In response, Examiner directs Applicant to the following from Hulten prior art that was already cited in the previous office action:
“Hulten, Abstract, Para [0006-0008], FIG. 5: … Described is a technology by which phishing-related data sources are processed into aggregated data and a given site evaluated the aggregated data using a predictive model to automatically determine whether the given site is likely to be a phishing site. The predictive model may be built using machine learning based on training data, e.g., including known phishing sites and/or known non-phishing sites …”
The cited portion of the prior art clearly discloses “machine learning” or “training data”.
Examiner also notes that the rejections are in combination of prior arts and in response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).13
Applicant states, page [14] of the remarks filed on 03/14/2020, “Accordingly, all rejections should be withdrawn, because the combined references do not teach machine learning based on reputation training data that includes a rollup feature based on a hierarchical rollup of data about IPRIDs which include a placeholder designating an unknown value.”
	In response, Examiner notes that all the above features that the Applicant is arguing about have already been shown to have been disclosed by the prior arts of record previously. And as such, Applicant’s arguments are not persuasive.
Applicant states, page [14] of the remarks filed on 03/14/2020, “Claim 3. Hulten Para [0080] recites the number of times a URL appeared in a spam source, but claim 3 recites "a number of submissions". These are not the same. The number of submissions refers to the number of submissions for analysis; see, e.g., Applicant's specification at [00198, 00364].”
	In response, Examiner disagrees with Applicant’s interpretation of the prior art. Examiner maintains that an URL appearing in a spam source a number of times broadly discloses the as to how many times transactions/communications are made from/to that URL, therefore disclosing “a number of submissions” involving the URL/IPRD.
Applicant states, page [14] of the remarks filed on 03/14/2020, “Claim 5. Hulten Para [0135] does not describe a training data grid having weekday columns and week rows. No grid is required to meet this description by Hulten. Hulten also does not mention "grid".”
	In response, Examiner notes that Hulten (as cited in the office action) discloses collecting different data for an UR: (web-site) over a time, such as over day, over week, over month. Each of these time periods can be considered an axis (variable) of a data structure (grid) and that the report for the URL can be accessed/analyzed based on any of the time periods (variable), thus the overall data structure is interpreted as a grid.
Applicant states, page [14] of the remarks filed on 03/14/2020, “Claims 6 and 11. The cited Hulten table in Para [0170] does not include any IPRID or IPRID template, so it will not function as the claimed lookup table. Attention is respectfully directed to Applicant's specification at, e.g., [00164, 00313, 00357].”
	In response, Examiner does not agree with Applicant’s characterization of the prior art, and further notes that, cited Para [0169-0170] of Hulten as in the previous office action, the “leaf” is as an “Web Site” and the table as in Para [0170] of Hulten does include “five” leaves (we sites or the claimed IPRDs)
Applicant states, page [14] of the remarks filed on 03/14/2020, “Claim 8. The cited material in Hulten Para [0035-0038] does not teach the tuples recited in claim 8. Indeed, Hulten does not mention "tuple". Nor does it teach "expanding training data by date or by original IPRID to include a set of label tuples which all have the same final IPRID and the same label but have different dates or different original IPRIDs or both." Hulten does not mention "expanding" training data. Attention is respectfully directed to Applicant's specification at, e.g., [00320-00323] for a discussion of training data expansion.”
	In response, Examiner does not agree with Applicant’s characterization of the prior art. Examiner notes that “tuple” is broadly interpreted as data collected in a group/set, as in the prior art. Also, the cited portion of Hulten discloses that the web site phishing data can be collected/aggregated for a number of days and they can be analyzed based on dates.
Examiner again notes that in response to Applicant referring to several section of the specification of the instant, the claims are interpreted in light of the specification, but limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
Applicant states, page [14] of the remarks filed on 03/14/2020, “Claims 9 and 10. One of skill in the art would understand that "join" is a term of art as used by Applicant with respect to training data manipulation; see, e.g., Applicant's specification at [00264, 00321-00324, 00359, 00372-00373, 00384, 00395]. By contrast, Hulten does not mention "join".”
	In response, Examiner notes that the term join, as best can be understood, is interpreted as combing data over number of dates/days. The cited portion of Hulten discloses accumulating domain/IPRD data over N days and it also discloses observing URL data over selected periods of time. Examiner interprets that observing URL host
 Data stream over selected periods of times and aggregating them disclose “join”.	
Examiner again notes that in response to Applicant referring to several section of the specification of the instant, the claims are interpreted in light of the specification, but limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
Applicant states, page [14] of the remarks filed on 03/14/2020, “Claim 13. Ringlein Para [0025] does not teach "separate planes for respective aggregate features." Indeed, Ringlein does not mention "plane".”
	In response, Examiner notes that Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
Applicant states, page [15] of the remarks filed on 03/14/2020, “14Claim 14. A performance measurement of a system, at "some date", is not the same as "a grid of label tuples which contain feature values that are calculated, from feature data, relative to a receipt date of the feature data."”
	In response, Examiner notes that claim 14 depends on claim 7. Examiner has clarified the “tuples” with respect to claim 7 which also applies to claim 14, and the Applicant is directed to Examiner’s response for claim 7. Hulten, as cited in the previous office action, does also disclose measuring system performance to identify fishing sites based on dated data that can range over number of days.
Applicant states, page [15] of the remarks filed on 03/14/2020, “Claim 15. Some sources being more current than others is not the same as "inferring a label for a current date based on aggregate feature data for prior dates." Also, as noted for claims 6 and 11, no lookup table as claimed has been shown in the references.”
	In response, Examiner disagrees with Applicant’s characterization of the prior art. Hulten discloses data collected and dated over a number of days/dates. It would be obvious based on dates, to infer, based on the date label, as to which of the data is 
Claims 16-20 claim a “computer-readable storage medium” without reciting it to be “non-transitory” in the claims.
Examiner interprets the claimed “computer-readable storage medium” to be “non-transitory” based on the following description in the specification of the instant application:
“Para [00108]: Whenever reference is made to data or instructions, it is understood that these items configure a computer-readable memory and/or computer-readable storage medium, thereby transforming it to a particular article, as opposed to simply existing on paper, in a person's mind, or as a mere signal being propagated on a wire, for example. For the purposes of patent protection in the United States, a memory or other computer-readable storage medium is not a propagating signal or a carrier wave or mere energy outside the scope of patentable subject matter under United States Patent and Trademark Office (USPTO) interpretation of the In re Nuijten case. No claim covers a signal per se or mere energy in the United States, and any claim interpretation that asserts otherwise in view of the present disclosure is unreasonable on its face. Unless expressly stated otherwise in a claim granted outside the United States, a claim does not cover a signal per se or mere energy.
Para [00109]: Moreover, notwithstanding anything apparently to the contrary elsewhere herein, a clear distinction is to be understood between (a) computer readable storage media and computer readable memory, on the one hand, and (b) transmission media, also referred to as signal media, on the other hand. A transmission medium is a propagating signal or a carrier wave computer readable medium. By contrast, computer readable storage media and computer readable memory are not propagating signal or carrier wave computer readable media. Unless expressly stated otherwise in the claim, “computer readable medium” means a computer readable storage medium, not a propagating signal per se and not mere energy.”
	Therefore, claims 16-20 are statutory, and not rejected as “signal per se” under 35 USC 101.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Pub. No.: US 2007/0192855 A1 to Hulten et al. (hereinafter “Hulten”) in view of Pub. No.: US 2020/0401696 A1 to Ringlein et al. (hereinafter “Ringlein”).
Regarding Claim 1. Hulten discloses An internet protocol resource identification (IPRID) reputation assessment system (Hulten, Abstract, Para [0016-0019], FIG. 1: Described is a technology by which phishing-related data sources are processed into aggregated data and a given site evaluated the aggregated data using a predictive model to automatically determine whether the given site is likely to be a phishing site … FIG. 1 illustrates an example of a suitable computing system environment 100 on which the invention may be implemented …), comprising:
5a memory (Hulten, Para [0019], FIG. 1: … With reference to FIG. 1, an exemplary system for implementing the invention includes a general purpose computing device in the form of a computer 110. Components of the computer 110 may include, but are not limited to, a processing unit 120, a system memory 130 …);
a processor in operable communication with the memory (Hulten, Para [0019], FIG. 1: … a processing unit 120, a system memory 130, and a system bus 121 that couples various system components including the system memory to the processing unit 120 …), the processor and the memory configured in a trained [convolutional neural network] which was trained using IPRID reputation training data (Hulten, Abstract, Para [0006-0008], FIG. 5: … Described is a technology by which phishing-related data sources are processed into aggregated data and a given site evaluated the aggregated data using a predictive model to automatically determine whether the given site is likely to be a phishing site. The predictive model may be built using machine learning based on training data, e.g., including known phishing sites and/or known non-phishing sites …);
However, Hulten does not explicitly teach, but Ringlein from same or similar field of endeavor teaches, “a trained convolutional neural network which was trained using IPRID reputation training data (Ringlein, Para [0075-0079]: … The features extracted from the security incident/alert and the security knowledge graph may be used as a basis for the feature extraction engine (FEE) 144 to generate metric features that are added to the extracted feature set for the security incident/alert. These metrics provide a statistical measure of aspects of the security incident that provide insights into the nature of the security incident. For example, metric features that may be include a mean magnitude, a mean toxicity, a sum of toxicity of nodes indicated by sources that assign reputation and toxicity value … Returning again to FIG. 1A, the extracted features/metrics 143 are input to a cognitive predictive computer model 148 which is then trained 146 based on the extracted features/metrics 143 and the corresponding correct labels provided by the security analyst 131 and associated with the corresponding security alert entry exported 142 to the FEE 144. The cognitive predictive computer model 148 may be security incident machine learning (ML) mechanism, such as previously described above, which may employ a neural network model, such as a convolutional neural network (CNN) model, that is trained through a machine learning process using, as input, the extracted features/metrics from the exported security alerts/security knowledge graphs and the corresponding ground truth of the correct disposition generated by the security analyst 131 and stored in the security alert database 138 …)”
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Ringlein into the teachings of Hulten, because it discloses that “the illustrative embodiments provide mechanisms for implementing a machine learning based model, such as neural network model, that is trained to recognize patterns of features extracted from the security incident itself, a security knowledge graph associated with the security incident, as well as metrics generated from analysis of the security knowledge graph, and predict a disposition of the security incident based on the recognized patterns. In extracting features from the security knowledge graph, selected features may be extracted from the nodes, edges, and overall topology of the graph in accordance with a specific configuration of the feature extraction mechanisms. The feature extraction mechanisms utilize a single traversal of the security knowledge graph associated with the security incident to collect the extracted features rather than having to perform multiple traversals (Ringlein, Para [0033])”.
	Hulten further discloses:
“wherein the IPRID reputation training data includes at least one rollup feature based on a [hierarchical] rollup of data about IPRIDs which includes at least one placeholder designating an unknown value (Hulten, Para [0006, 0008,  0054, 0067]: … various aspects of the subject matter described herein are directed towards processing data from at least one data source related to phishing sites, and using a predictive model to determine whether a site is likely to be a phishing site. For example, processing the data may comprise generating a report for each of a plurality of data sources, aggregating the reports and applying the predictive model to the aggregated reports. The predictive model may be built using machine learning based on training data, e.g., including known phishing sites and/or known non-phishing sites … The model is strengthened by aggregating phishing-related data from a plurality of sources, which, for example, may include at least one source corresponding to an email service and at least one source corresponding to an internet access service. The features and properties of each site may be logged, and used to develop more accurate training data. The model is strengthened further by using known phishing sites as well as known non-phishing sites, e.g., sites that appear to have features that would indicate phishing, but in actuality have been graded as non-phishing sites  …  Based on the data sources 301-307, the system 300 is able to collect a significant amount of statistical information about URLs, including their features and properties. Note that each property can be tracked per URL, per domain, and per IP address (via a DNS lookup on the domain). Properties may also be tracked at several different time resolutions, e.g., in the last ten minutes, hour, day, week, month, and all time (particularly for known good URL sources) … Number of times the object appeared in the source with a phishing trick (numeric IP, obfuscated URL, HREF mismatch, "look-alike URL", and so forth) …; Examiner’s interpretation: the obfuscated and look-alike URL disclose the unknown and place holder value);” and
However, Hulten does not explicitly teach, but Ringlein from same or similar field of endeavor teaches, “hierarchical rollup of data about IPRIDs (Ringlein, Para [0065-0069, 0033-0034], FIG. 2: … FIG. 2 is an example diagram of a security knowledge graph for an example security incident or alert in accordance with one illustrative embodiment … As shown in FIG. 2, the security knowledge graph consists of the security offense source IP as a root node of the security knowledge graph. Network observables, such as an Internet Protocol (IP) address, Universal Resource Locator (URL), domain name, file hashes, and the like, are represented as nodes connected to the root node via different network events … The source IP address connected to an external IP address that resolved to a domain name (the domain name being an external network observable) indicated by the other black nodes connected by edges to the root node … The illustrative embodiments provide mechanisms for implementing a machine learning based model, such as a neural network model, that is trained to recognize patterns of features extracted from the security incident itself, a security knowledge graph associated with the security incident, as well as metrics generated from analysis of the security knowledge graph, and predict a disposition of the security incident based on the recognized patterns. In extracting features from the security knowledge graph, selected features may be extracted from the nodes, edges, and overall topology of the graph in accordance with a specific configuration of the feature extraction mechanisms. The feature extraction mechanisms utilize a single traversal of the security knowledge graph associated with the security incident to collect the extracted features rather than having to perform multiple traversal …); Examiner’s Note: data related to entity (i. e. IP address, domain name URL etc.) are collected traversing from the entity to the root entity, and hence it’s hierarchical rollup)”;
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further combine the teachings of Ringlein into the teachings of Hulten, because it discloses that “the threat intelligence may be provided by external sources, such as threat intelligence services such as IBM X-Force, CrowdStrike, and the like, that have threat hunters constantly monitoring the threat landscape and creating information on what are the indicators associated with existing threats/emerging threats. For example, it may be known to threat intelligence services that the IP address 64.22.22.22 is an IP address that is known to be associated with a webpage that services exploit kits. Here 64.22.22.22 is an indicator, so if one sees an offense containing this IP address, using threat intelligence information, one can determine that an exploit kit could have exploited one of the organization's computers (Ringlein, Para [0067])”;
Hulten further discloses:
“wherein the IPRID reputation assessment system is configured to enhance 15security of a guarded system by performing at least one of the following: the trained convolutional neural network inferring a label which distinguishes a malicious IPRID from other IPRIDs (Hulten, Para [0179, 0048]: … A report aggregator 332 is provided to aggregate information from the report stores 321-327 into a description 334 of how each host (or URL) has been observed in the data streams over the selected time periods. The aggregator 332 also applies a predictive model 336 (the training of which is described with reference to FIG. 5) to the aggregated information to estimate the probability that the specified object is a phishing site/page … These examples of how phishing sites 550 and non-phishing sites 552 (each of which are updated as graders grade new sites) appear in the data sources are fed as training data/aggregated host statistics 556 to a machine learning algorithm 558. In turn the machine learning algorithm 558 produces a predictive model 560, which, for example, may be used as the predictive model 336 of FIG. 3. The training data/aggregated host statistics 556 contain data for known phishing sites at several time periods. Any algorithm that builds a classification model/probabilistic model is appropriate for generating the predictive model 560 (or the predictive model 336 of FIG. 3), e.g., one implementation used a learning algorithm such as decision tree induction (FIG. 4), while others may employ logistic regression, support vector machines, some hybrid combination of these, and so forth. The predictive model may be changed as needed, e.g., hourly, daily, whenever a new site appears, and so forth, including combinations of these, as new information becomes available. Note that client filtering code also may be updated based on what is learned as new information becomes available, e.g., client training may also be performed. Further, while a single model that integrates information from all of the data sources may be used, it is also feasible to use a separate model that predicts per data source whether a host/URL is phishing or not (e.g. P(host seen in FBL is phishing| how it appeared in FBL)), with a meta learner used to combine the probabilities of the individual models. Standard ensemble learning methods may be used to build a model (e.g. boosting) …), or the trained convolutional neural network producing a lookup table which distinguishes malicious IPRIDs from other IPRIDs; Examiner notes that Ringlein already discloses the “the trained convolutional neural network.”
Regarding Claim 2. The combination of Hulten-Ringlein discloses the system of claim 1, Hulten further discloses, “wherein the convolutional neural network is trained using IPRID reputation training data that includes at least three of the following aggregate features:
a count of distinct submissions (Hulten, Para [0055-0056]: … For email-related data sources, e.g., including FBL spam, FBL good, "This is junk," "This is fraud," honeypots, dynamic trap accounts, and raw e-mail samples), a given URL may be associated with the following features/properties: Number of times the object (URL, web server domain, web server IP) appeared in the source …);
25a count of distinct final hostnames;
a count of submissions of a particular IPRID (Hulten, Para [0036, 0040, 0054-0056], FIG. 3: … there are sources corresponding to browser (e.g., Internet Explorer 7.x) submissions 301, toolbar submissions 302, FBL (feedback loop, e.g., provided by volunteer users of a service) submissions 303, honeypots (closely monitored seeded/dummy email addresses) 304, and email service-supplied (e.g., Hotmail) fraud and junk mail submissions, 305 and 306, respectively … Although the current lag in average FBL message arrival time is about one day, which tends to reduce the value of locating currently-operating phishing sites via this data source, such information may be relevant for a long-operating site, and at least be used in training the predictive model … For email-related data sources, e.g., including FBL spam, FBL good, "This is junk," "This is fraud," honeypots, dynamic trap accounts, and raw e-mail samples), a given URL may be associated with the following features/properties:  Number of times the object (URL, web server domain, web server IP) appeared in the source …);
a count of submissions of a particular IPRID from within a document as opposed to from use in a network communication protocol exchange as a uniform resource locator (Hulten, Para [0059-0060]: Number of times the object appeared with a phishing related word (e.g. login, password, credit, and so forth) in the body or subject of the message … Whether common brands or phishing-related words appear in the host name or the URL …);”
30a count of redirects to a particular IPRID;
a count of classifications of a particular IPRID as non-malicious;
68a count of times a particular IPRID was unreachable due to an endpoint network error;
a count of classifications of a particular IPRID as a malicious phishing IPRID;
5a count of classifications of a particular IPRID as a malicious IPRID other than a phishing IPRID;
a count of any network errors encountered when attempting to reach a particular IPRID;
a count of HTTP code 200 responses encountered when attempting to 10reach a particular IPRID;
a count of HTTP code 301 responses encountered when attempting to reach a particular IPRID;
a count of HTTP code 0 responses encountered when attempting to reach a particular IPRID;
15a count of HTTP code 400 responses encountered when attempting to reach a particular IPRID;
a count of HTTP code 401 responses encountered when attempting to reach a particular IPRID;
a count of HTTP code 403 responses encountered when attempting to 20reach a particular IPRID;
a count of HTTP code 404 responses encountered when attempting to reach a particular IPRID;
a count of HTTP code responses other than individually tallied response codes were encountered when attempting to reach a particular 25IPRID.
Regarding Claim 3. The combination of Hulten-Ringlein discloses the system of claim 1, Hulten further discloses, “wherein the convolutional neural network is trained using IPRID reputation training data that includes at least two of the following aggregate features:
30a number of submissions (Hulten, Para [0080]: … The system 300 also may track the number of times a URL appeared recently in a spam source and never in a good source …);
a successful detonation statistic;
a failed detonation statistic;
69a verdict statistic indicating classification as non-malicious; or
a verdict statistic indicating classification as malicious (Hulten, Para [0092-0094]: … By way of example, consider a new web site that gets reported as being a phishing site by a user of an internet access service. The system 300 examines statistics about how that site appeared in numerous data sources, and uses this information, along with a probabilistic model, to determine the probability that the site actually is a phishing site. If this probability is above a target threshold, the site can be automatically propagated to the URL reputation service with a bad reputation and, in any case, the probability can be used to prioritize grading …).
Regarding Claim 4. The combination of Hulten-Ringlein discloses the system of claim 1, Hulten further discloses, “wherein the convolutional neural network is 5trained using IPRID reputation training data that includes at least one of the following rollup features:
a rollup aggregating data for an IPRID template which includes a domain name portion of a uniform resource locator, wherein the domain name portion includes a placeholder designating an unknown value (Hulten, Para [0054, 0067]: …  Based on the data sources 301-307, the system 300 is able to collect a significant amount of statistical information about URLs, including their features and properties. Note that each property can be tracked per URL, per domain, and per IP address (via a DNS lookup on the domain). Properties may also be tracked at several different time resolutions, e.g., in the last ten minutes, hour, day, week, month, and all time (particularly for known good URL sources) … Number of times the object appeared in the source with a phishing trick (numeric IP, obfuscated URL, HREF mismatch, "look-alike URL", and so forth) …);”
10a rollup aggregating data for an IPRID template which includes a path portion of a uniform resource locator, wherein the path portion includes a placeholder designating an unknown value;
a rollup aggregating data for an IPRID template which includes a query portion of a uniform resource locator, wherein the query portion 15includes a placeholder designating an unknown value; or
a rollup aggregating data for an IPRID template which matches multiple IP addresses or multiple IP address octets or both, and includes a placeholder designating an unknown value.
Regarding Claim  205. The combination of Hulten-Ringlein discloses the system of claim 1, Hulten further discloses, “wherein the convolutional neural network is trained using IPRID reputation training data that is characterized in at least one of the following ways:
the training data is free of combination features;
the training data is organized as a grid having weekday columns and week 25rows (Hulten, Para [0135]: … Turning to a consideration of training, training examples are produced from the data reports and phishing classifications described above. Each example corresponds to a web host (or site URL) that appeared in one of the data sources, and the features of the example record information about the contexts in which the host appeared. Examples are produced by aggregating these properties over time-windows of the data sources, e.g., by using every browser-generated complaint report over the past day, every FBL spam report over the past week, and every FBL good report over the past month …);
the training data is organized in vector space planes that correspond to data sources;
the training data is organized in vector space planes that correspond to data meanings; or
30the training data is organized in input branches that correspond to groups of data having the same meaning;70
the training data includes original domains as opposed to domains reached look
through redirection; or
the training data includes domains reached through redirection.
Regarding Claim  56. The combination of Hulten-Ringlein discloses the system of claim 1, Hulten further discloses, “comprising an instance of the lookup table which distinguishes malicious IPRIDs from other IPRIDs, the lookup table instance including IPRIDs and corresponding maliciousness probability values for a particular corpus of IPRIDs  (Hulten, Para [0169-0170]: Applying this model to the testing data, along with hand examination of some of the messages from the FBL that linked to hosts that had a high probability of being phishing sites according to the model, proved that a large majority of the test sites that fell into the two most-probable phishing leaves were indeed phishing. As can be readily appreciated, investigating any new hosts that fall into these leaves is a reliable way to find new phishing sites. Depending on the classification, reaching a leaf for a site may correspond to taking action to block a site, unblock a site, suggest hand grading, warn users about the possibility of phishing, and any other action … The following table shows some example statistics on the phish hit rate of the five "most phishy" leaves in the tree: TABLE-US-00001 P(is Number Number Phish) of of Non- From Phish Phish Leaf Hit Model Found Examined Rate 0.94 27 2 93.0% 0.72 3 0 100.0% 0.66 6 32 15.8% 0.65 2 26 7.1% 0.45 1 2 33.0% 0.21 4 23 14.8% …).”
Regarding Claim  107. Hulten discloses An internet protocol resource identification (IPRID) reputation assessment method (Hulten, Abstract: Described is a technology by which phishing-related data sources are processed into aggregated data and a given site evaluated the aggregated data using a predictive model to automatically determine whether the given site is likely to be a phishing site … …), comprising:
obtaining IPRID reputation training data which includes rollup features that are based on [hierarchical] rollups of data about IPRIDs which include placeholders designating unknown values (Hulten, Para [0054, 0067]: …  Based on the data sources 301-307, the system 300 is able to collect a significant amount of statistical information about URLs, including their features and properties. Note that each property can be tracked per URL, per domain, and per IP address (via a DNS lookup on the domain). Properties may also be tracked at several different time resolutions, e.g., in the last ten minutes, hour, day, week, month, and all time (particularly for known good URL sources) … Number of times the object appeared in the source with a phishing trick (numeric IP, obfuscated URL, HREF mismatch, "look-alike URL", and so forth) …);
However, Hulten does not explicitly teach, but Ringlein from same or similar field of endeavor teaches, “hierarchical rollups of data about IPRIDs (Ringlein, Para [0065-0069], FIG. 2: … FIG. 2 is an example diagram of a security knowledge graph for an example security incident or alert in accordance with one illustrative embodiment … As shown in FIG. 2, the security knowledge graph consists of the security offense source IP as a root node of the security knowledge graph. Network observables, such as an Internet Protocol (IP) address, Universal Resource Locator (URL), domain name, file hashes, and the like, are represented as nodes connected to the root node via different network events … The source IP address connected to an external IP address that resolved to a domain name (the domain name being an external network observable) indicated by the other black nodes connected by edges to the root node …); Examiner’s Note: data related to entity (i. e. IP address, domain name URL etc.) are collected traversing from the entity to the root entity, and hence it’s hierarchical rollup)”;
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Ringlein into the teachings of Hulten, because it discloses that “the threat intelligence may be provided by external sources, such as threat intelligence services such as IBM X-Force, CrowdStrike, and the like, that have threat hunters constantly monitoring the threat landscape and creating information on what are the indicators associated with existing threats/emerging threats. For example, it may be known to threat intelligence services that the IP address 64.22.22.22 is an IP address that is known to be associated with a webpage that services exploit kits. Here 64.22.22.22 is an indicator, so if one sees an offense containing this IP address, using threat intelligence information, one can determine that an exploit kit could have exploited one of the organization's computers (Ringlein, Para [0067])”.
Hulten further discloses:
“15training a machine learning model using the IPRID reputation training data (Hulten, Abstract, Para [0006-0008], FIG. 5: … Described is a technology by which phishing-related data sources are processed into aggregated data and a given site evaluated the aggregated data using a predictive model to automatically determine whether the given site is likely to be a phishing site. The predictive model may be built using machine learning based on training data, e.g., including known phishing sites and/or known non-phishing sites …); and
utilizing the trained machine learning model by producing a lookup table which distinguishes malicious IPRIDs from other IPRIDs (Hulten, Para [0169-0170]: Applying this model to the testing data, along with hand examination of some of the messages from the FBL that linked to hosts that had a high probability of being phishing sites according to the model, proved that a large majority of the test sites that fell into the two most-probable phishing leaves were indeed phishing. As can be readily appreciated, investigating any new hosts that fall into these leaves is a reliable way to find new phishing sites. Depending on the classification, reaching a leaf for a site may correspond to taking action to block a site, unblock a site, suggest hand grading, warn users about the possibility of phishing, and any other action … The following table shows some example statistics on the phish hit rate of the five "most phishy" leaves in the tree: TABLE-US-00001 P(is Number Number Phish) of of Non- From Phish Phish Leaf Hit Model Found Examined Rate 0.94 27 2 93.0% 0.72 3 0 100.0% 0.66 6 32 15.8% 0.65 2 26 7.1% 0.45 1 2 33.0% 0.21 4 23 14.8% …).”
Regarding Claim  208. The combination of Hulten-Ringlein discloses the method of claim 7, Hulten further discloses, “wherein the method is characterized in at least one of the following ways:
the IPRID reputation training data includes label tuples, with each label tuple having an IPRID, a date, and a label, and obtaining IPRID reputation training data comprises expanding training data by date 25to include a set of label tuples which all have the same IPRID and the same label but have different dates (Hulten, Para [0035-0038]: … Turning to a more detailed example, FIG. 3 contains a high-level view of one example architecture (also referred to as a phish-finder system or anti-phishing system) 300 in which a phishing detection implementation may operate. As generally represented in FIG. 3, a variety of Phish Data sources 301-307 are available to the example anti-phishing system 300 … Note that although a limited number of example sources are shown, there is no limitation to the number and type of data; indeed the "etc" source labeled 307 in FIG. 3 represents additional sources and types of phishing-related data. For example, it is likely that legitimate enterprises will share some of their data with one another, providing many more sources. Other examples include data from IP/domain registrars … Although not readily apparent in FIG. 3, some of the sources are more current with respect to data than others, e.g., honeypot data 304 is very up-to-date (nearly instantaneous), whereas browser, toolbar and user-provided data 301, 302, 305 and 306 is moderately up-to-date, while FBL-like data 303 are relatively dated (but nevertheless useful) …; Examiner’s Note: date collected from a source/IPRID, i.e. IP/Domain, can be from different dates); or
the IPRID reputation training data includes label tuples, with each label tuple having an original IPRID, a final IPRID, a date, and a label, and obtaining IPRID reputation training data comprises expanding 30training data by date or by original IPRID to include a set of label tuples which all have the same final IPRID and the same label but have different dates or different original IPRIDs or both.
Regarding Claim 9. The combination of Hulten-Ringlein discloses the method of claim 7, Hulten further discloses, “wherein the obtaining comprises joining {domain, date, label} tuples with one or more rollup features (Hulten, Para [0046-0048]: … Returning to FIG. 3, in general, the raw data from each of these sources 301-307 is processed, e.g., by respective data processing means 311-317 into machine-readable (and optionally human-readable) reports 321-327 …  there is one report store per data stream which contains the information extracted from the data on the stream, e.g., accumulated for the past N days. The value of N may be determined empirically and may be different for different data sources, and in fact there may be multiple N values for a single data source; e.g., multiple types of information may be extracted from the same data source and each type of information may be aggregated over multiple different time spans … A report aggregator 332 is provided to aggregate information from the report stores 321-327 into a description 334 of how each host (or URL) has been observed in the data streams over the selected time periods …)”.
Regarding Claim  510. The combination of Hulten-Ringlein discloses the method of claim 7, Hulten further discloses, “wherein obtaining comprises joining {domain, date, label} tuples with one or more aggregate features that are based on aggregated data about respective IPRIDs, and training comprises feeding {domain, date, label, aggregate(s)} tuples to a neural network (Hulten, Para [0172-1078]: … FIG. 5 is a high-level representation of the training side of a PhishFinder system. Data sources 501-507, data processors 511-517 and report stores 521-527 may be similar components to counterpart components represented in the system of FIG. 3, e.g., the components may be shared between the training and performance sides of the same system. However, the report stores 521-527 for the training system may contain data on longer time periods than does the performance system, e.g., roughly N days of data before the first phishing site that will be trained on first appeared … In general, a training report aggregator 554 aggregates information from the report stores 501-507 into a description of how each phishing host (or URL) and confirmed non-phishing host (or URL) was observed in the data streams over the selected time windows. The training report aggregator 554 may generate several aggregated reports per known phishing site, for example using one or more of the following methods: … Using information available immediately upon the first observation of the site in any of the data sources, or one for the first observation in each data source where the site was observed. [0175] One for every J observations of the site in any data source (with a maximum of K). Or N reports divided so each captures an even number of the observations of the site in the data sources …).”
Regarding Claim  1011. The combination of Hulten-Ringlein discloses the method of claim 7, Hulten further discloses, “wherein the producing produces a lookup table including IPRIDs and corresponding maliciousness probability values for more than one hundred thousand IPRIDs (Hulten, Para [0169-0170, 0092-0094]: … The following table shows some example statistics on the phish hit rate of the five "most phishy" leaves in the tree: TABLE-US-00001 P(is Number Number Phish) of of Non- From Phish Phish Leaf Hit Model Found Examined Rate 0.94 27 2 93.0% 0.72 3 0 100.0% 0.66 6 32 15.8% 0.65 2 26 7.1% 0.45 1 2 33.0% 0.21 4 23 14.8% …) … By way of example, consider a new web site that gets reported as being a phishing site by a user of an internet access service. The system 300 examines statistics about how that site appeared in numerous data sources, and uses this information, along with a probabilistic model, to determine the probability that the site actually is a phishing site. If this probability is above a target threshold, the site can be automatically propagated to the URL reputation service with a bad reputation and, in any case, the probability can be used to prioritize grading …).
Regarding Claim 12. The combination of Hulten-Ringlein discloses the method of claim 7, Ringlein further discloses, “wherein the obtaining comprises:
15generating a base slot for a domain (Ringlein, Para [0038]: During the training of the security incident ML model, the training data is provided to the feature extraction engine which extracts the selected set of features from the security incidents and their corresponding security knowledge graphs … the security incident ML model may output a vector having a plurality of vector slots, where each vector slot corresponds to a particular disposition, and a value in the corresponding vector slot represents a probability (or confidence) that the corresponding disposition is the correct disposition for the security incident …);
generating a rollup slot for the domain (Ringlein, Para [0038]: During the training of the security incident ML model, the training data is provided to the feature extraction engine which extracts the selected set of features from the security incidents and their corresponding security knowledge graphs … the security incident ML model may output a vector having a plurality of vector slots, where each vector slot corresponds to a particular disposition, and a value in the corresponding vector slot represents a probability (or confidence) that the corresponding disposition is the correct disposition for the security incident …);
assigning to the base slot a value that depends at least in part on whether a reputation score is known for the domain (Ringlein, Para [0038]: During the training of the security incident ML model, the training data is provided to the feature extraction engine which extracts the selected set of features from the security incidents and their corresponding security knowledge graphs … the security incident ML model may output a vector having a plurality of vector slots, where each vector slot corresponds to a particular disposition, and a value in the corresponding vector slot represents a probability (or confidence) that the corresponding disposition is the correct disposition for the security incident …); and
assigning a rollup value to the rollup slot (Ringlein, Para [0038]: During the training of the security incident ML model, the training data is provided to the feature extraction engine which extracts the selected set of features from the security incidents and their corresponding security knowledge graphs … the security incident ML model may output a vector having a plurality of vector slots, where each vector slot corresponds to a particular disposition, and a value in the corresponding vector slot represents a probability (or confidence) that the corresponding disposition is the correct disposition for the security incident …).”
The motivation to further combine Ringlein remains same as in claim 7.
Regarding Claim 13. The combination of Hulten-Ringlein discloses the method of claim 7, Ringlein further discloses, “wherein the machine learning model includes a convolutional neural network (Ringlein, Para [0079]: … Returning again to FIG. 1A, the extracted features/metrics 143 are input to a cognitive predictive computer model 148 which is then trained 146 based on the extracted features/metrics 143 and the corresponding correct labels provided by the security analyst 131 and associated with the corresponding security alert entry exported 142 to the FEE 144. The cognitive predictive computer model 148 may be security incident machine learning (ML) mechanism, such as previously described above, which may employ a neural network model, such as a convolutional neural network (CNN) model, that is trained through a machine learning process using, as input, the extracted features/metrics from the exported security alerts/security knowledge graphs and the corresponding ground truth of the correct disposition generated by the security analyst 131 and stored in the security alert database 138 …), and wherein training comprises feeding the convolutional neural network a grid which has separate planes for respective aggregate features (Ringlein, Para [0025]: … Security Incident and Event Management (SIEM) is an approach to security management that combines security information management with security event monitoring functions into a single security management system. A SIEM system aggregates data from various data sources in order to identify deviations in the operation of the computing devices associated with these data sources from a normal operational state, and then take appropriate responsive actions to the identified deviations. SIEM systems may utilize multiple collection agents that gather security related events from computing devices, network equipment, firewalls, intrusion prevention systems, antivirus systems, and the like …).”
The motivation to further combine Ringlein remains same as in claim 7.
Regarding Claim 14. The combination of Hulten-Ringlein discloses the method of claim 7, Ringlein further discloses, “wherein the machine learning model includes a convolutional neural network (Ringlein, Para [0079]: … Returning again to FIG. 1A, the extracted features/metrics 143 are input to a cognitive predictive computer model 148 which is then trained 146 based on the extracted features/metrics 143 and the corresponding correct labels provided by the security analyst 131 and associated with the corresponding security alert entry exported 142 to the FEE 144. The cognitive predictive computer model 148 may be security incident machine learning (ML) mechanism, such as previously described above, which may employ a neural network model, such as a convolutional neural network (CNN) model, that is trained through a machine learning process using …), and wherein training comprises feeding the convolutional neural network a grid of label tuples which contain feature values that are calculated, from feature data, [relative to a receipt date of the 30feature data] (Ringlein, Para [0079]: … When extracting features from the security knowledge graph, the security knowledge graph is traversed from the root node and selected features of the encountered nodes, edges, and the overall topology are extracted and stored as part of a feature set representative of the security incident. For example, node features such as “toxicity”, “reputation” (if the threat source provides reputation data), and “node type”, may be extracted. Further examples of features that may be extracted from the nodes and/or security incident include category.label, (the label provided by such sources as IBM X-Force Exchange which denotes the category of this node, such as botnet, spam, etc.), emailBody (a binary feature determining whether this node represents an email), filetype (such as exe, doc, pdf), has executed_file (feature from the offense metadata which says whether, for example, a piece of malware ran or not), Reputation.label, Reputation.score, (a score from, for example, IBM X-Force Exchange, which denotes just how bad a particular item, such as an IP address or domain, is with regard to its level of threat), Signal, Reputation.toxicity (a value from 0 to 1 indicating the degree that something seems to be malicious), Category.toxicity, (similar to reputation.toxicity, but instead rates categories such as malware, spam, botnet, etc. This is a value indicating the “danger” of this category), and Hashes.toxicity (similar to the other toxicities, but this rates file hashes from such sources as IBM X-Force Exchange or the unstructured Watson for Cybersecurity corpus). Examples of node types may be a “domain” node, “hash” node, “IP” node, etc. …).”
The motivation to further combine Ringlein remains same as in claim 7.
Hulten further discloses, “calculated, from feature data, relative to a receipt date of the 30feature data (Hulten, Para [0087]: … Turning to an explanation of some example ways to identify phishing sites and measuring the performance of the system 300, consider that at some date data indicates that phishing sites are active for N days on average (where N may include a fraction, and may be less than one) before being shut down …)”
Regarding Claim 15. The combination of Hulten-Ringlein discloses the method of claim 7, Hulten further discloses, “wherein producing a lookup table comprises inferring a label for a current date based on aggregate feature data for prior dates (Hulten, Para [0037]: … Although not readily apparent in FIG. 3, some of the sources are more current with respect to data than others, e.g., honeypot data 304 is very up-to-date (nearly instantaneous), whereas browser, toolbar and user-provided data 301, 302, 305 and 306 is moderately up-to-date, while FBL-like data 303 are relatively dated (but nevertheless useful). …).”
Regarding Claim  516. This claim contains all the same or similar limitations as claim 1, and hence similarly rejected as claim 1.
Regarding Claim 17. The combination of Hulten-Ringlein discloses the storage medium of claim 16, Hulten further discloses, “wherein:
training uses IPRID reputation training data that includes both aggregate 20features and rollup features (Hulten, Abstract, Para [0054, 0067]: … Described is a technology by which phishing-related data sources are processed into aggregated data and a given site evaluated the aggregated data using a predictive model to automatically determine whether the given site is likely to be a phishing site. The predictive model may be built using machine learning based on training data, e.g., including known phishing sites and/or known non-phishing sites …  Based on the data sources 301-307, the system 300 is able to collect a significant amount of statistical information about URLs, including their features and properties. Note that each property can be tracked per URL, per domain, and per IP address (via a DNS lookup on the domain). Properties may also be tracked at several different time resolutions, e.g., in the last ten minutes, hour, day, week, month, and all time (particularly for known good URL sources) … Number of times the object appeared in the source with a phishing trick (numeric IP, obfuscated URL, HREF mismatch, "look-alike URL", and so forth) …); and
utilizing the trained machine learning model comprises inferring respective maliciousness probabilities for at least one hundred thousand IPRIDs ((Hulten, Para [0169-0170, 0092-0094]: … The followin table shows some example statistics on the phish hit rate of the five "most phishy" leaves in the tree: TABLE-US-00001 P(is Number Number Phish) of of Non- From Phish Phish Leaf Hit Model Found Examined Rate 0.94 27 2 93.0% 0.72 3 0 100.0% 0.66 6 32 15.8% 0.65 2 26 7.1% 0.45 1 2 33.0% 0.21 4 23 14.8% …) … By way of example, consider a new web site that gets reported as being a phishing site by a user of an internet access service. The system 300 examines statistics about how that site appeared in numerous data sources, and uses this information, along with a probabilistic model, to determine the probability that the site actually is a phishing site. If this probability is above a target threshold, the site can be automatically propagated to the URL reputation service with a bad reputation and, in any case, the probability can be used to prioritize grading …) and also comprises producing a lookup table which includes the IPRIDS and the respective maliciousness probabilities (Hulten, Para [0169-0170, 0092-0094]: … The followin table shows some example statistics on the phish hit rate of the five "most phishy" leaves in the tree: TABLE-US-00001 P(is Number Number Phish) of of Non- From Phish Phish Leaf Hit Model Found Examined Rate 0.94 27 2 93.0% 0.72 3 0 100.0% 0.66 6 32 15.8% 0.65 2 26 7.1% 0.45 1 2 33.0% 0.21 4 23 14.8% …) … By way of example, consider a new web site that gets reported as being a phishing site by a user of an internet access service. The system 300 examines statistics about how that site appeared in numerous data sources, and uses this information, along with a probabilistic model, to determine the probability that the site actually is a phishing site. If this probability is above a target threshold, the site can be automatically propagated to the URL reputation service with a bad reputation and, in any case, the probability can be used to prioritize grading …).”
Regarding Claim 18. The combination of Hulten-Ringlein discloses the storage medium of claim 16, Hulten further discloses, “wherein training uses IPRID reputation training data that includes at least two of the following:
a feature based on counts of classifications of a domain (Hulten, Para [0031]: … the data is processed via data processing means 220 to extract data as to any site having properties that indicates a phishing site. To this end, the data processing means 220 may rely on criteria such as suspicious word/phrase lists 222 (e.g., "credit card" or "bank account"), domain lists 224 (e.g., good reputation versus unknown versus bad), traffic lists 226 regarding amount and characteristics of traffic to a site, and other data 228. For example, geographic data (via the IP address) may be used to determine if a site is being hosted in a country or other physical location having a reputation as a location for phishing scams and/or located remotely relative to the enterprise that the site is purporting to represent. Seasonality may be a factor, e.g., hourly, daily, day of week, weekly, monthly and so forth. In essence, virtually any criteria may be used to evaluate the properties of a site …);
a feature based on whois data;
30a feature based on geolocation data (Hulten, Para [0031]: … the data is processed via data processing means 220 to extract data as to any site having properties that indicates a phishing site. To this end, the data processing means 220 may rely on criteria such as suspicious word/phrase lists 222 (e.g., "credit card" or "bank account"), domain lists 224 (e.g., good reputation versus unknown versus bad), traffic lists 226 regarding amount and characteristics of traffic to a site, and other data 228. For example, geographic data (via the IP address) may be used to determine if a site is being hosted in a country or other physical location having a reputation as a location for phishing scams and/or located remotely relative to the enterprise that the site is purporting to represent. Seasonality may be a factor, e.g., hourly, daily, day of week, weekly, monthly and so forth. In essence, virtually any criteria may be used to evaluate the properties of a site …); or
a feature based on tenant data.
Regarding Claim 19. The combination of Hulten-Ringlein discloses the storage medium of claim 16, Hulten further discloses, “wherein training uses IPRID reputation training data that includes at least two of the following:
a feature based on raw aggregate data (Hulten, Para [0093]: … the anti-phishing system 300 uses various data sources to find phishing sites, including sources that are closely affiliated with the email and internet access services being offered, (e.g., non-third party sources). The combination of sources provides a stronger model, especially when aggregated across both email-based and browser-based sources, and the model is further strengthened by using data sources that contain known non-phishing sites (e.g. FBL good mail). Features are extracted about the sites, including aggregations done at a host/site level, and probabilistic models are used to make predictions regarding phishing sites …);
a feature based on non-aggregate data (Hulten, Para [0179]: … while a single model that integrates information from all of the data sources may be used, it is also feasible to use a separate model that predicts per data source whether a host/URL is phishing or not (e.g. P(host seen in FBL is phishing| how it appeared in FBL)), with a meta learner used to combine the probabilities of the individual models. Standard ensemble learning methods may be used to build a model (e.g. boosting) …);
5a feature based on aggregate rollup data; or
a feature based on non-aggregate rollup data.
Regarding Claim 20. The combination of Hulten-Ringlein discloses the storage medium of claim 16, Ringlein further discloses, “wherein utilizing the trained machine learning model comprises utilizing at least one of the following: 
10a convolutional neural network (Ringlein, Para [0036]: … The security incident ML model, in some illustrative embodiments, is a neural network based model that is trained through a supervised or unsupervised machine learning operation. The neural network model may be a convolutional neural network (CNN) …);
a decision tree classifier;
a long short term memory model;
a logistical regression model; or
a deep neural network.
The motivation to further combine Ringlein remains same as in claim 16 (i.e. claim 1)
Pertinent Prior Arts: The prior arts made of record and not relied upon are considered pertinent to applicant's disclosure:
	PGPUB US 20190014149 A1, Cleveland et al.: Cleveland discloses a method of detecting a phishing event comprises acquiring an image of visual content rendered in association with a source, and determining that the visual content includes a password prompt. The method comprises performing an object detection, using an object detection convolutional network, on a brand logo in the visual content, to detect one or more targeted brands. Spatial analysis of the visual content may be performed to identify one or more solicitations of personally identifiable information. The method further comprises determining, based on the object detection and the spatial analysis, that at least a portion of the visual content resembles content of a candidate brand, and comparing the domain of the source with one or more authorized domains of the candidate brand. A phishing event is declared when the comparing indicates that the domain of the source is not one of the authorized domains of the candidate brand.
	The described embodiments are directed to systems for, and methods of, preventing phishing breaches. The described embodiments are configured to proactively identify deceptive phishing communications, such as emails and URLs, across an array of communications platforms.
	PGPUB US 20140033307 A1, Schmidtler: Schmidtler discloses a phishing classification model that detects a phishing website based on one or more feature vectors for the website is provided. The phishing classification model may operate on a server and may further select a website, generate a feature vector for a landing page of the website, create a feature vector for every iframe that is a descendent of the landing page, and derive a final feature vector from the feature vectors of the landing page and the descendent iframe pages. Further, machine learning techniques may be applied to generate, or train, a classification model based upon one or more known phishing websites. Based on the feature vector, the classification modeler may classify a website as either a phishing website or as a non-phishing website. Feedback in the form of human verification may further be incorporated.
	The invention comprises an automatic classification system that identifies phishing sites. The system utilizes Machine Learning techniques to automate the learning of the classification system.
	PGPUB US 20180063168 A1, SOFKA: Sofka discloses a computer-implemented data processing method comprises: executing a recurrent neural network (RNN) comprising nodes each implemented as a Long Short-Term Memory (LSTM) cell and comprising links between nodes that represent outputs of LSTM cells and inputs to LSTM cells, wherein each LSTM cell implements an input layer, hidden layer and output layer of the RNN; receiving network traffic data associated with networked computers; extracting feature data representing features of the network traffic data and providing the feature data to the RNN; classifying individual Uniform Resource Locators (URLs) as malicious or legitimate using LSTM cells of the input layer, wherein inputs to the LSTM cells are individual characters of the URLs, and wherein the LSTM cells generate feature representation; based on the feature representation, generating signals to a firewall device specifying either admitting or denying the URLs.
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MAHABUB S AHMED whose telephone number is (571)272-0364.  The examiner can normally be reached on 9AM-5PM EST M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kambiz Zand can be reached on (571)272-3811.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/MAHABUB S AHMED/Examiner, Art Unit 2434 

/DANT B SHAIFER HARRIMAN/Primary Examiner, Art Unit 2434