DETAILED ACTION
This communication is responsive to the application # 16/875,540 filed on May 15, 2020. Claims 1-20 are pending and are directed toward SYSTEMS AND METHODS FOR AN AT-RISK SYSTEM IDENTIFICATION VIA ANALYSIS OF ONLINE HACKER COMMUNITY DISCUSSIONS.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5, 8, 10, 13, 17, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Tavabi et al. (DarkEmbed: Exploit Prediction with Neural Language Models, IAAI-18, 2018-04-27, 10 pages), in view of Shakarian et al. (Chapter 8 Cyber Attribution: An Argumentation-Based Approach, pages 151-171, 2015), hereinafter referred to as Tavabi and Shakarian.
As per claim 1, Tavabi teaches a method for computer-implemented identification of at-risk systems (Recent works used machine learning techniques to predict exploited vulnerabilities by analyzing discussions about vulnerabilities on social media. Tavabi, page 7849), comprising:
accessing input data, wherein the input data comprises discussion data relevant to a system (The model takes as input a tokenized text corpus C ={w1, w2, . . . , wn} and creates a context for each word, Tavabi, page 7851);
defining a dataset, wherein the dataset includes the input data (We used a dataset containing almost 2,500,000 messages posted on a variety of darkweb and deepweb sites over the period from 2010 through August 2017. Tavabi, page 7852);
filtering the input data included in the dataset to select portions of the input data which comprises predetermined relevant topics (Since our goal is to predict vulnerabilities that are likely to be exploited, the posts referencing vulnerabilities after the exploitation date were removed from the data. This filtering step left 4898 posts mentioning 1886 distinct CVEs, Tavabi, page 7852);
sorting the input data included in the dataset (some vulnerabilities were mentioned in more than one post. For the posts mentioning more than one vulnerability, we only considered the less frequently mentioned CVE. Tavabi, page 7852);
Tavabi does not teach argumentation model, Shakarian however teaches defining an argumentation model configured to construct arguments for a given query, wherein the arguments are constructed using the dataset (we choose a structured argumentation framework (Rahwan et al. 2009) due to several characteristics that make such frameworks highly applicable to cyber-warfare domains. Unlike the EM, which describes probabilistic information about the state of the real world, the AM must allow for competing ideas—it must be able to represent contradictory information. The algorithmic approach allows for the creation of arguments based on the AM that may “compete” with each other to describe who conducted a given cyber-operation. In this competition—known as a dialectical process—one argument may defeat another based on a comparison criterion that determines the prevailing argument. Resulting from this process, the InCA framework will determine arguments that are warranted (those that are not defeated by other arguments) thereby providing a suitable explanation for a given cyber-operation. Shakarian, page 158, see also 8.4 Attribution Queries, pages 167-168);
Tavabi in view of Shakarian are analogous art to the claimed invention, because they are from a similar field of endeavor of systems, components and methodologies for providing secure communication between computer systems. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Tavabi in view of Shakarian. This would have been desirable because the transparency provided by the system can allow analysts to identify potentially incorrect input information and fine-tune the models or, alternatively, collect more information. In short, argumentation-based reasoning has been studied as a natural way to manage a set of inconsistent information—it is the way humans settle disputes. As we will see, another desirable characteristic of (structured) argumentation frameworks is that, once a conclusion is reached, we are left with an explanation of how we arrived at it and information about why a given argument is warranted; this is very important information for analysts to have (Shakarian, page 158).

Tavabi in view of Shakarian further teaches defining a machine learning model configured to use as input the dataset (We propose DarkEmbed, an efficient algorithm which utilizes neural language models to learn features of conversations on the D2Web. Tavabi, page 7850); and
generating a risk assessment for the system based on the dataset and the given query (Using DarkEmbed, we identify keywords in D2Web indicative of exploit probability, i.e., words which are associated with high and low rates of exploitation. Tavabi, page 7850).
As per claim 2, Tavabi in view of Shakarian teaches the method of claim 1, wherein the input data is provided by a data supplier (NVD and ExploitDB are among those data sources used in different methods. NVD: NIST provides this database which has a list of vulnerabilities disclosed, it also contains descriptions, CVSS score and other metrics for each vulnerability. ExploitDB 4: It is a repository for exploits, reported by security researchers. Tavabi, page 7850).
As per claim 3, Tavabi in view of Shakarian teaches the method of claim 1, wherein the discussion data included in the input data is retrieved from darkweb based marketplaces and forums (indeed, many darkweb and deepweb sites also create forums for other illicit activities, such as drug markets and the sale of stolen goods. Tavabi, page 7851).
As per claim 5, Tavabi in view of Shakarian teaches the method of claim 1, further comprising configuring the machine learning model to use as input a reduced set of system components corresponding to the system (Since the TF-IDF vectors can be quite large, classification methods using them would experience slow processing time and large memory usage. To reduce the size of document vectors, instead of the entire vocabulary, often a subset of the most frequent words is used to represent the documents. These document vectors are then used in the classification task. Tavabi, page 7852).
 As per claim 10, Tavabi in view of Shakarian teaches the method of claim 1, wherein a machine learning model filters the input data (we only considered posts mentioning a single vulnerability. We used embedding of size 150 (blog posts are lengthier that darkweb posts), CVSS score and number of times a vulnerability was mentioned in this dataset as features. Note that the optimal embedding size was obtained through cross validation. With 1613 blog posts in our dataset, we were able to achieve F1 = 0.80 and AUC = 0.87.).
Claims 13, 17, and 18 have limitations similar to those treated in the above rejection, and are met by the references as discussed above, and are rejected for the same reasons of obviousness as used above.
Claims 4, 11 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Tavabi et al. (DarkEmbed: Exploit Prediction with Neural Language Models, IAAI-18, 2018-04-27, 10 pages), in view of Shakarian et al. (Chapter 8 Cyber Attribution: An Argumentation-Based Approach, pages 151-171, 2015), in view of Nunes et al. (Argumentation Models for Cyber Attribution, arXiv:1607.02171v1 [cs.AI] 7 Jul 2016, 9 pages) hereinafter referred to as Tavabi, Shakarian and Nunes.
As per claim 4, Tavabi in view of Shakarian teaches the method of claim 1, but does not teach sorted, Nunes however teaches wherein the input data is sorted according time (The experiment was performed as follows. The dataset was divided according to the target team, building 20 subsets, and all the attacks were then sorted according to time. Nunes, page 5).
Tavabi in view of Shakarian in view of Nunes are analogous art to the claimed invention, because they are from a similar field of endeavor of systems, components and methodologies for providing secure communication between computer systems. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Tavabi in view of Shakarian in view of Nunes. This would have been desirable because after running the queries to return the set of possible culprits, the average search space across all target teams is 5.85 teams. This is a significant reduction in search space across all target teams (Nunes, page 6).

As per claim 11, Tavabi in view of Shakarian in view of Nunes teaches the method of claim 1, wherein the argumentation model employs defeasible logic programming (DeLP is a formalism that combines logic programming with defeasible argumentation; full details are discussed in [8]. Nunes, page 4).
Tavabi in view of Shakarian in view of Nunes are analogous art to the claimed invention, because they are from a similar field of endeavor of systems, components and methodologies for providing secure communication between computer systems. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Tavabi in view of Shakarian in view of Nunes. This would have been desirable because DeLP incorporates defeasible argumentation, which decides which arguments are warranted and it blocks arguments that are in conflict and a winner cannot be determined (Nunes, page 4).
Claim 19 has limitations similar to those treated in the above rejection, and are met by the references as discussed above, and are rejected for the same reasons of obviousness as used above.
Claims 6, 7, 12 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Tavabi et al. (DarkEmbed: Exploit Prediction with Neural Language Models, IAAI-18, 2018-04-27, 10 pages), in view of Shakarian et al. (Chapter 8 Cyber Attribution: An Argumentation-Based Approach, pages 151-171, 2015), in view of Nunes et al. (Darknet and Deepnet Mining for Proactive Cybersecurity Threat Intelligence, IEEE 2016, 6 pages), hereinafter referred to as Tavabi, Shakarian and Nunes2.
As per claim 6, Tavabi in view of Shakarian teaches the method of claim 1, but does not teach a set of components related to the system from a platform, a vendor, and a product set, Nunes however further teaches further comprising constraining the machine learning model to select a set of components related to the system from a platform, a vendor, and a product set (We are providing this information to cyber-security professionals to support their strategic cyder-defense planning to address questions such as, 1) What vendors and users have a presence in multiple darknet/deepnet markets/ forums? 2)What zero-day exploits are being developed by malicious hackers? 3) What vulnerabilities do the latest exploits target? Nunes2, page 1).
Tavabi in view of Shakarian in view of Nunes2 are analogous art to the claimed invention, because they are from a similar field of endeavor of systems, components and methodologies for providing secure communication between computer systems. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Tavabi in view of Shakarian in view of Nunes2. This would have been desirable because Our current system is operational and actively collecting approximately 305 cyber threats each week. Table 2 shows the current database statistics. It shows the total data collected and the data related to malicious hacking. The vendor and user statistics cited only consider those individuals associated in the discussion or sale of malicious hacking-related material, as identified by the system. The data is collected from two sources on the darknet/deepnet: markets and forums (Nunes2, page 1).

As per claim 7, Tavabi in view of Shakarian in view of Nunes2 teaches the method of claim 1, further comprising identifying an at-risk platform, an at-risk vendor, and an at-risk product set (We are providing this information to cyber-security professionals to support their strategic cyder-defense planning to address questions such as, 1) What vendors and users have a presence in multiple darknet/deepnet markets/ forums? 2)What zero-day exploits are being developed by malicious hackers? 3) What vulnerabilities do the latest exploits target? Nunes2, page 1).
Tavabi in view of Shakarian in view of Nunes2 are analogous art to the claimed invention, because they are from a similar field of endeavor of systems, components and methodologies for providing secure communication between computer systems. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Tavabi in view of Shakarian in view of Nunes2. This would have been desirable because Our current system is operational and actively collecting approximately 305 cyber threats each week. Table 2 shows the current database statistics. It shows the total data collected and the data related to malicious hacking. The vendor and user statistics cited only consider those individuals associated in the discussion or sale of malicious hacking-related material, as identified by the system. The data is collected from two sources on the darknet/deepnet: markets and forums (Nunes2, page 1).

As per claim 12, Tavabi in view of Shakarian teaches the method of claim 1, but does not teach non-alphanumeric characters, Nunes2 however teaches further comprising removing non-alphanumeric characters from the discussion data (Text Cleaning. Product title and descriptions on marketplaces often have much text that serves as noise to the classifier (e.g. *****SALE*****). To deal with these instances, we first removed all non-alphanumeric characters from the title and description. Nunes2, page 9).
Tavabi in view of Shakarian in view of Nunes2 are analogous art to the claimed invention, because they are from a similar field of endeavor of systems, components and methodologies for providing secure communication between computer systems. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Tavabi in view of Shakarian in view of Nunes2. This would have been desirable because This, in tandem with standard stop-word removal, greatly improved classification performance. Nunes2, page 9).
As per claim 14, Tavabi in view of Shakarian in view of Nunes2 teaches the system of claim 13, wherein the instructions executed by the processor further comprise computing a list of system components included in the discussion data relevant to the system (We performed evaluation on two such English forums. The dataset consisted of 781 topics with 5373 posts. Table 5 gives instance of topics defined as being relevant or not. We label 25% of the topics and perform a 10-fold cross validation using supervised methods. Nunes2, page 10).
Tavabi in view of Shakarian in view of Nunes2 are analogous art to the claimed invention, because they are from a similar field of endeavor of systems, components and methodologies for providing secure communication between computer systems. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Tavabi in view of Shakarian in view of Nunes2. This would have been desirable because leveraging unlabeled data in a semi-supervised technique improved the recall while maintaining the precision (Nunes2, page 10).

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Tavabi et al. (DarkEmbed: Exploit Prediction with Neural Language Models, IAAI-18, 2018-04-27, 10 pages), in view of Shakarian et al. (Chapter 8 Cyber Attribution: An Argumentation-Based Approach, pages 151-171, 2015), in view of Deb et al. (Predicting Cyber-Events by Leveraging Hacker Sentiment, MDPI, Published: 15 November 2018, 18 pages), hereinafter referred to as Tavabi, Shakarian and Deb.
As per claim 9, Tavabi in view of Shakarian teaches the method of claim 1, but does not teach API, Deb however teaches wherein an application programming interface accesses the input data (Working with researchers at Arizona State University, we were able to develop a database of posts from forums on both the dark web and surface web which discuss computer security and network vulnerability topics. To protect the future utility of these sources, each forum has been coded with a number (forumid) from 1 to 350. The data consist of the forumid, date the post was made, and the text of the post. The data in this study was from 1 January 2016 to 31 January 2018. Deb, page 7).
Tavabi in view of Shakarian in view of Deb are analogous art to the claimed invention, because they are from a similar field of endeavor of systems, components and methodologies for providing secure communication between computer systems. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Tavabi in view of Shakarian in view of Deb. This would have been desirable because The data was collected by ASU and we used an API to pull and store the data in a local server and access it via Apache Lucene’s Elastic Search engine. (Deb, page 7).

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Tavabi et al. (DarkEmbed: Exploit Prediction with Neural Language Models, IAAI-18, 2018-04-27, 10 pages), in view of Shakarian et al. (Chapter 8 Cyber Attribution: An Argumentation-Based Approach, pages 151-171, 2015), in view of NIST (NVD CWE Slice, retrieved from WEB Archive from October 10, 2018, 1 page), hereinafter referred to as Tavabi, Shakarian and NIST.
As per claim 15, Tavabi in view of Shakarian teaches the system of claim 13, but does not teach a hierarchy, NIST however teaches wherein the instructions executed by the processor further comprise computing a hierarchy of vulnerable systems and vulnerabilities associated with the vulnerable systems (All individual CWEs are held within a hierarchical structure that allows for multiple levels of abstraction. CWEs located at higher levels of the structure (i.e. Configuration) provide a broad overview of a vulnerability type and can have many children CWEs associated with them. CWEs at deeper levels in the structure (i.e. Cross Site Scripting) provide a finer granularity and usually have fewer or no children CWEs. NIST, page 1).
MOTIVATION: The Common Weakness Enumeration Specification (CWE) provides a common language of discourse for discussing, finding and dealing with the causes of software security vulnerabilities as they are found in code, design, or system architecture. Each individual CWE represents a single vulnerability type (NIST, page 1).
Tavabi in view of Shakarian in view of NIST are analogous art to the claimed invention, because they are from a similar field of endeavor of systems, components and methodologies for providing secure communication between computer systems. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Tavabi in view of Shakarian in view of NIST. This would have been desirable because The Common Weakness Enumeration Specification (CWE) provides a common language of discourse for discussing, finding and dealing with the causes of software security vulnerabilities as they are found in code, design, or system architecture. Each individual CWE represents a single vulnerability type (NIST, page 1).

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Tavabi et al. (DarkEmbed: Exploit Prediction with Neural Language Models, IAAI-18, 2018-04-27, 10 pages), in view of Shakarian et al. (Chapter 8 Cyber Attribution: An Argumentation-Based Approach, pages 151-171, 2015), in view of Samtani et al. (Exploring Emerging Hacker Assets and Key Hackers for Proactive Cyber Threat Intelligence, Journal of Management Information Systems, 34:4, pages 1023-1053, Published online: 02 Jan 2018), hereinafter referred to as Tavabi, Shakarian and Samtani.
As per claim 20, Tavabi in view of Shakarian further in view of Samtani teaches the system of claim 13, further comprising: a commercial data collection platform configured to identify and translate languages used in the discussion data (Regardless of subset, we use Google Translate to translate all posts to English, remove special characters, split identifiers (e.g., readFile to read File), fold case, and remove stop-words. Samtani, page 1033).
Tavabi in view of Shakarian in view of Samtani are analogous art to the claimed invention, because they are from a similar field of endeavor of systems, components and methodologies for providing secure communication between computer systems. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Tavabi in view of Shakarian in view of Samtani. This would have been desirable because Directly collecting and analyzing large amounts of hacker forum data present unique technical challenges that limit researchers’ abilities to produce comprehensive CTI studies. These challenges include vast amounts of textual data, robust anticrawling measures, foreign-language barriers, little-known hacking terms, and complex forum structures (Samtani, page 1025).



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to OLEG KORSAK whose telephone number is (571)270-1938.  The examiner can normally be reached on Monday-Friday 7:30am - 5:00pm EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Saleh Najjar can be reached on (571)272-4006.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/OLEG KORSAK/
Primary Examiner, Art Unit 2492