DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 9-21 and 23-29 are presented for examination. 

Continued Examination under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on December 28, 2021 has been entered.

Response to Amendment
	Applicant’s amendment has obviated most, but not all, of the objections to the specification and drawings  given in the previous Office Actions.  To the extent that an objection or rejection appears in the previous Office Action(s) but not this Office Action, that objection or rejection is withdrawn.  To the extent that is appears both in a previous Office Action(s) and this Office Action, the objection or rejection is maintained.

Specification
The disclosure is objected to because of the following informalities: 
In paragraph 29, “data contains correct answers, which are known as targets or target attributes; thus a properly” should be “data contain correct answers, which are known as targets or target attributes; thus, a properly”.
In paragraph 150, “a most recent seed data” should be “most recent seed data”.
Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:

(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the 
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “labeling module” in claim 9; “expansion module” in claim 11; “confidence module” in claim 12; “voting module” in claims 13 and 26; and “composite key module” in claim 14.
Please note that, notwithstanding the recitation of “one or more processing units” in the claims, the claims nonetheless invoke 35 USC § 112(f) because the claims do not recite algorithms sufficient for performing the entire claimed functions.  See MPEP § 2181(II)(B).
Regarding the “labeling module,” the specification, at paragraph 95, states that 
[a] labeling module 1310 assigns a communications-category label to an unlabeled entity such as a princip[al] entity which may be an email or clustering entity which may be an email feature. The communications-category label may be one of good, spam, phishing, bulk,, LLC Page 19 of 39 Docket: MS1-9238US  malware, or another label. External sources of information such as one or more whitelist(s) 1312, one or more blacklist(s) 1314, and manual labeling 1316 may be used by the labeling module 1310 to label an unlabeled entity. These sources of information provide "seed" labels that are used to form clusters which in turn are leveraged to assign communications-category labels to previously unlabeled entities. 

Therefore, any computer software that labels unlabeled entities by traversing the features of an expansion graph will be deemed to read on the claim.
Regarding the “expansion module,” the specification, at paragraph 100, states that 
[e]xpansion module 1320 may function to "expand" a communications-category label to a second feature of the unlabeled communication based on the expansion graph(s) 1318. Recall that both the communication itself, such as an email, may have a category label and individual features of the communication such as hash of the email content or the identity of an embedded URL may also have a category label. These category labels are not 

Therefore, any computer software that follows the edges and nodes of a directed graph to assign labels to unlabeled features of a communication will be deemed to read on the claim.
Regarding the “confidence module,” the specification, at paragraph 101, states that
[a] confidence module 1322 assigns a probability to the communications-category label based on the expansion graph(s) 1318. As discussed above, one or more of the edges in the expansion graph(s) 1318 may be associated with a confidence degrading ratio. Thus, the confidence in the initial source for the communications-category label and the confidence degrading ratios for any edges of the expansion graph(s) 1318 traversed to assign the confidence-category label to the second feature may be used to identify a probability that the communications-category label assigned to the feature of the unlabeled communication is correct. Different sources of labels may have different starting confidence levels. For example, the confidence level of a label derived from manual labeling 1316 may be given a high confidence level of, for example, 0.9. However, a label derived from a single blacklist 1314 may be given a lower confidence level such as, for example, 0.4. Thus, a combination of the initial confidence levels for the source of the label and the confidence degrading ratios of the edges of the expansion graph(s) 1318 traversed to apply that label to a second, different feature of the communication may be considered by the confidence module 1322. 

Therefore, any computer software that applies a probability to a label based either on an initial confidence level or on a confidence determined by traversing the graph will be deemed to read on the claim.
Regarding the “voting module,” the specification, at paragraphs 102-107, states that voting rules such as removing labels in the case of conflict; discounting labels if the confidence is below a threshold or if the number of sources pointing to a label is below a threshold; removing minority votes; and considering older labels less than newer labels may be applied to the labels in the event of conflict between two or more labels applied to the same communication.  For purposes of examination, any of these voting rules will be deemed to read on the claim.  Paragraph 71 
Regarding the “composite key module,” the specification, at paragraphs 108-10, states that 
[a] composite key module 1326 may combine two or more entity types to create a single key for clustering. A "key" for clustering is an entity type that can be assigned a communication-category label such as good, spam, bulk, phishing, malware, etc. As mentioned above, the entity types may be principle entities which are the communications themselves such as emails and clustering entities which represent features of the principle entities (e.g., sender email address, URL included in email, etc.). Thus, typically features of an email such as the hash of the contents of an email may be used as a key for forming a cluster. All emails that are associated with the same hash of the contents, meaning that the contents are identical or very similar, can be placed in the same cluster. Ultimately, the cluster may be assigned a communication-category label and that label may be applied to all of the entities in the cluster. 
Newport IP, LLC Page 23 of 39 Docket: MS1-9238USHowever, some types of keys may lead to false positives and return a label that is not correct. One way to reduce false positives is to make the keys more specific by combining multiple features into a single key-a composite key. For example, the IP address of the sender and the AuthInfo Code for the sending domain may be combined to form a composite key. The AuthInfo Code is an alphanumeric security code that exists for most top-level domains and is only known to the domain owner or administrative contact. Thus, instead of clustering based only on the IP address of the sender, the IP address and a single AuthInfo code are used as the key for a cluster. This may be appropriate if some of the emails coming from the IP address are good while others are spam and creating more specific clusters additionally based on the AuthInfo code is effective at separating the good email and the spam email into different clusters. Composite keys may be used in this way to avoid grouping emails with different communications-category labels in the same cluster. 
The composite key module 1326 may create composite keys and new clusters in response to evaluation of existing clusters performed by other modules such the confidence module 1322 or the voting module 1324. For example, if the confidence module 1322 identifies clusters or specific entities in clusters that have confidence levels below a threshold level, that may trigger the composite key module 1326 to identify a composite key that can be used to create clusters with higher confidence levels. Similarly, if the voting module 1324 identifies clusters or entities with labels that remain ambiguous even after applying the voting rules, that may suggest that the two or more keys should be combined to create a composite key and generate new clusters. 

Therefore, any computer software that combines two or more features of a communication for purposes of applying a label to a communication will be deemed to read on the claim.

If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 9-10, 12, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Kennedy et al. (US 10489587) (“Kennedy”) in view of Miserendino et al. (US 20170032279) (“Miserendino”) and further in view of Bosma et al., “A Framework for Unsupervised Spam Detection in Social Networking Sites,” in Euro. Conf. Info. Retrieval 364-75 (2012) (“Bosma”).
Regarding claim 9, Kennedy discloses  “[a] system comprising: 
one or more processing units (Kennedy, Fig. 6, processor 614); 
one or more memory units coupled to the one or more processing units (see Kennedy Fig. 1, memory 140, and Fig. 6 (showing system memory 616 coupled to processor 614 via communication infrastructure 612)); 
see Kennedy Figs. 1 (showing that the system comprises memory 140 that contains, inter alia, classification module 108 and sub-classification module 110), 5 (showing a decision tree 502 [expansion graph] that performs sub-classification)), that comprises:
a principal type entity (see Kennedy, Fig. 5, node 504 [which qualifies as a “principal type entity” insofar as it is hierarchically primary in the decision tree]) …;  
a first clustering type entity that represents a first feature of [a] … communication (analysis module stored in memory performs an analysis of an unknown file by applying, to the unknown file, a machine-learning heuristic that employs at least one decision tree – Kennedy, col. 2, ll. 40-61; decision tree [expansion graph] may be used to classify the unknown file [communication] by identifying a set of leaf nodes of the decision tree arrived at by the analysis performed by the machine-learning heuristic, where each leaf node is associated with a probability of being one or more particular types of malicious file [feature of communication] – id. at col. 7, l. 45-col. 8, l. 6; see also Fig. 5 [showing node 504 [principal type entity, “principal” because it is the hierarchically highest node in the graph] directionally connected to leaf nodes 510, 514, and 518 [clustering entities]; note also that the edges are “derivative” edges because a label for the email can be derived from the labels corresponding to percentages 512-20 in leaf nodes 510-18; compare specification paragraph 76 (“[D]erivative edges indicate that the label for a principal entity may be derived from labels of clustering type entities.”)]) …; 
a second clustering type entity that represents a second feature of the unlabeled communication (Kennedy Fig. 5 shows multiple leaf nodes [clustering type entities] [e.g., leaf node 510 may be the first clustering type entity and leaf node 518 may be the second]; each leaf node has a percentage likelihood of being a particular type of malware; for example, an unknown file that arrives at leaf node 510 may have an 80% chance of being ransomware [chance of being ransomware = first feature] and any unknown file that arrives at leaf node 518 may have a 40% chance of being ransomware [40% chance of being ransomware = second feature; 80% chance of being ransomware = first feature] – id. at col. 8, ll. 17-39) …;
a directional, derivative edge [between] the first clustering type entity [and] the principal type entity (see Kennedy Fig. 5 and note that node 504 [principal type entity] is connected directionally to leaf node 510 [first clustering entity] via two edges that pass through node 506);
a directional, clustering edge from the principal type entity to the second clustering type entity (see Kennedy Fig. 5 and note that node 504 [principal type entity] is connected directionally to leaf node 514 [second clustering entity] via two edges that pass through node 508); and 
a labeling module, stored in the one or more memory units, that is configured to assign the communications-category label from the first clustering type entity to the unlabeled communication based on the derivative edge [between] the first clustering type entity [and] the principal type entity, thereby creating a labeled communication (sub-classification module may use the same decision tree employed by the machine learning heuristic to sub-classify the unknown file [based on features of the file]; each leaf node may have a percentage likelihood of being a particular type of malware; for example, any unknown file that arrives at leaf node 510 may have an 80% chance of being ransomware, any unknown file that arrives at leaf node 514 may have a 60% chance of being a botnet application, and any unknown file that arrives at leaf node 518 may have a 40% chance of being ransomware – Kennedy, col. 8, ll. 17-39; sub-classification module [labeling module] sub-classifies [assigns a communications-category label from the leaf nodes/clustering-type entities to] the unknown file as the particular type of malicious file based on the decision tree – id. at col. 2, ll. 40-61; see also Fig. 5 (showing that the determination of the percentage 512 of the file being ransomware is based on the edges connecting node 504 to leaf node 510)) and that is configured to assign the communications-category label from the principal type entity to the second clustering type entity (any unknown file that arrives at leaf node 518 may have a 40% chance of being ransomware, and any unknown file that arrives at leaf node 510 may have an 80% chance of being ransomware [so that the same communications-category label “ransomware” may be applied to each] – Kennedy, col. 8, ll. 17-39; see also Fig. 5 (showing that the label assigned by leaf node 518 is based on traversal of the decision tree from node 504 [principal type entity] to leaf node 518)).”
Kennedy does not appear to disclose explicitly the further limitations of the claim.  However, Miserendino discloses “a feature of a communication other than personally identifiable information (PII) (in a trust relationship scenario for sharing feature vectors, if user A has a trust relationship scenario with user B and user C, but user B and user C have not entered into a trust relationship with each other, user B may use feature vectors user A has chosen to share but not data user C has chosen to share; by only sharing feature vectors, users may protect their confidential or sensitive file data from the other users with which they share – Miserendino, paragraph 31)….”
Miserendino and the instant application both relate to machine learning for malicious communication classification and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kennedy to make the features used by the model non-PII, as disclosed by Miserendino, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would protect the confidentiality of the users whose data are used in the training set.  See Miserendino, paragraph 31.
Neither Kennedy nor Miserendino appears to disclose explicitly the further limitations of the claim.  However, Bosma discloses that the “principal type entity … is an unlabeled communication (spam detection framework uses links between messages and other objects to propagate spam scores; in the Reporter Model the problem space is a bipartite graph having two node types: reporters and messages [message node = principal type entity/communication]; the algorithm calculates two scores: a hub score and an authority score; the authority score can be seen as a weighted version of a spam score based on raw report counts [i.e., prior to running the algorithm the message nodes represent messages that are not labeled as spam or not spam] – Bosma, secs. 3 and 3.1 and Fig. 1)….”
see Bosma Fig. 2 and note the arrows going from the reporter and author nodes [clustering type entities] to the message nodes [principal type entities])….”
Bosma and the instant application both relate to spam classification and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kennedy and Miserendino to include message nodes representing unlabeled communications among the nodes, as disclosed by Bosma, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would ensure that the system has a node representing the entire message as opposed to merely a feature thereof, thereby increasing the compactness of the model.  See Bosma, sec. 3.1 (reporter model contains only reporter nodes and message nodes).

Regarding claim 10, Kennedy, as modified by Miserendino and Bosma, discloses that “the communications-category label is good, spam, phishing, bulk, or malware (sub-classification module may classify the file as a particular type of malware such as ransomware or a botnet – Kennedy, col. 8, ll. 17-39) and [the system]: process[es] the labeled communication by storing the labeled communication or deleting the labeled communication based on the communications-category label (system may incorrectly sub-classify an unknown file but still successfully perform a security action on the file because the file is classified as malicious or non-malicious independent of the sub-classification of the file; for example, the system may incorrectly classify a keylogger as ransomware but may still successfully remove [delete] the keylogger – Kennedy, col. 9, ll. 30-41).”  

Regarding claim 12, Kennedy, as modified by Miserendino and Bosma, discloses “a confidence module, stored in the one or more memory units, configured to assign a probability to the communications-category label based on the expansion graph (sub-classification module may use the same decision tree employed by the machine learning heuristic to sub-classify the unknown file [based on features of the file]; each leaf node may have a percentage likelihood of being a particular type of malware; for example, any unknown file that arrives at leaf node 510 may have an 80% chance [probability] of being ransomware, any unknown file that arrives at leaf node 514 may have a 60% chance of being a botnet application, and any unknown file that arrives at leaf node 518 may have a 40% chance of being ransomware [so the sub-classification module contains a confidence module] – Kennedy, col. 8, ll. 17-39).”  

Regarding claim 25, Kennedy, as modified by Miserendino and Bosma, discloses that “the second clustering type entity contains other unlabeled communications that cluster together based on the second feature (any unknown file that arrives at leaf node 514 may have a 60% chance of being a botnet application [i.e., all files/unlabeled communications that arrive at leaf node 514 are clustered together and labeled as “botnet applications” based on the feature that there is a 60% probability that they are botnet applications] – Kennedy, col. 8, ll. 17-39).”

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Kennedy in view of Miserendino and Bosma and further in view of Aziz et al. (US 10462173) (“Aziz”).
Regarding claim 11, Kennedy, as modified by Miserendino and Bosma, discloses “assign[ing] the communications-category label to a … feature of the unlabeled communication based on the expansion graph (sub-classification module may use the same decision tree employed by the machine learning heuristic to sub-classify the unknown file [based on features of the file]; each leaf node may have a percentage likelihood of being a particular type of malware; for example, any unknown file that arrives at leaf node 510 may have an 80% chance of being ransomware, any unknown file that arrives at leaf node 514 may have a 60% chance of being a botnet application, and any unknown file that arrives at leaf node 518 may have a 40% chance of being ransomware [labels] – Kennedy, col. 8, ll. 17-39).”
statistically detected characteristics and/or dynamically observed behaviors (collectively, “features”) of a suspicious object may be provided to a classification engine for classification of the object as malicious or benign; the classification engine may generate a classification of the suspicious object based on a correlation of the features with known features of malware and benign objects [i.e., the system compares the features of the object under investigation to features labeled benign and malware thereby to label the features under investigation, and thereby the object, as malware or benign] – Aziz, col. 4, ll. 29-46)….”
Aziz and the instant application both relate to classification of malicious communications and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kenned, Bosma, and Miserendino to assign a label to multiple features of a communication for the purpose of labeling the communication itself, as disclosed by Aziz, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would increase the power and the accuracy of the classifier by making the ultimate classification depend on multiple factors.  See Aziz, col. 4, ll. 29-46.

Claims 13 and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Kennedy in view of Miserendino and Bosma and further in view of Lin (US 7051077) (“Lin”).
Regarding claim 13, neither Kennedy, Bosma, nor Miserendino appears to disclose explicitly the further limitations of the claim.  However, Lin discloses “a voting module, stored in the one or more memory units, configured to apply a set of voting rules to resolve conflicts between the communications-category label assigned to the unlabeled communication based on the first feature and a second communications-category label also assigned to the unlabeled communication based on a second feature system and method for identifying e-mail messages as unwanted or spam or as wanted or ham includes combining the output of two or more spam classifiers; each classifier is given a single vote, and the votes are combined using fuzzy logic – Lin, col. 3, ll. 22-38; system may be used in a situation in which two of three tools determine a message is spam [i.e., one identifies it as ham, and thus there is a conflict] – id. at col. 2, ll. 34-61; see also Fig. 2, memory 230).”
Lin and the instant application both relate to machine learning for email classification and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kennedy, Bosma, and Miserendino to apply voting rules in case of conflict among multiple classifications, as disclosed by Lin, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would provide a more useful result by providing the output of multiple classifiers and providing a mechanism by which to resolve classifications that are in tension with each other.  See Lin, col. 3, ll. 22-38.

Regarding claim 26, Kennedy, as modified by Miserendino, Bosma, and Lin, discloses “a voting module, stored in the one or more memory units, configured to select a single label for the second clustering type entity based on votes that include the communications-category label assigned from the principal type entity and at least two other communications-category labels assigned from different principal type entities (voting mechanism may allow the system to combine the non-standardized outputs of two or more classifiers [principal type entities] to produce a single classification output – Lin, col. 10, ll. 47-61; classifier conversion device is stored in a memory and is used to standardize classification results – id. at col. 10, l. 62-col. 11, l. 26; voting chairman or control module uses a voting formula to combine the standardized classification results [into a single label] – id. at col. 12, l. 53-col. 13, l. 8; number or type of classifiers is not fixed [so there may be three or more] – id. at col. 14, ll. 36-50; see also Fig. 2 (showing that results 213-217 of classifier tools 1-N 212-216 are inserted into voting mechanism 220 containing memory 224, 262, 230 and outputs a single classification output 250)).”  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kennedy, Miserendino, and Bosma to include a voting module that selects a single label based on votes on the results from multiple principal type entities, as disclosed by Lin, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would provide a more useful result by providing the output of multiple classifiers and providing a mechanism by which to resolve classifications that are in tension with each other.  See Lin, col. 3, ll. 22-38.

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Kennedy in view of Miserendino and Bosma and further in view of Zhang et al. (US 20160314182) (“Zhang”).
Regarding claim 14, neither Kennedy, Bosma, nor Miserendino appears to disclose explicitly the further limitations of the claim.  However, Zhang discloses “a composite key module, stored in the one or more memory units, configured to generate a cluster based on two or more features (clustering system of a communications classifier clusters documents based at least in part on classification terms; classification terms engine identifies a plurality of classification terms [features] that are indicative of a particular classification – Zhang, paragraphs 29-30; see also Fig. 5, memory subsystem 525).”  
Zhang and the instant application both relate to the classification of documents and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kennedy, Bosma, and Miserendino to generate a cluster based on two or more features, as disclosed by Zhang, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would increase the power of the classifier by ensuring that the classification is based on multiple features rather than merely one.  See Zhang, paragraphs 29-30.

s 15 and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Sheu et al., “An Intelligent Three-Phase Spam Filtering Method Based on Decision Tree Data Mining,” in 9 Sec. Comm. Networks 4013-26 (2016) (“Sheu”) in view of Miserendino and further in view of Bosma.
Regarding claim 15, Sheu discloses “[a] method comprising; 
accessing an expansion graph of relationships between a … node and a plurality of feature nodes, the plurality of feature nodes corresponding to features … (efficient spam filtering method may be based on a decision tree [expansion graph] data mining technique – Sheu, abstract; tree contains root node, children nodes under it, and leaf nodes without children – id. at p. 4014, first paragraph on right-hand column; ID3 algorithm divides all data instances into children nodes according to their values of a selected critical attribute [feature] – id. at p. 4014, last paragraph on right-hand column; leaf node will be labeled by the value of a target attribute [category label] possessed by the majority of data instances therein [so it is a message node in that it categorizes the whole message] – id. at p. 4015, first paragraph);
extracting a first feature from the unlabeled message (nine critical attributes [features] of an email are defined by surveying [extracting] the important fields of the email’s header section; these nine critical attributes are divided into three categories of “sender,” “title,” and “time and size” – Sheu, sec. 3.1.1, first paragraph); 
correlating the first feature with a first feature node of the plurality of feature nodes in the expansion graph (ID3 decision tree algorithm is applied to analyze the associative rules among the nine critical attributes and the target attribute of the emails (spammy or legitimate) – Sheu, sec. 3.1.1, first paragraph; captured attributes of emails are input to ID3 to build a decision tree that will bring out the potential association rules of an “if-then” pattern between the critical attributes and the target attribute [building the decision tree = correlating the feature with a feature node] – id. at p. 4018, first paragraph), wherein the first feature node has a first category label (leaf node [feature node] will be labeled by the value of a target attribute [category label] possessed by the majority of data instances therein – Sheu, p. 4015, first paragraph); [and]
assigning the first category label to the unlabeled message based on a directional edge in the expansion graph from the first feature node to the … node thereby creating a labeled message (rule constructing module checks critical attributes of emails and applies ID3 to compute potential association rules if the “if-then” pattern, which are stored in a rules database; in a classification phase, the unknown emails are classified by applying the rules database – Sheu, p. 4015, first three paragraphs of sec. 3 [note that since the rules database was created from the decision tree, and the leaf nodes of the decision tree are labels, the label assignment is based on the decision tree including its directional edge from the feature node to the message/label node])….” 
extracting a second feature from the unlabeled message (nine critical attributes [features] of an email are defined by surveying [extracting] the important fields of the email’s header section; these nine critical attributes are divided into three categories of “sender,” “title,” and “time and size” – Sheu, sec. 3.1.1, first paragraph);
correlating the second feature with a second feature node of the plurality of feature nodes in the expansion graph (ID3 decision tree algorithm is applied to analyze the associative rules among the nine critical attributes and the target attribute of the emails (spammy or legitimate) – Sheu, sec. 3.1.1, first paragraph; captured attributes of emails are input to ID3 to build a decision tree that will bring out the potential association rules of an “if-then” pattern between the critical attributes and the target attribute [building the decision tree = correlating the feature with a feature node] – id. at p. 4018, first paragraph; see also Fig. 3 and immediately preceding paragraph (showing that feature nodes A and B both correspond to different critical attributes, with B being the critical attribute with the most information gain if A is not determinative));
assigning the first category label to the second feature node in the expansion graph based on a directional, clustering edge [between] the … node [and] the second feature node (see Sheu Fig. 3 and immediately preceding and subsequent paragraphs and note that feature node B,  along with the B-D and B-E edges, determines whether the email is labeled with the label of D or the label of E, such that the category label of node D or node E can be ascribed to the email that passes through node B; note also that the label of each leaf node is given by the value of the target attribute possessed by the majority of data instances therein [thereby rendering the edges between feature node B and leaf nodes D and E “clustering edges”]) ….”
Sheu does not appear to disclose explicitly the further limitations of the claim.  However, Miserendino discloses “features that do not include personally identifiable information (PII) (in a trust relationship scenario for sharing feature vectors, if user A has a trust relationship scenario with user B and user C, but user B and user C have not entered into a trust relationship with each other, user B may use feature vectors user A has chosen to share but not data user C has chosen to share; by only sharing feature vectors, users may protect their confidential or sensitive file data from the other users with which they share – Miserendino, paragraph 31)….”
Miserendino further discloses “creating a training dataset comprising the labeled message (a third-party facility may construct a base malware classifier using a supervised machine learning algorithm such as decision trees; the learning is conducted over a training set composed of samples of identical type and covering all desired classes; in an embodiment, only two classes are used, malicious and benign [labels] – Miserendino, paragraph 21; see also Fig. 1, left-hand side); 
generating a machine learning model by supervised learning using the training dataset (third party extracts features that are more likely to be present in malicious and/or benign files from the training set and conducts the learning to create a model using a supervised machine learning algorithm – Miserendino, paragraph 21; see also Fig. 1, left-hand side); and 
classifying a new message with the machine learning model (after training the model, third party tests the model using a test set [i.e., classifies a new message using the model] – Miserendino, paragraph 21).”  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Sheu to refrain from using PII in creating the training dataset and then See Miserendino, paragraphs 21, 31.
Neither Sheu nor Miserendino appears to disclose explicitly the further limitations of the claim.  However, Bosma discloses “a message node [that] represent[s] an unlabeled message (spam detection framework uses links between messages and other objects to propagate spam scores; in the Reporter Model the problem space is a bipartite graph having two node types: reporters and messages; the algorithm calculates two scores: a hub score and an authority score; the authority score can be seen as a weighted version of a spam score based on raw report counts [i.e., prior to running the algorithm the message nodes represent messages that are not labeled as spam or not spam] – Bosma, secs. 3 and 3.1 and Fig. 1)….”
Bosma further discloses “a directional … edge from the message node to [a] second … node (see Bosma Fig. 3 and note the directional edges from the second message node to the first and third message nodes)….”
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sheu and Miserendino to include message nodes representing unlabeled communications among the nodes, as disclosed by Bosma, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would ensure that the system has a node representing the entire message as opposed to merely a feature thereof, thereby increasing the compactness of the model.  See Bosma, sec. 3.1 (reporter model contains only reporter nodes and message nodes).

Regarding claim 24, Sheu, as modified by Miserendino, discloses that “the expansion graph comprises a directional, clustering edge from the … node to a second feature node (see Sheu Fig. 1 and note that the tree is traversed, for instance, both from A[Wingdings font/0xE0] B [Wingdings font/0xE0] D and from A [Wingdings font/0xE0] B [Wingdings font/0xE0] E [second feature node; edge between B [Wingdings font/0xE0] E = clustering edge]; the directionality of the graph is implied by the fact that C is categorized as a leaf node in the first paragraph of p. 4015)….”
Bosma further discloses that “the second feature node receives the first category label from the message node (final hub and authority scores are found using an iterative procedure; the message scores are calculated and normalized, and based on the new message scores, new reporter scores are calculated [i.e., the message nodes send the score [category label] to the reporter node [feature node]] – Bosma, last paragraph before sec. 3.2).”
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sheu and Miserendino such that the message node sends a label to a feature node, as disclosed by Bosma, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would inform the feature nodes of the result of the message node’s processing, thereby improving the accuracy of the system in future calculations.  See Bosma, last paragraph before sec. 3.2 (final scores are calculated iteratively, and the steps are repeated until convergence occurs).

Claims 16 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Sheu in view of Miserendino and Bosma and further in view of Kennedy.
Regarding claim 16, Sheu, as modified by Miserendino and Bosma, discloses that “the category label comprises one or more of good message, spam message, phishing message, bulk message, or malware message (target attribute [label] is the attribute that is the concerned objective of the research; here, the target attribute is “email type” and may be labeled “S” for “spammy” [spam] or “L” for “legitimate” [good] – Sheu, p. 4014, last paragraph) ….”
Neither Sheu, Bosma, nor Miserendino appears to disclose explicitly the further limitations of the claim.  However, Kennedy discloses that “[the method] further compris[es]: processing the new message according to the first category label, the processing comprising storing, quarantining, or deleting (system may incorrectly sub-classify an unknown file but still successfully perform a security action on the file because the file is classified as malicious or non-malicious independent of the sub-classification of the file; for example, the system may incorrectly classify a keylogger as ransomware but may still successfully remove [delete] the keylogger – Kennedy, col. 9, ll. 30-41).”  
Kennedy and the instant application relate to machine learning for classification of malicious documents and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sheu, Bosma, and Miserendino to delete the file upon determining that it is malicious, as disclosed by Kennedy, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would protect the user by automatically deleting potentially problematic files without human intervention.  See Kennedy, col. 9, ll. 30-41.

Regarding claim 18, Sheu, as modified by Miserendino, Bosma, and Kennedy, discloses that “the directional edge is associated with a probability and assigning the category label is based on the probability (sub-classification module may use the same decision tree employed by the machine-learning heuristic to sub-classify the unknown file by (i) identifying leaf nodes each of which includes a percentage for the particular type of malicious file, (ii) adding the percentage from each leaf node, and (iii) sub-classifying the malicious file based on the sum of the percentages; for example, any unknown file that arrives at leaf node 510 may have an 80% chance [probability] of being ransomware, any unknown file that arrives at leaf node 514 may have a 60% chance of being a botnet application, and any unknown file that arrives at leaf node 518 may have a 40% chance of being ransomware – Kennedy, col. 8, ll. 17-39; see also Fig. 5 [showing that the nodes are connected by directional edges]).”  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sheu, Bosma, and Miserendino to classify the files based on probabilities associated with the graph, as disclosed by Kennedy, and an ordinary artisan could See Kennedy, col. 8, ll. 17-39.

Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Sheu in view of Miserendino and Bosma and further in view of Chickering et al. (US 20070038705) (“Chickering”).
Regarding claim 17, Sheu, as modified by Miserendino and Bosma, discloses that “the plurality of feature nodes comprise at least … a message sender node (Sheu Table I, p, 40-17, shows that the nine critical attributes of the email [from which the feature nodes are developed] include a sender attribute category) ….”
Neither Sheu, Bosma, nor Miserendino appears to disclose explicitly the further limitations of the claim.  However, Chickering discloses that “the plurality of feature nodes comprise at least … a URL node (algorithm for spam filtering determines how to partition messages to filter spam best; the partitioning property that results in the best spam classification is added as a test to the decision tree, and the learning algorithm is called recursively on the resulting data partitions; one candidate partition is whether there are any links or URLs in the message – Chickering, paragraph 39 and Table 1)….”
Chickering and the instant application both relate to machine learning for spam classification and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sheu, Bosma, and Miserendino to include whether the message contains a URL as a node in the graph, as disclosed by Chickering, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would increase the robustness of the resulting model by checking for a feature known to be present in spam emails.  See Chickering, paragraph 39 and Table 1.

s 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sheu in view of Miserendino and Bosma further in view of Aziz and Lin.
Regarding claim 19, Sheu, as modified by Miserendino and Bosma, discloses “assigning a … category label to the unlabeled message based on a[n] … expansion graph (rule constructing module checks critical attributes of emails and applies ID3 to compute potential association rules if the “if-then” pattern, which are stored in a rules database; in a classification phase, the unknown emails are classified by applying the rules database – Sheu, p. 4015, first three paragraphs of sec. 3 [note that since the rules database was created from the decision tree, and the leaf nodes of the decision tree are labels, the label assignment is based on the decision tree]) ….”
Neither Sheu, Bosma, nor Miserendino appears to disclose explicitly the further limitations of the claim.  However, Aziz discloses “assigning a … category label to the unlabeled message based on … a second feature of the unlabeled message (statistically detected characteristics and/or dynamically observed behaviors (collectively, “features”) of a suspicious object may be provided to a classification engine for classification of the object as malicious or benign; the classification engine may generate a classification of the suspicious object based on a correlation of the features with known features of malware and benign objects [i.e., the system compares the features of the object under investigation to features labeled benign and malware thereby to label the features under investigation, and thereby the object, as malware or benign] – Aziz, col. 4, ll. 29-46).”  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sheu, Bosma, and Miserendino to apply a label to a message based on multiple features of the message, as disclosed by Aziz, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would enhance the accuracy of the classifier by allowing to base the ultimate classification on multiple features.  See Aziz, col. 4, ll. 29-46.
system for voting among classifiers that classify a message as spam may be used in a situation in which two of three tools determine a message is spam – id. at col. 2, ll. 34-61)….”
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sheu, Miserendino, Bosma, and Aziz to apply multiple labels to the communication, as disclosed by Lin, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would provide more information to the classifier, allowing it to work more effectively and improve its accuracy.  See Lin, col. 3, ll. 22-38.

Regarding claim 20, Sheu, as modified by Miserendino, Aziz, and Lin, discloses “resolving a conflict between the first category label and the second category label based on a set of voting rules that specify priority between conflicting category labels (system for applying voting rules to different spam classifiers may be used in a situation in which two of three tools determine a message is spam [i.e., one identifies it as ham, and thus there is a conflict] – Lin, col. 2, ll. 34-61; to combine the classifiers, each classifier is given a single vote, and the votes are then combined using fuzzy logic – id. at col. 3, ll. 22-38).”
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sheu, Miserendino, and Aziz to develop a voting mechanism to resolve conflicts between two or more classifications, as disclosed by Lin, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would provide a more useful result by providing the output of multiple classifiers and providing a mechanism by which to resolve classifications that are in tension with each other.  See Lin, col. 3, ll. 22-38.

21 is rejected under 35 U.S.C. 103 as being unpatentable over Kennedy in view of Miserendino and Bosma and further in view of Schuld et al., “An Introduction to Quantum Machine Learning,” in 56.2 Contemporary Physics 172-85 (2015) (“Schuld”).
Regarding claim 21, neither Kennedy, Miserendino, nor Bosma appears to disclose explicitly the further limitations of the claim.  However, Schuld discloses that “the expansion graph is specific to the communications-category label (Schuld Fig. 9 contains a decision tree [expansion graph] whose terminal nodes are either “unsure,” “spam,” or “not spam” [i.e., the graph is specific to the label “spam” and not other possible classifications such as “malware,” “phishing,” etc.]).”
Schuld and the instant application both relate to graph-based classifiers for spam classification and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kennedy, Miserendino, and Bosma to make the graph specific to a category label, as disclosed by Schuld, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would simplify the resulting model and enhance accuracy by ensuring that the system is not burdened with the computationally more complex task of performing multi-label classification with a single graph.  See Schuld, sec. 3.5, first paragraph and Fig. 9.

Claim 23 is rejected under 35 U.S.C. 103 as being unpatentable over Sheu in view of Miserendino and Bosma and further in view of Schuld.
Regarding claim 23, neither Sheu, Miserendino, nor Bosma appears to disclose explicitly the further limitations of the claim.  However, Schuld discloses that “the expansion graph is specific to a communications-category label (Schuld Fig. 9 contains a decision tree [expansion graph] whose terminal nodes are either “unsure,” “spam,” or “not spam” [i.e., the graph is specific to the label “spam” and not other possible classifications such as “malware,” “phishing,” etc.]).”
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sheu, Miserendino, and Bosma to make the graph See Schuld, sec. 3.5, first paragraph and Fig. 9.
 
Regarding claim 27, *** “the expansion graph further comprises: 
a third clustering type entity that that represents a third feature of the unlabeled communication other than personally identifiable information (PII); 
a fourth clustering type entity that represents a fourth feature of the unlabeled communication other than personally identifiable information (PII); 
a directional, derivative edge from the third clustering type entity to the principal type entity; 
a second directional, clustering edge from the principal type entity to the second clustering type entity (see Kennedy Fig. 5 and note that node 504 [principal type entity] is directionally connected both to leaf node 510 [first clustering type entity] and leaf node 514 [second clustering type entity]); 
a third directional, clustering edge from the principal type entity to the third clustering type entity; 
a fourth directional, clustering edge from the principal type entity to the fourth clustering type entity; 
a fifth directional, clustering edge from the first clustering type entity to the second clustering type entity; 
a sixth directional, clustering edge from the first clustering type entity to the fourth clustering type entity; 

an eighth directional, clustering edge from the third clustering type entity to the second clustering type entity ().”

Regarding claim 28, *** “[a] method comprising:
accessing an expansion graph of relationships between a message node representing an unlabeled message and a plurality of feature nodes (), wherein the expansion graph is specific to a communications-category label () and wherein the plurality of feature nodes comprise at least two of a message hash node, a message sender node, a URL node, or a sender host node (); 
extracting a feature from the unlabeled message (); 
correlating the feature with a one of the plurality of feature nodes in the expansion graph, wherein the one of the plurality of feature nodes has a first category label (); Serial No.: 16/049,579-8- Alty Docket No.: MS1-9238US 
Atty/Agent: Benjamin A. KeimNewport IP, LLCassigning the first category label to the unlabeled message based on a directional, derivative edge in the expansion graph from the feature node to the message node thereby creating a labeled message, wherein the directional, derivative edge is associated with a probability and assigning the first category label is based on the probability (sub-classification module may use the same decision tree employed by the machine-learning heuristic to sub-classify the unknown file by (i) identifying leaf nodes each of which includes a percentage for the particular type of malicious file, (ii) adding the percentage from each leaf node, and (iii) sub-classifying the malicious file based on the sum of the percentages; for example, any unknown file that arrives at leaf node 510 may have an 80% chance [probability] of being ransomware, any unknown file that arrives at leaf node 514 may have a 60% chance of being a botnet application, and any unknown file that arrives at leaf node 518 may have a 40% chance of being ransomware – Kennedy, col. 8, ll. 17-39; see also Fig. 5 [showing that the nodes are connected by directional edges]); 
assigning a second category label to the unlabeled message based on a second expansion graph and a second feature of the unlabeled message (); 
applying a set of voting rules to resolve a conflict between the first category label and the second category label (); 
creating a training dataset comprising the labeled message (a third-party facility may construct a base malware classifier using a supervised machine learning algorithm such as decision trees; the learning is conducted over a training set composed of samples of identical type and covering all desired classes; in an embodiment, only two classes are used, malicious and benign [labels] – Miserendino, paragraph 21; see also Fig. 1, left-hand side); 
generating a machine learning model by supervised learning using the training dataset (third party extracts features that are more likely to be present in malicious and/or benign files from the training set and conducts the learning to create a model using a supervised machine learning algorithm – Miserendino, paragraph 21; see also Fig. 1, left-hand side); and 
classifying a new message with the machine learning model (after training the model, third party tests the model using a test set [i.e., classifies a new message using the model] – Miserendino, paragraph 21).”  



Response to Arguments
Applicant's arguments filed December 28, 2021 (“Remarks”) have been fully considered but, to the extent not rendered moot by a new ground of rejection, they are not persuasive.
Applicant first argues that “a recent seed data” in paragraph 150 is proper because it is the first time the seed data are mentioned.  Remarks at 11.  While Examiner is sensitive to such a concern, “a data” is grammatically incorrect because “data” is properly used only in the plural.  To the extent that Applicant is concerned about the use of the introductory article, “a set of recent seed data” would also be acceptable.
Applicant then argues that the Kennedy/Miserendino/Bosma combination does not render claim 9 obvious because (a) Kennedy allegedly describes a decision tree rather than an expansion graph; (b) Kennedy allegedly does not disclose a derivative edge from the first clustering type entity to the principal type entity or a clustering edge from the principal type entity to the second clustering type entity; (c) Bosma allegedly does not teach a first clustering type entity.  Remarks at 14-16.
Regarding (a), the term “expansion graph” is not expressly defined either by the claim or the specification, and Examiner can find no evidence that it was an accepted term of art before the effective filing date.  At most, the term “expansion graph” means a graph data structure that represents inferences based on relationships between different types of clustered entities.  See specification paragraph 6.  Here, because the decision tree of Kennedy is a type of graph that represents inferences regarding the type of file being evaluated based on the relationships between 
Regarding (b), while Examiner agrees that the directional edges of the decision tree of Kennedy are unidirectional, Figure 2 of Bosma shows a graph in which the edges go in the opposite direction, i.e., from author and reporter nodes [clustering type entities] to the message node [principal type entity].  Thus, the combination of Kennedy, which discloses principal-to-clustering edges, and Bosma, which discloses clustering-to-principal edges, meets the claim language disputed.
Regarding (c), much like “expansion graph,” “clustering type entity” appears to be a term of art coined by Applicant that is defined neither by the claim nor by the specification.  At most, specification paragraph 76 indicates that the “clustering type” entities are entities that correspond to features of the email.  Thus, as most broadly reasonably construed in light of the specification, a “clustering type entity” may be construed as any node in the graph that represents a feature of the communication.  Here, because the author and reporter nodes of Bosma correspond to an individual who reported a message and the author of the message, respectively, both of which are features of the message, they are “clustering type entities” as they are most broadly reasonably construed.
Applicant then argues that the Sheu/Miserendino/Bosma combination does not render claim 15 obvious because neither Sheu nor Bosma discloses directional edges going both to and from a node representing an unlabeled message to expand a category label from one feature node to another.  Remarks at 19-20.  However, this is again a case where the combination discloses edges going in both directions even though neither reference standing alone discloses bidirectional sets of edges.  Here, while the decision tree of Figure 1 of Sheu does not contain edges, it is clear 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN C VAUGHN whose telephone number is (571)272-4849.  The examiner can normally be reached on M-R 7a-5:30p ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.



/R.C.V./             Examiner, Art Unit 2125

/KAMRAN AFSHAR/            Supervisory Patent Examiner, Art Unit 2125