Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
	This communication is in response to the amended application filed on 07/27/2022.
Claims 1-20 are pending.
Claims 1, 4-5, 8-9 and 16-17 are amended.
Claims 18-20 are new. 
Remarks
Applicant’s arguments filed on 07/27/2022 (‘Remarks’) have been considered.
Regarding the 35 U.S.C. § 112 rejections
The 35 U.S.C. § 112 rejections have been withdrawn due to the amendments to the claims. 
Regarding the prior art rejections
Applicant’s arguments are moot due to the new reference, Bergstrom (Pub. No. US 2019/0007426 A1), being used in the current rejection. 
Abbasi teaches an attack group classifying apparatus comprising: a processor; and a non-transitory storage medium storing instructions thereon, the instructions when executed by the processor cause the processor to: extract, from a data set including documents of specific formats, features for identifying attack groups using the documents of the specific formats (Abbasi ¶ [0040], malware samples are a dataset which includes files, executables, etc., from which features such as type designation, rules aggregation sequence, etc., are extracted; see also Abstract regarding classifying malware [identifying attack groups] by analyzing malware samples; see also ¶ [0037], “the malware classification logic 190 is configured to generate a malware training dataset. The malware training dataset is a collection of malware samples each comprising (1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), (3) the reference rule sequence, and (4) the class name or label of the malware”); an information generating unit configured to generate a machine learning data set based on the extracted features (Abbasi ¶ [0040], malware dataset is created based on extracted features of “(1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), and (3) the rule aggregation sequence”; see also ¶ [0037]); and a learning model unit configured to execute a machine learning algorithm on the machine learning data set to generate a classification model for identifying the attack groups (Abbasi ¶ [0040], Longest Common Subsequence algorithm is executed on the generated dataset to classify malware).
Abbasi does not explicitly teach identifying attack groups representing a country or an entity that injects malware in the document. 
However, Bergstrom teaches extract, from a data set including documents of a specific format, features for identifying attack groups using the documents of the specific formats, each of the attack groups representing a country or an entity that injects malware in the documents (Bergstrom ¶ [0037], “evaluate a file to which a URL present [extracting features] in an e-mail [document] links at one or more pre-determined time intervals”; see also ¶ [0040], “blacklisting the sender's [entity] email address [by blacklisting the sender, the attack group is identified], the sender's domain and/or the sender's mail server so that the protected network including mail servers, e.g. mail server 104 reject subsequent e-mails” – the sender who injects URL malware into an email is blacklisted by the system in Bergstrom).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Abbasi and Bergstrom to teach identifying a sender who injects malware into a document because it allows for “blacklisting the sender's email address, the sender's domain and/or the sender's mail server so that the protected network including mail servers, e.g. mail server 104 reject subsequent e-mails from the sender/domain of the sender”. Bergstrom [0040]. This reduces the chance of malware infection in computer systems. See also Bergstrom Abstract regarding mitigation of network attacks. 
Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. § 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-5, 8-13 and 16-19 are rejected under 35 U.S.C. § 103 as being unpatentable over Abbasi (Pub. No. US 2017/0083703 A1) in view of Bergstrom (Pub. No. US 2019/0007426 A1).

Regarding claim 1, Abbasi teaches an attack group classifying apparatus comprising: a processor; and a non-transitory storage medium storing instructions thereon, the instructions when executed by the processor cause the processor to: extract, from a data set including documents of specific formats, features for identifying attack groups using the documents of the specific formats (Abbasi ¶ [0040], malware samples are a dataset which includes files, executables, etc., from which features such as type designation, rules aggregation sequence, etc., are extracted; see also Abstract regarding classifying malware [identifying attack groups] by analyzing malware samples; see also ¶ [0037], “the malware classification logic 190 is configured to generate a malware training dataset. The malware training dataset is a collection of malware samples each comprising (1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), (3) the reference rule sequence, and (4) the class name or label of the malware”); an information generating unit configured to generate a machine learning data set based on the extracted features (Abbasi ¶ [0040], malware dataset is created based on extracted features of “(1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), and (3) the rule aggregation sequence”; see also ¶ [0037]); and a learning model unit configured to execute a machine learning algorithm on the machine learning data set to generate a classification model for identifying the attack groups (Abbasi ¶ [0040], Longest Common Subsequence algorithm is executed on the generated dataset to classify malware).
Abbasi does not explicitly teach identifying attack groups representing a country or an entity that injects malware in the document. 
However, Bergstrom teaches extract, from a data set including documents of a specific format, features for identifying attack groups using the documents of the specific formats, each of the attack groups representing a country or an entity that injects malware in the documents (Bergstrom ¶ [0037], “evaluate a file to which a URL present [extracting features] in an e-mail [document] links at one or more pre-determined time intervals”; see also ¶ [0040], “blacklisting the sender's [entity] email address [by blacklisting the sender, the attack group is identified], the sender's domain and/or the sender's mail server so that the protected network including mail servers, e.g. mail server 104 reject subsequent e-mails” – the sender who injects URL malware into an email is blacklisted by the system in Bergstrom).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Abbasi and Bergstrom to teach identifying a sender who injects malware into a document because it allows for “blacklisting the sender's email address, the sender's domain and/or the sender's mail server so that the protected network including mail servers, e.g. mail server 104 reject subsequent e-mails from the sender/domain of the sender”. Bergstrom [0040]. This reduces the chance of malware infection in computer systems. See also Bergstrom Abstract regarding mitigation of network attacks. 

Regarding claim 2, Abbasi teaches the apparatus of claim 1. Abbasi furthermore teaches wherein the specific formats of the documents include at least one of an e-mail file format, a document file format and an executable file format (Abbasi ¶ [0037], types of formats include executable).

Regarding claim 3, Abbasi teaches the apparatus of claim 1. Abbasi furthermore teaches wherein the features include at least one of location information, language information, time information, system information, file attribute information and n-gram information (Abbasi ¶ [0040], dataset includes type designation).

Regarding claim 4, Abbasi teaches the apparatus of claim 1. Abbasi furthermore teaches wherein the instructions when executed by the processor cause the processor to generate the machine learning data set by pre-processing the features into categorical features and numerical features, for classification (Abbasi [0040], “dataset of malware samples each comprising (1) a representation of the sample (e.g., hash value) [numerical features], (2) type designation (e.g., file, executable, etc.) [categorical features]”).

Regarding claim 5, Abbasi teaches the apparatus of claim 4. Abbasi furthermore teaches wherein instruction for executing the machine learning algorithm comprises instructions to execute the machine learning algorithm after the machine learning data set is classified into a training data set and a test data set (Abbasi ¶ [0037], training set is classified as a training set of a test data set; see also [0040]).
Regarding claim 8, Abbasi teaches the apparatus of claim 1. Abbasi furthermore teaches wherein the instructions further cause the processor to apply the classification model to an arbitrary document of a specific format to classify attack groups of the arbitrary document of the specific format (Abbasi ¶ [0040], the malware sample is assigned a class).

Regarding claim 9, Abbasi teaches a learning method for classifying attack groups comprising: collecting a data set including documents of specific formats and extracting features for identifying attack groups using the documents of the specific formats from the data set (Abbasi ¶ [0040], malware samples are a dataset which includes files, executables, etc., from which features such as type designation, rules aggregation sequence, etc., are extracted; see also Abstract regarding classifying malware [identifying attack groups] by analyzing malware samples; see also ¶ [0037], “the malware classification logic 190 is configured to generate a malware training dataset. The malware training dataset is a collection of malware samples each comprising (1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), (3) the reference rule sequence, and (4) the class name or label of the malware”); generating a machine learning data set based on the extracted features (Abbasi ¶ [0040], malware dataset is created based on extracted features of “(1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), and (3) the rule aggregation sequence”; see also ¶ [0037]); and generating a classification model for identifying the attack groups by executing a machine learning algorithm on the machine learning data set (Abbasi ¶ [0040], Longest Common Subsequence algorithm is executed on the generated dataset to classify malware).
Abbasi does not explicitly teach identifying attack groups representing a country or an entity that injects malware in the documents. 
However, Bergstrom teaches extracting features for identifying attack groups using the documents of the specific formats from the data set, each of the attack groups representing a country or an entity that injects malware in the documents (Bergstrom ¶ [0037], “evaluate a file to which a URL present [extracting features] in an e-mail [document] links at one or more pre-determined time intervals”; see also ¶ [0040], “blacklisting the sender's [entity] email address [by blacklisting the sender, the attack group is identified], the sender's domain and/or the sender's mail server so that the protected network including mail servers, e.g. mail server 104 reject subsequent e-mails” – the sender who injects URL malware into an email is blacklisted by the system in Bergstrom).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Abbasi and Bergstrom to teach identifying a sender who injects malware into a document because it allows for “blacklisting the sender's email address, the sender's domain and/or the sender's mail server so that the protected network including mail servers, e.g. mail server 104 reject subsequent e-mails from the sender/domain of the sender”. Bergstrom [0040]. This reduces the chance of malware infection in computer systems. See also Bergstrom Abstract regarding mitigation of network attacks. 

Abbasi and Bergstrom teach all the limitations of claims 10-13, as asserted above with regard to claims 1-5, respectively.

Regarding claim 16, Abbasi teaches an attack group classifying method comprising: collecting a data set including documents of specific formats extracting features for identifying attack groups using the documents of the specific formats from the data set (Abbasi ¶ [0040], malware samples are a dataset which includes files, executables, etc., from which features such as type designation, rules aggregation sequence, etc., are extracted; see also Abstract regarding classifying malware [identifying attack groups] by analyzing malware samples; see also ¶ [0037], “the malware classification logic 190 is configured to generate a malware training dataset. The malware training dataset is a collection of malware samples each comprising (1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), (3) the reference rule sequence, and (4) the class name or label of the malware”); generating a machine learning data set based on the extracted features (Abbasi ¶ [0040], malware dataset is created based on extracted features of “(1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), and (3) the rule aggregation sequence”; see also ¶ [0037]); generating a classification model for identifying the attack groups by executing a machine learning algorithm on the machine learning data set (Abbasi ¶ [0040], Longest Common Subsequence algorithm is executed on the generated dataset to classify malware); and classifying attack groups of an arbitrary document of a specific format by applying the classification model to the arbitrary document of the specific format (Abbasi ¶ [0040], the malware sample is assigned a class).
Abbasi does not explicitly teach identifying attack groups representing a country or an entity that injects malware in the document. 
However, Bergstrom teaches extracting features for identifying attack groups using the documents of the specific formats from the data set, each of the attack groups representing a country or an entity that injects malware in the documents (Bergstrom ¶ [0037], “evaluate a file to which a URL present [extracting features] in an e-mail [document] links at one or more pre-determined time intervals”; see also ¶ [0040], “blacklisting the sender's [entity] email address [by blacklisting the sender, the attack group is identified], the sender's domain and/or the sender's mail server so that the protected network including mail servers, e.g. mail server 104 reject subsequent e-mails” – the sender who injects URL malware into an email is blacklisted by the system in Bergstrom).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Abbasi and Bergstrom to teach identifying a sender who injects malware into a document because it allows for “blacklisting the sender's email address, the sender's domain and/or the sender's mail server so that the protected network including mail servers, e.g. mail server 104 reject subsequent e-mails from the sender/domain of the sender”. Bergstrom [0040]. This reduces the chance of malware infection in computer systems. See also Bergstrom Abstract regarding mitigation of network attacks. 

Regarding claim 17, Abbasi teaches a non-transitory computer-readable storage medium including computer-executable instructions, which cause, when executed by a processor, the processor to perform a learning method for classifying attack groups, the method comprising: collecting a data set including documents of specific formats and extracting features for identifying attack groups using the documents of the specific formats from the data set (Abbasi ¶ [0040], malware samples are a dataset which includes files, executables, etc., from which features such as type designation, rules aggregation sequence, etc., are extracted; see also Abstract regarding classifying malware [identifying attack groups] by analyzing malware samples; see also ¶ [0037], “the malware classification logic 190 is configured to generate a malware training dataset. The malware training dataset is a collection of malware samples each comprising (1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), (3) the reference rule sequence, and (4) the class name or label of the malware”); generating a machine learning data set based on the extracted features (Abbasi ¶ [0040], malware dataset is created based on extracted features of “(1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), and (3) the rule aggregation sequence”; see also ¶ [0037]); generating a classification model for identifying the attack groups by executing a machine learning algorithm on the machine learning data set (Abbasi ¶ [0040], Longest Common Subsequence algorithm is executed on the generated dataset to classify malware); and classifying attack groups of an arbitrary document of a specific format by applying the classification model to the arbitrary document of the specific format (Abbasi ¶ [0040], the malware sample is assigned a class).
Abbasi does not explicitly teach identifying attack groups representing a country or an entity that injects malware in the document. 
However, Bergstrom teaches extracting features for identifying attack groups using the documents of the specific formats from the data set, each of the attack groups representing a country or an entity that injects malware in the documents (Bergstrom ¶ [0037], “evaluate a file to which a URL present [extracting features] in an e-mail [document] links at one or more pre-determined time intervals”; see also ¶ [0040], “blacklisting the sender's [entity] email address [by blacklisting the sender, the attack group is identified], the sender's domain and/or the sender's mail server so that the protected network including mail servers, e.g. mail server 104 reject subsequent e-mails” – the sender who injects URL malware into an email is blacklisted by the system in Bergstrom).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Abbasi and Bergstrom to teach identifying a sender who injects malware into a document because it allows for “blacklisting the sender's email address, the sender's domain and/or the sender's mail server so that the protected network including mail servers, e.g. mail server 104 reject subsequent e-mails from the sender/domain of the sender”. Bergstrom [0040]. This reduces the chance of malware infection in computer systems. See also Bergstrom Abstract regarding mitigation of network attacks. 

Regarding claim 18, Abbasi and Bergstrom teach the apparatus of claim 1. 
Abbasi does not explicitly teach teaches receiving advanced persistent threat (APT) reports, obtaining hash values of malwares in the APT report; and obtaining the malwares that matches the hash values.
However, Bergstrom teaches receiving advanced persistent threat (APT) reports (Bergstrom ¶ [0007], “an electronic mail (email) directed to a user of an enterprise and containing a potentially malicious link is received by a mail server of the enterprise.”), obtaining hash values of malwares in the APT report (Bergstrom ¶ [0007], “a file to which the potentially malicious link points at the first time is caused by the mail server to be evaluated within a sandbox environment and a first hash value is generated based on contents of the file to which the potentially malicious link points at the first time.”); and obtaining the malwares that matches the hash values (“a file to which the potentially malicious link points at the first time is caused by the mail server to be evaluated… a file to which the potentially malicious link points to at the second time is caused by the mail server to be evaluated.”).

Regarding claim 19, Abbasi and Bergstrom teach the apparatus of claim 1. Abbasi furthermore teaches wherein the instructions further cause the processor to classify malware into each of the attack groups (Abbasi Abstract “classify the malware sample to at least one known malware family”).

Claims 6-7 and 14-15 are rejected under 35 U.S.C. § 103 as being unpatentable over Abbasi (Pub. No. US 2017/0083703 A1) in view of Bergstrom (Pub. No. US 2019/0007426 A1) and further in view of Becker (Pat. No. US 6,460,049 B1).

Regarding claim 6, Abbasi teaches the apparatus of claim 5.
Abbasi does not explicitly teach wherein when the machine learning algorithm is executed, a K-fold cross validation algorithm for generating the classification model is executed after the machine learning data set is classified into a K-number of sub data sets.
However, Becker teaches wherein when the machine learning algorithm is executed, a K-fold cross validation algorithm for generating the classification model is executed after the machine learning data set is classified into a K-number of sub data sets (Becker column 20, lines 67 and column 21, lines 1-8, “Estimate Accuracy uses cross-validation, resulting in longer running times. Cross-validation splits the data into k folds (commonly 10) and builds k classifiers. The process can be repeated multiple times to increase the reliability of the estimate. A user can set the number k and the number of times”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Abbasi and Becker to teach utilizing K-fold validation because it “assesses the accuracy of a classifier that would be built”. Becker column 20, lines 67 and column 21, lines 1-8. Furthermore, this is merely combining prior art elements according to known methods to yield predictable results. MPEP 2143(I).

Regarding claim 7, Abassi and Becker teach the apparatus of claim 6. 
Abbasi does not explicitly teach wherein each of the sub data sets includes the training data set and the test data set at a ratio of K-1:1.
However, Becker teaches wherein each of the sub data sets includes the training data set and the test data set at a ratio of K-1:1 (Becker column 20, lines 67 and column 21, lines 1-8, “Estimate Accuracy uses cross-validation, resulting in longer running times. Cross-validation splits the data into k folds (commonly 10) and builds k classifiers. The process can be repeated multiple times to increase the reliability of the estimate. A user can set the number k and the number of times”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Abbasi and Becker to teach utilizing K-fold validation because it “assesses the accuracy of a classifier that would be built”. Becker column 20, lines 67 and column 21, lines 1-8. Furthermore, this is merely combining prior art elements (K-Fold) according to known methods (setting the K number) to yield predictable results. MPEP 2143(I).

Abbasi and Becker teach all the limitations of claims 14-15 as asserted above with regard to claims 6-7, respectively. 

Claim 20 is rejected under 35 U.S.C. § 103 as being unpatentable over Abbasi (Pub. No. US 2017/0083703 A1) in view of Bergstrom (Pub. No. US 2019/0007426 A1) and further in view of Peterson (Pat. No. US 8,370,942 B1).

Regarding claim 20, Abbasi and Bergstrom teach the apparatus of claim 1.
Abbasi and Bergstrom do not explicitly teach wherein each of the attack groups represent a country. 
However, Peterson teaches wherein each of the attack groups represent a country (Peterson column 5, lines 6-24, “The malware source analysis component 101 can determine in which country a source is located”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Abbasi, Bergstrom and Peterson identifying a country from which the malware source is located because “it is known that disproportionate amount of malware 103 originates from certain countries (e.g., China, Lithuania, etc.).”, and therefore content sourced from those countries is more likely to be malware than from other countries. Peterson column 5, lines 6-24.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GREGORY P TOLCHINSKY whose telephone number is (571)270-0599.  The examiner can normally be reached on m-f (9:30-6:30PM).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Umar Cheema can be reached on 571-270-3037.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




Gregory P. Tolchinsky
/G.P.T./Examiner, Art Unit 2456                                                                                                                                                                                                        10/18/2022

/Brian Whipple/Primary Examiner, Art Unit 2456                                                                                                                                                                                                        10/18/2022