Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This communication is in response to the application filed on 11/05/2018. 
Claims 1-17 are pending.
Priority
Acknowledgment is made of Applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy of the foreign application has been retrieved. 
Information Disclosure Statement
The information disclosure statement (IDS) submitted on and 11/22/2019 is in compliance with the provisions of 37 C.F.R. § 1.97. Accordingly, the IDS is being considered by the examiner.
Claim Interpretation
The following is a quotation of 35 U.S.C. § 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. § 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. § 112(f) or pre-AIA  35 U.S.C. § 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “a feature extracting unit configured to”, “an information generating unit configured to” and “a learning model unit configured to” in claim 1 (these units are also recited in claims 4-5) and “an attack group classifying unit” in claim 8. 
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 U.S.C. § 112
The following is a quotation of the first paragraph of 35 U.S.C. § 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 1, 4-5 and 8 are rejected under 35 U.S.C. § 112(a) or 35 U.S.C. § 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. § 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. Specifically, “a feature extracting unit configured to”, “an information generating unit configured to” and “a learning model unit configured to” in claim 1 (these units are also recited in claims 4-5) and “an attack group classifying unit” in claim 8 are not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. § 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim Rejections - 35 U.S.C. § 112
The following is a quotation of 35 U.S.C. § 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


Claim limitations in claims 1, 4-5 and 8 invoke 35 U.S.C. § 112(f). However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function of the units and to clearly link the structure, material, or acts to the function. Therefore, claims 1, 4-5 and 8 are indefinite and are rejected under 35 U.S.C. § 112(b).
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. § 112(f) or pre-AIA  35 U.S.C. § 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. § 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. § 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. §132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 C.F.R. § 1.75(d) and MPEP §§ 608.01(o) and 2181.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. § 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-5, 8-13 and 16-17 are rejected under 35 U.S.C. § 102(a)(1) as being anticipated by Abbasi (Pub. No. US 2017/0083703 A1)
Regarding claim 1, Abbasi teaches an attack group classifying apparatus comprising: a feature extracting unit configured to extract, from a data set including documents of specific formats, features for identifying attack groups using the documents of the specific formats (Abbasi ¶ [0040], malware samples are a dataset which includes files, executables, etc., from which features such as type designation, rules aggregation sequence, etc., are extracted; see also Abstract regarding classifying malware [identifying attack groups] by analyzing malware samples; see also ¶ [0037], “the malware classification logic 190 is configured to generate a malware training dataset. The malware training dataset is a collection of malware samples each comprising (1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), (3) the reference rule sequence, and (4) the class name or label of the malware”); an information generating unit configured to generate a machine learning data set based on the extracted features (Abbasi ¶ [0040], malware dataset is created based on extracted features of “(1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), and (3) the rule aggregation sequence”; see also ¶ [0037]); and a learning model unit configured to execute a machine learning algorithm on the machine learning data set to generate a classification model for identifying the attack groups (Abbasi ¶ [0040], Longest Common Subsequence algorithm is executed on the generated dataset to classify malware).

Regarding claim 2, Abbasi teaches the apparatus of claim 1. Abbasi furthermore teaches wherein the specific formats of the documents include at least one of an e-mail file format, a document file format and an executable file format (Abbasi ¶ [0037], types of formats include executable).

Regarding claim 3, Abbasi teaches the apparatus of claim 1. Abbasi furthermore teaches wherein the features include at least one of location information, language information, time information, system information, file attribute information and n-gram information (Abbasi ¶ [0040], dataset includes type designation).

Regarding claim 4, Abbasi teaches the apparatus of claim 1. Abbasi furthermore teaches wherein the information generating unit generates the machine learning data set by pre-processing for classifying the features into categorical features and numerical features (Abbasi [0040], “dataset of malware samples each comprising (1) a representation of the sample (e.g., hash value) [numerical features], (2) type designation (e.g., file, executable, etc.) [categorical features]”).

Regarding claim 5, Abbasi teaches the apparatus of claim 4. Abbasi furthermore teaches wherein the learning model unit executes the machine learning algorithm after the machine learning data set is classified into a training data set and a test data set (Abbasi ¶ [0037], training set is classified as a training set of a test data set; see also [0040]).

Regarding claim 8, Abbasi teaches the apparatus of claim 1. Abbasi furthermore teaches an attack group classifying unit configured to apply the classification model to an arbitrary document of a specific format to classify attack groups of the arbitrary document of the specific format (Abbasi ¶ [0040], the malware sample is assigned a class).

Regarding claim 9, Abbasi teaches a learning method for classifying attack groups comprising: collecting a data set including documents of specific formats and extracting features for identifying attack groups using the documents of the specific formats from the data set (Abbasi ¶ [0040], malware samples are a dataset which includes files, executables, etc., from which features such as type designation, rules aggregation sequence, etc., are extracted; see also Abstract regarding classifying malware [identifying attack groups] by analyzing malware samples; see also ¶ [0037], “the malware classification logic 190 is configured to generate a malware training dataset. The malware training dataset is a collection of malware samples each comprising (1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), (3) the reference rule sequence, and (4) the class name or label of the malware”); generating a machine learning data set based on the extracted features (Abbasi ¶ [0040], malware dataset is created based on extracted features of “(1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), and (3) the rule aggregation sequence”; see also ¶ [0037]); and generating a classification model for identifying the attack groups by executing a machine learning algorithm on the machine learning data set (Abbasi ¶ [0040], Longest Common Subsequence algorithm is executed on the generated dataset to classify malware).
Abbasi teaches all the limitations of claims 10-13, as asserted above with regard to claims 2-5, respectively.

Regarding claim 16, Abbasi teaches an attack group classifying method comprising: collecting a data set including documents of specific formats extracting features for identifying attack groups using the documents of the specific formats from the data set (Abbasi ¶ [0040], malware samples are a dataset which includes files, executables, etc., from which features such as type designation, rules aggregation sequence, etc., are extracted; see also Abstract regarding classifying malware [identifying attack groups] by analyzing malware samples; see also ¶ [0037], “the malware classification logic 190 is configured to generate a malware training dataset. The malware training dataset is a collection of malware samples each comprising (1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), (3) the reference rule sequence, and (4) the class name or label of the malware”); generating a machine learning data set based on the extracted features (Abbasi ¶ [0040], malware dataset is created based on extracted features of “(1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), and (3) the rule aggregation sequence”; see also ¶ [0037]); generating a classification model for identifying the attack groups by executing a machine learning algorithm on the machine learning data set (Abbasi ¶ [0040], Longest Common Subsequence algorithm is executed on the generated dataset to classify malware); and classifying attack groups of an arbitrary document of a specific format by applying the classification model to the arbitrary document of the specific format (Abbasi ¶ [0040], the malware sample is assigned a class).

Regarding claim 17, Abbasi teaches a non-transitory computer-readable storage medium including computer-executable instructions, which cause, when executed by a processor, the processor to perform a learning method for classifying attack groups, the method comprising: collecting a data set including documents of specific formats and extracting features for identifying attack groups using the documents of the specific formats from the data set (Abbasi ¶ [0040], malware samples are a dataset which includes files, executables, etc., from which features such as type designation, rules aggregation sequence, etc., are extracted; see also Abstract regarding classifying malware [identifying attack groups] by analyzing malware samples; see also ¶ [0037], “the malware classification logic 190 is configured to generate a malware training dataset. The malware training dataset is a collection of malware samples each comprising (1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), (3) the reference rule sequence, and (4) the class name or label of the malware”); generating a machine learning data set based on the extracted features (Abbasi ¶ [0040], malware dataset is created based on extracted features of “(1) a representation of the sample (e.g., hash value), (2) type designation (e.g., file, executable, etc.), and (3) the rule aggregation sequence”; see also ¶ [0037]); generating a classification model for identifying the attack groups by executing a machine learning algorithm on the machine learning data set (Abbasi ¶ [0040], Longest Common Subsequence algorithm is executed on the generated dataset to classify malware); and classifying attack groups of an arbitrary document of a specific format by applying the classification model to the arbitrary document of the specific format (Abbasi ¶ [0040], the malware sample is assigned a class).
Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. § 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 6-7, 14-15 are rejected under 35 U.S.C. § 103 as being unpatentable over Abbasi (Pub. No. US 2017/0083703 A1) in view of Becker (Pat. No. US 6,460,049 B1).

Regarding claim 6, Abbasi teaches the apparatus of claim 5.
Abbasi does not explicitly teach wherein when the machine learning algorithm is executed, a K-fold cross validation algorithm for generating the classification model is executed after the machine learning data set is classified into a K-number of sub data sets.
However, Becker teaches wherein when the machine learning algorithm is executed, a K-fold cross validation algorithm for generating the classification model is executed after the machine learning data set is classified into a K-number of sub data sets (Becker column 20, lines 67 and column 21, lines 1-8, “Estimate Accuracy uses cross-validation, resulting in longer running times. Cross-validation splits the data into k folds (commonly 10) and builds k classifiers. The process can be repeated multiple times to increase the reliability of the estimate. A user can set the number k and the number of times”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Abbasi and Becker to teach utilizing K-fold validation because it “assesses the accuracy of a classifier that would be built”. Becker column 20, lines 67 and column 21, lines 1-8. Furthermore, this is merely combining prior art elements according to known methods to yield predictable results. MPEP 2143(I).
Regarding claim 7, Abassi and Becker teach the apparatus of claim 6. 
Abbasi does not explicitly teach wherein each of the sub data sets includes the training data set and the test data set at a ratio of K-1:1.
However, Becker teaches wherein each of the sub data sets includes the training data set and the test data set at a ratio of K-1:1 Becker column 20, lines 67 and column 21, lines 1-8, “Estimate Accuracy uses cross-validation, resulting in longer running times. Cross-validation splits the data into k folds (commonly 10) and builds k classifiers. The process can be repeated multiple times to increase the reliability of the estimate. A user can set the number k and the number of times”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to combine the teachings of Abbasi and Becker to teach utilizing K-fold validation because it “assesses the accuracy of a classifier that would be built”. Becker column 20, lines 67 and column 21, lines 1-8. Furthermore, this is merely combining prior art elements (K-Fold) according to known methods (setting the K number) to yield predictable results. MPEP 2143(I).

Abbasi and Becker teach all the limitations of claims 14-15 as asserted above with regard to claims 6-7, respectively. 
Conclusion
The following prior art made of record and not relied upon is considered pertinent to  Applicant's disclosure: 
Cakir (Bugra Cakir and Erdogan Dogdu. 2018. Malware classification using deep learning methods. In Proceedings of the ACMSE 2018 Conference (ACMSE '18). Association for Computing Machinery, New York, NY, USA, Article 10, 1–5. DOI:https://doi.org/10.1145/3190645.3190692) teaches “Malware classification using deep learning methods.” Cakir Title.
Grehant (Pub. No. US 2017/0193052 A1) teaches that “A second method is the cross-validation discussed in Kohavi, Ron. “A study of cross-validation and bootstrap for accuracy estimation and model selection.” Ijcai. Vol. 14. No. 2. 1995. Cross validation consists in breaking the available tagged data into training data and test data. The model is trained based on the training data and then tested on the test data. When tested, the output of the trained model is compared to the actual value of the target data. K-fold consists in multiple (K, e.g. K=5) cross-validations to make better use of the available tagged data. In the first cross-validation, the tagged data is split in K sets of approximatively the same size (it is approximate because the size of the tagged data set may be different from a multiple of K). Then, for each successive run, the test data set is made with samples not previously used in the test set (in previous runs), and the training data at each run is the remainder of the tagged data set. Performance of the model is measured for each run. The final performance measure is typically the average of all runs.” Grehant ¶ [0007].
Westhues (Pub. No. US 2021/0366048 A1) teaches “artificial neural network model may be validated and cross-validated using standard techniques such as hold-out, K-fold”. Westhues ¶ [0025].
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GREGORY P TOLCHINSKY whose telephone number is (571)270-0599.  The examiner can normally be reached on m-f (9:30-6:30PM).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Umar Cheema can be reached on 571-270-3037.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



Gregory P. Tolchinsky
/G.P.T./Examiner, Art Unit 2456                                                                                                                                                                                                        04/08/2022

/UMAR CHEEMA/Supervisory Patent Examiner, Art Unit 2456