FINAL ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Amendment A, received on 17 May 2021, has been entered into record.  In this amendment, claims 1, 10, and 16 have been amended.
Claims 1-20 are presented for examination.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 17 May 2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Response to Arguments
With regards to the objection to the abstract, the applicant has submitted amendments, and the examiner hereby withdraws the objection.
Applicant’s arguments with respect to claim(s) 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. 
Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of U.S. Patent No. 10,621,349 B2 in view of Saxe et al. (US Patent 9,690,938  B1 and Saxe hereinafter). The ‘349 Patent teaches the instant claims except for receiving a feature vector comprising a plurality of features derived from a file; determining, by a machine learning model using the feature vector, that the file comprises malicious code; preventing, based on the determining, the file from being accessed or executed. Nonetheless, these features are well known in the art and would have been an obvious modification of the teachings disclosed by ‘349, as taught by Saxe. Saxe discloses compressing a received file and adding each of the hash values into a vector (col. 4, lines 25-37) that is received by machine learning processes to determine whether the file is classified as malware (col. 3, lines 11-15; col. 5, lines 28-30), and taking appropriate action such as deleting or quarantining the file when the file is identified as malware (col. 18, lines 8-21). Given the teaching of Saxe, a person having ordinary skill in the art before the effective filing date of the claimed invention would have readily recognized the desirability and advantages of modifying the teachings of the ‘349 Patent with the teachings of Saxe by receiving a feature vector, determining a file has malicious code, and preventing the file from being accessed or executed in order to provide protection against malware threats while using machine learning techniques to reduce the amount of time needed for threat detection (col. 1, lines 39-41).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention 

Claims 1-5, 8-14, and 16-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Saxe in view of Zhou et al. (US 2013/0291111 A1 and Zhou hereinafter), and further in view of Tristan et al. (US 2019/0095805 A1 and Tristan hereinafter).
As to claim 1, Saxe discloses a system and method for machine learning based malware detection, the system and method having:
receiving a feature vector comprising a plurality of features derived from a file (col. 4, lines 25-37; col. 5, lines 28-30); 
determining, by a machine learning model using the feature vector, that the file comprises malicious code (col. 3, lines 11-15; col. 5, lines 28-30); 
preventing, based on the determining, the file from being accessed or executed (col. 18, lines 8-21); 
wherein the machine learning model is generated by: 
hashing, each of a plurality of features in a feature set, to result in a corresponding identifier value, wherein the feature set is generated from a sample and the sample includes at least a portion of a file (col. 21, lines 1-11, 22-24; Figure 8).
Saxe fails to specifically disclose:
indexing, based on the hashing, one or more hashed features to generate a plurality of index vectors, wherein values in the index vector are populated based on the identifier values;
generating, using the index vectors, a training dataset; 
training, using the training dataset, the machine learning model using the training dataset. 

Zhou discloses a system and method for program identification based on machine learning, the system and method having:
generating, using the index vectors (feature vectors), a training dataset (training models) (0117, lines 13-16; 0399, lines 15-20); 
training, using the training dataset, the machine learning model using the training dataset (0399, lines 9-20). 
Given the teaching of Zhou, a person having ordinary skill in the art before the effective filing date of the claimed invention would have readily recognized the desirability and advantages of modifying the teachings of Saxe with the teachings of Zhou by training a machine learning model to identify a file with a malicious code. Zhou recites motivation by disclosing that training a machine learning model to identify malicious code in a file allows for providing protection while saving manpower and improving identification efficiency (0399, lines 20-22). It is obvious that the teachings of Zhou would have improved the teachings of Saxe by training a machine learning model to identify malicious code in a file in order to provide protection in an efficient manner.

Saxe in view of Zhou fails to specifically disclose:
indexing, based on the hashing, one or more hashed features to generate a plurality of index vectors, wherein values in the index vector are populated based on the identifier values.
Nonetheless, this feature is well known in the art and would have been an obvious modification of the teachings disclosed by Saxe in view of Zhou, as taught by Tristan.
Tristan discloses a system and method for ensemble decision using feature hashing models, the system and method having:
indexing, based on the hashing, one or more hashed features to generate a plurality of index vectors, wherein values in the index vector are populated based on the identifier values (0021, lines 8-12; 0031, lines 1-17; 0032, lines 1-5).
Given the teaching of Tristan, a person having ordinary skill in the art before the effective filing date of the claimed invention would have readily recognized the desirability and advantages of modifying the teachings of Saxe in view of Zhou with the teachings of Tristan by indexing hashed features to generate index vectors. Tristan recites motivation by disclosing that mapping the set of features into another set by using a hash function reduces the size of the feature set (0020, lines 8-13). It is obvious that the teachings of Tristan would have improved the teachings of Saxe in view of Zhou by indexing hashed features to generate index vectors in order to reduce the size of the feature set.

As to claims 10 and 16, Saxe discloses:
at least one programmable data processor (col. 3, lines 32-39); 
memory storing instructions which, when executed by the at least one programmable data processor, result in operations comprising (col. 3, lines 40-51): 
receiving a feature vector comprising a plurality of features derived from a file (col. 4, lines 25-37; col. 5, lines 28-30); 
determining, by a machine learning model using the feature vector, that the file comprises malicious code (col. 3, lines 11-15; col. 5, lines 28-30); 
preventing, based on the determining, the file from being accessed or executed (col. 18, lines 8-21); 
wherein the machine learning model is generated by: 
hashing, each of a plurality of features in a feature set, to result in a corresponding identifier value, wherein the feature set is generated from a sample and the sample includes at least a portion of a file (col. 21, lines 1-11, 22-24; Figure 8).

indexing, based on the hashing, one or more hashed features to generate a plurality of index vectors, wherein values in the index vector are populated based on the identifier value;
generating, using the index vector, a training dataset; 
training, using the training dataset, the machine learning model using the training dataset. 
Nonetheless, these features are well known in the art and would have been an obvious modification of the teachings disclosed by Saxe, as taught by Zhou.
Zhou discloses:
generating, using the index vectors (feature vectors), a training dataset (training models) (0117, lines 13-16; 0399, lines 15-20); 
training, using the training dataset, the machine learning model using the training dataset (0399, lines 9-20). 
Given the teaching of Zhou, a person having ordinary skill in the art before the effective filing date of the claimed invention would have readily recognized the desirability and advantages of modifying the teachings of Saxe with the teachings of Zhou by training a machine learning model to identify a file with a malicious code. Please refer to the motivation recited above with respect to claim 1 as to why it is obvious to apply the teachings of Zhou to the teachings of Saxe.

Saxe in view of Zhou fails to specifically disclose:
indexing, based on the hashing, one or more hashed features to generate a plurality of index vectors, wherein values in the index vector are populated based on the identifier value.
Nonetheless, this feature is well known in the art and would have been an obvious modification of the teachings disclosed by Saxe in view of Zhou, as taught by Tristan.

indexing, based on the hashing, one or more hashed features to generate a plurality of index vectors, wherein values in the index vector are populated based on the identifier value (0021, lines 8-12; 0031, lines 1-17; 0032, lines 1-5).
Given the teaching of Tristan, a person having ordinary skill in the art before the effective filing date of the claimed invention would have readily recognized the desirability and advantages of modifying the teachings of Saxe in view of Zhou with the teachings of Tristan by indexing hashed features to generate index vectors. Please refer to the motivation recited above with respect to claim 1 as to why it is obvious to apply the teachings of Tristan to the teachings of Saxe in view of Zhou.

As to claims 2, 11, and 17, Saxe fails to specifically disclose:
wherein a format of the file is selected from a group consisting of: a portable executable format, a document format, a file format, an executable format, a script format, an image format, a video format, and an audio format. 
Nonetheless, these features are well known in the art and would have been an obvious modification of the teachings disclosed by Saxe, as taught by Zhou.
Zhou discloses:
wherein a format of the file is selected from a group consisting of: a portable executable format, a document format, a file format, an executable format, a script format, an image format, a video format, and an audio format (0115, lines 1-2). 
Given the teaching of Zhou, a person having ordinary skill in the art before the effective filing date of the claimed invention would have readily recognized the desirability and advantages of modifying the teachings of Saxe with the teachings of Zhou by using a particular file format. Please refer to the 

As to claims 3, 12, and 18, Saxe discloses:
wherein the index includes a value corresponding to a hashed feature (col. 22, lines 14-23). 
Saxe does not explicitly disclose wherein the index includes a value corresponding to a sign attribute; however, Saxe discloses an index corresponding to field names that are hashed, where hash values are limited into a range such as [0,16) (col. 21, lines 22-33, 49-64; col. 22, lines 14-23). Since values can be limited to a range, where the range only includes positive values, the values can be considered a sign attribute.

As to claims 4, 13, and 19, Saxe discloses:
wherein the value is determined based on a name of each hashed feature (col. 11, lines 14-23). 

As to claims 5, 14, and 20, Saxe does not explicitly disclose wherein the sign attribute includes at least one of the following: a positive value and a negative value; however, Saxe discloses an index corresponding to field names that are hashed, where hash values are limited into a range such as [0,16) (col. 21, lines 22-33, 49-64; col. 22, lines 14-23). Since values can be limited to a range such are where the range only includes positive values, the values can be considered a sign attribute including positive values.

As to claim 8, Saxe discloses:
wherein the index vector has a predetermined size (col. 22, lines 14-23). 

As to claim 9, Saxe discloses:
wherein at least one of the hashing, the indexing, the generating, and the training is performed by at least one processor of at least one computing system, wherein the computing system comprises: at least one software component, at least one hardware component, and any combination thereof (col. 22, lines 50-61). 

Prior Art Made of Record
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Dirac et al. (US 2015/0379430 A1) discloses a system and method for efficient duplicate detection for machine learning data sets.
Pearmain et al. (US 2018/0082191 A1) discloses a system and method for proximal factorization machine interface engine.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SARAH SU whose telephone number is (571)270-3835.  The examiner can normally be reached on 7:30 AM - 4:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lynn Feild can be reached on 571-272-2092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SARAH SU/Primary Examiner, Art Unit 2431