DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 02/12/2021 has been entered.
 
Response to Arguments
35 U.S.C 103
	Applicant’s arguments filed with respect to the rejection(s) of claims 1-20 under U.S.C 103 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, and in light of Applicant’s amendments, new grounds of rejection are made in view of Tatarinov (U.S Pub # 20140366137).
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 15, 18-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  Regarding claims 15, 18-20, the instant claims recite the limitation of "one or more computer-readable media storing instructions."  The claims are directed toward an article of manufacture and normally would be statutory.  However, the specification, at paragraph 0015, defines or exemplifies the computer-readable storage medium in an open-ended and non-limiting manner such as " Another innovative aspect of the subject matter described in this specification can be embodied in one or more non-transitory computer readable media storing instructions that when executed by one or more computers cause the one or more computers to perform operations"  Thus under the broadest reasonable interpretation of "computer readable media" the claim is directed toward non-statutory type computer-readable storage media such as transitory signals and propagating waves.  Applicant is advised to amend the respective claims to exclude such transitory embodiments by adding “non-transitory” to the computer readable media which would render the claims statutory.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

s 1, 4, 7-8, 11, 14-15 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Schipka (U.S Pub # 20090013405) in view of Tatarinov (U.S Pub # 20140366137) and in further view of Stranne (U.S Pub # 20110154495).
With regards to claim 1, Schipka discloses a method of clustering files by a file characterization system comprising one or more computers, wherein the method comprises:
receiving, by the one or more computers, a plurality of files, wherein each file in the plurality of files is in a format, and the formats of the files include formats that are different from each other ([0049-0050] scanning system can receive and handle multiple different file formats);
for each of the files of the plurality of files:
determining, by the one or more computers, a format of the file ([0053] input file 100 is supplied to the file format identifier 21 which determines the file format of the file);
accessing data identifying, for each of the file formats, a respective set of file features to be extracted from files having that format, wherein the respective set of file features for the file format is different from the respective sets of file features for other respective file formats ([0050] For each file format, the scanning system 1 uses a set of predetermined features which include features based on the file forma);
selecting, by the one or more computers and based on the format of the file, file features associated with the format, the file features being the respective set of file features identified for the file format of a file ([0050, 0053] For each file format, the scanning system 1 uses a set of predetermined features which include features based 
extracting, by the one or more computers and for each file feature of the file features, a respective feature value for the file feature from the file ([0059] analyses the file 100 to detect the set of features which define the feature space in respect of the given file format to which the analyzer is specific. [0063] Some features may be simply indicated to be present or not, for example indicated by a binary value in the representation 23. Other features may have associated therewith a value which varies over a range. In this case the value may be present in the representation 24). 
	Schipka does not disclose however Tatarinov discloses:
	wherein files that have matching feature values for each file feature file features have a same hash ([0028] hash sums comparison may be used on all types of resources);
storing, in an index, the hash, and for each hash, an identification of each file for which the hash was generated from the feature values for file ([0025] table 2 [0026] in addition to the known resources of malicious executable files, the resource database may store known resources of executable not containing malicious code.);
receiving a file that is a known malware file ([0024] receive an executable file that’s analyzed to known resources of malicious executable files);
selecting, based on the format of the known malware file, file features associated with the format ([0029-0035] selecting features based on the type/format of files);
generating, by the one or more computers and based on the feature values, a hash for the file ([0028-0029] hash sum analysis).  

	One of ordinary skill in the art would have been motivated to make this modification in order to detect malicious executable files based on the similarity of various types of extractable of the executable files (Tatarinov [0007]).
Stranne discloses:
	extracting, for each file feature file features, a respective feature value for the file feature from the known malware file ([0060] extract a set of binary comparable features.);
generating, based on the feature values, a hash for the known malware file ([0060] Then the hash of these features are calculated);
submitting, as a search query to search the index, the generated hash for the known malware file ([0061] the look-up table is accessed to look up the calculated hashes);
receiving, in response for submitting the search query, data identifying all files that are indexed by the generated hash ([0062] each time a hash is located in the look-up table, this table entry is marked in a suitable manner); and
identifying the files received as belonging to a family of malware files that includes the known malware file ([0063] if a data collection is found to include all features of a specific genetic signature, the data collection is determined to belong to the variant of malware associated with this signature. [0050] where a variant is a set of malware with a genetic signature).

	One of ordinary skill in the art would have been motivated to make this modification in order identify a copy of a particular file by simple hash detection (Stranne [0006]).
	Claims 8 and 15 correspond to claim 1 and are rejected accordingly.
	With regards to claim 4, Schipka further discloses:
wherein at least one file feature of the file features is: a file size, a file type, or a metadata value ([0055] file type).  
Claims 11 and 18 correspond to claim 4 and are rejected accordingly.
With regards to claim 7, Schipka further discloses:
identifying, by the one or more computers and based on the format of the file, a predetermined file features ([0050] For each file format, the scanning system 1 uses a set of predetermined features which include features based on the file forma), and 
updating, in response to extracting the respective feature values by the one or more computers and based on the values of the extracted respective feature values, the predetermined file features ([0045] The training subsystem 30 is operated periodically to update the parameters as new reference files 100 are added to the corpus).  
	Claim 14 corresponds to claim 7 and is rejected accordingly.
Claims 6, 12-13, 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Schipka (U.S Pub # 20090013405) in view of Tatarinov (U.S Pub # 20140366137) .
With regards to claim 6, Schipka does not disclose however Krukov discloses:
combining the feature values to generate a combined representation of the features of the file ([0041] the first set of features is combined with the byte representation of the second set of features by means of a concatenation); and 
applying a hashing function to the combined representation to generate the hash of the file ([0041] compute the hash of the compound file 105 with the use of the first and second sets of features).  
It would have been obvious for one of ordinary skill in the art before the date the current invention was effectively filed to have combined the antivirus system of Schipka, Tatarinov and Stranne by the system of Krukov to identify files from their hash.
	One of ordinary skill in the art would have been motivated to make this modification in order to calculate and compare a hash sum of files in a database to identify a file as malicious (Krukov [0006]).
Claims 13 and 20 correspond to claim 6 and are rejected accordingly.
With regards to claim 12, Schipka does not disclose however Krukov discloses:
indexing, in an index that lists a plurality of files by respective hashes for the plurality of files, the file using the generated hash ([0046] a file may be a file whose hash and information about which is stored in the database of hashes 130).  
It would have been obvious for one of ordinary skill in the art before the date the current invention was effectively filed to have combined the antivirus system of Schipka, Tatarinov and Stranne by the system of Krukov to identify files from their hash.

Claim 19 corresponds to claim 12 and is rejected accordingly.
Conclusion

                                                                                                                                                      
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TONY WU whose telephone number is (571)272-2033.  The examiner can normally be reached on Monday-Friday (9-5).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark Featherstone can be reached on 7032703750.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 






/T.W./Examiner, Art Unit 2166                                                                                                                                                                                                        
/MARK D FEATHERSTONE/Supervisory Patent Examiner, Art Unit 2166