DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

2.	Claims 1-18 are pending.

Information Disclosure Statement
3.	The information disclosure statement (IDS) submitted on 5/10/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
4.	The drawings have been reviewed and are accepted as being in compliance with the provisions of 37 CFR 1.121.

Priority
5.	Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed on 5/10/2021.
.
Claim Rejections - 35 USC § 101
6.	35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


7.	Claims 1-18 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: Claim 1 recites a “method of comparing a candidate file with an exemplar file, comprising: receiving the candidate file comprising candidate file data; processing the candidate file data to generate a candidate file fingerprint representing the candidate file…comparing the candidate file fingerprint with an exemplar file fingerprint representing the exemplar file…wherein, processing the candidate file data to generate a candidate file fingerprint representing the candidate file, comprises: applying a rolling hash function to the candidate file data to generate a sequence of strings, and adding to the candidate file fingerprint a fingerprint string comprising a substring from the sequence of strings when a predetermined string pattern appears in the sequence of strings.…” the claim recites a series of steps and therefore is a process. 

Step 2a Prong One Claim 1, recites “comparing” Specifically referring to comparing “a candidate file fingerprint” comprising a plurality of strings and “comparing” portions of those files. These limitations are processes that, under their broadest reasonable interpretation, cover performance of the limitation in the mind, but for the recitation of “processing” could use generic computer components.

Step 2A Prong Two: The judicial exception is not integrated into a practical application. The claim recites the additional elements of "receiving the candidate file comprising candidate file data..."; this limitation amounts to data gathering (MPEP 2106.05(g); and "processing the candidate file data to generate a candidate file fingerprint"; this limitation is a  mere generic presentation of another file fingerprint of collected and analyzed data “An application program interface for extracting and processing information from a diversity of types of hard copy documents – Content Extraction, 776 F.3d at 1345, 113 USPQ2d at 1356” (MPEP 2106.05(g).

As per Claim 2, generating by receiving the exemplar file comprising exemplar file data; and processing the exemplar file data to generate the exemplar file fingerprint representing the exemplar file, the exemplar file fingerprint comprising the plurality of fingerprint strings each representing a portion of the exemplar file data; wherein, processing the exemplar file data to generate the exemplar file fingerprint, comprises: applying the rolling hash function to the exemplar file data to generate a sequence of strings, and adding to the exemplar file fingerprint a fingerprint string comprising a substring from the sequence of strings when the predetermined string pattern appears in the sequence of strings.” At this step a person could examine and determine by observing the sets of information “strings” and based on the similarity give a response, being a mental process, reciting processing does not amount to include additional elements. These limitations are processes that, under their broadest reasonable interpretation, cover performance of the limitation in the mind, but for the recitation of generic computer components.

As per Claim 3, comparing and calculating a Jaccard similarity index across the fingerprint strings of the candidate file fingerprint and the exemplar file fingerprint. At this step a person could examine and determine by observing the sets of information “strings” and based on the similarity give a response, being a mental process, reciting processing does not amount to include additional elements. These limitations are processes that, under their broadest reasonable interpretation, cover performance of the limitation in the mind, but for the recitation of generic computer components.

“The Jaccard similarity is calculated by dividing the number of observations in both sets by the number of observations in either set. In other words, the Jaccard similarity can be computed as the size of the intersection divided by the size of the union of two sets” at this step a person could examine and determine by observing the sets of information “strings” and based on the similarity give a response, being a mental process. These limitations are processes that, under their broadest reasonable interpretation, cover performance of the limitation in the mind, but for the recitation of generic computer components.

As per Claim 4, comparing and computing values to match the exemplar file. At this step a person could compare and compute values. These limitations are processes that, under their broadest reasonable interpretation, cover performance of the limitation in the mind, but for the recitation of generic computer components “computing”.

As per Claim 5, receiving, processing, and applying a rolling function to the sequence of predetermined file fingerprints and disposing the files in a common directory, or on a common disk, or distributed across an estate of computers and/or associated storage systems. The claim is specifically disclosing that is distributing and applying a rolling hash function. These limitations “processes” that, under their broadest reasonable interpretation, cover performance of the limitation in the mind, applying rolling function” but for the recitation of generic computer components.

As per Claim 6, recites that the candidate file and/ or the exemplar file is an executable file or a Dynamic Link Library file. At this step, using a program to compare the “candidate file” uses a program, rather to add an inventive concept.
“A dynamic link library (DLL) is a collection of small programs that larger programs can load when needed to complete specific tasks. The small program, called a DLL file, contains instructions that help the larger program handle what may not be a core function of the original program.”

As per Claim 7, the claim recites a candidate file data to generate a sequence of strings comprises executing a Rabin-Karp Rolling Hash algorithm. At this step, adding certain algorithm deemed connected to a string-search and pattern is the mere use of the mentioned algorithm, rather to add an inventive concept, hence considered a mental process and using a “mathematical formula”
See In Gottschalk v. Benson, the patent for an algorithm for converting binary coded decimal to pure binary, which was based on a programmed conversion of numerical information was held to be patent-ineligible. It was held that a mathematical formula with no substantial practical application exception in connection with a digital computer cannot be patented.
“The Rabin-Karp algorithm is a string-searching algorithm that uses hashing to find patterns in strings. A string is an abstract data type that consists of a sequence of character”

Claims 10-17 are rejected for the same reasons as Claims 1-7, are wholly directed 
towards generating fingerprints representing a candidate file, in which a “rolling hash function is applied” which is clearly performing a mathematical calculation regarding the fingerprints. “A hash function is a mathematical function that converts a numerical input value into another compressed numerical value”. See In Gottschalk v. Benson
The mentioned claims also add the functionality of submasks applied to the string sequences, which is comparing positions of the sequence of strings. These limitations under their broadest reasonable interpretation, cover performance of the limitation in the mind, comparing the strings or “masking” but for the recitation of generic computer components.

Claims 8-9 and 18 merely state that the instructions are executable by a processor, but does not necessitate that the instructions are executed by the processor.  As such the step of execution is optional and the Claim is directed towards instructions (e.g. non-functional descriptive material) tangibly embodied on the computer program product, which is clearly disclosed above as non-statutory.
Merely claiming nonfunctional descriptive material, i.e., abstract ideas, stored on a computer-readable medium, in a computer, or on an electromagnetic carrier signal, does not make it statutory. See Diehr, 450 U.S. at 185-86, 209 USPQ at 8 (noting that the claims for an algorithm in Benson were unpatentable as abstract ideas because “[t]he sole practical application of the algorithm was in connection with the programming of a general purpose computer.”).  

Claim Rejections - 35 USC § 102
8.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


9.	Claim(s) 1-18 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kriz et al (US 2017/0279841).
As per Claim 1, Kriz discloses:
A method of comparing a candidate file with an exemplar file, comprising: receiving the candidate file comprising candidate file data; (Par [0020], “In some aspects, the offset (i.e., position from the beginning of the data) of the anchor value can be determined.” Par [0029] “the file can be parsed in to components and the components in the file can be analyzed “anchor value”) processing the candidate file data to generate a candidate file fingerprint representing the candidate file, the candidate file fingerprint comprising a plurality of fingerprint strings each representing a portion of the candidate file data; (See Figure 3, “fingerprint generator” and Par [0019], “For example, properties of a rolling hash function (e.g., values where at least five bits equal to zero) may be used to determine anchor values.”) and comparing the candidate file fingerprint with an exemplar file fingerprint representing the exemplar file, (Par [0031], “the data similarity fingerprint can be useful in other application environments. Such environments can include log file analysis, comparing text or binary files, or other file/data comparison environment”)
the exemplar file comprising exemplar file data and the exemplar file fingerprint comprising a plurality of fingerprint strings each representing a portion of the exemplar file data; (Par [0003]. “In the field of malware detection, it can be useful to determine if a file or a portion of a file is similar to a file that is known to contain malware, or is known to be free of malware”) wherein, processing the candidate file data to generate a candidate file fingerprint representing the candidate file, comprises: applying a rolling hash function to the candidate file data to generate a sequence of strings, and adding to the candidate file fingerprint a fingerprint string comprising a substring from the sequence of strings when a predetermined string pattern appears in the sequence of strings. (See Figures 1and 3 and par [0019], “For example, properties of a rolling hash function (e.g., values where at least five bits equal to zero) may be used to determine anchor values. The anchor values may be determined by a machine programmed with the desired anchor values to be used for the various data types or using selected properties or functions applied to the data object.”).

As per Claim 2, the rejection of claim 1 is incorporated and Kriz further discloses: wherein the exemplar file fingerprint is generated by: receiving the exemplar file comprising exemplar file data; and processing the exemplar file data to generate the exemplar file fingerprint representing the exemplar file, the exemplar file fingerprint comprising the plurality of fingerprint strings each representing a portion of the exemplar file data; (See Figures 1-3, comparing and generating according to portions based on the “anchor values”) wherein, processing the exemplar file data to generate the exemplar file fingerprint, comprises: applying the rolling hash function to the exemplar file data to generate a sequence of strings, and adding to the exemplar file fingerprint a fingerprint string comprising a substring from the sequence of strings when the predetermined string pattern appears in the sequence of strings. (See Figures 1-3, comparing and generating according to portions based on the “anchor values” and see also par [0019], “rolling hash function”).

As per Claim 3, the rejection of claim 1 is incorporated and Kriz further discloses: wherein comparing the candidate file fingerprint with the exemplar file fingerprint representing the exemplar file, comprises: calculating a Jaccard similarity index across the fingerprint strings of the candidate file fingerprint and the exemplar file fingerprint. (Par [0016], “The choice of anchor values and the resulting distances between anchor values can be used to generate coordinates for a data similarity fingerprint . In some aspects, there are N coordinates for a data similarity fingerprint,
where N is the number of different anchor values… ( Euclidean ) metric . This allows
efficient ( approximate ) algorithms for nearest neighbor searches , for example kd or k - means trees when comparing data similarity fingerprint values to be used”)
NOTE: “The Jaccard similarity is calculated by dividing the number of observations in both sets by the number of observations in either set. In other words, the Jaccard similarity can be computed as the size of the intersection divided by the size of the union of two sets”

As per Claim 4, the rejection of claim 1 is incorporated and Kriz further discloses: wherein comparing the candidate file fingerprint with the exemplar file fingerprint representing the exemplar file, comprises: computing a value indicative of the similarity of the comparison, and further comprising: indicating, based on a predetermined threshold of the value, that the candidate file matches the exemplar file. (Par [0016], “The choice of anchor values and the resulting distances between anchor values can be used to generate coordinates for a data similarity fingerprint . In some aspects, there are N coordinates for a data similarity fingerprint,
where N is the number of different anchor values… ( Euclidean ) metric . This allows
efficient ( approximate ) algorithms for nearest neighbor searches , for example kd or k - means trees when comparing data similarity fingerprint values to be used” computed value giving “anchor value” and generate coordinates for a data similarity fingerprint; see also figure 1).
As per Claim 5, the rejection of claim 1 is incorporated and Kriz further discloses: second candidate file comprising second candidate file data; and processing the at least a second candidate file data to generate at least a second 
candidate file fingerprint representing the at least a second candidate file, the at least a second candidate file fingerprint comprising a plurality of fingerprint strings each representing a portion of the at least a second candidate file data; (See Figures 1-3 and paragraphs [0019-0020]) and wherein, processing the at least a second candidate file data to generate at least a second candidate file fingerprint representing the at least a second candidate file, comprises: applying the rolling hash function to the at least a second candidate file data to generate a sequence of strings, (See Figure 3, “fingerprint generator” and Par [0019], “For example, properties of a rolling hash function (e.g., values where at least five bits equal to zero) may be used to determine anchor values.”) and adding to the candidate file fingerprint a fingerprint string comprising a substring from the sequence of strings when the predetermined string pattern appears in the sequence of strings; (Par [0014], “highlighted. As shown in FIG. 1, example 106 has four instances of the anchor value “0a.” A distance between anchor values comprises the number of data units between instances of an anchor value” and par [0023-0024], “At block 210, the coordinates for each anchor value are assembled into a vector representing the data similarity fingerprint” see also Figure 1 and 2) and wherein the candidate file and the at least a second candidate file are disposed in a common directory, or on a common disk, or distributed across an estate of computers and/or associated storage systems. (Par [0037], “The disk drive unit 416 includes a machine-readable medium 422 on which is stored one or more sets of instructions 424 and data structures (e.g., software instructions)”).

As per Claim 6, the rejection of claim 1 is incorporated and Kriz further discloses: wherein the candidate file and/ or the exemplar file is an executable file or a Dynamic Link Library file. (Par [0019], “For example, a set of anchor values may be chosen for PE (portable executable) files typically found in Android OS environments, while a different set of anchor values may be chosen when the object data comprises IOS executable files”).

As per Claim 7 the rejection of claim 1 is incorporated and Kriz further discloses: wherein: applying a rolling hash function to the candidate file data to generate a sequence of strings comprises executing a Rabin-Karp Rolling Hash algorithm. (Par [0019], “For example, properties of a rolling hash function (e.g., values where at least five bits equal to zero) may be used to determine anchor values”).

As per Claim 8, Kriz discloses: A computer program product comprising instructions which when executed on a processor cause the processor to carry out the method according to claim 1. (See Claim 1 and par [0026]).
 
As per Claim 9, the rejection of Claim 1 is incorporated and further Kriz discloses: wherein the method is performed on a single core of a processor. (Par [0026]).

As per Claim 10, Kriz discloses:
A method of generating a candidate file fingerprint representing a candidate file, comprising: receiving the candidate file comprising candidate file data; and 
processing the candidate file data to generate the candidate file fingerprint representing the candidate file, the candidate file fingerprint comprising a plurality of fingerprint strings each representing a portion of the candidate file data; (Par [0020], “In some aspects, the offset (i.e., position from the beginning of the data) of the anchor value can be determined.” Par [0029] “the file can be parsed in to components and the components in the file can be analyzed “anchor value” being the portion) wherein, processing the candidate file data to generate a candidate file fingerprint representing the candidate file, comprises: applying a rolling hash function to the candidate file data to generate a sequence of strings, ; (See Figure 3, “fingerprint generator” and Par [0019], “For example, properties of a rolling hash function (e.g., values where at least five bits equal to zero) may be used to determine anchor values.”) and adding to the candidate file fingerprint a fingerprint string comprising a substring from the sequence of strings when a predetermined string pattern appears in the sequence of strings. (See Figures 1and 3 and par [0019], “For example, properties of a rolling hash function (e.g., values where at least five bits equal to zero) may be used to determine anchor values. The anchor values may be determined by a machine programmed with the desired anchor values to be used for the various data types or using selected properties or functions applied to the data object.”).

As per Claim 11, the rejection of Claim 10 is incorporated and Kriz further discloses: wherein adding to the candidate file fingerprint the fingerprint string comprising a substring from the sequence of strings when the predetermined string pattern appears in the sequence of strings, comprises: applying a submask to the sequence of strings. (Par [0020], “At block 204, the first instances of the anchor values are located by a computer system (see FIG. 4). In some aspects, the offset (i.e., position from the beginning of the data) of the anchor value can be determined.” The “offset” being th submask, related to the position of the “anchor value”).

As per Claim 12, the rejection of Claim 11 is incorporated and Kriz further discloses: strings, comprises: for each of n positions in the submask, comparing a value in the submask with a corresponding value in each string in the sequence of strings, and adding to the candidate file fingerprint the fingerprint string comprising the substring from the sequence of strings if every value in the submask is identical to its corresponding value in the string in the sequence of strings. (Claim 1, “the processor automatically adding the single value to the similarity fingerprint.” And Par [0020], “At block 204, the first instances of the anchor values are located by a computer system (see FIG. 4). In some aspects, the offset (i.e., position from the beginning of the data) of the anchor value can be determined.” The “offset” being th submask, related to the position of the “anchor value” see also Figure 1).

As per Claim 13, the rejection of Claim 10 is incorporated and Kriz further discloses: wherein adding to the candidate file fingerprint the fingerprint string comprising the substring from the sequence of strings when the predetermined string pattern appears in the sequence of strings, comprises: only adding to the candidate file fingerprint the fingerprint string comprising the substring from the sequence of strings if said fingerprint string is distinct from every other fingerprint string already included in the candidate file fingerprint. (Par [0016], “The choice of anchor values and the resulting distances between anchor values can be used to generate coordinates for a data similarity fingerprint . In some aspects, there are N coordinates for a data similarity fingerprint, where N is the number of different anchor values… ( Euclidean ) metric . This allows efficient ( approximate ) algorithms for nearest neighbor searches , for example kd or k - means trees when comparing data similarity fingerprint values to be used” computed value giving “anchor value” and generate coordinates for a data similarity fingerprint; see also figure 1).

As per Claim 14, the rejection of Claim 10, is incorporated and Kriz further discloses: 10 further comprising: processing the candidate file data to generate a second candidate file fingerprint representing the candidate file, the second candidate file fingerprint comprising a plurality of fingerprint strings each representing a portion of the candidate file data; (See Figures 1-3 and paragraphs [0019-0020])
wherein, processing the candidate file data to generate a second candidate file fingerprint representing the candidate file, comprises: applying a second rolling hash function to the candidate file data to generate a second sequence of strings, and adding to the second candidate file fingerprint a fingerprint string comprising a substring from the second sequence of strings when a second predetermined string pattern appears in the second sequence of strings; and wherein the second candidate file fingerprint is generated simultaneously with the candidate file fingerprint. (Par [0014], “highlighted. As shown in FIG. 1, example 106 has four instances of the anchor value “0a.” A distance between anchor values comprises the number of data units between instances of an anchor value” “For example, in the case of a string, the number of data units is the number of characters. For binary data, the distance can be the number of bytes between anchor values. In example 106, there is one character between the first instance of anchor value “0a” and the second instance, thus the distance is one (1). The distance between the second instance of the anchor value “0a” and the third instance is eleven” and par [0023-0024], “At block 210, the coordinates for each anchor value are assembled into a vector representing the data similarity fingerprint” see also Figures 1 and 2).

As per Claim 15, the rejection of Claim 10 is incorporated and Kriz further discloses: further comprising: linking the candidate file fingerprint to the candidate file. (Par [0014], “highlighted. As shown in FIG. 1, example 106 has four instances of the anchor value “0a.” A distance between anchor values comprises the number of data units between instances of an anchor value” and par [0023-0024], “At block 210, the coordinates for each anchor value are assembled into a vector representing the data similarity fingerprint” see also Figures 1 and 2, “assembling” being the linking as claimed).

As per Claim 16, the rejection of Claim 10 is incorporated and Kriz further discloses: wherein the candidate file is an executable file or a Dynamic Link Library file. (Par [0019], “For example, a set of anchor values may be chosen for PE (portable executable) files typically found in Android OS environments, while a different set of anchor values may be chosen when the object data comprises IOS executable files”).

As per Claim 17, the rejection of Claim 10 is incorporated and Kriz further discloses:, wherein: applying a rolling hash function to the candidate file data to generate a sequence of strings comprises executing a Rabin-Karp Rolling Hash algorithm. (Par [0019], “For example, properties of a rolling hash function (e.g., values where at least five bits equal to zero) may be used to determine anchor values”).

As per Claim 18, the rejection of Claim 10 is incorporated and Kriz further discloses: A computer program product comprising instructions which when executed on a processor cause the processor to carry out the method according to claim 10. (See Claim 10 and par [0026])

Conclusion
10.	The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
	Ren (US 2020/0019605) relates to FILE FINGERPRINT GENERATION, specifically generating a first hash based on the first sequence; selecting a second sequence from the string of characters based on the first sequence, wherein the second sequence is shifted from the first sequence; generating a second hash based on the second sequence; and generating a fingerprint for the file based on the first hash and the second hash.
11.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANGELICA RUIZ whose telephone number is (571)270-3158. The examiner can normally be reached M-F 10:00 am to 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre M Vital can be reached on (571) 272-4215. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ANGELICA RUIZ/           Primary Examiner, Art Unit 2162