Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/11/2022 has been entered.
 
Response to Arguments
Claim Rejections 35 U.S.C. §102
Applicant’s arguments filed on 10/11/2022, directed at the amended claims submitted on 10/11/2022 were considered, but are moot in view of new rejections made below in response to the latest amendments by applicant.

Claim Rejections - 35 U.S.C. § 103
Applicant’s arguments, see Remarks, pages 4-6, filed on 10/11/2022, with respect to claims 42-46 have been fully considered and are persuasive.  The rejection of claims 42-46 has been withdrawn. 

Claim Objections
Claim 26 is objected to because of the following informalities:  claim 26 recites “a meta data”.  Since data is an uncountable noun, appropriate correction is required.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 26, 27, 29-33, 35, 37, 38, 40, 41, 47 and 48 are rejected under 35 U.S.C. 103 as being unpatentable over Jarrett (US 2012/0324579), further in view of Zaitsev (US 2014/0223566).

Regarding claims 26 and 48, Jarrett teaches A method for identifying at least one digital content element, the digital content element forming a part of a set of digital content (see [0045] and Figs. 4 and 5: “Scanning module 504 is configured to scan content 118 in storage 106 using malware signatures of malware signature scan set 318 to detect malware. …For example, scanning module 504 may separately scan/search each file and/or other object of content 118 for the data/code of each malware signature. If the data/code of a malware signature is found in a file or other object of content 118, the file or other object is deemed to be infected by malware that the malware signature is configured to detect”. The Examiner interprets “the data/code of a malware signature … found in a file or other object of content 118” as at least one digital content element. The Examiner further interprets “a file or other object of content 118” as digital content), the method comprising: 
providing the digital content element (see [0069] and Fig. 8: “Scanning module 504 may scan infected content 120 in quarantine storage 116 using malware signatures of modified malware signature scan set 802 to detect malware”); 
comparing the digital content element with a first set of data (see [0069] and Fig. 8: “the content is scanned using the modified malware signature scan set to determine whether the content is infected with malware”. The Examiner interprets “the modified malware signature scan set” as a first set of data) provided by a combination of a second set of data and a third set of data, wherein the second set of data represents known data of interest to a searching entity and wherein the third set of data represents known non- identifying digital content elements(see [0068] and Figs. 7 and 8: “malware signature manager 502 may remove any malware signatures identified by signature identifiers in revocation list 124 from malware signature scan set 318 (FIG. 5) to generate a modified malware signature scan set 802 in FIG. 8”. The Examiner interprets “malware signature scan set 318 (FIG. 5)” as wherein the second set of data represents known data of interest to a searching entity. The Examiner further interprets “signature identifiers in revocation list 124” as wherein the third set of data represents known non- identifying digital content elements. And see [0035]: “antimalware provider 104 may revoke or repair some malware signatures determined or suspected as being defective, causing false positive malware determinations to occur. In such case, antimalware provider 104 may transmit a revocation list 124 to computing devices 102a-102c. Revocation list 124 identifies one or more malware signatures that have been revoked. In some embodiments, revocation list 124 may identify one or more malware signatures that may be causing false positive malware determinations for repair”); and 
if the digital content element is detected within the first set of data, then identifying the digital content element as digital content of interest (see [0069] and Figs. 7 and 8: “Scanning module 504 may scan infected content 120 in quarantine storage 116 using malware signatures of modified malware signature scan set 802 to detect malware….If content of infected content 120 is determined to be infected with malware, operation proceeds to step 712”).

Jarrett fails to teach that the known non-identifying digital content elements represented by the third set of data includes one or more non-unique digital content elements relating to a file structure or a meta data.
In the same field of endeavor, Zaitsev teaches that a server-based system for generation of heuristic scripts for malware detection includes an automatic heuristics generation system for generating heuristic scripts for curing malware infections (see abstract). 
Specifically, Zaitsev teaches a combination of a second set of data and a third set of data (see [0030] and Fig. 1: “In order to avoid a large number of false positives, and to increase the effectiveness of detection of malicious software, a settings/malicious objects database 150 that contains information for malware detection, including AV settings, is used, together with a safe objects database 140”. The Examiner interprets the malicious objects database 150 as a second set of data. The Examiner further interprets the safe objects database 140 as a third set of data), 
wherein the second set of data represents known data of interest to a searching entity (see [0030] and Fig. 1: “These databases contain information about known malicious objects (in the malicious objects database 150)”) and 
wherein the third set of data represents known(see [0031]: “The safe object database 140 is needed in order to exclude from the heuristic scenarios detection of objects that are, in fact, safe, and to thereby reduce the number of false positives”. And see [0034] and Fig. 1: “The database 140 is used to exclude known clean objects, i.e., to prevent false positives”) including one or more non-unique digital content elements relating to a file structure or a meta data (see [0051]: “As one example, consider a case with 10 different types of metadata. The less metadata is used for the scenario generation, the higher the probability that the scenario can identify all subsequent versions of the malware, but, at the same time, the higher the probability of false positives--which in turn forces the use of more metadata, to reduce the risk of false positives. With more metadata, the probability of false positives is lower, but, so is the probability of detection of new variants of malware. The general concept is to use as little metadata as possible taking the risk of false positives into consideration, but to try to use metadata that is only specific to malicious objects, rather than clean ones”. And see [0050]: “The source of the data for these scripts are the databases 120, 140 and 150. The databases contain some metadata that are encountered for both clean and malicious objects. The scenarios search through these metadata, to definitively identify a program as safe or malicious. Any information provided by the client computers 100 can be useful in these databases for both the generation of the scripts and for detecting false positives and taking further measures to reduce them”.  The Examiner interprets the database 140 “used to exclude known clean objects, i.e., to prevent false positives” (see [0034]), wherein the database 140 contains “some metadata that are encountered for both clean and malicious objects” (see [0050]), as wherein the third set of data represents known
And see [0047]: “2. A control check is performed, to see if legitimate files (i.e., safe objects) with the same metadata had been encountered, for example, checking for which system functions it calls (WinAPI), which libraries it loads (DLLs), what it modifies in other process (interprocess interaction), such as changing cookies in a browser, changes of variables in services and drivers, which connections it opens, whether it is signed, and by whom, how often it is launched, and on which computer, from where it is downloaded (e.g., from a trusted website, or not), which files it creates, modifies, deletes, etc. (e.g., hosts or autorun.inf, ini files in a system folder). The metadata can be static (one file only) or dynamic (based on a running process). If yes, then this type of metadata cannot be used easily for the scenario. In this situation additional metadata may be used, such as described above, or a decision may be taken that this particular variety of malware cannot be detected using metadata alone, without a risk of false positive”). 

Both Jarrett and Zaitsev teach reducing false-positives in malware detection by excluding a third set of data from a second set of data, wherein the second set of data represents known data of interest to a searching entity and wherein the third set of data represents known non-identifying digital content elements. Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to substitute the revocation list 124 of Jarrett with the database 140 “used to exclude known clean objects” and containing “some metadata that are encountered for both clean and malicious objects” taught by Zaitsev as the third set of data representing known non-identifying digital content elements. It would have been obvious because Zaitsev teaches the following: “A control check is performed, to see if legitimate files (i.e., safe objects) with the same metadata had been encountered… If yes, then this type of metadata cannot be used easily for the scenario… a decision may be taken that this particular variety of malware cannot be detected using metadata alone, without a risk of false positive” (see [0047]). When the substitution described above is made, Jarrett  modified in view of Zaitsev would teach wherein the third set of data represents known non-identifying digital content elements including one or more non-unique digital content elements relating to a file structure or a meta data, as recited in claim 26.

Regarding claim 27, Jarrett further teaches wherein the step of comparing comprises: 
comparing the digital content element with the second set of data; and if the digital content element is detected within the second set of data (see [0033] and Fig. 5: “Quarantine storage 116 is used to store content of working storage 114 that is determined to be infected with malware by antimalware client 110a during a malware scanning process. The content stored in quarantine storage 116 is moved from working storage 114 and stored as infected content 120”), then comparing the digital content element with the third set of data, and 
wherein the step of identifying comprises: if the digital content element is detected within the second set of data (see [0064]-[0066] and Fig. 8: “the following signature identifiers may be included in signature identifiers 510 in quarantine storage 116 with associated content: 34567 movieplayer.exe  12345 virus.exe, runkey  As shown in the above example, the signature identifier 34567 may be included in signature identifiers 510 in association with the file "movieplayer.exe" stored in quarantine storage 116. As such, the malware signature identified by signature identifier 34567 was used to detect malware in moveiplayer.exe. Furthermore, the signature identifier 12345 may be included in signature identifiers 510 in association with the files "virus.exe" and "runkey" stored in quarantine storage 116. As such, the malware signature identified by signature identifier 12345 was used to detect malware in each of virus.exe and runkey”) and if the digital content element is not detected within the third set of data, then identifying the digital content element as digital content of interest (see [0063] and Fig. 8: “malware signature manager 502 is configured to compare revocation list 124 to signature identifiers 510 in quarantine storage 116 to determine whether any content of infected content 120 was previously determined by scanning module 504 to be infected by a revoked malware signature”. And see [0067]: “Continuing this example, malware signature manager 502 may compare the example of revocation list 124 above (containing signature identifiers 34567 and 32187) to signature identifiers 510 (containing signature identifiers 34567 and 12345). For instance, in an embodiment, malware signature manager 502 may include a compare module configured to perform the comparison. By the comparison, malware signature manager 502 may determine that the malware signature identified by signature identifier 34567 used to detect malware in the file movieplayer.exe is present in revocation list 124, and therefore a revoked malware signature was used to detect malware in movieplayer.exe”. The Examiner interprets signature identifier 12345 in association with the files "virus.exe" and "runkey" remaining in quarantine storage 116 after comparing the example of revocation list 124 above (containing signature identifiers 34567 and 32187) to signature identifiers 510 (containing signature identifiers 34567 and 12345)  as if the digital content element is detected within the second set of data and if the digital content element is not detected within the third set of data, then identifying the digital content element as digital content of interest).

Regarding claim 29, Jarrett further teaches wherein the step of comparing comprises: creating the first set of data by subtracting the third set of data from the second set of data; and comparing the digital content element with the first set of data, and wherein the step of identifying comprises: if the digital content element is detected within the first set of data, then identifying the digital content element as digital content of interest (see [0007]: “A revocation list is received that indicates one or more signature identifiers for revoked malware signatures. If the signature identifier stored in association with the stored content is included in the revocation list, the content is rescanned using the modified malware signature scan set to determine whether the content is infected with malware. Furthermore, the malware signature is removed from a malware signature scan set to generate a modified malware signature scan set. … If the rescan indicates the content is infected with malware defined by the modified malware signature scan set, one or more signature identifiers for malware signatures in the modified malware signature scan set used to detect the content as infected with malware are stored in association with the content in the quarantine storage”).

Regarding claim 30, Jarrett further teaches wherein creating the first set of data comprises: comparing each element in the second set of data with each element in the third set of data; and if an element of the second set of data is not detected within the third set of data, then adding the element to the first set of data (see [0007]: “A revocation list is received that indicates one or more signature identifiers for revoked malware signatures. If the signature identifier stored in association with the stored content is included in the revocation list, the content is rescanned using the modified malware signature scan set to determine whether the content is infected with malware. Furthermore, the malware signature is removed from a malware signature scan set to generate a modified malware signature scan set”. And see [0067]: “Continuing this example, malware signature manager 502 may compare the example of revocation list 124 above (containing signature identifiers 34567 and 32187) to signature identifiers 510 (containing signature identifiers 34567 and 12345). For instance, in an embodiment, malware signature manager 502 may include a compare module configured to perform the comparison. By the comparison, malware signature manager 502 may determine that the malware signature identified by signature identifier 34567 used to detect malware in the file movieplayer.exe is present in revocation list 124, and therefore a revoked malware signature was used to detect malware in movieplayer.exe”).

Regarding claim 31, Jarrett further teaches wherein the set of digital content comprises at least one data file, and wherein the digital content element is a fragment of the data file (see [0033]: “Content 118 may include any type of content, including files (e.g., data files, audio files, video files, configuration files, web page files, etc.), applications, and/or other content objects”).

Regarding claim 32, Jarrett further teaches wherein the digital content element is defined in the structure of the set of digital content (see [0045] and Figs. 4 and 5: “Scanning module 504 is configured to scan content 118 in storage 106 using malware signatures of malware signature scan set 318 to detect malware. …For example, scanning module 504 may separately scan/search each file and/or other object of content 118 for the data/code of each malware signature. If the data/code of a malware signature is found in a file or other object of content 118, the file or other object is deemed to be infected by malware that the malware signature is configured to detect”. The Examiner interprets “the data/code of a malware signature … found in a file or other object of content 118” as the digital content element… defined in the structure of the set of digital content).

Regarding claim 33, Jarrett further teaches wherein the digital content element is a block (see [0045] and Figs. 4 and 5: “Scanning module 504 is configured to scan content 118 in storage 106 using malware signatures of malware signature scan set 318 to detect malware. …For example, scanning module 504 may separately scan/search each file and/or other object of content 118 for the data/code of each malware signature. If the data/code of a malware signature is found in a file or other object of content 118, the file or other object is deemed to be infected by malware that the malware signature is configured to detect”. The Examiner interprets “the data/code of a malware signature … found in a file or other object of content 118” as wherein the digital content element is a block).

Regarding claim 35, Jarrett further teaches wherein the block corresponds to one of: a memory block; a disk storage block; a disk storage sector; or a block comprising at least one data file (see [0064]-[0066]: “For instance, in one example provided for purposes of illustration, the following signature identifiers may be included in signature identifiers 510 in quarantine storage 116 with associated content: 34567 movieplayer.exe 12345 virus.exe, runkey”).

Regarding claim 37, Jarrett further teaches wherein the digital content element has been encoded by way of one of: a hashing function; or a locality- sensitive hashing function (see [0038]: “antimalware clients 110 may be configured to generate the identifiers of the signature identifier list based on the corresponding malware signatures stored at antimalware clients 110 (e.g., by performing hashes of the malware signatures using a hash generator, etc.)”).

Regarding claim 38, Jarrett further teaches wherein at least one of the second set of data or the third set of data have been encoded by way of a hashing function (see [0038]: “antimalware clients 110 may be configured to generate the identifiers of the signature identifier list based on the corresponding malware signatures stored at antimalware clients 110 (e.g., by performing hashes of the malware signatures using a hash generator, etc.)”).

Regarding claim 40, Jarrett further teaches receiving a fourth set of data identified as known, wherein the fourth set of data comprises misidentified digital content elements; comparing the fourth set of data with the second set of data; and if a misidentified digital content element is detected within the second set of data, adding the misidentified digital content element to the third set of data (see [0006]: “A revocation list is generated that includes one or more signature identifiers for malware signatures previously used to detect malware in the content that was manually restored. The revocation list is transmitted to the plurality of clients. The revocation list indicates revoked malware signatures to the clients so that the revoked malware signatures are not used in subsequent scanning, and so that quarantined content may be rescanned without using the revoked malware signatures to determine whether the quarantined content is infected”).

Regarding claim 41, Jarrett further teaches wherein at least one of the second set of data and the third set of data comprises a plurality of respective subsets of data (see [0033]: “Content 118 may include any type of content, including files (e.g., data files, audio files, video files, configuration files, web page files, etc.), applications, and/or other content objects”).

Regarding claim 47, Jarrett further teaches A method for removing digital content elements misidentified as known from a set of digital content identified as known, the method comprising: providing a fourth set of data identified as known according to the method of claim 26, wherein the fourth set of data comprises misidentified digital content elements; and adding the misidentified digital content element to the third set of data (see [0055]: “Referring back to FIG. 6, in step 604, a revocation list is generated that includes a signature identifier for the particular malware signature. For example, referring to FIG. 3, in response to receiving a threshold number of restore indications 316 that include a particular signature identifier from clients, malware signatures distributor 304 may add the signature identifier to a revocation list”. And see [0006]: “A revocation list is generated that includes one or more signature identifiers for malware signatures previously used to detect malware in the content that was manually restored. The revocation list is transmitted to the plurality of clients. The revocation list indicates revoked malware signatures to the clients so that the revoked malware signatures are not used in subsequent scanning, and so that quarantined content may be rescanned without using the revoked malware signatures to determine whether the quarantined content is infected”).

Claim 28 is rejected under 35 U.S.C. 103 as being unpatentable over Jarrett (US 2012/0324579), further in view of Zaitsev (US 2014/0223566), and further in view of Kratzer (US 2008/0168558).

Regarding claim 28, Jarrett modified in view of Zaitsev fails to teach wherein the step of comparing comprises: comparing the digital content element with the third set of data; and if the digital content element is not detected within the third set of data, then comparing the digital content element with the second set of data, and wherein the step of identifying comprises: if the digital content element is detected within the second set of data and if the digital content is not detected within the third set of data, then identifying the digital content as digital content of interest.
In the same field of endeavor, Kratzer teaches wherein the step of comparing comprises: comparing the digital content element with the third set of data (see [0025] and Fig. 5: “at step 510 it is determined whether the detected network data has an associated entry in the current whitelist”); and
if the digital content element is not detected within the third set of data, then comparing the digital content element with the second set of data, and wherein the step of identifying comprises: if the digital content element is detected within the second set of data and if the digital content is not detected within the third set of data, then identifying the digital content as digital content of interest (see [0026]: “Referring to FIG. 5, if it is determined that the current whitelist does not include an entry associated with the detected network data, then at step 560, one or more intrusion prevention policies is applied”).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to improve the malware identification method of Jarrett modified in view of Zaitsev by letting the step of comparing comprise: comparing the digital content element with the third set of data; and if the digital content element is not detected within the third set of data, then comparing the digital content element with the second set of data, and wherein the step of identifying comprises: if the digital content element is detected within the second set of data and if the digital content is not detected within the third set of data, then identifying the digital content as digital content of interest, as taught by Kratzer. It would have been obvious because Kratzer teaches: “when the processing load of the IPS appliance 140 (FIG. 1) is above the predetermined threshold level, the intrusion prevention system functionality is dynamically executed to achieve greater effective throughput by selectively analyzing network data using the whitelist of safe network data for non-malicious content to minimize processing load” (see Kratzer [0020]).

Claim 34 is rejected under 35 U.S.C. 103 as being unpatentable over Jarrett (US 2012/0324579), further in view of Zaitsev (US 2014/0223566), and further in view of Romanenko (US 2013/0139265).

Regarding claim 34, Jarrett modified in view of Zaitsev fails to teach wherein the block corresponds to a network packet or a payload portion of a network packet.
In the same field of endeavor, Romanenko teaches wherein the block corresponds to a network packet or a payload portion of a network packet (see [0005]: “A list of clean objects is constructed for …e-mail messages”. The Examiner interprets e-mail message as a payload portion of a network packet).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to improve the malware identification method of Jarrett modified in view of Zaitsev by letting the block correspond to a network packet or a payload portion of a network packet, as taught by Romanenko. It would have been obvious because doing so predictably achieves the commonly understood benefit of detecting malware in a network packet or a payload portion of a network packet.

Claim 36 is rejected under 35 U.S.C. 103 as being unpatentable over Jarrett (US 2012/0324579), further in view of Zaitsev (US 2014/0223566), and further in view of Bogorad (US 8,621,625).

Regarding claim 36, Jarrett modified in view of Zaitsev fails to teach wherein the block has a fixed size.
In the same field of endeavor, Bogorad teaches wherein the block has a fixed size (see col. 4, lines 44-54: “After identifying the unchecked file, scanning module 104 may identify a set of characteristics of the unchecked file (step 304). The set of characteristics may include one or more characteristics and/or attributes of the unchecked file. For example, the set of characteristics may include the file name of the unchecked file, information that may be indicative of the presence of one or more code sequences generally indicative of malware, one or more hashes of one or more portions of the unchecked file, and/or one or more hashes of one or more sections or other fixed-size chunks of the unchecked file”).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to improve the malware identification method of Jarrett modified in view of Zaitsev by letting the block undergoing malware detection have a fixed size, as taught by Bogorad. It would have been obvious because Bogorad teaches that doing so enables hashes of the fixed-size chunks of the unchecked file to be generated to determine whether the unchecked file is related to a clean file in a set of known-clean files (see col. 5, lines 51-62).

Claim 39 is rejected under 35 U.S.C. 103 as being unpatentable over Jarrett (US 2012/0324579), further in view of Zaitsev (US 2014/0223566), and further in view of Elliot (US 2014/0280155).

Regarding claim 39, Jarrett modified in view of Zaitsev fails to teach wherein the second set of data and the third set of data is one of: a cuckoo filter; or a bloom filter.
In the same field of endeavor, Elliot teaches wherein the second set of data and the third set of data is one of: a cuckoo filter; or a bloom filter (see [0028]: “Exemplary embodiments will now be described that solve the first problem discussed above, i.e., efficiently comparing a particular object (hereinafter a "target object") to all objects in a corpus. The disclosed embodiments utilize a Bloom filter to identify slugs associated with objects in the corpus that do not match the slug for the target object. This quick recognition is performed by discarding slugs that are associated with a different bin in the Bloom filter than the bin associated with the slug for the target object”).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to improve the malware identification method of Jarrett modified in view of Zaitsev by letting the second set of data and the third set of data be a bloom filter, as taught by Elliot. It would have been obvious because Elliot teaches the following: “Bloom filters have the property that two slugs falling into different bins within the Bloom filter are certain to have different properties and thus reflect different objects. Therefore, if the slug for the target object does not fall into the same bin as the slug for a particular object in the corpus, the target object does not match the particular object in the corpus and may thus be removed from future consideration in such embodiments” (see [0029]).

Allowable Subject Matter
Claims 42-46 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZHIMEI ZHU whose telephone number is (571)270-7990. The examiner can normally be reached 10am-6pm Monday-Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Farid Homayounmehr can be reached on 571-272-3739. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ZHIMEI ZHU/              Examiner, Art Unit 2495