DETAILED ACTION

Notice of AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

The present office action is responsive to communications received on 12/17/2020. Claims 1-20 are pending.

Response to Arguments
The arguments/remarks filed by the applicant on 12/17/2020 have been fully considered and are responded in the following.

Applicant's amendments to the claims have overcome the objections previously set forth in the Non-Final Office Action mailed 10/6/2020. All previous claim objections have been withdrawn.

Applicant’s arguments, ‘the "score" of McDougal is distinguishable from the sensitivity score presently claimed. McDougal explains that its score "is indicative of a likelihood that the communication is a malicious communication." By contrast, claim 1 recites: "wherein the sensitivity score is indicative of a probability of the electronic file containing sensitive or protected information." In light of these shortcomings, McDougal fails to disclose: "determining, by the traffic analysis service, a sensitivity score for the electronic file based on the file metadata, wherein the sensitivity score is indicative of a probability of the electronic file containing sensitive or protected information," as recited in claim 1, and similarly recited in claims 10 and 19 (as amended). Therefore, claims 1, 10 and 19 are patentable over 

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claim 1-6, 8-15, and 17-20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to 
In particular, the Examiner finds that the Specification does not provide adequate support for the full scope of the claim limitation “determining … a sensitivity score for the electronic file based on the file metadata, wherein the sensitivity score is indicative of a probability of the electronic file containing sensitive or protected information” recited in claims 1, 10 and 19.  According to MPEP 2161.01(I) regarding determining whether there is adequate written description for a computer-implemented functional claim limitation, generic claim language in the original disclosure does not satisfy the written description requirement if it fails to support the scope of the genus claimed. Ariad, 598 F.3d at 1349-50, 94 USPQ2d at 1171 ("[A]n adequate written description of a claimed genus requires more than a generic statement of an invention’s boundaries.") (citing Eli Lilly, 119 F.3d at 1568, 43 USPQ2d at 1405-06); Enzo Biochem, Inc. v. Gen-Probe, Inc., 323 F.3d 956, 968, 63 USPQ2d 1609, 1616 (Fed. Cir. 2002) (holding that generic claim language appearing in ipsis verbis in the original specification did not satisfy the written description requirement because it failed to support the scope of the genus claimed); Fiers v. Revel, 984 F.2d 1164, 1170, 25 USPQ2d 1601, 1606 (Fed. Cir. 1993) (rejecting the argument that "only similar language in the specification or original claims is necessary to satisfy the written description requirement").
In this instance, the Examiner notes that claim covers determination of the sensitive score for all forms of file metadata, whereas the Specification only appears to support a particular species, i.e. words used in the file path and/or file name, word weight and frequency of occurrence, malware detected on endpoint to determine sensitivity score; See [p.28, line 19 - p.29, line 13].  For example, the e.g. docx, pdf, xlsx, etc., which is not adequately supported by the Specification.  Thus, the Examiner finds that the Specification does not support the scope of the genus claimed.

The dependent claims, with exceptions of claim 7 and 16, included in the statement of rejection but not specifically addressed in the body of the rejection have inherited the deficiencies of their parent claim and have not resolved the deficiencies.  Therefore, they are rejected based on the same rationale as applied to their parent claims above.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1, 2, 4, 8, 10, 11, 13, 17 and 19 are rejected under 35 U.S.C. 103 as being unpatentable McDougal (US 20130081142 A1) in view of Grzymala-Busse (US 20100024037 A1).

Regarding claim 1, McDougal teaches a method comprising:
obtaining, by a traffic analysis service that monitors a network, file metadata regarding an electronic file; (FIG. 6: intercept communication (step 605); extract meta data (step 610);) Here McDougal discloses intercepting a communication at a first node of a security system. The method also includes extracting communication metadata associated with the communication. The communication metadata comprises a plurality of different fields. The method additionally includes determining if the communication comprises an attached file and, if so, extracting file metadata associated with the file. (¶2)
determining, by the traffic analysis service, a sensitivity score for the electronic file based on the file metadata; (determine score for each field of meta data (step 615); combine score from each field to generate combined score (step 625);)
detecting, by the traffic analysis service, the electronic file within traffic in the network; ([0029] FIG. 2: File type module 220, may be configured to determine the type of file that ingest module 210 receives. File type module 220 may, also be configured to determine the metadata for a communication. File type module 220 may determine the type of a file. For example, file type module 220 may examine an extension associated with the file to determine the type of the file. As another example, file type module 220 may examine portions of the file in order to determine its type. File type module 220 may look at where it received the file to determine its type. File type module 220 may look at characters (magic numbers) in a header of a file to determine its type. In this manner, file type module 220 may detect the correct type of the file even if the file's extension has been removed or changed. As another example, certain types of files may be determined based on both magic number(s) and the file extension. File type module 220 may parse out and store each field value for each metadata field.) Here McDougal discloses detailed examples on detecting file within traffic in the network.
causing, by the traffic analysis service, performance of a mitigation action regarding the detection of the electronic file within the traffic, based on the sensitivity score of the electronic file. 

McDougal teaches determining a sensitivity score, but does not explicitly teach wherein the sensitivity score is indicative of a probability of the electronic file containing sensitive or protected information. This aspect of the claim is identified as a difference.
However, Grzymala-Busse in an analogous art explicitly teaches 
determining, by the traffic analysis service, a sensitivity score for the electronic file based on the file metadata, wherein the sensitivity score is indicative of a probability of the electronic file containing sensitive or protected information; ([0002, 0032] identifying sensitive information and quarantining or removing same from computers and computer networks. FIG 2: flow chart of a data identification and sanitization. At step 130 “data parsing”. If data parser does find potentially sensitive information, the data is analyzed at step 140 “information retrieval/data validation/metadata creation” to determine if the data “makes sense” (i.e. the data is compared to attributes relating to sensitive information to determine whether the data exhibits any of those attributes) in the context of being sensitive information. If the data does “make sense” it is scored at step 150 “metadata evaluation/analysis/scoring”.)
causing, by the traffic analysis service, performance of a mitigation action regarding the detection of the electronic file within the traffic, based on the sensitivity score of the electronic file. ([0032] The security evaluator can be set by the user to define a desired level of scrutiny. The level may 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the “classifying communications” concept of McDougal, and the “identifying potentially sensitive information” approach of Grzymala-Busse, for providing identity theft security and easing the burden of businesses in securing sensitive information and complying with externally-imposed standards of security by identifying sensitive information and quarantining or removing same from computers and computer networks and by intercepting sensitive information and directing its further processing or storage (Grzymala-Busse [0002]).

Regarding claim 2, McDougal in view of Grzymala-Busse teaches all the features with respect to claim 1, as outlined above. McDougal further teaches wherein the mitigation action comprises sending an alert to a user interface that identifies the electronic file and a sender of the traffic. ([0060, 0085, 0076, 0020] if the communication is classified as potentially containing malicious code, adjudication and disposition module 410 may cause an alert to be sent to an analyst or system administrator that the communication has been characterized as potentially containing malicious code. The identification of the metadata fields or field values may be provided in a report to a user. The metadata may be file version, author, rendering software, time stamp, etc.) of the communication. The database may contain field values associated with metadata extracted from previous communications intercepted at the first node of the security system. Based on a database of previous metadata fields and field values, for a particular sender the database or collection of information may identify whether the sender is likely to send malicious or non-malicious communications.) In summary, file identification and sender/author, as part of the metadata, are sent in an alert in any potentially malicious situation.

Regarding claim 4, McDougal in view of Grzymala-Busse teaches all the features with respect to claim 1, as outlined above. McDougal further teaches wherein the file metadata comprises user profile information associated with the electronic file. ([0076] The metadata may be associated with the communication itself (e.g., the communication's header, hidden fields, etc.) or with the attachment (e.g., file version, author, rendering software, time stamp, etc.) of the communication.)

Regarding claim 8, McDougal in view of Grzymala-Busse teaches all the features with respect to claim 1, as outlined above. McDougal further teaches wherein the sensitivity score for the electronic file is determined based further on whether malware was detected on an endpoint on which the electronic file is hosted. ([0020] The score derived from the database or collection of information may change over time based on the success or failure of previous classifications. For example, a sender may initially be identified as not likely to send malicious emails. However, if the next five emails from the sender are all classified as malicious, then the database or collection of information may be updated to associate the sender as someone likely to send malicious emails.) Here McDougal discloses that score 

Regarding claim 10 and 19, the scope of the claim is similar to that of claim 1. Accordingly, the claim is rejected using a similar rationale.

Regarding claim 11, the scope of the claim is similar to that of claim 2. Accordingly, the claim is rejected using a similar rationale.

Regarding claim 13, the scope of the claim is similar to that of claim 4. Accordingly, the claim is rejected using a similar rationale.

Regarding claim 17, the scope of the claim is similar to that of claim 8. Accordingly, the claim is rejected using a similar rationale.

Claim 3, 5, 9, 12, 14, 18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable McDougal (US 20130081142 A1) in view of Grzymala-Busse (US 20100024037 A1) and Treat (US 9641544 B1).

Regarding claim 3, McDougal in view of Grzymala-Busse teaches all the features with respect to claim 1, as outlined above. But McDougal does not teach wherein the traffic is encrypted, and wherein detecting, by the traffic analysis service, the electronic file within traffic in the network comprises: using 
However, Treat in an analogous art explicitly teaches wherein the traffic is encrypted, ([Col 6: Line 23-27] monitoring network communications, including SSL or other encrypted network communications) and wherein detecting, by the traffic analysis service, the electronic file within traffic in the network comprises: 
using a machine learning-based classifier ([Col 6: Line 46-48] machine learning techniques (MLT) to generate real-time, rapidly evolving behavior profiles for users across the enterprise network) to predict a plaintext data size of the traffic; and ([Col 6: Line 18-21, Col 8: Line 45-51] Various new and distinct forms of preventive, predictive, and prescriptive models are introduced to effectively counter internal attackers and achieve insider threat prevention. For example, assume that Alice is an employee of ACME Company who has access to an enterprise network for ACME Company, and assume that an example security policy for ACME Company (e.g., insider threat prevention (ITP) policy) caps file transfers to 10 megabytes (MB) within a predefined period of time to an offsite site (e.g., Box, Gmail, or other apps/web services). In this case, a network device can detect a file transfer activity that violates this example ITP policy.) Here Treat predicts 10 megabytes (MB) as plaintext data size of the traffic.
matching a file size of the electronic file to the predicted plaintext data size of the traffic. ([Col 11: Line 46-53, Col 26: Line 2-14] provide real-time content scanning, such as for monitoring and/or controlling file transfer activities (including data limits on file transfers and/or destination-based restrictions on such file transfers), and/or other information to match signatures (e.g., file-based, protocol-based, and/or other types/forms of signatures for detecting malware or suspicious behavior). Inputs, which can be utilized by prevention controller 902 for performing the disclosed insider threat prevention techniques, can include the following, but can also include additional inputs for these or file size, number of file transfers, destination for a file transfer, file type, and file name.) In summary, Treat discloses that file size can be utilized to compare/match to the signatures/activities, which can be the data size of the traffic.
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the “classifying communications” concept of McDougal, and the “threat prevention” approach of Treat, so various new and distinct forms of preventive, predictive, and prescriptive models are introduced to effectively counter internal attackers and achieve insider threat prevention (Treat [Col 6: Line 18-21]).

Regarding claim 5, McDougal in view of Grzymala-Busse teaches all the features with respect to claim 1, as outlined above. McDougal in view of Grzymala-Busse and Treat further teaches wherein determining the sensitivity score for the electronic file based on the file metadata comprises:
using a machine learning-based classifier to classify the file metadata, wherein the classifier is trained using a training dataset that comprises file metadata for a plurality of files that has been labeled with sensitivity scores. ([Treat Col 7: Line 61 - Col 8: Line 10] correlation of metrics and/or monitored network activities to determine that a given network activity is an anomalous activity. For example, a static default value can be utilized for thresholds for metrics to provide a baseline user behavior profile. These thresholds and/or metrics can be updated (e.g., trained) based on monitored activities for one or more users. As an example, in a learning mode/training mode, these thresholds and/or metrics can be dynamically tuned to adjust such default values for thresholds for metrics to provide a dynamically generated baseline user behavior profile. In an example implementation, one or more Machine Learning Techniques (MLT) can be utilized for implementing these trainings and 

Regarding claim 9, McDougal in view of Grzymala-Busse teaches all the features with respect to claim 1, as outlined above. McDougal in view of Grzymala-Busse and Treat further teaches wherein obtaining the file metadata regarding an electronic file comprises:
receiving, at the traffic analysis service, the file metadata from an agent executed by an endpoint on which the electronic file is hosted. ([Treat Col 8: Line 36-39, Col 26: Line 2-14] endpoint security agents that can be deployed and executed on client devices to monitor network traffic, applications, and user activities on an enterprise network. Communications that can include inputs (analogous to claim limitation “metadata”) from endpoint security agents executed on each of the client devices to prevention controller 902. Inputs can include the following: user identification (ID), file transfer application (app ID), file size, number of file transfers, destination for a file transfer, file type, and file name.)

Regarding claim 12, the scope of the claim is similar to that of claim 3. Accordingly, the claim is rejected using a similar rationale.

Regarding claim 14 and 20, the scope of the claim is similar to that of claim 5. Accordingly, the claim is rejected using a similar rationale.

Regarding claim 18, the scope of the claim is similar to that of claim 9. Accordingly, the claim is rejected using a similar rationale.

Claim 6 and 15 are rejected under 35 U.S.C. 103 as being unpatentable McDougal (US 20130081142 A1) in view of Grzymala-Busse (US 20100024037 A1) and Manadhata (US 8621233 B1).

Regarding claim 6, McDougal in view of Grzymala-Busse teaches all the features with respect to claim 1, as outlined above. But McDougal does not teach wherein the sensitivity score is determined further based on a frequency of the file appearing on endpoints across at least a portion of the network. This aspect of the claim is identified as a difference.
However, Manadhata in an analogous art explicitly teaches wherein the sensitivity score is determined further based on a frequency of the file appearing on endpoints across at least a portion of the network. ([Col 7: Line 61-65] A frequency weighting module 324 weights the similarity score for the pair of names based on the frequency distribution of the file names, where a file name's frequency is measured by the number of endpoints 112 on which an instance of the file having the given name is found.) Here Manadhata summaries the invention in [Abstract] as “generate a score indicating a confidence that the computer file contains malicious software. The score is weighted based on file name frequency, the age of the file, and the prevalence of the file. The weighted score is used to determine whether the computer file contains malicious software.”
(Manadhata [Col 1: Line 6-7, 45-47]).

Regarding claim 15, the scope of the claim is similar to that of claim 6. Accordingly, the claim is rejected using a similar rationale.

Claim 7 and 16 are rejected under 35 U.S.C. 103 as being unpatentable McDougal (US 20130081142 A1) in view of Grzymala-Busse (US 20100024037 A1) and Hanusiak (US 20180336323 A1).

Regarding claim 7, McDougal in view of Grzymala-Busse teaches all the features with respect to claim 1, as outlined above. McDougal further teaches wherein the metadata comprises a file name or file path, ([0049] a report may be sent based on the monitored behavior and results. The report may include information such as the name of the file.). But McDougal does not teach wherein determining the sensitivity score for the electronic file comprises: matching one or more words that appear in the file name or file path of the electronic file to words appearing in file names or file paths of a plurality of electronic files; and calculating the sensitivity score for the electronic file based in part on frequencies of the one or more matched words appearing in the file names or file paths of the plurality of electronic files. This aspect of the claim is identified as a difference.
However, Hanusiak in an analogous art explicitly teaches wherein determining the sensitivity 
matching one or more words that appear in the file name or file path of the electronic file to words appearing in file names or file paths of a plurality of electronic files; and ([0065] compares file-i from the list of files 520 with an id-file-j from an identifier file database 124 and determines a similarity score between file-i and id-file-j, as shown at 610. In one or more examples, the similarity score is determined based on the names of the files, a number of characters matching between the two file names.) Reference McDougal in ¶19 discloses that URL may be used in classifying communications (e.g., the URL may be tokenized, meaning “break text into individual linguistic units, such as words.”). Reference Hanusiak discloses matching linguistic units among files to determine similarity score. Therefore, the combination discloses the whole limitation.
calculating the sensitivity score for the electronic file based in part on frequencies of the one or more matched words appearing in the file names or file paths of the plurality of electronic files. ([0065] the similarity score is determined based on the names of the files, a number of characters matching between the two file names. Alternatively, or in addition, the similarity score is a ratio between the number of matching characters and non-matching characters. Further, in other examples, other attributes of the files are used for computing the similarity score. For example, sizes of the files, date of creation, date of modification, location of the file (path), are used for computing the similarity score. The similarity score thus indicates a similarity between the file from the list of files 520 and a file from the identifier file database 124.)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the “classifying communications” concept of McDougal, and the “characters matching between file names” approach of Hanusiak, to be able to evaluate similarity between files efficiently by simply comparing matching characters among filenames (Hanusiak [0065]).

Regarding claim 16, the scope of the claim is similar to that of claim 7. Accordingly, the claim is rejected using a similar rationale.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20100088305 A1, "Detection of Confidential Information", by Fournier, teaches calculating a score indicative of the likelihood that a file contains confidential information, comparing the score to a threshold; and if the score exceeds the threshold, flagging the stored data as likely to contain confidential information.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a).   Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Carl Colin can be reached on 571-272-3862.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/H.Y./Examiner, Art Unit 2493

/Kevin Bechtel/Primary Examiner, Art Unit 2491