DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is in response to Application filed 08/09/2021 with a Track One request filed 08/09/2021 and granted on 11/05/2021.
Claims 1-22 are pending.

Priority
 
This application is claimed as a continuation of U.S. Patent Application No. 17/079,937 filed 10/26/2020, which claims priority from U.S. Provisional Patent Application No. 62/925,569 filed 10/24/2019 and is claimed as continuation of International Application PCT/US20/57245 filed 10/24/2020.  The ‘937 application and the provisional application provide sufficient support for the claimed invention of this application.  Therefore, the effective filing date of the claimed invention of this application is 10/24/2019.

Information Disclosure Statement

The Information Disclosure Statement (IDS) filed by Applicant on 08/09/2021 has been considered.  A copy of the considered IDS as initialed, signed and dated by Examiner is enclosed with this Office action.

Specification

The disclosure is objected to because of the following informalities: 

Regarding paragraph [0001] of the Specification, as U.S. Patent Application No. 17/079,937 has been patented, its information should be supplemented with its patent information (e.g., Patent No.).

Appropriate correction is required.

Claim Objections

Claims 4-6, 12 and 19 are objected to because of the following informalities: 

Regarding claim 4, the limitation “that data file” in line 19 and the limitation “that displayed data” in line 24 should be “that displayed data file”.

Regarding claim 5, the limitation “an identified one or more protected information element” started in line 7 and the limitation “an identified one or more entity identifications” started in line 8 should be fixed.  There are conflicts between the terms “an” and “one or more” as well as the form of the noun (e.g., singular or plural form).

Regarding claim 6, the limitation “the protected element type” in line 14 should be “the protected information element type” for being consistent in claim language.

Regarding claim 12, the limitation “that data file” in line 19 and the limitation “that displayed data” in line 24 should be “that displayed data file”.

Regarding claim 19, the limitation “the protected element type” in line 14 should be “the protected information element type” for being consistent in claim language.

Appropriate correction is required.

Claim Rejections - 35 USC § 112

The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.



Claims 12-22 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 12 recites the limitation "the human reviewer" in line 11.  There is insufficient antecedent basis for this limitation in the claim.

Other dependent claims 13-22 are rejected as incorporating and failing to resolve the deficiency of claim 12 upon which they depend.

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



Claims 1-16 and 18-22 (effective filing date 10/24/2019) are rejected under 35 U.S.C. 103 as being unpatentable over Bhowan et al. (U.S., Publication No. 2019/0213354, Publication date 07/11/2019 or effectively filed date 01/09/2018), in view of Paterson et al. (U.S. Publication No. 2018/0067932, Publication date 03/08/2018), in view of Williamson et al. (U.S. Publication No. 2018/0232528, Publication date 08/16/2018), and further in view of Park et al. (U.S. Publication No. 2021/0377423, effectively filed date 09/26/2019).

As to claim 1, Bhowan et al. teaches:
“A method of identifying protected information elements associated with unique entities in data file collections comprising” (see Bhowan et al., Abstract and [0016]-[0017] for automatically identifying personal information (i.e., protected information) included in a corpus of documents to generate a set of user profiles (i.e., unique entities)): 
“a. receiving, by a computer, a first data file collection comprising a plurality of data files stored on or associated with an enterprise IT network” (see Bhowan et al., [0017] for receiving/processing a corpus of documents (i.e., data files) relating to an organization), wherein; 
“i. the first data file collection includes the plurality of data files comprising structured, unstructured, and semi-structured file types” (see Bhowan et al., [0018] for obtaining information from a corpus of documents including structured documents, semi-structured documents and unstructured documents; also see [0019] for different types of documents (e.g., forms, service tickets, training materials, etc.)); and 
“ii. at least a portion of the plurality of data files comprises one or more protected information elements associated with one or more unique entities having one or more entity identifications” (see Bhowan et al., [0022] wherein each document in the corpus of documents may include a set of values indicating personal information for particular individuals (i.e., unique entities)); 
“b. analyzing, by the computer, the plurality of data files to identify a presence of protected information elements” (see Bhowan et al., [0022] for processing/analyzing the information included in the corpus of documents to identify a set of values indicating personal information, wherein the set of values indicating personal information as disclosed is interpreted as protected information elements as recited); 
“c. generating, by the computer, information about the first data file collection comprising” (see Bhowan et al., [0022] for identifying/generating sets of value indicating personal information for particular individuals from analyzing the corpus of documents (i.e., the first data file collection); also see [0016] and [0030] for generating the set of user profiles): 
“iii. a listing of protected information element types in the plurality of data files” (see Bhowan et al., [0072] for analyzing the information included in the corpus of documents to identify/generating types of personal information wherein a set of types of personal information as disclosed is interpreted as listing of protected information element types as recited; also see [0073]);
“viii. a count of data files including at least one protected information element” (see Bhowan et al., [0012] for tracking which electronic document including personal information; also see [0038] for a list/count of documents/files that include a threshold amount of personal information); and
“ix. an entity count, wherein the entity count includes more than one entity identification associated with some unique entities” (see Bhowan et al., [0016] and [0063] for generating a set of user profiles associated with particular individuals, wherein particular set of individuals (i.e., corresponding set of profiles) is interpreted as equivalent to an entity count as recited); and 
“d. configuring, by the computer, the generated information about the first data file collection for use in one or more of: 
i. a user notification; 
ii. a report;
iii. a dashboard; or 
iv. machine learning information for use in evaluating additional data file collections” (see Bhowan et al., [0105]-[0106] for using information generated from the corpus of documents (e.g., user profiles, index, etc.) to provide/report to a user in response to a request, wherein any information providing to a user can be broadly interpreted as a report or user notification as recited; also see [0073] for using the information included in the corpus of documents to train a machine learning model to identify personal information; also see [0104] for notifying a user with an indication regarding personal information associated with the user (e.g., has been removed or recommended to be removed)).
However, Bhowan et al. does not explicitly teach information generated about corpus of documents (i.e., first data file collection) including specific listing and counts as recited as follows:
“i. count of data files”;
“ii. a listing of data file types”; 
“iv. a count of protected information element types”; 
“v. a count of protected information elements in the plurality of data files”; 
“vi. a count of protected information elements in each data file”; 
“vii. a count of protected information elements per each data file type”.
On the other hand, Paterson et al. teaches information generated about a plurality of documents/files comprising:
“i. count of data files” (see Paterson et al., [0016] for a document count); and
“ii. a listing of data file types” (see Paterson et al., [0016] and [0071] for a plurality/listing of document categories).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Paterson et al.'s teaching to Bhowan et al.’s system by implementing a feature of counting the files/documents included in a collection and identifying a list of categories/types of files/documents since features of counting of elements and listing of categories/types are well-known and well-used in the art for viewing and reporting elements/entities in a system.  Ordinarily skilled artisan would have been motivated to do so, as suggested by Paterson et al. (see [0064] and [0071]) to provide Bhowan et al.’s system with an effective way to organize, browse and view documents from the corpus of documents based on identified different document categories and a real-time count of the documents contained in each folder or sub-folder.  In addition, both of the references (Bhowan et al. and Paterson et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, providing access to corpus of documents based on association between documents and entity identifiers.  This close relation between both of the references highly suggests an expectation of success.
However, Bhowan et al. as modified by Paterson et al. does not explicitly teach information generated about corpus of documents (i.e., first data file collection) including specific counts as recited as follows:
 “iv. a count of protected information element types”; 
“v. a count of protected information elements in the plurality of data files”; 
“vi. a count of protected information elements in each data file”; 
“vii. a count of protected information elements per each data file type”.
On the other hand, Williamson et al. teaches information generated about a data source (e.g., a collection of documents) comprising:
“iv. a count of protected information element types” (see Williamson et al., Fig. 5 and [0044] for displaying a number/count of sensitive data types (e.g., NAME, ADDRESS, CCN, SSN, etc.));
“v. a count of protected information elements in the plurality of data files” (see Williamson et al., Fig. 5 and [0092] for displaying an indication of the number of data portions identified as sensitive data associated with each data source (i.e., collection of files/document (see [0025])), wherein each data portion detected as sensitive data is interpreted as a protected information element as recited); and
“vii. a count of protected information elements per each data file type” (see Williamson et al., Fig. 5 and [0092] for displaying an indication of the number of data portions identified as sensitive data associated with each data source, wherein each data portion detected as sensitive data is interpreted as a protected information element as recited wherein each data source (see [0025]) can be interpreted as equivalent to each date file type as recited);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Williamson et al.'s teaching to Bhowan et al.’s system (as modified by Paterson et al.) by implementing a feature of counting of protected information types and protected information elements in a collection of documents since feature of counting of elements is well-known and well-used in the art for viewing and reporting elements/entities in a system.  Ordinarily skilled artisan would have been motivated to do so, as suggested by Williamson et al. (see [0044] and Fig. 5) to provide Bhowan et al.’s system with an effective way to view personal/sensitive information associated documents in the corpus of documents.  In addition, both of the references (Bhowan et al. and Williamson et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, processing a source/corpus of documents to identify personal/sensitive information and allowing users to have some action regarding personal/sensitive information.  This close relation between both of the references highly suggests an expectation of success when combined.
However, Bhowan et al. as modified by Paterson et al. and Williamson et al. does not explicitly teach information generated about corpus of documents (i.e., first data file collection) including specific count as recited as follows:
 “vi. a count of protected information elements in each data file”.
On the other hand, Park et al. teaches information generated about a data source comprising:
“vi. a count of protected information elements in each data file” (see Park et al., Fig. 6 and [0062]-[0063] for displaying the total number (8) of personal information elements identified in a document).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Park et al.'s teaching to Bhowan et al.’s system (as modified by Paterson et al. and Williamson et al.) by implementing a feature of counting of protected information elements in a document since feature of counting of elements is well-known and well-used in the art for viewing and reporting elements/entities in a system.  Ordinarily skilled artisan would have been motivated to do so, as suggested by Park et al. (see [0063] and Fig. 6) to provide Bhowan et al.’s system with an effective way to view and perform actions regarding personal/sensitive information associated each document in the corpus of documents.  In addition, both of the references (Bhowan et al. and Park et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, processing documents to identify personal/sensitive information and allowing users to have some action regarding personal/sensitive information.  This close relation between both of the references highly suggests an expectation of success when combined.

	Regarding claim 2, this claim is rejected based on the same arguments as above to reject claim 1 and is similarly rejected including the following:
	Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“generating a plan associated with human review of at least a portion of the plurality of data files in the first data file collection for identification, by one or more human reviewers, of protected information element types associated with the one or more unique entities having one or more entity identifications” (see Bhowan et al., [0106] for generating a request for a list of documents that include a threshold amount of personal information for a user (e.g., manager) to perform risk management assessments, wherein a request as disclosed is interpreted as a plan as recited).

Regarding claim 3, this claim is rejected based on the same arguments as above to reject claim 1 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein the generated information about the first data file collection is configured for use in the dashboard, and wherein the dashboard is configured for display of at least the following generated information on a user device” (see Bhowan et al., [0038] and [0106] for displaying information to a user interface; also see Williamson et al., Fig. 5 for dashboard; also see Park et al., Fig. 6 for dashboard):
“a. the count of data files” (see Bhowan et al., [0038] for a list of documents; also see Paterson et al., [0016], [0064] and Fig. 7A for document count);
“b. the listing of data file types” (see Paterson et al., [0016] for a plurality/listing of document categories;
“c. the listing of protected information element types in the plurality of data files” (see Bhowan et al., [0072] for types of personal information; also see Williamson et al., Fig. 5 for sensitive data types view; also see Park et al., Fig. 6);
“d. the count of protected information element types” (see Williamson et al., Fig. 5 for sensitive data types view; also see Park et al., Fig. 6);
“e. the count of protected information elements” (see Williams et al., Fig. 5 and [0091]-[0092] for count/number of data portions detected with sensitive data; also see Park et al., Fig. 6);
“f. the count of protected information elements in each data file” (see Park et al., Fig. 6 and [0063] for a total of 8 pieces of personal information in a document;
“g. the count of protected information elements per each data file type” (see Williamson et al., Fig. 5 and [0092] for the number of data portions detected as sensitive data associated with each data source (i.e., different data type)) ;
“h. the count of data files including at least one protected information element” (see Bhowan et al., [0038] for a list of documents that include a threshold amount of personal information); and
“i. the entity count” (see Bhowan et al., [0016] for a set of user profiles (i.e., entities); also see Paterson et al., Fig.  for listing of Entities).

Regarding claim 4, this claim is rejected based on the same arguments as above to reject claim 1 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein the generated information about the first data file collection is configured as the machine learning information” (see Bhowan et al., [0073] and [0079] for using information of corpus of documents (e.g., labeled training data) to train a machine learning model; also see Williamson et al., [0041]) and the method further comprises:
“a. generating, by the computer, a second data file collection comprising each first collection data file identified by the computer as including one or more protected information elements” (see Bhowan et al., [0106] for generating a list of documents (i.e., a second data file collection) that include personal information for one or more individuals for a request);
“b. configuring, by the computer, a plurality of identified data files in the second data file collection for display and selection on a user device” (see Bhowan et al., [0106] for displaying the list of documents to a user);
“c. displaying, by the computer, one or more of the plurality of identified data files on the user device” (see Bhowan et al., [0106] for displaying the list of documents to a user);
“d. analyzing, by a human reviewer, the one or more displayed data files to confirm the computer identification of the one or more protected information elements in each of the one or more displayed data files” (see Bhowan, [0106] for performing risk management assessment by a user (i.e., human reviewer) by identifying which documents have personal information; also see Williamson et al., [0077] for receiving user feedback indicating whether reported/displayed sensitive data portions are actually sensitive data and whether data portions reported not to be sensitive data are actually sensitive data) wherein:
“i. if the human reviewer confirms that the one or more protected information elements are not present in the displayed data file” (see Bhowan et al., [0106] for determining which documents have personal information; also see Williamson et al., [0077]), the method further comprises:
“1. selecting, by the human reviewer, that displayed data file for removal from the second data file collection; and
2. removing, by the computer, that data file from the second data file collection” (see Paterson et al., [0078] and Fig. 5 for a feature of moving a document (e.g., removing a document from a folder/collection)); or
“ii. if the human reviewer confirms that the one or more protected information elements are present in the displayed data file” (see Bhowan et al., [0106] for determining which documents have personal information; also see Williamson et al., [0077]), the method further comprises:
“1. selecting, by the human reviewer, that displayed data to remain in the second data file collection; and
2. linking, by either or both the human reviewer or the computer, each of the one or more protected information elements with a unique entity having one or more entity identifications” (see Bhowan et al., [0026] and [0030] for linking a set of values indicating personal information with a same individual (i.e., unique entity) to create a user profile for each user; also see Paterson et al., [0078] for a feature of storing a document (e.g., maintaining a document in a folder/collection)); and
“e. recording, by the computer, information associated with the human reviewer's actions” (see Williamson et al., [0077] for receiving/recording user feedback/actions); and
“f. incorporating, by the computer, information derived from the human reviewer's actions into the machine learning information for use in subsequent data file analyses” (see Williamson et al., [0041] and [0077] for using the user feedback for training the machine learning module of the data classifier for determining whether a data portion is sensitive).

Regarding claim 5, this claim is rejected based on the same arguments as above to reject claim 4 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“a. identifying, by the computer, additional data files in either or both of the first and second data file collections having a presence of” (see Bhowan et al., [0106] for identifying a list of document based on a request):
“i. one or more protected information elements associated with one or more unique entities having one or more entity identifications” (see Bhowan et al., [0106] wherein the documents include personal information for one or more individuals (i.e., one or more unique entities)); or
“ii. one or more entity identifications associated with a unique entity” (see Bhowan et al., [0023] and [0026] wherein personal information includes a name value, a personal identification value (i.e., entity identification) associated with an individual (i.e., unique entity));
“b. determining, by the computer or by the human reviewer, whether an identified one or more protected information element or an identified one or more entity identifications is associated with a unique entity” (see Bhowan et al., [0026] for determining whether a value and other values are personal information for the same individual);
“c. generating, by the computer, data file linkage information for each protected information element determined to be associated with a unique entity” (see Bhowan et al., [0028] and [0030] for associating/linking related values into a set of user profiles, wherein each user profile representing a unique entity can be interpreted as linkage information as recited); and
“d. configuring, by the computer, the data file linkage information for use in one or more of:
i. the user notification; 
ii. the report;
iii. the dashboard; or 
iv. the machine learning information for use in subsequent data file analyses” (see Bhowan et al., [0037] for using information linking the personal information in the user profile to identify personal information from document(s) and notify a user (e.g., providing information to the user)).
Regarding claim 6, this claim is rejected based on the same arguments as above to reject claim 4 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein the first and second data file collections include data files comprising tabular data associated with a plurality of unique entities having one or more entity identifications” (see Bhowan et al., [0018] for data sources including structured documents (e.g., files including table or column or row)), and the method further comprises:
“a. identifying, by the computer, a first data file comprising tabular data having one or more rows or columns including either or both of one or more protected information elements and one or more entity identifications associated with a unique entity” (see Bhowan et al., [0028]-[0029] for identifying values indicating personal information (i.e., protected information elements) associated with a same individual/entity; also see Park et al., Fig. 6 for a table);
“b. configuring, by the computer, the first data file for display and selection on the user device” (see Park et al., Fig. 6);
“c. displaying, by the computer, the first data file on the user device” (see Park et al., Fig. 6);
“d. identifying, by the human reviewer, one or more columns or rows on the displayed first data file as corresponding to a protected information element type or an entity identification” (see Park et al., Fig. 6 and [0064] for providing management menu to allow a user to interact with the user interface);
“e. generating, by the computer, linkage information for the protected element type and a corresponding entity identification” (see Bhowan et al., [0028] and [0030] for associating/linking related values into a set of user profiles, wherein each user profile representing a unique entity can be interpreted as linkage information as recited);
“f. recording, by the computer, information derived from the human reviewer's actions” (see Williamson et al., [0077] for receiving/recording user feedback/actions) in:
“i. identifying the protected information element type” (see Williamson et al., [0041] wherein user feedback includes identifying new patterns to identify sensitive data); 
“ii. identifying the entity identification” (see Williamson et al., [0041] wherein user feedback includes identifying new patterns to identify sensitive data) ; and 
“iii. generating the linkage information” (see Williamson et al., [0041] wherein user feedback including newly received reference data, configuration data or contextual matching (i.e., linkage information)); and
“g. incorporating, by the computer, the recorded information into the machine learning information for use in subsequent data file analyses” (see Williamson et al., [0041] and [0077] for using the user feedback for training the machine learning module of the data classifier for determining whether a data portion is sensitive). 

Regarding claim 7, this claim is rejected based on the same arguments as above to reject claim 4 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein a plurality of entity identifications for a unique entity are present in at least a portion of the data files of the first and second data file collections and the method further comprises performing, by the computer, an entity resolution step, thereby generating resolved unique entity identifications for at least a portion of the unique entities in the first and second data file collections” (see Bhowan et al., [0082] for performing a co-reference resolution technique (i.e., an entity resolution step) to identify relationships between values indicating multiple types of personal information that relates to the same individual).

Regarding claim 8, this claim is rejected based on the same arguments as above to reject claim 7 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein each resolved unique entity identification is linked to one or more protected information elements” (see Bhowan et al., [0016] and [0082] for generating user profiles to link related values of personal information, wherein each user profile includes personal information for a particular individual), and 
“wherein linkage information for the resolved unique entity identification and the one or more protected information elements is configured for use in one or more of”:
a. the user notification;
b. the report;
c. the dashboard;
d. the machine learning information for use in subsequent data file analyses; or
e. a notification to a unique entity having one or more protected information elements present in one or more data files in the first or second data file collections” (see Bhowan et al., [0037] for using information linking the personal information in the user profile to identify personal information from document(s) and notify a user of personal information included in one or more documents, wherein providing information to a user is interpreted as providing a user notification or a notification to a unique entity as recited; also see Park et al., Fig. 6 for providing personal information associated with individuals on an interface (e.g., a report or a dashboard)).

Regarding claim 9, this claim is rejected based on the same arguments as above to reject claim 1 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein the generated information about the first data file collection is derived from analysis, by the computer, of the enterprise IT network after receipt of a notification of a data breach event” (see Bhowan et al., [0016]-[0017] for identifying/generating personal information included in a corpus of documents related to an organization based on audit request/event; also see Williamson et al., [0021] and [0023] for scanning one or more input data sources for sensitive data to be protected).

Regarding claim 10, this claim is rejected based on the same arguments as above to reject claim 1 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein at least a portion of the one or more protected information elements is associated with one or more compliance-related activities defined by one or more of laws, regulations, policies, procedures, and contractual obligations associated with the protected information elements” (see Bhowan et al., [0012] for tracking which documents including personal information according to laws and regulations to protect personal information; also see Williamson et al., [0023]).

As to claim 11, Bhowan et al. teaches:
“A method of analyzing a collection of data files in data file collections for the presence of protected information elements associated with unique entities comprising” (see Bhowan et al., Abstract and [0016]-[0017] for automatically identifying personal information (i.e., protected information) included in a corpus of documents to generate a set of user profiles (i.e., unique entities)): 
“a. receiving, by a computer, a first data file collection comprising a plurality of data files stored on or associated with an enterprise IT network” (see Bhowan et al., [0017] for receiving/processing a corpus of documents (i.e., data files) relating to an organization), wherein; 
“i. the first data file collection includes the plurality of data files comprising structured, unstructured, and semi-structured file types” (see Bhowan et al., [0018] for obtaining information from a corpus of documents including structured documents, semi-structured documents and unstructured documents; also see [0019] for different types of documents (e.g., forms, service tickets, training materials, etc.)); and 
“ii. at least a portion of the plurality of data files comprise one or more protected information elements associated with one or more unique entities having one or more entity identifications” (see Bhowan et al., [0022] wherein each document in the corpus of documents may include a set of values indicating personal information for particular individuals (i.e., unique entities)); 
“b. analyzing, by the computer, the plurality of data files in the first data file collection for a presence of protected information elements” (see Bhowan et al., [0022] for processing/analyzing the information included in the corpus of documents to identify a set of values indicating personal information, wherein the set of values indicating personal information as disclosed is interpreted as protected information elements as recited); 
“c. generating, by the computer, information about the first data file collection comprising” (see Bhowan et al., [0022] for identifying/generating sets of value indicating personal information for particular individuals from analyzing the corpus of documents (i.e., the first data file collection); also see [0016] and [0030] for generating the set of user profiles): 
“iii. a listing of protected information element types in the plurality of data files” (see Bhowan et al., [0072] for analyzing the information included in the corpus of documents to identify/generating types of personal information wherein a set of types of personal information as disclosed is interpreted as listing of protected information element types as recited; also see [0073]);
“viii. a count of data files including at least one protected information element” (see Bhowan et al., [0012] for tracking which electronic document including personal information; also see [0038] for a list/count of documents/files that include a threshold amount of personal information); and
“ix. an entity count, wherein the entity count includes more than one entity identification associated with some unique entities” (see Bhowan et al., [0016] and [0063] for generating a set of user profiles associated with particular individuals, wherein particular set of individuals (i.e., corresponding set of profiles) is interpreted as equivalent to an entity count as recited); 
“d. generating, by the computer, a plan associated with human review of at least a portion of the plurality of data files in the first data file collection for identification, by one or more human reviewers, of a presence of one or more protected information element types associated with the one or more unique entities having one or more entity identifications” (see Bhowan et al., [0106] for generating a request for a list of documents that include a threshold amount of personal information for a user (e.g., manager) to perform risk management assessments, wherein a request as disclosed is interpreted as a plan as recited); and 
“e. configuring, by the computer, the human reviewer plan for use in one or more of: 
i. a user notification; 
ii. a report;
iii. a dashboard; or 
iv. machine learning information for use in data file analysis” (see Bhowan et al., [0105]-[0106] for using information generated/retrieved from the corpus of documents (e.g., user profiles, index, etc.) to provide/report to a user in response to a request, wherein any information providing to a user can be broadly interpreted as a report or user notification as recited; also see [0073] for using the information included in the corpus of documents to train a machine learning model to identify personal information; also see [0104] for notifying a user with an indication regarding personal information associated with the user (e.g., has been removed or recommended to be removed)).
However, Bhowan et al. does not explicitly teach information generated about corpus of documents (i.e., first data file collection) including specific listing and counts as recited as follows:
“i. count of data files”;
“ii. a listing of data file types”; 
“iv. a count of protected information element types”; 
“v. a count of protected information elements in the plurality of data files”; 
“vi. a count of protected information elements in each data file”; 
“vii. a count of protected information elements per each data file type”.
On the other hand, Paterson et al. teaches information generated about a plurality of documents/files comprising:
“i. count of data files” (see Paterson et al., [0016] for a document count); and
“ii. a listing of data file types” (see Paterson et al., [0016] and [0071] for a plurality/listing of document categories).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Paterson et al.'s teaching to Bhowan et al.’s system by implementing a feature of counting the files/documents included in a collection and identifying a list of categories/types of files/documents since features of counting of elements and listing of categories/types are well-known and well-used in the art for viewing and reporting elements/entities in a system.  Ordinarily skilled artisan would have been motivated to do so, as suggested by Paterson et al. (see [0064] and [0071]) to provide Bhowan et al.’s system with an effective way to organize, browse and view documents from the corpus of documents based on identified different document categories and a real-time count of the documents contained in each folder or sub-folder.  In addition, both of the references (Bhowan et al. and Paterson et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, providing access to corpus of documents based on association between documents and entity identifiers.  This close relation between both of the references highly suggests an expectation of success.
However, Bhowan et al. as modified by Paterson et al. does not explicitly teach information generated about corpus of documents (i.e., first data file collection) including specific counts as recited as follows:
 “iv. a count of protected information element types”; 
“v. a count of protected information elements in the plurality of data files”; 
“vi. a count of protected information elements in each data file”; 
“vii. a count of protected information elements per each data file type”.
On the other hand, Williamson et al. teaches information generated about a data source (e.g., a collection of documents) comprising:
“iv. a count of protected information element types” (see Williamson et al., Fig. 5 and [0044] for displaying a number/count of sensitive data types (e.g., NAME, ADDRESS, CCN, SSN, etc.));
“v. a count of protected information elements in the plurality of data files” (see Williamson et al., Fig. 5 and [0092] for displaying an indication of the number of data portions identified as sensitive data associated with each data source (i.e., collection of files/document (see [0025])), wherein each data portion detected as sensitive data is interpreted as a protected information element as recited); and
“vii. a count of protected information elements per each data file type” (see Williamson et al., Fig. 5 and [0092] for displaying an indication of the number of data portions identified as sensitive data associated with each data source, wherein each data portion detected as sensitive data is interpreted as a protected information element as recited wherein each data source (see [0025]) can be interpreted as equivalent to each date file type as recited);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Williamson et al.'s teaching to Bhowan et al.’s system (as modified by Paterson et al.) by implementing a feature of counting of protected information types and protected information elements in a collection of documents since feature of counting of elements is well-known and well-used in the art for viewing and reporting elements/entities in a system.  Ordinarily skilled artisan would have been motivated to do so, as suggested by Williamson et al. (see [0044] and Fig. 5) to provide Bhowan et al.’s system with an effective way to view personal/sensitive information associated documents in the corpus of documents.  In addition, both of the references (Bhowan et al. and Williamson et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, processing a source/corpus of documents to identify personal/sensitive information and allowing users to have some action regarding personal/sensitive information.  This close relation between both of the references highly suggests an expectation of success when combined.
However, Bhowan et al. as modified by Paterson et al. and Williamson et al. does not explicitly teach information generated about corpus of documents (i.e., first data file collection) including specific count as recited as follows:
 “vi. a count of protected information elements in each data file”.
On the other hand, Park et al. teaches information generated about a data source comprising:
“vi. a count of protected information elements in each data file” (see Park et al., Fig. 6 and [0062]-[0063] for displaying the total number (8) of personal information elements identified in a document).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Park et al.'s teaching to Bhowan et al.’s system (as modified by Paterson et al. and Williamson et al.) by implementing a feature of counting of protected information elements in a document since feature of counting of elements is well-known and well-used in the art for viewing and reporting elements/entities in a system.  Ordinarily skilled artisan would have been motivated to do so, as suggested by Park et al. (see [0063] and Fig. 6) to provide Bhowan et al.’s system with an effective way to view and perform actions regarding personal/sensitive information associated each document in the corpus of documents.  In addition, both of the references (Bhowan et al. and Park et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, processing documents to identify personal/sensitive information and allowing users to have some action regarding personal/sensitive information.  This close relation between both of the references highly suggests an expectation of success when combined.

Regarding claim 12, this claim is rejected based on the same arguments as above to reject claim 11 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein the generated information about the first data file collection is configured as the machine learning information” (see Bhowan et al., [0073] and [0079] for using information of corpus of documents (e.g., labeled training data) to train a machine learning model; also see Williamson et al., [0041]) and the method further comprises:
“a. generating, by the computer, a second data file collection comprising each first collection data file identified by the computer as including one or more protected information elements” (see Bhowan et al., [0106] for generating a list of documents (i.e., a second data file collection) that include personal information for one or more individuals for a request);
“b. configuring, by the computer, a plurality of identified data files in the second data file collection for display and selection on a user device” (see Bhowan et al., [0106] for displaying the list of documents to a user);
“c. displaying, by the computer, one or more of the plurality of identified data files on the user device” (see Bhowan et al., [0106] for displaying the list of documents to a user);
“d. analyzing, by a human reviewer, the one or more displayed data files to confirm the computer identification of the one or more protected information elements in each of the one or more displayed data files” (see Bhowan, [0106] for performing risk management assessment by a user (i.e., human reviewer) by identifying which documents have personal information; also see Williamson et al., [0077] for receiving user feedback indicating whether reported/displayed sensitive data portions are actually sensitive data and whether data portions reported not to be sensitive data are actually sensitive data) wherein:
“i. if the human reviewer confirms that the one or more protected information elements are not present in the displayed data file” (see Bhowan et al., [0106] for determining which documents have personal information; also see Williamson et al., [0077]), the method further comprises:
“1. selecting, by the human reviewer, that displayed data file for removal from the second data file collection; and
2. removing, by the computer, that data file from the second data file collection” (see Paterson et al., [0078] and Fig. 5 for a feature of moving a document (e.g., removing a document from a folder/collection)); or
“ii. if the human reviewer confirms that the one or more protected information elements are present in the displayed data file” (see Bhowan et al., [0106] for determining which documents have personal information; also see Williamson et al., [0077]), the method further comprises:
“1. selecting, by the human reviewer, that displayed data to remain in the second data file collection; and
2. linking, by either or both the human reviewer or the computer, each of the one or more protected information elements with a unique entity having one or more entity identifications” (see Bhowan et al., [0026] and [0030] for linking a set of values indicating personal information with a same individual (i.e., unique entity) to create a user profile for each user; also see Paterson et al., [0078] for a feature of storing a document (e.g., maintaining a document in a folder/collection)); and
“e. recording, by the computer, information associated with the human reviewer's actions” (see Williamson et al., [0077] for receiving/recording user feedback/actions); and
“f. incorporating, by the computer, information derived from the human reviewer's actions into the machine learning information for use in subsequent data file analyses” (see Williamson et al., [0041] and [0077] for using the user feedback for training the machine learning module of the data classifier for determining whether a data portion is sensitive).

Regarding claim 13, this claim is rejected based on the same arguments as above to reject claim 12 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein the one or more displayed data files are derived from the plurality of determined data files by a filtering process in which a data file comprising a larger number of protected information elements is selected for display prior to a data file comprising a fewer number of protected information elements” (see Bhowan et al., [0038] for displaying a list of documents that include a threshold amount of personal information; also see Williamson et al., Fig. 5 for displaying sensitive data types according to a number of sensitive data elements associated with it).

Regarding claim 14, this claim is rejected based on the same arguments as above to reject claim 12 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein the one or more displayed data files are derived from the plurality of determined data files by a filtering step in which a data file is selected for display according to a generated probability that the computer’s determination about the presence of the one or more protected information elements is a correct identification of the presence of the one or more protected information elements” (see Bhowan et al., [0038] for displaying a list of documents that include a threshold amount of personal information; also see Williamson et al., [0044] for displaying files/tables where the sensitive data portions were detected and confidence scores of each detection).

Regarding claim 15, this claim is rejected based on the same arguments as above to reject claim 12 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein linkage information for the linked protected information and unique entity is generated by the human reviewer” (see Bhowan et al., [0030]-[0031] for generating and maintaining a set of user profiles for individuals) and the method further comprises:
“a. analyzing, by the computer, each of the first and second data file collections to determine the presence of additional data files including one or more protected information elements associated with the unique entity” (see Bhowan et al., [0017] and [0100] wherein the identification platform automatically identifying personal information in a corpus of documents and updating an index by adding a new user profile or updating an existing user profile); and
“b. when a determination is made that an additional data file includes one or more protected information elements associated with the unique entity, linking, by the computer or the human reviewer, the one or more protected information elements in that data file with the unique entity” (see Bhowan et al., [0100] for updating an existing user profile with personal information associated with an individual).

Regarding claim 16, this claim is rejected based on the same arguments as above to reject claim 12 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein a plurality of entity identifications for a unique entity are present in at least a portion of the data files of the first and second data file collections and the method further comprises performing, by the computer, an entity resolution step, wherein the entity resolution step comprises combining a plurality of entity identifications with a unique entity, thereby generating resolved entity identifications for at least a portion of the unique entities in the data files” (see Bhowan et al., [0082] for performing a co-reference resolution technique (i.e., an entity resolution step) to identify relationships between values indicating multiple types of personal information that relates to the same individual).

Regarding claim 18, this claim is rejected based on the same arguments as above to reject claim 16 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein each resolved entity identification is linked to one or more protected information elements” (see Bhowan et al., [0016] and [0082] for generating user profiles to link related values of personal information, wherein each user profile includes personal information for a particular individual), and 
“wherein linkage information for the resolved entity identification and one or more protected information elements is configured for use in one or more of:
a. the user notification;
b. the report;
c. the dashboard;
d. the machine learning information for use in subsequent data file analyses; or
e. a notification to a unique entity having one or more protected information elements present in one or more data files in the first or second data file collections” (see Bhowan et al., [0037] for using information linking the personal information in the user profile to identify personal information from document(s) and notify a user of personal information included in one or more documents, wherein providing information to a user is interpreted as providing a user notification or a notification to a unique entity as recited; also see Park et al., Fig. 6 for providing personal information associated with individuals on an interface (e.g., a report or a dashboard)).

Regarding claim 19, this claim is rejected based on the same arguments as above to reject claim 12 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein the first and second data file collections include data files comprising tabular data associated with a plurality of unique entities having one or more entity identifications” (see Bhowan et al., [0018] for data sources including structured documents (e.g., files including table or column or row)), and the method further comprises:
“a. identifying, by the computer, a first data file comprising tabular data having one or more rows or columns including either or both of one or more protected information elements and one or more entity identifications associated with a unique entity” (see Bhowan et al., [0028]-[0029] for identifying values indicating personal information (i.e., protected information elements) associated with a same individual/entity; also see Park et al., Fig. 6 for a table);
“b. configuring, by the computer, the first data file for display and selection on the user device” (see Park et al., Fig. 6);
“c. displaying, by the computer, the first data file on the user device” (see Park et al., Fig. 6);
“d. identifying, by the human reviewer, one or more columns or rows on the displayed first data file as corresponding to a protected information element type or an entity identification” (see Park et al., Fig. 6 and [0064] for providing management menu to allow a user to interact with the user interface);
“e. generating, by the computer, linkage information for the protected element type and a corresponding entity identification” (see Bhowan et al., [0028] and [0030] for associating/linking related values into a set of user profiles, wherein each user profile representing a unique entity can be interpreted as linkage information as recited);
“f. recording, by the computer, information derived from the human reviewer's actions” (see Williamson et al., [0077] for receiving/recording user feedback/actions) in:
“i. identifying the protected information element type” (see Williamson et al., [0041] wherein user feedback includes identifying new patterns to identify sensitive data); 
“ii. identifying the entity identification” (see Williamson et al., [0041] wherein user feedback includes identifying new patterns to identify sensitive data) ; and 
“iii. generating the linkage information” (see Williamson et al., [0041] wherein user feedback including newly received reference data, configuration data or contextual matching (i.e., linkage information)); and
“g. incorporating, by the computer, the recorded information into the machine learning information for use in subsequent data file analyses” (see Williamson et al., [0041] and [0077] for using the user feedback for training the machine learning module of the data classifier for determining whether a data portion is sensitive) .

Regarding claim 20, this claim is rejected based on the same arguments as above to reject claim 12 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein the first data file collection is derived from analysis of the enterprise IT network after receipt of a notification of a data breach event” (see Bhowan et al., [0016]-[0017] for identifying/generating personal information included in a corpus of documents related to an organization based on audit request/event; also see Williamson et al., [0021] and [0023] for scanning one or more input data sources for sensitive data to be protected).

Regarding claim 21, this claim is rejected based on the same arguments as above to reject claim 12 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein at least a portion of the one or more protected information elements is associated with one or more compliance-related activities defined by one or more of laws, regulations, policies, procedures, and contractual obligations associated with the protected information elements” (see Bhowan et al., [0012] for tracking which documents including personal information according to laws and regulations to protect personal information; also see Williamson et al., [0023]).

Regarding claim 22, this claim is rejected based on the same arguments as above to reject claim 12 and is similarly rejected including the following:
Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches:
“wherein the subsequent data file analyses of a data file collection of interest comprises analysis of one or more of:
a. the first data file collection derived from the enterprise IT network;
b. the second data file collection;
c. a third data file collection derived from a bulk data file collection stored on or associated with the enterprise IT network; or
d. a fourth data file collection derived from a bulk data file collection stored on or associated with a second enterprise IT network that is different from the enterprise IT network” (see Bhowan et al., [0018]-[0019] for performing documents for processing/analyzing from one or more data sources; also see Williamson et al., Fig. 5 and [0021]-[0022] for performing sensitive data scanning on one or more input data sources wherein each data source is a collection of files/documents (see [0025])).

Claim 17 (effective filing date 10/24/2019) is rejected under 35 U.S.C. 103 as being unpatentable over Bhowan et al. (U.S., Publication No. 2019/0213354, Publication date 07/11/2019 or effectively filed date 01/09/2018), in view of Paterson et al. (U.S. Publication No. 2018/0067932, Publication date 03/08/2018), in view of Williamson et al. (U.S. Publication No. 2018/0232528, Publication date 08/16/2018), in view of Park et al. (U.S. Publication No. 2021/0377423, effectively filed date 09/26/2019), and further in view of Elangovan et al. (U.S. Patent No. 10,706,843, effectively filed date 03/09/2017).

As to claim 17, Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. teaches all limitations as recited in claim 16 and further teaches:
“b. displaying, by the computer, at least a portion of the resolved data files on the user device” (see Bhowan et al., [0106] for displaying a list of documents for user assessment of which documents have personal information; also see Williamson et al., [0041] for displaying and getting feedback from a user regarding sensitive data portions); and
“d. recording, by the computer, information associated with the human reviewer's entity resolution confirmation actions for use in subsequent data file analyses” (see Bhowan et al., [0073] and [0079] for training one or more machine learning model or algorithms of the identification platform based on historical information and labeled training data; also see Williamson et al., [0041] for using user feedback  to training the machine learning module of the data classifier).
However, Bhowan et al. as modified by Paterson et al., Williamson et al. and Park et al. does not explicitly teach a feature of using a confidence level for entity identification resolution and providing resolution action(s) involving a human reviews as recited as follows:
“a. generating, by the computer, a confidence level that a resolved entity identification for a unique entity correctly resolves the plurality of entity identifications to that unique entity and, if the generated confidence level is below a threshold level, configuring one or more resolved data files linked with the unique entity for display and selection on the user device”;
“c. confirming, by the human reviewer, that the computer-generated entity resolution correctly assigned each of the plurality of entity identifications to the unique entity, wherein:
i. if the human reviewer confirms that the computer-generated entity resolution correctly assigned each of the plurality of entity identifications to the unique entity, maintaining the previously generated resolved entity identification; and
ii. if the human reviewer does not confirm that the computer-generated entity resolution correctly assigned each of the plurality of entity identifications to the unique entity, removing each incorrect resolved entity identification”.
On the other hand, Elangovan et al. teaches a feature of using a confidence level for entity identification resolution and providing resolution action(s) involving a human reviewer as recited as follows
“a. generating, by the computer, a confidence level that a resolved entity identification for a unique entity correctly resolves the plurality of entity identifications to that unique entity” (see Elangovan et al., [column 39, lines 4-40] for performing entity resolution based on a confidence score associated with a contact identifier), and, 
“if the generated confidence level is below a threshold level, configuring one or more resolved data files linked with the unique entity for display and selection on the user device:
c. confirming, by the human reviewer, that the computer-generated entity resolution correctly assigned each of the plurality of entity identifications to the unique entity, wherein:
i. if the human reviewer confirms that the computer-generated entity resolution correctly assigned each of the plurality of entity identifications to the unique entity, maintaining the previously generated resolved entity identification; and
ii. if the human reviewer does not confirm that the computer-generated entity resolution correctly assigned each of the plurality of entity identifications to the unique entity, removing each incorrect resolved entity identification” (see Elangovan et al., [column 39, lines 63] for contact/entity resolution involving a user’s feedback; it should be noted that this part recites a contingent limitation where all recited step(s) are occurred only when a condition is satisfied (e.g., if the generated confident level is below a threshold level); in other words, when the condition is not satisfied (e.g., if the generated confidential level is above the threshold level), those steps are not occurred; therefore, this contingent limitation recited in a method claim does not need to be demonstrated by the prior art (see MPEP § 2111.04, subsection II).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Elangovan et al.'s teaching to Bhowan et al.’s system (as modified by Paterson et al., Williamson et al. and Park et al.) by implementing a feature of performing entity resolution using confidence value and involving user since it is well-known and well-used in the art for using both machine (e.g., using some calculated score/value) and user feedback in improving entity/identify resolution process.  Ordinarily skilled artisan would have been motivated to do so to provide Bhowan et al.’s system with an effective way to improve the identification platform based on user feedback.  In addition, both of the references (Bhowan et al. and Elangovan et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, a system for identifying personal/sensitive information associated with individuals for being used in a system.  This close relation between both of the references highly suggests an expectation of success when combined.














Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHUONG THAO CAO whose telephone number is (571)272-2735. The examiner can normally be reached Monday - Friday: 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ashish Thomas can be reached on 571-272-0631. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Phuong Thao Cao/Primary Examiner, Art Unit 2164