DETAILED ACTION
Response to Amendment
The amendment filed on 07/15/22 has been entered. Claims 1, 3-20 remain pending in the application.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1, 3-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Independent claims 1, 20 similarly recite gathering a training set comprising data items and known attributes and features for each data item; identifying known attributes for corresponding features of each data item based on gathered contextual information; building a learning model based on the identified known attributes and corresponding data items; and employing the learning model as an initial rendering of a model of the known attributes and features; identifying, without a matching procedure for the data items in a data set, a set of features that define a context for a plurality of data items in the large data set, each feature of the set of features defining metadata about a form and use of the plurality of data items; determining, for each feature of the set of features, a source in which an attribute for the feature is stored; identifying, for each feature of the set of features, the attribute based on referencing the determined source for the feature, wherein the attribute is indicative of a sensitivity of each of the plurality of data items; associating the identified attributes with each data item in the plurality of data items to generate an enriched data set including the attributes for each data item in the plurality of data items, the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data; and determining, based on the model defining metadata indicating a form and use of the plurality of data items, whether each of the plurality of data items is a sensitive data item.
The limitations of identifying known attributes for corresponding features of each data item based on gathered contextual information; building a learning model based on the identified known attributes and corresponding data items; identifying, without a matching procedure for the data items in a data set, a set of features that define a context for a plurality of data items in the large data set, each feature of the set of features defining metadata about a form and use of the plurality of data items; determining, for each feature of the set of features, a source in which an attribute for the feature is stored; identifying, for each feature of the set of features, the attribute based on referencing the determined source for the feature, wherein the attribute is indicative of a sensitivity of each of the plurality of data items; and determining, based on the model defining metadata indicating a form and use of the plurality of data items, whether each of the plurality of data items is a sensitive data item, as drafted, are processes that, under their broadest reasonable interpretation, cover mental processes but from the recitation of implementing them on generic computer components. That is, nothing in the claim element precludes these steps from practically being performed in the mind. For example, “identifying known attributes” in the context of this claim encompasses the user observing, analyzing, and/or judging the sensitivity of data items, such as a value of 10 for social security number and a value of 1 for the color of someone’s shirt based on what is known to be sensitive or not. The “building a learning model” encompasses the user writing down, with the aid of a pen and piece of paper, a table for the data items and a sensitivity value for each data item so that it can be referenced at another time. The “identifying…a set of features” encompasses the user observing, analyzing, and/or judging information about the form and use of data from a large data set. The “determining, for each feature of the set of features, a source” encompasses the user looking up features in scanned pages including observing and analyzing which scanned page (source) lists each feature. That is, [Pgs. 8-9 lines 27-2] of the specification specifies that the sources can be scanned data. Therefore, the sources under BRI and consistent with the specification, could be printed pages of scanned data. The “identifying, for each feature of the set of features, the attribute” encompasses the user observing and analyzing the determined scanned page including the features of each data item to judge an attribute for each feature. Lastly, the “determining, based on the model defining metadata indicating a form and use of the plurality of data items, whether each of the plurality of data items is a sensitive data item” encompasses the user referencing the data items in the table that was written down in order to find the corresponding sensitivity value listed for the data items. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, claims 1, 20 recite an abstract idea (Step 2A, Prong 1).
This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements of – non-transitory computer-readable medium; a processor; gathering a training set comprising data items and known attributes and features for each data item; and employing the learning model as an initial rendering of a model of the known attributes and features; associating the identified attributes with each data item in the plurality of data items to generate an enriched data set including the attributes for each data item in the plurality of data items, the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data. The non-transitory computer-readable medium and processor are recited at a high-level of generality (i.e., as generic computer devices performing generic computer functions) and do not meaningfully limit the claim. The additional elements of associating the identified attributes with each data item in the plurality of data items to generate an enriched data set including the attributes for each data item in the plurality of data items, the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data represent insignificant extra-solution activities to the judicial exception and are mere data gathering steps. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (Step 2A, Prong 2).
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of gathering a training set comprising data items and known attributes and features for each data item; and employing the learning model as an initial rendering of a model of the known attributes and features; associating the identified attributes with each data item in the plurality of data items to generate an enriched data set including the attributes for each data item in the plurality of data items, the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data represents an insignificant extra-solution activities that is well-understood, routine, and conventional activities previously known to the industry. That is, these limitations represent well-understood, routine, conventional activities in the fields of data processing and/or data storage and retrieval and are merely directed to the well-understood, routine, conventional activity of storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). Therefore, these additional elements do not cause the claim to amount to significantly more than the judicial exception. (Step 2B). Accordingly, claims 1, 20 are not patent eligible.
Independent claim 14 recites a memory to store a training set, comprising data items and known attributes and features for each data item; and a processor operatively coupled to the memory, the processor to: build a learning model based on the known attributes and features for each data item; employ the learning model as an initial rendering of a model of the known attributes and features; identify, without a matching procedure for the data items in a data set, a set of features that define a context for a plurality of data items in the large data set, each feature of the set of features defining metadata about a form and use of the plurality of data items; determine, for each feature of the set of features, a source in which an attribute for the feature is stored; identify, for each feature of the set of features, the attribute based on referencing the determined source for the feature, wherein the attribute is indicative of a sensitivity of each of the plurality of data items; associate the identified attributes with each data item in the plurality of data items to generate an enriched data set including the attributes for each data item in the plurality of data items, the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data: and determine, based on the model defining metadata indicating a form and use of the plurality of data items, whether each of the plurality of data items is a sensitive data item.
The limitations of build a learning model based on the known attributes and features for each data item; identify, without a matching procedure for the data items in a data set, a set of features that define a context for a plurality of data items in the large data set, each feature of the set of features defining metadata about a form and use of the plurality of data items; determine, for each feature of the set of features, a source in which an attribute for the feature is stored; identify, for each feature of the set of features, the attribute based on referencing the determined source for the feature, wherein the attribute is indicative of a sensitivity of each of the plurality of data items; and determine, based on the model defining metadata indicating a form and use of the plurality of data items, whether each of the plurality of data items is a sensitive data item, as drafted, are processes that, under their broadest reasonable interpretation, cover mental processes but from the recitation of implementing them on generic computer components. That is, nothing in the claim element precludes these steps from practically being performed in the mind. For example, the “build a learning model” encompasses the user writing down, with the aid of a pen and piece of paper, a table for the data items and a sensitivity value for each data item so that it can be referenced at another time. The “identify …a set of features” encompasses the user observing, analyzing, and/or judging information about the form and use of data from a large data set. The “determining, for each feature of the set of features, a source” encompasses the user looking up features in scanned pages including observing and analyzing which scanned page (source) lists each feature. That is, [Pgs. 8-9 lines 27-2] of the specification specifies that the sources can be scanned data. Therefore, the sources under BRI and consistent with the specification, could be printed pages of scanned data. The “identify, for each feature of the set of features, the attribute” encompasses the user observing and analyzing the determined scanned page including the features of each data item to judge an attribute for each feature. Lastly, the “determine, based on the model defining metadata indicating a form and use of the plurality of data items, whether each of the plurality of data items is a sensitive data item” encompasses the user referencing the data items in the table that was written down in order to find the corresponding sensitivity value listed for the data items. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, claim 14 recite an abstract idea (Step 2A, Prong 1).
This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements of – a memory to store a training set comprising data items and known attributes and features for each data item; a processor operatively coupled to the memory, the processor to:; employ the learning model as an initial rendering of a model of the known attributes and features; associate the identified attributes with each data item in the plurality of data items to generate an enriched data set including the attributes for each data item in the plurality of data items, the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data. The memory and processor are recited at a high-level of generality (i.e., as generic computer devices performing generic computer functions) and do not meaningfully limit the claim. The additional element of associate the identified attributes with each data item in the plurality of data items to generate an enriched data set including the attributes for each data item in the plurality of data items, the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data represents an insignificant extra-solution activity to the judicial exception and is a mere data gathering steps. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (Step 2A, Prong 2).
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of employ the learning model as an initial rendering of a model of the known attributes and features; associate the identified attributes with each data item in the plurality of data items to generate an enriched data set including the attributes for each data item in the plurality of data items, the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data represent insignificant extra-solution activities that are well-understood, routine, and conventional activities previously known to the industry. That is, these limitations represent well-understood, routine, conventional activities in the fields of data processing and/or data storage and retrieval and are merely directed to the well-understood, routine, conventional activity of storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). Therefore, these additional elements do not cause the claim to amount to significantly more than the judicial exception. (Step 2B). Accordingly, claim 14 is not patent eligible.
Claims 3-13, 15-19 depend on claims 1, 14 and include all the limitations of these claims. Therefore, claims 3-13, 15-19 are directed to the same abstract idea and the analysis must proceed to (Step 2A, Prong 2). 
Claims 3, 4, 17 recite additional limitations pertaining to the generating of the training set including data sensitivity. This judicial exception is not integrated into a practical application. The additional elements represent further mental process steps of mentally observing features defining a context, such as in the table written down by the user as aforementioned, and judging a sensitivity attribute based on the feature with which it is associated within the table. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. This additional step is considered an abstract idea (mental process step) and does not integrate the judicial exception into a practical application. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements represent further mental process steps. Therefore, these additional limitations are not sufficient to amount to significantly more than the judicial exception. Claims 3, 4, 17 are not patent eligible.
Claims 5, 18 recite additional limitations pertaining to the generating of a feature set for each data item. This judicial exception is not integrated into a practical application. This additional element represents a further mental process step of writing down, using a pen and piece of paper, a table of the features of the data items, mentally judging the sensitivity of data items, such as a value of 10 for social security number and a value of 1 for the color of someone’s shirt based on what is known to be sensitive or not, and writing down these values for each corresponding feature in the table. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. This additional step is considered an abstract idea (mental process step) and does not integrate the judicial exception into a practical application. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element represents a further mental process steps. Therefore, these additional limitations are not sufficient to amount to significantly more than the judicial exception. Claims 5, 18 are not patent eligible.
Claim 6 recites the additional limitations identifying a source indicative of an attribute for each said feature; and retrieving the attribute; and storing the attribute in conjunction with each of the plurality of data items. The “identifying” limitation encompasses the user identifying the scanned page which includes an attribute such as a sensitivity, for each features of a set of features. Therefore, this claims further recites an abstract idea (mental process).
These additional limitations do not integrate the abstract idea into a practical application and merely represent insignificant extra-solution activities to the judicial exception.  Accordingly, the remaining additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the remaining additional elements represent well-understood, routine, conventional activity previously known to the industry. That is, these remaining limitations represent well-understood, routine, conventional activity in the fields of data processing and/or data storage and retrieval and are merely directed to the well-understood, routine, conventional activity of storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). Therefore, these additional elements do not cause the claim to amount to significantly more than the judicial exception.
Claim 7 recites an additional limitation pertaining to the referencing the source. This judicial exception is not integrated into a practical application. The additional elements represent a further mental process step. That is, this limitation encompasses the user looking up features in scanned pages including observing and analyzing which scanned page (source) lists each feature.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. This additional step is considered an abstract idea (mental process step) and does not integrate the judicial exception into a practical application. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element represents a further mental process step. Therefore, this additional limitation is not sufficient to amount to significantly more than the judicial exception. Claim 7 is not patent eligible.
Claims 8-11 recite additional limitations pertaining to the computing of a value of the attribute and determining of the attribute. This judicial exception is not integrated into a practical application. The additional elements represent a further mental process step. That is, these limitations encompass the user looking up features in scanned pages including observing and analyzing which scanned page (source) lists each feature for scanned pages that they have the privilege of viewing as well as determining the value based on the format of the data and how frequently the data is accessed, which could be obtained by the user in order to help make the determination.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. These additional steps are considered an abstract idea (mental process step) and do not integrate the judicial exception into a practical application. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements represent a further mental process step. Therefore, this additional limitation is not sufficient to amount to significantly more than the judicial exception. Claims 8-11 are not patent eligible.
Claims 12, 19 recite additional limitations pertaining to aggregating the plurality of data items. This additional limitation does not integrate the abstract idea into a practical application and merely represent insignificant extra-solution activities to the judicial exception and is a mere data gathering step.  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element represents a well-understood, routine, conventional activity previously known to the industry. That is, this limitation represents a well-understood, routine, conventional activity in the fields of data processing and/or data storage and retrieval and are merely directed to the well-understood, routine, conventional activity of storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). Therefore, this additional element does not cause the claim to amount to significantly more than the judicial exception.
Claim 13 recites additional limitations pertaining to the training of the model based on receiving attributes. This additional limitation does not integrate the abstract idea into a practical application and merely represent insignificant extra-solution activities to the judicial exception and is a mere data gathering step.  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element represents a well-understood, routine, conventional activity previously known to the industry. That is, this limitation represents a well-understood, routine, conventional activity in the fields of data processing and/or data storage and retrieval and are merely directed to the well-understood, routine, conventional activity of storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). Therefore, this additional element does not cause the claim to amount to significantly more than the judicial exception.
Claim 15 recites an additional limitation pertaining to the training set and build an initial rendering of the model. This judicial exception is not integrated into a practical application. The additional element corresponding to the training set represents a further mental process step. That is, this limitation encompasses the user observing contexts (gathering contextual information) of data items and analyzing/looking up the corresponding attribute for features corresponding to the contexts such as in the scanned pages. The “building” encompasses the user writing down a table of features and attributes such as the sensitivity for each of the features. These steps could also include a human further verifying that the result is correct. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. This additional step is considered an abstract idea (mental process step) and does not integrate the judicial exception into a practical application. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element represents a further mental process step. Therefore, this additional limitation is not sufficient to amount to significantly more than the judicial exception. Claim 15 is not patent eligible.
Claim 16 recites an additional limitation pertaining to the training set. This judicial exception is not integrated into a practical application. The additional elements represent a further mental process step. That is, this limitation encompasses further defining the training set data. These steps could also include a human further verifying that the result is correct. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. These additional steps are considered an abstract idea (mental process step) and does not integrate the judicial exception into a practical application. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements represent a further mental process step. Therefore, this additional limitation are not sufficient to amount to significantly more than the judicial exception. Claims 16 is not patent eligible.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 3-8, 10-13, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Enuka (US 2020/0050966) in view of Praveen (US 2018/0285599) and further in view of Crabtree (US 2021/0019674).
Regarding claim 1, Enuka discloses:
A method for classifying data in large data sets, comprising: gathering a training set, comprising data items and known attributes and features for each data item; identifying known attributes for corresponding features of each data item … at least by ([0162] “FIG. 9 shows exemplary labeled training data 900 that may be provided to train the machine learning models on a number of supervised use cases (e.g., a minimum of 4,000 use cases). As shown, each row of the training data 900 may comprise an attribute field name 901 corresponding to an attribute field in an identity data source, a scanned field name 902 corresponding to a scanned field in a scanned data source for which a confidence level is determined, and a label 950 indicating whether the scanned field should be classified as containing personal information associated with the same attribute as that of the attribute field.”) and the training set of data items are the labeled training data, as shown for example in Fig. 9 while the known attributes and features are the attribute such as elements 901 to 950;
building a learning model based on the identified known attributes and corresponding data items; and employing the learning model as an initial rendering of a model of the known attributes and features at least by ([0034] “Referring to FIG. 1, an exemplary method of creating initial data subject profiles for an identity graph is illustrated. At an optional first step 101, initial personal information of one or more data subjects may be received by the system to create one or more data subject profiles. Such personal information (and resulting profiles) may correspond to users, customers, employees or any other person whose personal information is stored by the organization (collectively referred to herein as “data subjects”). Moreover, the initial personal information may be used as a learning set for the system to learn what personal information looks like in a specific environment. The initial personal information may be manually entered into the system by a user (e.g., via a client application) and/or may be included in a file that is uploaded to the system”, [0151]-[0154] describes the initial creation of data records, [0155] describes the creation of predictive features from the initial creation of data records for use in a machine learning model);
identifying…a set of features … for a plurality of data items in the large data set, each feature of the set of features defining metadata about … and use of the plurality of data items at least by ([0032] “The term “personal information” may refer to any information or data that can be used on its own or with other information to identify, contact, or locate a single person, and/or to identify an individual in context. Such information may include any information that can be used to distinguish or trace an individual's identity. Specific, non-limiting examples of personal information types or “attributes” include, but are not limited to: name, home address, work address, email address, national identification number, social security number,…” [0103] “the features employed by the machine learning models may relate to one or more of: the values contained in the selected rows of the scanned data source, metadata associated with fields in the scanned data source, values contained in the identity data source, metadata associated with fields in the identity data source, information associated with personal information findings determined from the scanned data source and the identity data source, and/or information associated with personal information records created from such findings. Exemplary features are discussed in detail below”) and the set of features that define a context are the personal information attributes, such as name or social security number, for example, which define metadata about the use of the data (e.g. name and social security number identifies specific people);
determining, for each feature of the set of features, a source in which an attribute for the feature is stored at least by ([0040] “At step 104, the system may connect to one or more identity data sources and conduct a search for personal information contained therein, based on the stored personal information rules. As potential personal information is found in an identity data source, the system may create a personal information findings list of such information, including the value of each finding and/or metadata associated therewith, such as an associated attribute, the data source in which the personal information was found, the location where the personal information is located within the data source (e.g., collection, table, field, row, etc.), and/or a date when the personal information was found” [0058] “if a proximity search 205 results in the discovery of a proximate attribute, the location information of the proximate attribute may be used to update one or more personal information proximity rules so that subsequent searches may take advantage of this additional information. Generally, the location information may include, but is not limited to, the absolute location of the proximate attribute and/or the relative location of the proximate attribute to the original attribute. Additionally or alternatively, information relating to the type of proximate attribute may be used to update one or more attribute definition rules so that subsequent searches may look for this type of personal information” [0138] “Referring to FIG. 6, an exemplary table 600 depicting predictive results for matching attribute fields to data source fields is illustrated. As shown, the output table 600 comprises the following labels: identity source field name 605, scanned source field name 610, field findings count 615, field unique findings count 620, name similarity 625, confidence level 630, and classification or prediction 635.”);
identifying, for each feature of the set of features, the attribute based on referencing the determined source for the feature, wherein the attribute is indicative of a sensitivity of each of the plurality of data items at least by ([0138] “Referring to FIG. 6, an exemplary table 600 depicting predictive results for matching attribute fields to data source fields is illustrated. As shown, the output table 600 comprises the following labels: identity source field name 605, scanned source field name 610, field findings count 615, field unique findings count 620, name similarity 625, confidence level 630, and classification or prediction 635.”, [0139] “As discussed above, the machine learning model employs a number of features to compare fields in a scanned data source to fields in one or more identity data sources to determine a confidence level 630. In the illustrated embodiment, the field findings count 615 and field unique findings count 620 are shown to provide a strong indicator of whether the scanned data source field contains personal information. For example, if the field unique findings count 620 is close to the number of findings 615, then the scanned source field is likely to include personal information.”, [0140] “On the other hand, name similarity 625 may be a weaker indicator of whether a scanned source field includes personal information that corresponds to a given field in an identity data source. For example, even in instances where the scanned source field name 610 is similar or identical to the identity source field name 605, the data stored in the scanned source field will not necessarily hold meaningful personal information. This is shown, for example, in row 640, where the identity source field name 605 is nearly identical to the scanned source field name, but the model determines a confidence level of only 0.0389.”, [0141] “FIG. 6 further shows that the machine learning model may classify and label 635 each of the scanned source fields based on the confidence level 630 determined for such field.”) and the computed value for each attribute are the field findings count, field unique findings count, name similar and confidence level which are all ultimately utilized to classify and label the scanned source fields as containing personal information or not containing personal information (“1” or “0”), respectively, by referencing the identify source field name as shown in at least Fig. 6;
associating the identified attributes with each data item in the plurality of data items to generate an enriched data set including the attributes for each data item in the plurality of data items…; and determining, based on the model…, whether each of the plurality of data items is a sensitive data item at least by ([0082]-[0087] disclose enrichment of the data sets in detail, [0127] summarizes predictive features which are used by the machine learning models to determine confidence levels for attributed fields and scanned fields such as the field findings count, field unique findings count, and name similarity as shown in Fig. 6, [0139] “As discussed above, the machine learning model employs a number of features to compare fields in a scanned data source to fields in one or more identity data sources to determine a confidence level 630”, [0141] “FIG. 6 further shows that the machine learning model may classify and label 635 each of the scanned source fields based on the confidence level 630 determined for such field. For example, the system may indicate that a scanned source field contains personal information (and, specifically, the same type of personal information as a given attribute field) by including a “1” in the corresponding prediction column 635. And the system may indicate a classification of no personal information by including a “0” in such column. As explained below, such classification is based on a determination of whether the confidence level is greater than or equal to a predetermined minimum threshold.”) and the associating of the computed attributes with each data item is shown in at least Fig. 6 wherein the computed attributes are the calculated confidence values and determining, by the model, whether the data within the scanned source field name has similar personal information to that of the identify source field name (“1”) or no personal information (“0”) based on the confidence values as shown in at least Fig. 6.
Enuka fails to disclose “…features of each data item based on gathered contextual information; the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data; identifying, without a matching procedure for the data items in the data set, a set of features that define a context for a plurality of data items in the data set; …metadata about a form… of the plurality of data items; …the model defining metadata indicating a form and use of the plurality of data items”
However, Praveen teaches the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data at least by ([0015] “The transaction server 106 includes one or more storage devices storing the transactional data 104.” [0020] “The information masking system 102 improves results of the regular expression based masking by applying second layer processing. The second layer processing includes lookup table based masking. The lookup table based masking involves examining tokens in the transaction data 104 using a lookup table. The information masking system 102 can populate the lookup table with one or more front end computers designated as agents or gatherers. The front end computers are configured to fetch historical transaction data from one or more transaction servers 106 and populate a token database 108. The token database 108 can include a table storing tokens that historically indicate PII. For example, a token can include an account holder name, e.g., the word “McDonald.” The token database 108 can store this token in the table, and designate this table as a lookup table. The token can be hashed in the lookup table. The front end computers can scrape one or more tokens, e.g., the account holder's name, from the transaction server 106, or from an external site. The information masking system 102 can perform a lookup in the token database 108 for tokens in the transaction data 104 and determine whether a particular token may contain PII, and masks those tokens” [0038] “The second layer processing module 204 can assign a confidence score to tokens that are used in both PII context and non-PII context based on a proportion of historical transaction data which contained the token, e.g., “Thomas,” as PII versus the token as non-PII. The training data can indicate whether a token in a context is PII or is not PII”) and token DB 108, as shown in Fig. 1, stores tokens that historically indicate PII that can be used to lookup whether received data likely contains PII by matching received data to the stored tokens; further, the tokens stored in token DB 108 are external to the actual transaction data 104 (external to the dataset) which are stored on the transactions server as shown also in Fig. 1;
…features of each data item based on gathered contextual information; …a set of features that define a context… at least by ([0038] discloses the determining of the context of tokens as historically used);
…metadata about a form and use of the plurality of data items at least by ([0038] discloses the determining of metadata about the token form, such as “Thomas” and context or use of the tokens);
…the model defining metadata indicating a form and use of the plurality of data items at least by ([0038], [0046] “a modeling subsystem 210 can generate PII identifying data 214 from input transaction data 216, and provide the PII identifying data 214 to respective modules.” [0047] “The modeling subsystem 210 receives the input transaction data 216. The input transaction data 216 can include historical transaction data, simulated transaction data, or both. The input transaction data 216 can include transaction descriptions. The input transaction data 216 can include training data, e.g., truth data on whether a token is PII in a given context.”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Praveen into the teaching of Enuka because both references disclose the identification of data sensitivity. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in Enuka to further include a lookup table external to the data set as in Praveen in order to prevent duplicative storage of the full PII which could put it at further risk of being compromised.
Enuka, Praveen fail to entirely disclose “identifying, without a matching procedure for the data items in the data set, a set of features that define a context for a plurality of data items in the data set”.
However, Crabtree teaches the above limitation at least by ([0126] “This process appends data with temporal, geospatial (geoJSON formatted), information reliability, and contextual metadata as determined by the system's 3110 machine learning algorithms and ontological axioms configuration”) and machine learning algorithms are used to determining contextual metadata without using regex or regular expression matching.
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Crabtree into the teaching of Enuka, Praveen because the references disclose the identification of processing data and/or data sensitivity. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in the combination of references to further include the appending of contextual metadata as in Crabtree “from each bit of ingested data and the insight from the knowledge graph to identify what type of risk it is and how impactful it is to the entity.” (Crabtree, [0054]).
As per claim 3, claim 1 is incorporated, Enuka fails to disclose “generating the training set by: identifying, for each data item, features that define a contextual aspect of the data, each feature tending to have a correlation with sensitivity or privacy of the data; and for each feature, receiving an attribute previously associated with a sensitivity of each of the plurality of data items”
However, Praveen teaches the above limitations at least by ([0037] “The second layer processing module 204 can populate the positive lookup table and negative lookup table with training data and past transaction data.” [0038] “To alleviate this problem, the second layer processing module 204 can identify the words, which are name tokens, that have historically been used in both non-PII context, e.g., which were used in the context of an organization, a store name, and in PII context, e.g., as a name of a person. The second layer processing module 204 can assign a confidence score to tokens that are used in both PII context and non-PII context based on a proportion of historical transaction data which contained the token, e.g., “Thomas,” as PII versus the token as non-PII. The training data can indicate whether a token in a context is PII or is not PII”,).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Praveen into the teaching of Enuka because the references similarly disclose the identification of data sensitivity. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in Enuka to further include the generating of training data based on context data as in Praveen in order to give the machine learning model a better understand of the use of the data for more accurate classification of sensitive data.
As per claim 4, claim 3 is incorporated, Enuka further discloses:
wherein the sensitivity indicates a likelihood that each of the plurality of data items is indicative of a personal, unique or financial fact about an entity to which it pertains at least by ([0032] “As used herein, the term “personal information” may refer to any information or data that can be used on its own or with other information to identify, contact, or locate a single person, and/or to identify an individual in context. Such information may include any information that can be used to distinguish or trace an individual's identity.” [0139] “As discussed above, the machine learning model employs a number of features to compare fields in a scanned data source to fields in one or more identity data sources to determine a confidence level 630. In the illustrated embodiment, the field findings count 615 and field unique findings count 620 are shown to provide a strong indicator of whether the scanned data source field contains personal information. For example, if the field unique findings count 620 is close to the number of findings 615, then the scanned source field is likely to include personal information.” [0141] “FIG. 6 further shows that the machine learning model may classify and label 635 each of the scanned source fields based on the confidence level 630 determined for such field. For example, the system may indicate that a scanned source field contains personal information (and, specifically, the same type of personal information as a given attribute field) by including a “1” in the corresponding prediction column 635. And the system may indicate a classification of no personal information by including a “0” in such column. As explained below, such classification is based on a determination of whether the confidence level is greater than or equal to a predetermined minimum threshold.”).
As per claim 5, claim 1 is incorporated, Enuka fails to explicitly disclose “further comprising generating a feature set for each data item of the plurality of data items, the feature set including an entry for each feature of the feature set and an attribute indicating a tendency that the data item defines sensitive or private information”
However, Praveen teaches the above limitations at least by ([0035] “The second layer processing module 204 can create and populate the lookup table before masking time. In some implementations, the second layer processing module 204 can create and populate a positive lookup table and a negative lookup table. The positive lookup table can include tokens of known PII, which the second layer processing module 204 will mask. The second layer processing module 204 can populate the positive lookup table with known PII words or phrases, e.g., names of account holders such as “Joe McDonald.” Once the second layer processing module 204 identifies one or more tokens from the transaction data that match one or more tokens in the positive lookup table, the second layer processing module 204 can mask those one or more tokens.” [0036] “The negative lookup table can include tokens that the second layer processing module 204 will avoid masking. These tokens are treated as stop words to prevent over-masking. The second layer processing module 204 can populate the negative lookup table with known words or phrases that, although similar to PII, are known not to be PII, e.g., names of stores such as “McDonald's” or “Macy's.” If, at masking time, the second layer processing module 204 finds a match for a token in the negative lookup table, the second layer processing module 204 can mark this token, e.g., by assigning a confidence score of −1 to the token, and prevents this token from being masked.” [0037] “The second layer processing module 204 can associate a respective confidence score with each token in each lookup table, and assign the confidence score to a matching token in a record of the transaction data 104.”) and the feature sets could be the positive and negative lookup tables that contain the tokens and confidence scores.
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Praveen into the teaching of Enuka because the references similarly disclose the identification of data sensitivity. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in Enuka to further include the generating of the feature set as in Praveen in order to allow for easy lookup of features of sensitive tokens for classification.
As per claim 6, claim 1 is incorporated, Enuka further discloses:
identifying a source indicative of an attribute for each said feature; and retrieving the attribute; and storing the attribute in conjunction with each of the plurality of data items at least by ([0131] “At step 409, the system stores, transmits and/or displays the results of the scan, including location information corresponding to one or more locations in the scanned data source where personal information has been confirmed and/or classified according to attribute (e.g., field(s) and/or row(s) within such fields). In one embodiment, the scan results may include metadata, such as but not limited to: scanned data source information corresponding to the tables that were scanned, the number of rows scanned, the specific rows scanned, the number of findings detected, the number of personal information records created from such findings, field-to-field confidence levels, scanned field attribute classifications, and/or other information” [0154] “the system may identify personal information findings from the input data based on personal information rules. The system may further identify metadata associated with such findings, such as but not limited to, an attribute type, a field name (e.g., a name of a column in a database in which the personal information is located), a field value (which may be hashed for privacy reasons), a scan ID, data source information corresponding to the data source where the personal information is stored (e.g., name, type, location, access credentials, etc.) and/or location information corresponding to a location within the data source where the personal information is stored (e.g., table, column, row, collection, etc.). Upon identifying such information in an initial data record, the system may aggregate, encode and sort this information into a findings file”).
As per claim 7, claim 1 is incorporated, Enuka further discloses:
wherein referencing the source includes information about the source itself or information retrieved from the source at least by ([0139] “As discussed above, the machine learning model employs a number of features to compare fields in a scanned data source to fields in one or more identity data sources to determine a confidence level 630. In the illustrated embodiment, the field findings count 615 and field unique findings count 620 are shown to provide a strong indicator of whether the scanned data source field contains personal information.” [0140] “the system may connect to one or more identity data sources and conduct a search for personal information contained therein, based on the stored personal information rules. As potential personal information is found in an identity data source, the system may create a personal information findings list of such information, including the value of each finding and/or metadata associated therewith, such as an associated attribute, the data source in which the personal information was found, the location where the personal information is located within the data source (e.g., collection, table, field, row, etc.), and/or a date when the personal information was found”) and the referencing the source includes at least information retrieved from the source and about the source itself.
As per claim 8, claim 1 is incorporated, Enuka further discloses:
wherein computing the value for the attribute further comprises determining the attribute based on the storage location of the data at least by ([0131] “At step 409, the system stores, transmits and/or displays the results of the scan, including location information corresponding to one or more locations in the scanned data source where personal information has been confirmed and/or classified according to attribute (e.g., field(s) and/or row(s) within such fields).” [0141] “FIG. 6 further shows that the machine learning model may classify and label 635 each of the scanned source fields based on the confidence level 630 determined for such field. For example, the system may indicate that a scanned source field contains personal information (and, specifically, the same type of personal information as a given attribute field) by including a “1” in the corresponding prediction column 635. And the system may indicate a classification of no personal information by including a “0” in such column. As explained below, such classification is based on a determination of whether the confidence level is greater than or equal to a predetermined minimum threshold.”).
As per claim 10, claim 1 is incorporated, Enuka further discloses:
further comprising determining the attribute based on a string format on formatting characters embedded in the data at least by ([0106] “It will be appreciated that a finding may be determined when a value in the attribute field matches a value in the scanned field. The system may utilize various criteria to determine such matches. For example, the system may require that the attribute field value exactly matches the scanned field value. As another example, the system may require that the attribute field value matches only a substring of the scanned field value” [0108] “the system may utilize natural language processing and/or various string similarity algorithms to determine a match between an attribute field value and a scanned field value.”).
As per claim 11, claim 1 is incorporated, Enuka further discloses:
further comprising determining the attribute based on an access frequency of the data at least by ([0204] “The risk and rules component 1332 may be further adapted to calculate risk scores for each personal information record… Such calculations may be based on static parameters, such as personal information attributes and weights, and/or dynamic parameters, such as frequency of use and type of access (e.g., read/write, etc.)”).
As per claim 12, claim 5 is incorporated, Enuka further discloses:
further comprising aggregating each of the plurality of data items and the set of features for generating the enriched data set, the model responsive to the enriched data set at least by ([0082]-[0087] disclose enrichment of the data sets in detail [0127] summarizes predictive features which are used by the machine learning models to determine confidence levels for attributed fields and scanned fields such as the field findings count, field unique findings count, and name similarity as shown in Fig. 6, [0138] “Referring to FIG. 6, an exemplary table 600 depicting predictive results for matching attribute fields to data source fields is illustrated. As shown, the output table 600 comprises the following labels: identity source field name 605, scanned source field name 610, field findings count 615, field unique findings count 620, name similarity 625, confidence level 630, and classification or prediction 635.” [0139] “As discussed above, the machine learning model employs a number of features to compare fields in a scanned data source to fields in one or more identity data sources to determine a confidence level 630”, [0155] “At step 825, various predictive features are created from the preprocessed information. Such features may be provided to the machine learning model to determine predictive values (i.e., feature weights) of the features, a confidence level and a classification based on the confidence level.”).
As per claim 13, claim 3 is incorporated, Enuka further discloses:
further comprising training the model by receiving attributes based on correct recognition of sample data at least by ([0171] “A number of metrics may be calculated to assess the performance of the disclosed models, including, sensitivity (i.e., recall or true-positive rate) and precision (i.e., true-negative rate). As shown in Equation 1, below, sensitivity corresponds to the Y-axis of a receiver operating characteristic (“ROC”) curve, where each point corresponds to a threshold at which a prediction is made. Sensitivity provides the percentage of information that is correctly identified as a personal information attribute for some predictive threshold. It will be appreciated that a higher recall corresponds to a lower prediction threshold, which in turn reflects a preference to avoid false negatives over false positives.” [0180] “the user may select one or more rows of results and modify the confidence level 1210 for each row. For example, the results may show a discrepancy between the confidence level of underlying data 1220 and the confidence level of corresponding metadata 1210 (e.g., low versus high)” [0181] “the user may be able to modify the confidence level relating to the metadata 1210 via an update confidence level modal or popup 1215. Such feature 1215 may provide an option (e.g., a dropdown menu) to allow the user to select an updated confidence level 1217. Upon selecting an updated confidence level 1217, the system may store the selection and then automatically retrain a machine learning model to predict results according to the adjusted confidence level 1217.”).
Regarding claim 14, Enuka discloses:
A device for data sensitivity classification, the device comprising: a memory to store a training set, comprising data items and known attributes and features for each data item at least by ([0162] “FIG. 9 shows exemplary labeled training data 900 that may be provided to train the machine learning models on a number of supervised use cases (e.g., a minimum of 4,000 use cases). As shown, each row of the training data 900 may comprise an attribute field name 901 corresponding to an attribute field in an identity data source, a scanned field name 902 corresponding to a scanned field in a scanned data source for which a confidence level is determined, and a label 950 indicating whether the scanned field should be classified as containing personal information associated with the same attribute as that of the attribute field.” [0216]-[0217] disclose the memory) and the training set of data items are the labeled training data, as shown for example in Fig. 9 while the known attributes and features are the attribute such as elements 901 to 950;
and a processor operatively coupled to the memory, the processor to: build a learning model based on the known attributes and features for each data item; employ the learning model as an initial rendering of a model of the known attributes and features at least by ([0034] “Referring to FIG. 1, an exemplary method of creating initial data subject profiles for an identity graph is illustrated. At an optional first step 101, initial personal information of one or more data subjects may be received by the system to create one or more data subject profiles. Such personal information (and resulting profiles) may correspond to users, customers, employees or any other person whose personal information is stored by the organization (collectively referred to herein as “data subjects”). Moreover, the initial personal information may be used as a learning set for the system to learn what personal information looks like in a specific environment. The initial personal information may be manually entered into the system by a user (e.g., via a client application) and/or may be included in a file that is uploaded to the system”, [0151]-[0154] describes the initial creation of data records, [0155] describes the creation of predictive features from the initial creation of data records for use in a machine learning model);
identify…a set of features … for a plurality of data items in the large data set, each feature of the set of features defining metadata about … and use of the plurality of data items at least by ([0032] “The term “personal information” may refer to any information or data that can be used on its own or with other information to identify, contact, or locate a single person, and/or to identify an individual in context. Such information may include any information that can be used to distinguish or trace an individual's identity. Specific, non-limiting examples of personal information types or “attributes” include, but are not limited to: name, home address, work address, email address, national identification number, social security number,…” [0103] “the features employed by the machine learning models may relate to one or more of: the values contained in the selected rows of the scanned data source, metadata associated with fields in the scanned data source, values contained in the identity data source, metadata associated with fields in the identity data source, information associated with personal information findings determined from the scanned data source and the identity data source, and/or information associated with personal information records created from such findings. Exemplary features are discussed in detail below”) and the set of features that define a context are the personal information attributes, such as name or social security number, for example, which define metadata about the use of the data (e.g. name and social security number identifies specific people);
determine, for each feature of the set of features, a source in which an attribute for the feature is stored at least by ([0040] “At step 104, the system may connect to one or more identity data sources and conduct a search for personal information contained therein, based on the stored personal information rules. As potential personal information is found in an identity data source, the system may create a personal information findings list of such information, including the value of each finding and/or metadata associated therewith, such as an associated attribute, the data source in which the personal information was found, the location where the personal information is located within the data source (e.g., collection, table, field, row, etc.), and/or a date when the personal information was found” [0058] “if a proximity search 205 results in the discovery of a proximate attribute, the location information of the proximate attribute may be used to update one or more personal information proximity rules so that subsequent searches may take advantage of this additional information. Generally, the location information may include, but is not limited to, the absolute location of the proximate attribute and/or the relative location of the proximate attribute to the original attribute. Additionally or alternatively, information relating to the type of proximate attribute may be used to update one or more attribute definition rules so that subsequent searches may look for this type of personal information” [0138] “Referring to FIG. 6, an exemplary table 600 depicting predictive results for matching attribute fields to data source fields is illustrated. As shown, the output table 600 comprises the following labels: identity source field name 605, scanned source field name 610, field findings count 615, field unique findings count 620, name similarity 625, confidence level 630, and classification or prediction 635.”);
identify, for each feature of the set of features, the attribute based on referencing the determined source for the feature, wherein the attribute is indicative of a sensitivity of each of the plurality of data items at least by ([0138] “Referring to FIG. 6, an exemplary table 600 depicting predictive results for matching attribute fields to data source fields is illustrated. As shown, the output table 600 comprises the following labels: identity source field name 605, scanned source field name 610, field findings count 615, field unique findings count 620, name similarity 625, confidence level 630, and classification or prediction 635.”, [0139] “As discussed above, the machine learning model employs a number of features to compare fields in a scanned data source to fields in one or more identity data sources to determine a confidence level 630. In the illustrated embodiment, the field findings count 615 and field unique findings count 620 are shown to provide a strong indicator of whether the scanned data source field contains personal information. For example, if the field unique findings count 620 is close to the number of findings 615, then the scanned source field is likely to include personal information.”, [0140] “On the other hand, name similarity 625 may be a weaker indicator of whether a scanned source field includes personal information that corresponds to a given field in an identity data source. For example, even in instances where the scanned source field name 610 is similar or identical to the identity source field name 605, the data stored in the scanned source field will not necessarily hold meaningful personal information. This is shown, for example, in row 640, where the identity source field name 605 is nearly identical to the scanned source field name, but the model determines a confidence level of only 0.0389.”, [0141] “FIG. 6 further shows that the machine learning model may classify and label 635 each of the scanned source fields based on the confidence level 630 determined for such field.”) and the computed value for each attribute are the field findings count, field unique findings count, name similar and confidence level which are all ultimately utilized to classify and label the scanned source fields as containing personal information or not containing personal information (“1” or “0”), respectively, by referencing the identify source field name as shown in at least Fig. 6;
associate the identified attributes with each data item in the plurality of data items to generate an enriched data set including the attributes for each data item in the plurality of data items; and determine, based on the model…, whether each of the plurality of data items is a sensitive data item at least by ([0082]-[0087] disclose enrichment of the data sets in detail, [0127] summarizes predictive features which are used by the machine learning models to determine confidence levels for attributed fields and scanned fields such as the field findings count, field unique findings count, and name similarity as shown in Fig. 6, [0139] “As discussed above, the machine learning model employs a number of features to compare fields in a scanned data source to fields in one or more identity data sources to determine a confidence level 630”, [0141] “FIG. 6 further shows that the machine learning model may classify and label 635 each of the scanned source fields based on the confidence level 630 determined for such field. For example, the system may indicate that a scanned source field contains personal information (and, specifically, the same type of personal information as a given attribute field) by including a “1” in the corresponding prediction column 635. And the system may indicate a classification of no personal information by including a “0” in such column. As explained below, such classification is based on a determination of whether the confidence level is greater than or equal to a predetermined minimum threshold.”) and the associating of the computed attributes with each data item is shown in at least Fig. 6 wherein the computed attributes are the calculated confidence values and determining, by the model, whether the data within the scanned source field name has similar personal information to that of the identify source field name (“1”) or no personal information (“0”) based on the confidence values as shown in at least Fig. 6.
Enuka fails to disclose “the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data; identify, without a matching procedure for the data items in the data set, a set of features that define a context for a plurality of data items in the data set; …metadata about a form… of the plurality of data items; …the model defining metadata indicating a form and use of the plurality of data items”
However, Praveen teaches the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data at least by ([0015] “The transaction server 106 includes one or more storage devices storing the transactional data 104.” [0020] “The information masking system 102 improves results of the regular expression based masking by applying second layer processing. The second layer processing includes lookup table based masking. The lookup table based masking involves examining tokens in the transaction data 104 using a lookup table. The information masking system 102 can populate the lookup table with one or more front end computers designated as agents or gatherers. The front end computers are configured to fetch historical transaction data from one or more transaction servers 106 and populate a token database 108. The token database 108 can include a table storing tokens that historically indicate PII. For example, a token can include an account holder name, e.g., the word “McDonald.” The token database 108 can store this token in the table, and designate this table as a lookup table. The token can be hashed in the lookup table. The front end computers can scrape one or more tokens, e.g., the account holder's name, from the transaction server 106, or from an external site. The information masking system 102 can perform a lookup in the token database 108 for tokens in the transaction data 104 and determine whether a particular token may contain PII, and masks those tokens” [0038] “The second layer processing module 204 can assign a confidence score to tokens that are used in both PII context and non-PII context based on a proportion of historical transaction data which contained the token, e.g., “Thomas,” as PII versus the token as non-PII. The training data can indicate whether a token in a context is PII or is not PII”) and token DB 108, as shown in Fig. 1, stores tokens that historically indicate PII that can be used to lookup whether received data likely contains PII by matching received data to the stored tokens; further, the tokens stored in token DB 108 are external to the actual transaction data 104 (external to the dataset) which are stored on the transactions server as shown also in Fig. 1;
…a set of features that define a context… at least by ([0038] discloses the determining of the context of tokens as historically used);
…metadata about a form and use of the plurality of data items at least by ([0038] discloses the determining of metadata about the token form, such as “Thomas” and context or use of the tokens);
…the model defining metadata indicating a form and use of the plurality of data items at least by ([0038], [0046] “a modeling subsystem 210 can generate PII identifying data 214 from input transaction data 216, and provide the PII identifying data 214 to respective modules.” [0047] “The modeling subsystem 210 receives the input transaction data 216. The input transaction data 216 can include historical transaction data, simulated transaction data, or both. The input transaction data 216 can include transaction descriptions. The input transaction data 216 can include training data, e.g., truth data on whether a token is PII in a given context.”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Praveen into the teaching of Enuka because both references disclose the identification of data sensitivity. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in Enuka to further include a lookup table external to the data set as in Praveen in order to prevent duplicative storage of the full PII which could put it at further risk of being compromised.
Enuka, Praveen fail to entirely disclose “identifying, without a matching procedure for the data items in the data set, a set of features that define a context for a plurality of data items in the data set”
However, Crabtree teaches the above limitation at least by ([0126] “This process appends data with temporal, geospatial (geoJSON formatted), information reliability, and contextual metadata as determined by the system's 3110 machine learning algorithms and ontological axioms configuration”) and machine learning algorithms are used to determining contextual metadata without using regex or regular expression matching.
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Crabtree into the teaching of Enuka, Praveen because the references disclose the identification of processing data and/or data sensitivity. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in the combination of references to further include the appending of contextual metadata as in Crabtree “from each bit of ingested data and the insight from the knowledge graph to identify what type of risk it is and how impactful it is to the entity.” (Crabtree, [0054]).
As per claim 15, claim 14 is incorporated, Enuka further discloses:
wherein, the training set includes known attributes for the at least one features of each data item…at least by ([0162] “FIG. 9 shows exemplary labeled training data 900 that may be provided to train the machine learning models on a number of supervised use cases (e.g., a minimum of 4,000 use cases). As shown, each row of the training data 900 may comprise an attribute field name 901 corresponding to an attribute field in an identity data source, a scanned field name 902 corresponding to a scanned field in a scanned data source for which a confidence level is determined, and a label 950 indicating whether the scanned field should be classified as containing personal information associated with the same attribute as that of the attribute field.”) and the training set of data items are the labeled training data, as shown for example in Fig. 9 while the known attributes and features are the attribute such as elements 901 to 950;
the training set operable for building the initial rendering of the model of the known attributes and features at least by ([0034] “Referring to FIG. 1, an exemplary method of creating initial data subject profiles for an identity graph is illustrated. At an optional first step 101, initial personal information of one or more data subjects may be received by the system to create one or more data subject profiles. Such personal information (and resulting profiles) may correspond to users, customers, employees or any other person whose personal information is stored by the organization (collectively referred to herein as “data subjects”). Moreover, the initial personal information may be used as a learning set for the system to learn what personal information looks like in a specific environment. The initial personal information may be manually entered into the system by a user (e.g., via a client application) and/or may be included in a file that is uploaded to the system”, [0151]-[0154] describes the initial creation of data records, [0155] describes the creation of predictive features from the initial creation of data records for use in a machine learning model).
Praveen further discloses:
… corresponding to at least one feature of each data item based on gathered contextual information at least by ([0038] discloses the determining of the context of tokens as historically used).
As per claim 16, claim 15 is incorporated, Enuka further discloses:
Wherein the training set includes attributes based on correct recognition of sample data at least by ([0171] “A number of metrics may be calculated to assess the performance of the disclosed models, including, sensitivity (i.e., recall or true-positive rate) and precision (i.e., true-negative rate). As shown in Equation 1, below, sensitivity corresponds to the Y-axis of a receiver operating characteristic (“ROC”) curve, where each point corresponds to a threshold at which a prediction is made. Sensitivity provides the percentage of information that is correctly identified as a personal information attribute for some predictive threshold. It will be appreciated that a higher recall corresponds to a lower prediction threshold, which in turn reflects a preference to avoid false negatives over false positives.” [0180] “the user may select one or more rows of results and modify the confidence level 1210 for each row. For example, the results may show a discrepancy between the confidence level of underlying data 1220 and the confidence level of corresponding metadata 1210 (e.g., low versus high)” [0181] “the user may be able to modify the confidence level relating to the metadata 1210 via an update confidence level modal or popup 1215. Such feature 1215 may provide an option (e.g., a dropdown menu) to allow the user to select an updated confidence level 1217. Upon selecting an updated confidence level 1217, the system may store the selection and then automatically retrain a machine learning model to predict results according to the adjusted confidence level 1217.”).
As per claim 17, claim 14 is incorporated, Enuka further discloses:
wherein the data sensitivity indicates a likelihood that each of the data items is indicative of a personal, unique or financial fact about an entity to which it pertains at least by ([0032] “As used herein, the term “personal information” may refer to any information or data that can be used on its own or with other information to identify, contact, or locate a single person, and/or to identify an individual in context. Such information may include any information that can be used to distinguish or trace an individual's identity.” [0139] “As discussed above, the machine learning model employs a number of features to compare fields in a scanned data source to fields in one or more identity data sources to determine a confidence level 630. In the illustrated embodiment, the field findings count 615 and field unique findings count 620 are shown to provide a strong indicator of whether the scanned data source field contains personal information. For example, if the field unique findings count 620 is close to the number of findings 615, then the scanned source field is likely to include personal information.” [0141] “FIG. 6 further shows that the machine learning model may classify and label 635 each of the scanned source fields based on the confidence level 630 determined for such field. For example, the system may indicate that a scanned source field contains personal information (and, specifically, the same type of personal information as a given attribute field) by including a “1” in the corresponding prediction column 635. And the system may indicate a classification of no personal information by including a “0” in such column. As explained below, such classification is based on a determination of whether the confidence level is greater than or equal to a predetermined minimum threshold.”).
As per claim 18, claim 14 is incorporated, Enuka fail to explicitly disclose “wherein the training set  comprises a feature set for each of the data items, the feature set  for each of the data items including an entry for each feature of the set and an attribute indicating a tendency that each of the data items defines sensitive or private information”
However, Praveen teaches the above limitations at least by ([0035] “The second layer processing module 204 can create and populate the lookup table before masking time. In some implementations, the second layer processing module 204 can create and populate a positive lookup table and a negative lookup table. The positive lookup table can include tokens of known PII, which the second layer processing module 204 will mask. The second layer processing module 204 can populate the positive lookup table with known PII words or phrases, e.g., names of account holders such as “Joe McDonald.” Once the second layer processing module 204 identifies one or more tokens from the transaction data that match one or more tokens in the positive lookup table, the second layer processing module 204 can mask those one or more tokens.” [0036] “The negative lookup table can include tokens that the second layer processing module 204 will avoid masking. These tokens are treated as stop words to prevent over-masking. The second layer processing module 204 can populate the negative lookup table with known words or phrases that, although similar to PII, are known not to be PII, e.g., names of stores such as “McDonald's” or “Macy's.” If, at masking time, the second layer processing module 204 finds a match for a token in the negative lookup table, the second layer processing module 204 can mark this token, e.g., by assigning a confidence score of −1 to the token, and prevents this token from being masked.” [0037] “The second layer processing module 204 can associate a respective confidence score with each token in each lookup table, and assign the confidence score to a matching token in a record of the transaction data 104.”) and the feature sets could be the positive and negative lookup tables that contain the tokens and confidence scores.
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Praveen into the teaching of Enuka because the references similarly disclose the identification of data sensitivity. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in Enuka to further include the generating of the feature set as in Praveen in order to allow for easy lookup of features of sensitive tokens for classification.
As per claim 19, claim 14 is incorporated, Enuka further discloses:
Wherein the enriched data set includes, for each of the data items, an aggregation of the data item and the corresponding features, the model responsive to the enriched data set at least by ([0082]-[0087] disclose enrichment of the data sets in detail [0127] summarizes predictive features which are used by the machine learning models to determine confidence levels for attributed fields and scanned fields such as the field findings count, field unique findings count, and name similarity as shown in Fig. 6, [0138] “Referring to FIG. 6, an exemplary table 600 depicting predictive results for matching attribute fields to data source fields is illustrated. As shown, the output table 600 comprises the following labels: identity source field name 605, scanned source field name 610, field findings count 615, field unique findings count 620, name similarity 625, confidence level 630, and classification or prediction 635.” [0139] “As discussed above, the machine learning model employs a number of features to compare fields in a scanned data source to fields in one or more identity data sources to determine a confidence level 630”, [0155] “At step 825, various predictive features are created from the preprocessed information. Such features may be provided to the machine learning model to determine predictive values (i.e., feature weights) of the features, a confidence level and a classification based on the confidence level.”).
Regarding claim 20, Enuka discloses:
A non-transitory computer-readable medium having instructions stored thereon which, when executed by a processor, cause the processor to: gather a training set, comprising data items and known attributes and features; identify known attributes for corresponding features of each data item … at least by ([0162] “FIG. 9 shows exemplary labeled training data 900 that may be provided to train the machine learning models on a number of supervised use cases (e.g., a minimum of 4,000 use cases). As shown, each row of the training data 900 may comprise an attribute field name 901 corresponding to an attribute field in an identity data source, a scanned field name 902 corresponding to a scanned field in a scanned data source for which a confidence level is determined, and a label 950 indicating whether the scanned field should be classified as containing personal information associated with the same attribute as that of the attribute field.”) and the training set of data items are the labeled training data, as shown for example in Fig. 9 while the known attributes and features are the attribute such as elements 901 to 950;
build a learning model based on the identified known attributes and corresponding data items; and employ the learning model as an initial rendering of a model of the known attributes and features at least by ([0034] “Referring to FIG. 1, an exemplary method of creating initial data subject profiles for an identity graph is illustrated. At an optional first step 101, initial personal information of one or more data subjects may be received by the system to create one or more data subject profiles. Such personal information (and resulting profiles) may correspond to users, customers, employees or any other person whose personal information is stored by the organization (collectively referred to herein as “data subjects”). Moreover, the initial personal information may be used as a learning set for the system to learn what personal information looks like in a specific environment. The initial personal information may be manually entered into the system by a user (e.g., via a client application) and/or may be included in a file that is uploaded to the system”, [0151]-[0154] describes the initial creation of data records, [0155] describes the creation of predictive features from the initial creation of data records for use in a machine learning model);
identify…a set of features … for a plurality of data items in a large data set, each feature of the set of features defining metadata about … and use of the plurality of data items at least by ([0032] “The term “personal information” may refer to any information or data that can be used on its own or with other information to identify, contact, or locate a single person, and/or to identify an individual in context. Such information may include any information that can be used to distinguish or trace an individual's identity. Specific, non-limiting examples of personal information types or “attributes” include, but are not limited to: name, home address, work address, email address, national identification number, social security number,…” [0103] “the features employed by the machine learning models may relate to one or more of: the values contained in the selected rows of the scanned data source, metadata associated with fields in the scanned data source, values contained in the identity data source, metadata associated with fields in the identity data source, information associated with personal information findings determined from the scanned data source and the identity data source, and/or information associated with personal information records created from such findings. Exemplary features are discussed in detail below”) and the set of features that define a context are the personal information attributes, such as name or social security number, for example, which define metadata about the use of the data (e.g. name and social security number identifies specific people);
determine, for each feature of the set of features, a source in which an attribute for the feature is stored at least by ([0040] “At step 104, the system may connect to one or more identity data sources and conduct a search for personal information contained therein, based on the stored personal information rules. As potential personal information is found in an identity data source, the system may create a personal information findings list of such information, including the value of each finding and/or metadata associated therewith, such as an associated attribute, the data source in which the personal information was found, the location where the personal information is located within the data source (e.g., collection, table, field, row, etc.), and/or a date when the personal information was found” [0058] “if a proximity search 205 results in the discovery of a proximate attribute, the location information of the proximate attribute may be used to update one or more personal information proximity rules so that subsequent searches may take advantage of this additional information. Generally, the location information may include, but is not limited to, the absolute location of the proximate attribute and/or the relative location of the proximate attribute to the original attribute. Additionally or alternatively, information relating to the type of proximate attribute may be used to update one or more attribute definition rules so that subsequent searches may look for this type of personal information” [0138] “Referring to FIG. 6, an exemplary table 600 depicting predictive results for matching attribute fields to data source fields is illustrated. As shown, the output table 600 comprises the following labels: identity source field name 605, scanned source field name 610, field findings count 615, field unique findings count 620, name similarity 625, confidence level 630, and classification or prediction 635.”);
identify, for each feature of the set of features, the attribute based on referencing the determined source for the feature, wherein the attribute is indicative of a sensitivity of each of the plurality of data items at least by ([0138] “Referring to FIG. 6, an exemplary table 600 depicting predictive results for matching attribute fields to data source fields is illustrated. As shown, the output table 600 comprises the following labels: identity source field name 605, scanned source field name 610, field findings count 615, field unique findings count 620, name similarity 625, confidence level 630, and classification or prediction 635.”, [0139] “As discussed above, the machine learning model employs a number of features to compare fields in a scanned data source to fields in one or more identity data sources to determine a confidence level 630. In the illustrated embodiment, the field findings count 615 and field unique findings count 620 are shown to provide a strong indicator of whether the scanned data source field contains personal information. For example, if the field unique findings count 620 is close to the number of findings 615, then the scanned source field is likely to include personal information.”, [0140] “On the other hand, name similarity 625 may be a weaker indicator of whether a scanned source field includes personal information that corresponds to a given field in an identity data source. For example, even in instances where the scanned source field name 610 is similar or identical to the identity source field name 605, the data stored in the scanned source field will not necessarily hold meaningful personal information. This is shown, for example, in row 640, where the identity source field name 605 is nearly identical to the scanned source field name, but the model determines a confidence level of only 0.0389.”, [0141] “FIG. 6 further shows that the machine learning model may classify and label 635 each of the scanned source fields based on the confidence level 630 determined for such field.”) and the computed value for each attribute are the field findings count, field unique findings count, name similar and confidence level which are all ultimately utilized to classify and label the scanned source fields as containing personal information or not containing personal information (“1” or “0”), respectively, by referencing the identify source field name as shown in at least Fig. 6;
associate a respective attribute of the identified attributes with each data item in the plurality of data items to generate an enriched data set including the attributes for each data item in the plurality of data items; and determine, based on the model…, whether each of the plurality of data items is a sensitive data item at least by ([0082]-[0087] disclose enrichment of the data sets in detail, [0127] summarizes predictive features which are used by the machine learning models to determine confidence levels for attributed fields and scanned fields such as the field findings count, field unique findings count, and name similarity as shown in Fig. 6, [0139] “As discussed above, the machine learning model employs a number of features to compare fields in a scanned data source to fields in one or more identity data sources to determine a confidence level 630”, [0141] “FIG. 6 further shows that the machine learning model may classify and label 635 each of the scanned source fields based on the confidence level 630 determined for such field. For example, the system may indicate that a scanned source field contains personal information (and, specifically, the same type of personal information as a given attribute field) by including a “1” in the corresponding prediction column 635. And the system may indicate a classification of no personal information by including a “0” in such column. As explained below, such classification is based on a determination of whether the confidence level is greater than or equal to a predetermined minimum threshold.”) and the associating of the computed attributes with each data item is shown in at least Fig. 6 wherein the computed attributes are the calculated confidence values and determining, by the model, whether the data within the scanned source field name has similar personal information to that of the identify source field name (“1”) or no personal information (“0”) based on the confidence values as shown in at least Fig. 6.
Enuka fails to disclose “…features of each data item based on gathered contextual information; the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data; identify, without a matching procedure for the data items in the data set, a set of features that define a context for a plurality of data items in the data set; …metadata about a form… of the plurality of data items; …the model defining metadata indicating a form and use of the plurality of data items”
However, Praveen teaches the attributes external to the enriched data set and indicative of a greater or lesser likelihood that a data item contains sensitive or private data at least by ([0015] “The transaction server 106 includes one or more storage devices storing the transactional data 104.” [0020] “The information masking system 102 improves results of the regular expression based masking by applying second layer processing. The second layer processing includes lookup table based masking. The lookup table based masking involves examining tokens in the transaction data 104 using a lookup table. The information masking system 102 can populate the lookup table with one or more front end computers designated as agents or gatherers. The front end computers are configured to fetch historical transaction data from one or more transaction servers 106 and populate a token database 108. The token database 108 can include a table storing tokens that historically indicate PII. For example, a token can include an account holder name, e.g., the word “McDonald.” The token database 108 can store this token in the table, and designate this table as a lookup table. The token can be hashed in the lookup table. The front end computers can scrape one or more tokens, e.g., the account holder's name, from the transaction server 106, or from an external site. The information masking system 102 can perform a lookup in the token database 108 for tokens in the transaction data 104 and determine whether a particular token may contain PII, and masks those tokens” [0038] “The second layer processing module 204 can assign a confidence score to tokens that are used in both PII context and non-PII context based on a proportion of historical transaction data which contained the token, e.g., “Thomas,” as PII versus the token as non-PII. The training data can indicate whether a token in a context is PII or is not PII”) and token DB 108, as shown in Fig. 1, stores tokens that historically indicate PII that can be used to lookup whether received data likely contains PII by matching received data to the stored tokens; further, the tokens stored in token DB 108 are external to the actual transaction data 104 (external to the dataset) which are stored on the transactions server as shown also in Fig. 1;
…features of each data item based on gathered contextual information; …a set of features that define a context… at least by ([0038] discloses the determining of the context of tokens as historically used);
…metadata about a form and use of the plurality of data items at least by ([0038] discloses the determining of metadata about the token form, such as “Thomas” and context or use of the tokens);
…the model defining metadata indicating a form and use of the plurality of data items at least by ([0038], [0046] “a modeling subsystem 210 can generate PII identifying data 214 from input transaction data 216, and provide the PII identifying data 214 to respective modules.” [0047] “The modeling subsystem 210 receives the input transaction data 216. The input transaction data 216 can include historical transaction data, simulated transaction data, or both. The input transaction data 216 can include transaction descriptions. The input transaction data 216 can include training data, e.g., truth data on whether a token is PII in a given context.”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Praveen into the teaching of Enuka because both references disclose the identification of data sensitivity. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in Enuka to further include a lookup table external to the data set as in Praveen in order to prevent duplicative storage of the full PII which could put it at further risk of being compromised.
Enuka, Praveen fail to entirely disclose “identify, without a matching procedure for the data items in the data set, a set of features that define a context for a plurality of data items in the data set”
However, Crabtree teaches the above limitation at least by ([0126] “This process appends data with temporal, geospatial (geoJSON formatted), information reliability, and contextual metadata as determined by the system's 3110 machine learning algorithms and ontological axioms configuration”) and machine learning algorithms are used to determining contextual metadata without using regex or regular expression matching.
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Crabtree into the teaching of Enuka, Praveen because the references disclose the identification of processing data and/or data sensitivity. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in the combination of references to further include the appending of contextual metadata as in Crabtree “from each bit of ingested data and the insight from the knowledge graph to identify what type of risk it is and how impactful it is to the entity.” (Crabtree, [0054]).

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Enuka (US 2020/0050966) in view of Praveen (US 2018/0285599) and Crabtree (US 2021/0019674) and further in view of Stockdale (US 2019/0260784).
As per claim 9, claim 1 is incorporated, Enuka, Praveen, Crabtree fail to disclose “further comprising computing the value for the attribute based on privileges applied to the data”
However, Stockdale teaches the above limitation at least by ([0006] “The clustering module also clusters the data values with other data values having similar characteristics using at least one machine-learning model trained on known data fields with identified privacy levels used in the network to infer a privacy level associated with that data field. A privacy level is utilized to indicate whether a data value in each of the data fields should be anonymized or remain public. A permission module determines a privacy status of each data field by comparing the privacy level to a permission threshold. An aliasing module applies an alias transform to one or more data values in the set of data fields with a privacy alias to anonymize that data value in that data field based on the privacy status i) assigned by the permission module, ii) manually entered by a system administrator in the graphical user interface, and iii) any combination of both.” [0048] “For example, the clustering module can apply one or more clustering techniques to the input data to associate the data value with a similar data value for a known personally identifiable data field. The clustering module adjusts the privacy level of the data field based on the proximity to the data values of known personally identifiable data fields. The clustering module can thus infer a privacy level associated with the data field to indicate whether the data field likely contains sensitive information; and thus, should not be public. The privacy level can be utilized to indicate whether that data value in that data field should be anonymized.” [0053] “The clustering module sets a default privacy status for all data fields so that the default privacy level triggers anonymization of all data values that have data identifiable to a network entity.”) and the determining of the attribute is the determining of the privacy status based on a comparison of the set privacy level for the data (privileges applied to the data) to a threshold.
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Stockdale into the teaching of Enuka, Praveen, Crabtree because the references similarly disclose the processing of sensitive data. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in the combination of references to further include the determining of the attribute based on data privileges as in Stockdale in order to identify data sensitivity based on privileges that are applied to the data to be further incorporated into the model.

Response to Arguments
The following is in response to the amendment filed on 07/15/22.
Applicant’s arguments have been carefully and respectfully considered but are not persuasive.
Regarding 35 USC 101, on pg. 8, applicant argues that defining metadata about a form and use and a source in which an attribute or feature is stored are not mental processes.
In response to the preceding argument, examiner respectfully submits that features defining metadata about a form and use of data can be already known or observed by the user and perhaps written down on a piece of paper. Further, the limitation “determining, for each feature of the set of features, a source in which an attribute for the feature is stored” encompasses a user judging a source of where an attribute for a feature is stored, which could be information that is known to the user or observed.
Regarding 35 USC 101, on pg. 9, applicant argues that the newly-amended limitation provide improvements to machine learning techniques.
In response to the preceding argument, examiner respectfully submits that forgoeing the use of regex or regular expression matching procedures and, instead, utilizing machine learning would not provide an improvement to machine learning techniques.
Regarding 35 USC 101, on pg. 9, applicant argues that previous approaches provide challenges in efficiently processing data and provide a technological improvement.
In response to the preceding argument, examiner respectfully submits that at least invocation of the model on live, non-training data is not recited or even suggested by the claims. Therefore, it is unclear how the claimed invention would tie to the alleged improvement
Regarding 35 USC 101, on pg. 15, applicant argues that the improvements to the technology are associated with training techniques that are specific to a technical field and only address issues within training a machine learning model.
In response to the preceding argument, examiner respectfully submits that the claimed invention does not improve machine learning models or the way in which they are trained, as suggested.

Applicant’s arguments with respect to the prior art rejections have been considered but are moot because they do not apply to all of the references being used in the current rejection.

		

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM P BARTLETT whose telephone number is (469)295-9085.  The examiner can normally be reached on M-Th 11:30-8:30, F 11-3.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on (571)272-4046.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/WILLIAM P BARTLETT/
Examiner, Art Unit 2169

/USMAAN SAEED/Supervisory Patent Examiner, Art Unit 2169