DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections – 35 USC § 101
35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. 

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claimed invention is determining sensitive and non-sensitive data, classifying the data, and use of machine learning to classify misclassified data. 
Claims 1-20 do fall within at least one of the four categories of patent eligible subject matter because the claims recite a machine (i.e., non-transitory computer-readable medium and system) and process (i.e., a method). 
 Although claims 1-20 fall under at least one of the four statutory categories, it should be determined whether the claim recites a judicial exception.  
Under Step 2A Prong one the claims are analyzed to determine if the claims are directed to judicial exception. 
Claim 1 recite collecting and classifying customer data as sensitive and non-sensitive, determining if non-sensitive is a sensitive data and scrub sensitive data. 
Claim 9 recites obtaining data, classifying data, extracting features from data, predicting as sensitive data, scrubbing a value. 

The limitation of obtaining and classifying covers “Mental Process” but for the recitation of generic computer components. The claims also recite the use of a machine learning to classify the data. That is, other than reciting a computer and utilization of a machine learning classifier, nothing in the claim element precludes the step from practically being performed by human mind, (classifying data as sensitive or non-sensitive). For example, a person can evaluate data and classify information that need to be protected (personally identifying information) as sensitive data and re-evaluate the data to make sure no sensitive information is accessed. The machine learning is used to perform steps that can be performed by human, i.e., the process of collecting data and analyzing the data which itself is the abstract idea. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind. Accordingly, the claim recites an abstract idea. 

If a claim recites a judicial exception, it should be determined whether the claim reciting the judicial exception is integrated into a practical application of that exception. However, this judicial exception is not integrated into a practical application. The claims include the additional elements of a processor for obtaining and classifying data. The processor and the machine learning in the steps are recited at a high-level of generality (i.e., as a generic processor performing a generic computer function of obtaining and identifying data), such that it amounts no more than mere instructions to apply the exception using a generic computer component. The machine learning to classify or determine if data is sensitive or non-sensitive is nothing more than generating learned functions and further the steps could be performed by human. Accordingly, this additional element does not integrate the abstract idea into a practical 
Under step 2B, the claims are evaluated to determine whether the additional elements individually or in combination are sufficient to ensure that the claim amounts to significantly more than the exception. The claim, as indicated above, recite the additional element of a computer. However, the element is recited at high level of generality and given the broadest reasonable interpretation are simply generic computers performing generic computer function of receiving, storing processing and transmitting data. As noted above, the additional elements provide a general linking to a particular technological environment or field of use. The claims appear to be automating mental tasks such as analyzing data and labeling or classifying the data by applying a processor and a machine learning program. The claims do not invoke any inventive programming, require any specialized computer hardware or other inventive computer components, i.e., a particular machine, or that the claimed invention is implemented using other than generic computer components to perform generic computer functions. Therefore, the claims do not amount to significantly more than the abstract idea itself and is not patent eligible. 
The combination of these additional elements is no more than mere instructions to apply the exception using a generic device. Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
Dependent claims 2-8, 10-15, and 17-20 merely add further details of the abstract elements recited in independent claims without including an improvement to another technology or technical field, an improvement to the functioning of the computer itself, or meaningful 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5, 6, 8, are rejected under 35 U.S.C. 103 as being unpatentable over Aissi (US 2014/0068706 A1) in view of Das et al. (US 2018/0191780 A1) and further in view of Johnstone et al. (US 2017/0270437 A1).

	Claims 1, 8:
Aissi teaches one or more processors; and a memory; one or more programs, wherein the one or more programs are stored in the memory and are configured to be executed by the one or more processors, the one or more programs including instructions that (see fig. 4):
	collecting customer data (data asset, such as user’s personal information, home address, email address or financial information such as account number) (see [0032]); 
classify customer data through a first classification process, the first classification
process indicating whether a segment of the customer data includes sensitive data or non-

associated with a source of the customer data and the second name associated with a field in
the customer data (data asset which may include sensitive information associated with a user, such as, user’s personal information, … classifying as high sensitive, sensitive, important or non sensitive, … identifying or recognizing a type of data based on a characteristic or property (attribute) of the data …) (see fig. 5, [0032]-[0036], [0082]-[0084]) and scrub the sensitive data from the customer data (sensitive data type may be encrypted or may not be needed and may be scrubbed from the system) (see [0091]-[0098]); 
	Aissi does not teach when the first classification process classifies the customer data as having non-sensitive data, utilize a machine learning classifier to determine if the segment of customer data classified as having non-sensitive data, is sensitive data; and when the machine learning classifier classifies the segment of customer data as containing sensitive data. 
	Das teaches determining a data classification policy and identifying parameters that are considered to be sensitive or non-sensitive parameters (financial information, health information, individually identifiable information such as user name, IP address, or other information or other sensitive parameters) and evaluating non-sensitive policy parameters (see [0043]- [0044]). It would have been obvious to one of ordinary skill in the art at the time of the invention to include Das’s evaluation of non-sensitive policy parameters in Aissi’s asset protection in order to evaluate the privacy policy. Aissi failed to teach the collected customer data is generated upon occurrence of a plurality of events during execution of a software product. 
Johnstone teaches data collected during execution of a software product (gathering resource usage information, …the data may include device usage, network usage, firewall usage ...data access, application usage and other resource usage … data organized into various categories, including activities performed by the user or by application associated with the employee (user) and third party application access) (see [0062]-[0068]). It would have been obvious to one of ordinary skill in the art at the time the claimed invention was filed to include Johnstone’s application usage, in Aissi’s asset protection, in order to protect sensitive data collected from the use of applications, which includes personally identifying information of users.  

Claim 2:
Aissi teaches classifier uses words in the first name, words from the second name, and words representing a type of a value of the property name to classify the segment of customer data (data type may be identified based on any attribute or combination of data types and classified as highly sensitive, sensitive, important and not sensitive) (see fig. 5, [0082]-[0084], [0088], [0090]).

Claim 3:
Aissi teaches when the first classification process classifies the customer data as containing sensitive data, scrub the sensitive data from the customer data (sensitive data type may be encrypted or may not be needed and may be scrubbed from the system) (see fig, 6B, [0091]-[0098]).
Claim 5. 
Aissi/Das teaches extracting features from the customer data, the features including words in the first name, words in the second name and words that describe a type of a value associated with the second name (see Aissi [0082]-[0084]); and generate a feature vector 
Claim 6:
Aissi/Das teaches generating a policy based on the extracted features; and wherein the first classification process uses the policy to detect sensitive data (the data protection module search for data assets and identifies the data assets based on security and privacy attributes and the identified data assets are classified based on policy that may be set by one or more entities) (see [0036], [0037], [0069]; Das [0035]-[0038], [0043]-[0044]).

 

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Aissi (US 2014/0068706 A1) in view of Das et al. (US 2018/0191780 A1) and further in view of Roth et al. (US 2015/0019858 A1).
Claim 4:
Aissi teaches scrubbing sensitive data but failed to teach generating a sandbox process to scrub the sensitive data. Roth teaches data in sandbox or quarantine (see [0075], [0090]). Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to include the sandbox in Roth in Aissi’s data protection in order to provide additional protection by placing it in an isolated secure container.

s 9, 11, 12, 13 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Cervantez et al. (US 10,979,461 B1) in view of Aissi (US 2014/0068706 A1) and further in view of Johnstone et al. (US 2017/0270437 A1).
Claim, 9:
Cervantez teaches obtaining customer data from telemetric data generated upon occurrence of a plurality of events, the customer data including an event name of a select event of the plurality of events and one or more properties with the select event, wherein a property has a property name and a value, the event name identifying the event triggering collection of the telemetric consumer data (access to data stored in storage containers to create training data by labeling or tagging samples of customer data, … the labels indicating different type of data labeled with a label or value or code or identifier) (see col. 5 lines 4-41, col. 7 lines 1-46);
Cervantez teaches classifying the customer data through a first classification process as non-sensitive and sensitive data (data object can be labeled with a label such as “medical data”, “highly sensitive data” and different type of data with which the samples of the customer data can be labeled may span a spectrum of sensitivity, e.g., second data can be labeled as publicly accessible data as “non-PII data” or “not sensitive data”) (see col. 7 lines 1-46);
upon the first classification process indicating that the customer is non-sensitive data, extracting features from each event of the plurality of customer data, including words in a name associated at least one property, words in a name associated with the event and type of value (training data may be created by labeling or tagging samples of customer data maintained by storage containers, the labels may indicate different types of data that correspond to different sensitivity levels, including “highly sensitive data”, or a value, code or identifier, or “non sensitive data) (see col. 7 lines 1-46); and 
machine learning model can comprise a classifier that is tasked with classifying unknow input, …the class label, in this case, may correspond to different sensitive levels, and may indicative of the sensitivity such as by including “sensitive* in the label (e.g., highly sensitive data, sensitive data, not sensitive data) (see col. 8 line 25 to col. 9 line 17). Cervantez failed to teach scrubbing sensitive data from the customer data. 
Aissi teaches after the customer data is classified as sensitive data scrubbing the value of the at least one property from the customer data (data asset which may include sensitive information associated with a user, such as, user’s personal information, … classifying as high sensitive, sensitive, important or non sensitive, … identifying or recognizing a type of data based on a characteristic or property (attribute) of the data …) (see fig. 5, [0032]-[0036], [0082]-[0084]); (sensitive data type may be encrypted or may not be needed and may be scrubbed from the system) (see [0091]-[0098]). It would have been obvious to one of ordinary skill in the art at the time of the claimed invention was filed to incorporate Aissi’s scrubbing of sensitive data in Cervantez’s classification system, in order to protect sensitive data that might be stored in a less secure storage medium.  
Cervantez failed to teach the collected customer data is generated upon occurrence of a plurality of events during execution of a software product. 
Johnstone teaches data collected during execution of a software product (gathering resource usage information, …the data may include device usage, network usage, firewall usage .. data access, application usage and other resource usage … data  organized into various categories, including activities performed by the user or by application associated with the employee (user) and third party application access) (see [0062]-[0068]). It would have been obvious to one of ordinary skill in the art at the time the claimed invention was filed to include Johnstone’s application usage in Cervantez profile data so that the information can be classified in order to protect the personally identifying information of the user.  
Claim 11:
Cervantez teaches upon the machine learning classifier predicting the at least one property as sensitive data, updating the first classification process with the extracted features (the class label may correspond to a classification of unknown data as a type of data (see col. 2 lines 40-67, col. 8 lines 25-60).
Claim 12:
Cervantez teaches uses one or more policies to classify the at least one property as sensitive data, wherein a policy is based on a combination of words in usage patterns of the classified sensitive data (policies for identifying data as e.g. medial data as highly sensitive data or data that does not include sensitive data as non-PII or not sensitive data) (see col. 6 line 24 to col. 7 46). 

Claim 13:
Cervantez/Aissi teaches upon the machine language classifying the property as a sensitive data generating a new policy based on the extracted features (once trained the machine learning can classify the data and evaluate the sufficiency of the existing data security) (see col. 2 lines 50-67) and Aissi  (the data protection module search for data assets and identifies the data assets based on security and privacy attributes and the identified data assets are classified based on policy that may be set by one or more entities) (see [0036], [0037], [0069]).

Aissi teaches when the first classification process classifies the customer data as containing sensitive data, scrub the sensitive data from the customer data, wherein the scrubbing includes, obfuscating, deleting, or converting the value (sensitive data type may be encrypted or may not be needed and may be scrubbed from the system) (see fig, 6B, [0091]-[0098]).


Claims 7 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Aissi (US 2014/0068706 A1) in view of Das et al. (US 2018/0191780 A1) and further in view of Official Notice.
Claim 7, 10:
Aissi failed to teach wherein the machine learning classifier is trained using logistic regression with a Lasso penalty. However, Official Notice is taken that it is old and well known in the art of machine learning to use logistic regression with lasso penalty. It would have been obvious to one of ordinary skill in the art at the time of the invention to use logistic regression in Aissi’s machine learning in order to use well suited algorithm for classifying.

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Cervantez, in view of Aissi further in view of Johnstone and further in view of Roth et al. (US 2015/0019858 A1).

Claim 14:
Cervantez/Aissi teaches scrubbing sensitive data but failed to teach generating a sandbox process to scrub the sensitive data. Roth teaches data in sandbox or quarantine (see [0075], .

Claims 16 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Cervantez et al. (US 10,979,461 B1) and further in view of Johnstone et al. (US 2017/0270437 A1).
Claim 16:
Cervantez teaches obtain a plurality of training data (data) from telemetric data generated upon occurrence of a plurality of events, the training data including an event name of a select event of the plurality of events and one or more properties with the select event, wherein a property has a property name and a value, the event name identifying the event triggering collection of the telemetric consumer data (access to data stored in storage containers to create training data by labeling or tagging samples of customer data, … the labels indicating different type of data labeled with a label or value or code or identifier) (see col. 5 lines 4-41, col. 7 lines 1-46);
Cervantez teaches classify each of the one or more properties of each event name and the corresponding property value with a label, wherein the label identifies a property as sensitive data or non-sensitive data (data object can be labeled with a label such as “medical data”, “highly sensitive data” and different type of data with which the samples of the customer data can be labeled may span a spectrum of sensitivity, e.g., second data can be labeled as publicly accessible data as “non-PII data” or “not sensitive data”) (see col. 7 lines 1-46);
data object that matches the pattern of fabricated PII, file names can be used to label samples of customer data e.g., “tradesecrets.docx” or other techniques such as identification of key words/terms/phrases using natural language processing to create the labeled training data) (see col. 8 lines 7-24); and 
train a classifier with the extracted features to learn to associate a label with a combination of words extracted from a given event name, and a given property name of a given event (machine learning model can comprise a classifier that is tasked with classifying unknow input, …) (see col. 8 line 25 to col. 9 line 17). Cervantez failed to teach the collected customer data is generated upon occurrence of a plurality of events during execution of a software product. 
Johnstone teaches data collected during execution of a software product (gathering resource usage information, …the data may include device usage, network usage, firewall usage .. data access, application usage and other resource usage … data organized into various categories, including activities performed by the user or by application associated with the employee (user) and third party application access) (see [0062]-[0068]). It would have been obvious to one of ordinary skill in the art at the time the claimed invention was filed to include Johnstone’s application usage in Cervantez profile data so that the information can be classified in order to protect the personally identifying information of the user.  

Claim 18:
the data object may be labeled with a label or value, code or identifier) (see col. 7 line 1-67, 
Claim 19:
Cervantez teaches wherein the extracted features include words most frequently found in the training data (labeling samples of the customer data, … medical data labeled as “medial data”, or “highly sensitive data” …) col. 11 line 50 to col. 12 line 58).
Claim 20:
Cervantez teaches utilizing the classifier to detect an event having sensitive data given a combination of words extracted from a target event name of the target event, target property name of a given event name, a target value of the target property value (classifying the data as a type of data … the first type of data may represent, for example  medial data which may correspond to a highest sensitivity level, … other data may represent data that is not sensitive …Cervantez also teaches that the value, code or identifier can be used to indicate the sensitivity level) (see col. 12 lines 33-57). 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 17 are rejected under 35 U.S.C. 103 as being unpatentable over Cervantez in view of Johnstone and further in view Official Notice.
Claim 17:
. 
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Applicant’s argument regarding the rejection under 101, the step of collecting or obtaining data which is data gathering is insignificant extra-solution activity which are also part of the abstract idea. Further, the additional element (or combination of the elements) used for collecting or obtaining data of events during execution of software product is purely conventional. Using a computer to collect or obtain data is one of the most basic functions of s computer. A generic machine learning (module or software) is utilized to perform the abstract idea of classifying data, which is a well understood, routine, conventional activity. Thus, there is no more than the conceptual idea of obtaining data and classifying the data by conventional means. All the computer functions are generic, routine, conventional computer activities that are performed only for their conventional use. Regarding the Enfish, McRo, etc., the current claims do not show any improvements in the computer-technology or computer functionality, including those directed to the machine learning software.  Classifying the collected data as sensitive or non-sensitive and scrub the sensitive data does not provide improvement to the technology. As 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  
Williamson et al. (US 10,810,317 B2), teaches automatically parsing data and determining which portion of the data sources contain sensitive data and to perform corrective action, using a data classifier.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YEHDEGA RETTA whose telephone number is (571)272-6723. The examiner can normally be reached Monday-Thursday 8am-6pm.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kambiz Abdi can be reached on 571-272-6702. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/YEHDEGA RETTA/Primary Examiner, Art Unit 3688