DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Rule et al. (Rule), US Patent Application Publication No. US 2021/0092224A1, and further in view of Antonatos et al. (Antonatos), US Patent Application Publication No. US 2020/0320406 A1.

As to independent claim 1, Rule discloses a system that performs data classification for personally identifiable information (PII) data, the system comprising: 
a database system that stores attribute data and corresponding metadata (paragraph [0006]: a plurality of records (attribute data) stored at a database, wherein each record is associated with a phone call (metadata) and includes at least one request generated based on a transcript of the phone call using a natural language processing); 
an interactive user interface configured to receive user input via a communication network (paragraph [0011]: a service provider can receive a plurality of calls (user input) from a plurality of callers); and 
a computer processor, coupled to the database system and the communication network (Figure 2 and paragraph [0053]: processor 206 is connected with a secondary storage device (database) and a network connection device), configured to perform the steps of: 
receiving data relating to one or more attributes (paragraph [0011]: record and analyze the call, wherein the record can be a file, folder, media file, document, etc, and include information such as a time for an incoming call, a phone number, an account, (metadata); 
identifying corresponding metadata associated with the one or more attributes (paragraph [0011]: record and analyze the call, wherein the record can be a file, folder, media file, document, etc, and include information such as a time for an incoming call, a phone number, an account, (metadata)); and 
classifying the one or more attributes (paragraphs [0011]-[0012]: generating a transcript for each call, dividing the transcript into small segments and matching these segments to known phonemes through a complex statistical model to determine what the caller was saying and outputting it as text); 
wherein the classifying is based on statistical techniques and natural language processing (paragraph [0012]: classifying a transcript using a complex statistical model).
Rule, however, does not explicitly disclose classifying the one or more attributes into non-Pll data and PII data based on the identified corresponding metadata and further classifying the PII data into one of a plurality of protection groups, each protection group identifying access permissions. 
In the same field of endeavor, Antonatos discloses applying one or more data security rules, policies, and/or requirements on data read to prevent unauthorized user to access selected data/raw data (e.g., classified/private data) by transforming sensitive data according to according to one or more data security rules, policies and/or requirements (paragraph [0017]).  Antonatos further discloses classified /private data is detected using a machine learning operation such as natural language processing and/or artificial intelligence operation to learn data that may be determined to be classified as private, personal, sensitive, and/or proprietary, and the selected portion of data that is determined to be classified/private data may be filtered and/or anonymized (paragraphs [0017], [0063], [0064] and [0077]).  
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the system of Rule to include classifying the one or more attributes into non-Pll data and PII data based on the identified corresponding metadata and further classifying the PII data into one of a plurality of protection groups, each protection group identifying access permissions, as taught by Antonatos for the purpose of preventing unauthorized user to access the classified/private data.

As to dependent claim 2, Rule discloses wherein the computer processor is configured to perform the step of applying tokenization to the one or more attributes to identify a sequence of words (paragraph [0012]).

As to dependent claim 3, Rule discloses wherein the computer processor is configured to perform the step off applying a stemming process to the one or more attributes to reduce inflected words to a stem form (paragraph [0015]).

As to dependent claim 4, Rule discloses wherein the computer processor is configured to perform the step of applying a grouping of words as a single item (paragraph [0012]).

As to dependent claim 5, Rule discloses wherein the grouping comprises a lemmatization process (paragraph [0015]).

As to dependent claim 6, Rule discloses wherein the computer processor is configured to perform the step of: applying a weighting technique that represents an importance associated with a word (paragraphs [0013]-[0015]).

As to dependent claim 7, Rule discloses wherein the weighting technique comprises term frequency-inverse document frequency (TF-IDF) weight (paragraph [0015]).

As to dependent claim 8, Rule discloses wherein the term frequency-inverse document frequency (TR-IDF) weight comprises a first term that measures how frequently a term appears in a document (paragraph [0013]). 

As to dependent claim 9, Rule discloses wherein the term frequency-inverse document frequency (TF-IDF) weight comprises a second term that measures how important a term is (paragraph [0013]). 

As to dependent claim 10, Rule discloses wherein the computer processor is configured to perform the step of: applying a Synthetic Minority Oversampling Technique to crease a dataset in a balanced manner (paragraph [0033]). 

As to independent claim 11, Rule discloses a method that performs data classification for personally identifiable information (PII) data, the method comprising the steps of: 
receiving data relating to one or more attributes (paragraph [0011]: record and analyze the call, wherein the record can be a file, folder, media file, document, etc, and include information such as a time for an incoming call, a phone number, an account, (metadata): 
identifying corresponding metadata associated with the one or more attributes (paragraph [0011]: record and analyze the call, wherein the record can be a file, folder, media file, document, etc, and include information such as a time for an incoming call, a phone number, an account, (metadata)); and 
classifying the one or more attributes (paragraphs [0011]-[0012]: generating a transcript for each call, dividing the transcript into small segments and matching these segments to known phonemes through a complex statistical model to determine what the caller was saying and outputting it as text); 
wherein the classifying is based on statistical techniques and natural language processing (paragraph [0012]: classifying a transcript using a complex statistical model).
Rule, however, does not explicitly disclose classifying the one or more attributes into non-Pll data and PII data based on the identified corresponding metadata and further classifying the PII data into one of a plurality of protection groups, each protection group identifying access permissions. 
In the same field of endeavor, Antonatos discloses applying one or more data security rules, policies, and/or requirements on data read to prevent unauthorized user to access selected data/raw data (e.g., classified/private data) by transforming sensitive data according to according to one or more data security rules, policies and/or requirements (paragraph [0017]).  Antonatos further discloses classified /private data is detected using a machine learning operation such as natural language processing and/or artificial intelligence operation to learn data that may be determined to be classified as private, personal, sensitive, and/or proprietary, and the selected portion of data that is determined to be classified/private data may be filtered and/or anonymized (paragraphs [0017], [0063], [0064] and [0077]).  
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to modify the system of Rule to include classifying the one or more attributes into non-Pll data and PII data based on the identified corresponding metadata and further classifying the PII data into one of a plurality of protection groups, each protection group identifying access permissions, as taught by Antonatos for the purpose of preventing unauthorized user to access the classified/private data.

As to dependent claim 12, Rule discloses applying tokenization to the one or more attributes to identify a sequence of words (paragraph [0012]).

As to dependent claim 13, Rule discloses applying a stemming process to the one or more attributes to reduce inflected words to a stem form (paragraph [0015]).

As to dependent claim 14, Rule discloses applying a grouping of words as a single item (paragraph [0012]).

As to dependent claim 15, Rule discloses wherein the grouping comprises a lemmatization process (paragraph [0015]).

As to dependent claim 16, Rule discloses applying a weighting technique that represents an importance associated with a word (paragraphs [0013]-[0015]).

As to dependent claim 17, Rule discloses wherein the weighting technique comprises term frequency-inverse document frequency (TF-IDF) weight (paragraph [0015]).

As to dependent claim 18, Rule discloses wherein the term frequency-inverse document frequency (TF-IDF) weight comprises a first term that measures how frequently a term appears in a document (paragraph [0013]).

As to dependent claim 19, Rule discloses wherein the term frequency-inverse document frequency (TP-IDF) weight comprises a second term that measures how important a term is (paragraph [0013]). 

As to dependent claim 20, Rule discloses applying a Synthetic Minority Oversampling Technique to increase a dataset in a balanced manner (paragraph [0033]).


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHAU T NGUYEN whose telephone number is (571)272-4092. The examiner can normally be reached M-Th: 6:30-5:00(P.T.).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Cesar Paula can be reached on 5712724128. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/CHAU T NGUYEN/Primary Examiner, Art Unit 2177