DETAILED ACTION
This communication is in response to the Application filed on 07/31/2020. Claims 1-20 are pending and have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 7-11, 14-17, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Qiao (US 2018/0365560) in view of Santos (“Learning Character-level Representations for Part-of-Speech Tagging”).
As to claim 1, Qiao teaches a method comprising: 
detecting, by a processor, terms in a corpus of raw text extracted from a collection of documents (see [0041], SID program 110A and 110B loads positive training samples where such is derived from a text document and see [0023], processor); 
extracting, by the processor, for each of the terms, a surrounding sentence, the surrounding sentence including a reference to a data subject, to thereby form a group of target sentences (see [0041], where training data comprises sentences that include context related sensitive inputs); 
training, by the processor, a neural network model to compute an output that indicates a likelihood of a given sentence containing personal information using the plurality of matrices as inputs ([0046], where neural network model is trained using positive samples and see [0040], where output of the trained LSTM will be a probability value that the token represents sensitive information).
Qiao does show in [0036] generation of vectors and word embedding vectors 226 in Figure 2, generation of embeddings of each sentence. The Examiner notes this results in generation of a vector which in combination with the other terms results in a matrix.
Thus, Santos is being cited to explicitly disclose generating, by a processor, a matrix of feature information for each target sentence of the group of target sentences to form a plurality of matrices (see sect. 2.1, where sentence comprises N words and for each word a vector is created, sect 2.1.1, where transform word w into word embedding which is a matrix (See equation 1)).
Qiao and Santos are in the same field of endeavor of word embedding with respect to natural language data, and therefore are analogous art. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the PII detection as taught by Qiao with the matrix as taught by Santos in order to capture semantic and syntactic information (see Santos sect 2.1, paragraph under heading).
As to claim 15, apparatus claim 15 and method claim 1 are related as apparatus and the method of using same, with each claimed element's function corresponding to the claimed method step. Accordingly claim 15 is similarly rejected under the same rationale as applied above with respect to method claim. Furthermore, Qiao teaches a memory containing machine-readable medium comprising machine-executable code having stored thereon instructions for detecting personal information (see [0023], computer program product including a computer readable storage medium); and a processor coupled to the memory, the processor configured to execute the machine executable code to cause the processor to (see [0023], where processor carries out the instructions stored on the medium): form a plurality of tokens from the target sentence (see [0038], where training data divided into tokens in which each token is a word); the matrix of feature information including a sequence of vectors for the plurality of tokens (see [0038], where structure of feature vector 202 is described with respect to each token).

As to claim 8, Qiao teaches a non-transitory machine-readable medium having stored thereon instructions for performing a method of detecting personal information, the non-transitory machine- readable medium comprising machine-executable code which, when executed by at least one machine, causes the at least one machine (see [0023], where processor carries out the instructions stored on the medium) to: 
extract raw text from a collection of documents to form a corpus of raw text (see [0036], where database 116 stores information such as training data which may be a table or set of NL inputs and see [0038], where training data is further divided); 
detect terms in the corpus of raw text (see [0038], division of training data into tokens); 
extract, for each of the terms, a surrounding sentence, the surrounding sentence including a reference to a data subject, to thereby form a group of target sentences (see [0041], where training data comprises sentences that include context related sensitive inputs); 
train a neural network model to compute an output that indicates a likelihood of a given sentence containing personal information using the plurality of matrices as inputs ([0046], where neural network model is trained using positive samples and see [0040], where output of the trained LSTM will be a probability value that the token represents sensitive information).
Qiao does show in [0038] generation of vectors and word embedding vectors 226 in Figure 2, generation of embeddings of each sentence. The Examiner notes this results in generation of a vector which in combination with the other terms results in a matrix.
Santos is being cited to explicitly disclose generate a matrix of feature information for each target sentence of the group of target sentences to form a plurality of matrices (see sect. 2.1, where sentence comprises N words and for each word a vector is created, sect 2.1.1, where transform word w into word embedding which is a matrix (See equation 1)) wherein the matrix of feature information for a first target sentence of the group of target sentence includes a vector for each of a plurality of tokens identified from the first target sentence (see sect. 2.1, where a sentence of N words, every word is converted to a vector ).
Qiao and Santos are in the same field of endeavor of word embedding with respect to natural language data, and therefore are analogous art. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the PII detection as taught by Qiao with the matrix as taught by Santos in order to capture semantic and syntactic information (see Santos sect 2.1, paragraph under heading).

As to claim 2 and 9, Qiao in view of Santos teach all of the limitations as in claim 1, 8 above.
Furthermore, Qiao teaches further comprising determining, by the processor, in response to the output indicating that the given sentence contains the personal information, that a document that contains the given sentence is a sensitive document (see [0021], where sensitive information detection based on neural network is used to identify sensitive information such as terms or phrases or entities from a section of text such as sentences or paragraphs). 

As to claim 3 and 10, Qiao in view of Santos teach all of the limitations as in claim 1,8, above.
Furthermore, Qiao teaches further comprising wherein the extracting comprises: extracting, by the processor, a sentence that contains a first term as the surrounding sentence in response to the sentence (see [0041], where training data comprises sentences that include context related sensitive inputs);  including at least one item selected from a list consisting of: a name of the data subject (see [0041], where  IBM and AT&T are names comprising sensitive information in the phrase), a pronoun referencing the data subject, and a direct reference to a person.

As to claim 4, 11, 17, 19, Qiao in view of Santos teach all of the limitations as in claim 1, 8, and 15, above.
Furthermore, Qiao teaches wherein the generating comprises: generating, by the processor, a plurality of features for each of a plurality of tokens formed for a first target sentence of the group of target sentences, wherein the plurality of features includes at least one item selected from a list consisting of: a part of speech tag (see Figure 2, where feature vector generated from POS 220 (claim 17 of instant application claims this)), a dependency parsing tag, a word embedding vector (see Figure 2, word embedding vector 226,  (claim 19 of instant application claims this)), a data subject tag, and a negation- hypothetical tag.

5, Qiao in view of Santos teach all of the limitations as in claim 1, above.
Furthermore, Qiao teaches forming, by the processor, a plurality of tokens for a first target sentence of the group of target sentences, wherein a token of the plurality of tokens is either a word or a special character in the target sentence (see [0036], [0038], where a token may represent a word, phrase, abbreviation or alphanumeric natural language inputs).

As to claim 7 ,14, and 16, Qiao in view of Santos teach all of the limitations as in claim 15, above.
Furthermore, Qiao teaches wherein the neural network model is a sequence-based recurrent neural network model (See [0043], where bi directional LSTM recurrent neural network is used).

As to claim 20, Qiao in view of Santos teach all of the limitations as in claim 1, 8, and 15, above.
Furthermore, Qiao teaches wherein the output is a probability indicator having a value between 0 and 1 (see [0040], where where a probability value representing a confidence level is described) (e.g. It is well known that a probability is a value which lies between 0 and 1 as by definition it’s the number of times a specific event occurs divided by the total number of events).

Claims 6, 12, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Qiao in view of Santos as applied to claim 1, 8, and 15 above, and further in view of Muffat (US 2020/0250139).
As to claim 6 and 12, Qiao in view of Santos does teach all of the limitations as claim 1 and 8, above. 
However, Qiao in view of Santos does not specifically disclose wherein the training comprises: applying, by the processor, a first set of gated recurrent units and a second set of gated recurrent units of the neural network model to a sequence of vectors in the matrix of feature information for a first target sentence of the group of target sentences in a forward direction and a backward direction, respectively.
Muffat does disclose wherein the training comprises: applying, by the processor, a first set of gated recurrent units and a second set of gated recurrent units of the neural network model to a sequence of vectors in the matrix of feature information for a first target sentence of the group of target sentences in a forward direction and a backward direction, respectively (see Figure 6, BI-GRU in 612 where each terms are represented by a BI-GRU as shown, where the up and down arrow to the next set of terms show a forward direction and a backward direction).
Qiao and Santos and Muffat are in the same field of endeavor of word embedding with respect to natural language data, and therefore are analogous art. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the PII detection as taught by Qiao and Santos with the tags as taught by Muffat in order to be able to address personal data protection and privacy regulations in terms of management of 

As to claim 18, Qiao in view of Santos does teach all of the limitations as claim claim 15, above. 
However, Qiao in view of Santos does not specifically disclose wherein a vector in the sequence of vectors for a corresponding token of the plurality of tokens includes a dependency parsing tag.
Muffat does disclose wherein a vector in the sequence of vectors for a corresponding token of the plurality of tokens includes a dependency parsing tag (see [0065] and [0066], where dependency tree is used by the GCN to general an additional embedding which is the dependency tree of the sentence and further per [0072], POS tagger can tag per ADJ (adjective) and 714 sentence tree is created).
Qiao and Santos and Muffat are in the same field of endeavor of word embedding with respect to natural language data, and therefore are analogous art. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the PII detection as taught by Qiao and Santos with the tags as taught by Muffat in order to be able to address personal data protection and privacy regulations in terms of management of sensitive personal data (see Muffat [0005]).

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Qiao in view of Santos as applied to claim 8, and above, and further in view of Du (“Explicit Interaction Model towards Text Classification”).
As to claim 13, Qiao in view of Santos does teach all of the limitations as claim claim 8, above. 
However, Qiao in view of Santos does not specifically disclose wherein the machine- executable code further causes the at least one machine, as part of the training, to: transform a sequence of vectors in the matrix of feature information for the first target sentence of the group of target sentences into a sigmoid function using a plurality of layers, the plurality of layers including at least one item selected from a group consisting of: a first set of gated recurrent units, a second set of gated recurrent units, a max pooling layer, an average pooling layer, a concatenation layer, and a normalization layer.
Du does teach wherein the machine- executable code further causes the at least one machine, as part of the training, to: transform a sequence of vectors in the matrix of feature information for the first target sentence of the group of target sentences into a sigmoid function using a plurality of layers (see Figure 2, where each word is inputted into an encoder  and see page 6360, right column, under Gated Recurrent Units where mapping each word into text to an embedding (vector) and see page 6361, right column, last paragraph, where in the aggregation step sigmoid and softmax used), the plurality of layers including at least one item selected from a group consisting of: a first set of gated recurrent units (see page 6361, right column, Interaction Layer, 2nd paragraph where GRU us sued as an encoder to  a second set of gated recurrent units, a max pooling layer, an average pooling layer, a concatenation layer, and a normalization layer (see Figure 2, Interaction layer as a concatenation layer).
Qiao and Santos and Du are in the same field of endeavor of word embedding with respect to natural language data, and therefore are analogous art. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the PII detection as taught by Qiao and Santos with the sigmoid and layer as taught by Du in order to be able to achieve finer classification and fine grained word level (see Du, page 6360, left column, first 5 lines) which would benefit the PII detection of Qiao in view of Santos which uses input text in order to classify personal information. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Raphael (US 2021/0064781) is cited to disclose detection of sensitive data (see [0118]). Heckel (US 10,169,315) is cited to disclose detection and removal of PII using neural networks.  (see Figure 4). Medalion (US 2021/0125615) is cited to disclose used of a BiLSTM in order to detect PII (see [0088]).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PARAS D SHAH whose telephone number is (571)270-1650. The examiner can normally be reached Monday-Thursday 7:30AM-2:30PM, 5PM-7PM (EST), Friday 8AM-noon (EST).

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on 571-272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Paras D Shah/Primary Examiner, Art Unit 2659                                                                                                                                                                                                        

03/08/2022