Detailed Action
This action is in response to the application filed on October 19, 2020.

Notice of AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .  In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Information Disclosure Statement
1.	The information disclosure statement (IDS) submitted on 10/19/20, 3/31/22 and 5/16/22 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the Examiner.

Claim Rejections – 35 USC § 101
2.	35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. 
Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more
Step 1 (See MPEP 2106) Claims 1-20 are directed to a method which belongs to a statutory class.
Step 2A, Prong One: Claim 1 recites A computer implemented method for protecting sensitive information in documents, comprising: providing an inverted text index for a set of documents; using the inverted text index for evaluating one or more statistical measures of an index token of the inverted text index; using the one or more statistical measures for selecting a set of candidate tokens that may contain sensitive information; extracting metadata from the inverted text index descriptive of the set of candidate tokens, wherein the extracted metadata comprises at least a token type of the index token and a document identifier of a document containing the index token;  associating the set of candidate tokens with respective token metadata, wherein the token metadata of the token comprises the extracted metadata of the token which is a process that, under its broadest reasonable interpretation, covers performance of the limitation by Mental Process, but for the recitation of generic computer components. 

This can be done for example, by a person associating tokens with documents, comparing the tokens, and cover/hide/mask some tokens. Nothing in the claim element precludes the steps from practically being performed in the human mind. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation by mental process, but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea. 

This judicial exception, i.e., the mental process, is not integrated into a practical application. In particular, the claims only recite additional elements – tokenizing the at least one document, resulting in one or more document tokens; comparing the one or more document tokens with the set of candidate tokens; selecting a set of document tokens to be masked based on the comparison; selecting at least part of the set of document tokens that comprises sensitive information according to the associated token metadata; masking the at least part of the set of document tokens in the one or more documents, resulting in one or more masked documents.” These elements are recited at a high-level of generality (i.e., as a generic processor performing a generic computer function amounts no more than mere instructions to apply the exception using a generic computer component. The processing environments perform a generic function of computing/processing queries. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
Step 2A, Prong Two: Claims 1-20 are also analyzed considering all the additional elements recited in them to determine whether any claim element or combination of elements amount to significantly more than the judicial exception. In claims 1-20 receiving a request of at least one document; and providing the one or more masked documents.” are recited at a high level of generality. The limitation is thus insignificant extra-solution activity. Limitations that the courts have found not to be enough to qualify as "significantly more" when recited in a claim with a judicial exception include: i. Adding the words "apply it" (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, e.g., a limitation indicating that a particular function such as creating and maintaining electronic records is performed by a computer, as discussed in Alice Corp., 134 S. Ct. at 2360, 110 USPQ2d at 1984 (see MPEP § 2106.05(f)). 2106.05(g)--Insignificant Extra-Solution Activity. 

Claims 1-20, taken as a whole, as an ordered combination of steps, and considering the additional elements, do not provide meaningful limitations to transform the abstract idea into a practical application that is significantly more than the abstract idea itself.

Therefore, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception.  The additional elements amount to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. The claims are not patent eligible.

	All dependent claims have been analyzed for each of the steps stated above. Dependent claims are not patent eligible for the same reasons as applied above.
. 

Claim Rejections - 35 USC § 103
3.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


4.	Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Cachin et al. (US 2018/021,8164 A1) in view of Salgado et. Al (US2008/0239,365 A1.)

5.	Regarding claim 1, Cachin discloses “A computer implemented method for protecting sensitive information in documents, comprising:  providing an inverted text index for a set of documents; using the inverted text index for evaluating one or more statistical measures of an index token of the inverted text index;” (See Fig. 3-Fig. 6 and abstract) (Produces mask-update data, dependent on the current and new encryption keys, and sends the mask-update data to the data-user computer. The mask-update data permits updating, at the data-user computer, of masked data items produced with the current encryption key into masked data items produced with the new encryption key.)

“using the one or more statistical measures for selecting a set of candidate tokens that may contain sensitive information;” (See abstract and [002]) (Data masking is provided by, for at least one predetermined data item in data to be sent, applying a one-way function to that data item to produce a first value, producing a masked data item by encrypting the first value via a deterministic encryption scheme using a current encryption key for a current epoch, and replacing that data item by the masked data item. Data masking is used when data which includes sensitive information needs to be copied to a less-trusted environment. The purpose of data masking is to de-sensitize the data, so as to hide “mask” sensitive data items, such that the data as a whole remains useful for its intended purpose.)

But, Cachin does not explicitly disclose “extracting metadata from the inverted text index descriptive of the set of candidate tokens, wherein the extracted metadata comprises at least a token type of the index token and a document identifier of a document containing the index token;” 

However, Salgado teaches “extracting metadata from the inverted text index descriptive of the set of candidate tokens, wherein the extracted metadata comprises at least a token type of the index token and a document identifier of a document containing the index token;” (See [0024]-[0025) (When an original document has been scanned, the OCR module extracts textual information from the image content of the print job. The lexicon 38 may be a finite state device which serves as a dictionary whereby specific textual elements in the print job data stream or OCR'd text may be identified for masking.)

But, Cachin does not explicitly disclose “associating the set of candidate tokens with respective token metadata, wherein the token metadata of the token comprises the extracted metadata of the token; receiving a request of at least one document; tokenizing the at least one document, resulting in one or more document tokens; comparing the one or more document tokens with the set of candidate tokens;

However, Salgado teaches “associating the set of candidate tokens with respective token metadata, wherein the token metadata of the token comprises the extracted metadata of the token; receiving a request of at least one document; tokenizing the at least one document, resulting in one or more document tokens; comparing the one or more document tokens with the set of candidate tokens;” (See [005], [0020], [0025]) (The lexicon/tokenization which serves as a dictionary whereby specific textual elements in the print job data stream or OCR'd text may be identified for masking. The lexicon may include words, phrases and the like, including person names, place names and other specific text elements of interest to the user. The lexicon may also be structured to identify lexical equivalents, such as abbreviations, lemma forms, and the like of user-selected textual elements. The lexicon may also cluster textual elements according to category, whereby a specific category of textual elements may be selected for masking. Masking text in a rendered copy of an original document is provided. The apparatus includes a text modification system which is configured to receive a print job from an application and modify the print job in accordance with a print job description, whereby when rendered on an output device, a selected text element is masked. During the process of masking, the print job stream may be converted to a meta format which is subsequently returned to the native format prior to being transmitted to the output device.)

But, Cachin does not explicitly disclose “selecting a set of document tokens to be masked based on the comparison; selecting at least part of the set of document tokens that comprises sensitive information according to the associated token metadata; masking the at least part of the set of document tokens in the one or more documents, resulting in one or more masked documents; and providing the one or more masked documents.”

However, Salgado teaches “selecting a set of document tokens to be masked based on the comparison; selecting at least part of the set of document tokens that comprises sensitive information according to the associated token metadata; masking the at least part of the set of document tokens in the one or more documents, resulting in one or more masked documents; and providing the one or more masked documents.” (See Fig. 2 and [0011],[0029]) (Provides a user with the ability to print a document with text or other information masked to permit display of a scanned document with masks over sensitive or irrelevant areas of text, The system identifies all the date text in the document by comparing the data stream with the lexicon 38 and may automatically tag the date text for masking. Alternatively, the date text may be highlighted or otherwise accented whereby a user can select one or more of the accented text elements for masking.)

It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to combine the cited references wherein according to Salgado [005], an apparatus for masking text in a rendered copy of an original document is provided. The apparatus includes a text modification system which is configured to receive a print job from an application and modify the print job in accordance with a print job description, whereby when rendered on an output device, a selected text element is masked. 

Regarding claim 2, Cachin in view of Salgado discloses “The method of claim 1, further comprising:  determining a topic metadata of the set of candidate tokens, the topic metadata of the set of candidate tokens comprising a topic of the set of candidate tokens or a topic of a document containing the set of candidate tokens, wherein the token metadata of the token further comprises the topic metadata.” (See Salgado, (See [0024]-[0025) (When an original document has been scanned, the OCR module extracts textual information from the image content of the print job. The lexicon 38 may be a finite state device which serves as a dictionary whereby specific textual elements in the print job data stream or OCR'd text may be identified for masking.)


Regarding claim 3, Cachin in view of Salgado discloses “The method of claim 2, further comprising: determining a token category of each token of the set of candidate tokens; inputting the token categories to an information governance tool; and receiving as output the topic metadata.” (See Salgado: (See [005], [0020], [0025]) (The lexicon/tokenization which serves as a dictionary whereby specific textual elements in the print job data stream or OCR'd text may be identified for masking. The lexicon may include words, phrases and the like, including person names, place names and other specific text elements of interest to the user. The lexicon may also be structured to identify lexical equivalents, such as abbreviations, lemma forms, and the like of user-selected textual elements. The lexicon may also cluster textual elements according to category, whereby a specific category of textual elements may be selected for masking. Masking text in a rendered copy of an original document is provided. The apparatus includes a text modification system which is configured to receive a print job from an application and modify the print job in accordance with a print job description, whereby when rendered on an output device, a selected text element is masked. During the process of masking, the print job stream may be converted to a meta format which is subsequently returned to the native format prior to being transmitted to the output device.)
Regarding claim 4, Cachin in view of Salgado discloses “The method of claim 1, wherein: the statistical measure of the index token comprises one or more of a number of documents of the set of documents containing the index token, a frequency of occurrence of the index token in the set of documents, or a frequency of occurrence of a token type of the index token in the set of documents; and selecting the set of candidate tokens comprises comparing the statistical measure with a predefined threshold.” (See abstract and [002]) (Data masking is provided by, for at least one predetermined data item in data to be sent, applying a one-way function to that data item to produce a first value, producing a masked data item by encrypting the first value via a deterministic encryption scheme using a current encryption key for a current epoch, and replacing that data item by the masked data item. Data masking is used when data which includes sensitive information needs to be copied to a less-trusted environment. The purpose of data masking is to de-sensitize the data, so as to hide “mask” sensitive data items, such that the data as a whole remains useful for its intended purpose.)

Regarding claim 5, Cachin in view of Salgado discloses “The method of claim 4, wherein the token type comprises one or more of a text type or number type.” (See Salgado: (See [005], [0020], [0025]) (The lexicon/tokenization which serves as a dictionary whereby specific textual elements in the print job data stream or OCR'd text may be identified for masking. The lexicon may include words, phrases and the like, including person names, place names and other specific text elements of interest to the user. The lexicon may also be structured to identify lexical equivalents, such as abbreviations, lemma forms, and the like of user-selected textual elements. The lexicon may also cluster textual elements according to category, whereby a specific category of textual elements may be selected for masking. Masking text in a rendered copy of an original document is provided. The apparatus includes a text modification system which is configured to receive a print job from an application and modify the print job in accordance with a print job description, whereby when rendered on an output device, a selected text element is masked. During the process of masking, the print job stream may be converted to a meta format which is subsequently returned to the native format prior to being transmitted to the output device.)


Regarding claim 6, Cachin in view of Salgado discloses “The method of claim 1, further comprising: storing the set of candidate tokens in association with the token metadata in a storage system; using an updated inverted text index for evaluating one or more statistical measures of an updated index token of the updated inverted text index; using the one or more statistical measures for selecting an updated set of candidate tokens that may contain sensitive  information; extracting updated metadata from the updated inverted text index descriptive of the updated set of candidate tokens, wherein the extracted updated metadata comprises at least a token type of the updated index token and a document identifier of a document containing the updated index token; and updating the storage system accordingly, wherein the updated storage system is used for selecting updated document tokens that are masked.” (See Fig. 3-Fig. 6 and abstract, [002]) (Produces mask-update data, dependent on the current and new encryption keys, and sends the mask-update data to the data-user computer. The mask-update data permits updating, at the data-user computer, of masked data items produced with the current encryption key into masked data items produced with the new encryption key. (Data masking is provided by, for at least one predetermined data item in data to be sent, applying a one-way function to that data item to produce a first value, producing a masked data item by encrypting the first value via a deterministic encryption scheme using a current encryption key for a current epoch, and replacing that data item by the masked data item. Data masking is used when data which includes sensitive information needs to be copied to a less-trusted environment. The purpose of data masking is to de-sensitize the data, so as to hide “mask” sensitive data items, such that the data as a whole remains useful for its intended purpose.

See also Salgado: (See [005], [0020], [0025]) (The lexicon/tokenization which serves as a dictionary whereby specific textual elements in the print job data stream or OCR'd text may be identified for masking. The lexicon may include words, phrases and the like, including person names, place names and other specific text elements of interest to the user. The lexicon may also be structured to identify lexical equivalents, such as abbreviations, lemma forms, and the like of user-selected textual elements. The lexicon may also cluster textual elements according to category, whereby a specific category of textual elements may be selected for masking. Masking text in a rendered copy of an original document is provided. The apparatus includes a text modification system which is configured to receive a print job from an application and modify the print job in accordance with a print job description, whereby when rendered on an output device, a selected text element is masked. During the process of masking, the print job stream may be converted to a meta format which is subsequently returned to the native format prior to being transmitted to the output device.)

Regarding claim 7, Cachin in view of Salgado discloses “The method of claim 1, wherein: selecting the at least part of the set of document tokens that comprises sensitive information according to the associated token metadata comprises running a classifier on the token metadata and classifying the set of document tokens as sensitive or not sensitive tokens; and the selection is performed based on the classification”. (See Salgado: [0061]) (The system and method provide the capability of leaving access to sensitive but otherwise interesting documents whether printed or scanned to readers by removing the sensitive information on the document itself. In some instances, it can also improve in the readability of certain documents by masking irrelevant information, in order not to distract the user.

Regarding claim 8, Cachin in view of Salgado discloses “The method of claim 1, further comprising:  determining a domain represented by a content of the requested document, wherein the set of documents represents the determined domain and excludes the requested document.” (See [005]) (An apparatus for masking text in a rendered copy of an original document includes a text modification system which is configured to receive a print job from an application and modify the print job in accordance with a print job description, whereby when rendered on an output device, a selected text element is masked.)

Regarding claim 9, Cachin in view of Salgado discloses “The method of claim 1, wherein the set of documents comprises the requested document.”  (See [005]) (An apparatus for masking text in a rendered copy of an original document includes a text modification system which is configured to receive a print job from an application and modify the print job in accordance with a print job description, whereby when rendered on an output device, a selected text element is masked.)

Regarding claim 10, Cachin in view of Salgado discloses “The method of claim 1, wherein the requested document is an unstructured document.” (See [005]) (An apparatus for masking text in a rendered copy of an original document includes a text modification system which is configured to receive a print job from an application and modify the print job in accordance with a print job description, whereby when rendered on an output device, a selected text element is masked.)

As per claim 11, this claim is rejected based on rationale given above for rejected claim 1 and is similarly rejected. “A computer program product for protecting sensitive information in documents, the computer program product comprising: one or more non-transitory computer-readable storage media and program instructions stored on the one or more non-transitory computer-readable storage media capable of performing a method,” ([0021]-[0022]) (The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium or media having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention. The computer readable storage medium can be a non-tangible device that can retain and store instructions for use by an instruction execution device.)

As per claim 12, this claim is rejected based on rationale given above for rejected claim 2 and is similarly rejected.

As per claim 13, this claim is rejected based on rationale given above for rejected claim 3 and is similarly rejected.

 As per claim 14, this claim is rejected based on rationale given above for rejected claim 4 and is similarly rejected.
As per claim 15, this claim is rejected based on rationale given above for rejected claim 5 and is similarly rejected.

As per claim 16, this claim is rejected based on rationale given above for rejected claim 1 and is similarly rejected.

As per claim 17, this claim is rejected based on rationale given above for rejected claim 2 and is similarly rejected.

As per claim 18, this claim is rejected based on rationale given above for rejected claim 3 and is similarly rejected.

As per claim 19, this claim is rejected based on rationale given above for rejected claim 4 and is similarly rejected.

As per claim 20, this claim is rejected based on rationale given above for rejected claim 5 and is similarly rejected.











                                              Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Tracy McGhee whose telephone number is (313)446-6581.  The examiner can normally be reached on Mon-Thu, 8:00 - 4:30.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain Alam can be reached on 571-272-3978.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Tracy McGhee/
Patent Examiner
Art Unit 2154

/HOSAIN T ALAM/Supervisory Patent Examiner, Art Unit 2154