Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Applicant's Response
In Applicant's Response dated 1/18/2022, Applicant did not amend the Claims and argued against all objections and rejections set forth in the previous Office Action.
All objections and rejections not reproduced below are withdrawn. 
The prior art rejection of the Claims under 35 U.S.C. 103 previously set forth are maintained.
	The examiner appreciates the applicant noting where the support for the amendments is located in the specification. 
The Application was filed on 4/2/2019.
Claim(s) 1-18 are pending for examination. Claim(s) 1 is/are independent claim(s).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention, as a whole, would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1, 3, 10, 15 are rejected under 35 U.S.C. 103 as being unpatentable over Thayer; Nicholas D. et al. US Pub. No. 2018/0285591 (Thayer) in view of Hemachandran; Bharath Kumar US Pub. No. 2018/0232407 (Hemachandran). 
 
Claim 1: 
	Thayer teaches: 
A system [¶ 0075] (system), comprising:
a non-transitory computer readable storage medium having computer program code stored thereon [¶ 0076] (medium), the computer program code, when executed by one or more processors implemented on a computer machine presents a redaction server [abstract, ¶ 0002, 10, 13, 15, 17] (server) [abstract, ¶ 0013, 18, 24-27, 31, 36, 40-41, 47, 51-52, 54-59] (redaction) comprising:
…
a candidates generator that generates a list of words for redaction [¶ 0034, 44] (determining sensitive values contained in a document is generating “a list of words for redaction”) [¶ 0036, 40, 47, 51] (the method describes multiple “words” because there is a loop where an obfuscation value is generated for each sensitive word) [¶ 0057, 63] (the determined sensitive values are later described as a pre-defined list for obfuscation, and that the list can be dynamically updated by the system) [¶ 0020] (obfuscation criterion based on a data field, tag that identifies a type of data value) [¶ 0020] (identify obfuscation values is a “list”) [¶ 0039] (selected sensitive value is also a “list”) [¶ 0057, 63] (pre-defined list for obfuscation, list dynamically updated by the system) [¶ 0057] (identify the data fields that contain obfuscation values from a pre-defined list) [¶ 0063] (identify the data column names that contain the obfuscation values) [¶ 0073] (location identifiers of the obfuscation values) [¶ 0036, 40, 47, 51; Fig. 2, element 212; Fig. 3, element 314] (a loop where an obfuscation value is generated for each sensitive word means there is more than one or “words”) from the structured, semi- structured, and unstructured data [¶ 0019] (structured, unstructured, semi-structured) [¶ 0034, 44] (techniques used by the obfuscator/de-obfuscator may depend on the type or structure of the document, for unstructured documents parse and semantically analyze the text or data, may use a combination of techniques where the document contains an unstructured section and a structured section, this would be a semi-structured document) [¶ 0020] (with a structured document the obfuscator/de-obfuscator can select an obfuscation criterion based on a data field);
a replacement engine that redacts by replacing one or more words from the list of words with one or more of a replacement word, random characters, and random numbers [¶ 0021, 36, 47] (random set of alphanumeric characters as the obfuscation value) [¶ 36-37] (replace credit card number with random fake credit card number, replace zip code numbers) [abstract, ¶ 0013, 18, 24-27, 31, 36, 40-41, 47, 51-52, 54-59] (redaction). 

Thayer teaches processing structured and unstructured data differently but does not discuss how the data is identified. 
Thayer fails to teach, but Hemachandran teaches: 
a parser that analyzes documents to identify structured, semi- structured, and unstructured data from a document [¶ 0037] (data type identification module determines the type of the incremental heterogeneous data as structured data, semi-structured data, quasi -structured data and unstructured data);

	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine the method of redacting data in Thayer with the method of analysis of data in Hemachandran, with a reasonable expectation of success. 
	The motivation for doing so would have been “ensuring consistency between source and destination” [Hemachandran: ¶ 0038].

Claim 3: 
	Thayer teaches: 
The system of claim 1, wherein the candidates generator uses semi-structured metadata to generate the list of words [¶ 0027, 57] (metadata in document). 

Claim 10: 
	Thayer teaches: 
The system of claim 1, wherein a redaction server generates a redacted document from the document [¶ 0052] (communicate the redacted document for storage).

Claim 15: 
	Thayer teaches: 
The system of claim 10, further comprising one or more of data storage, NLP metadata storage, semi-structured metadata storage, and replacement metadata storage [¶ 0023, 40-41, 52, 57, 71, 76-77, 87] (storage). 
Carasso also teaches: [¶ 0041, 52, 57, 71] (storage).

Claims 2, 5, 7, 8, 11-14 are rejected under 35 U.S.C. 103 as being unpatentable over Thayer; Nicholas D. et al. US Pub. No. 2018/0285591 (Thayer) in view of Hemachandran; Bharath Kumar US Pub. No. 2018/0232407 (Hemachandran) in view of Lucas; Michael Ryan et al. US Pub. No. 2020/0126663 (Lucas).
Claim 2: 
	Thayer, Hemachandran teach all the elements shown above.  
	Thayer, Hemachandran fail to teach, but Lucas teaches: 
	The system of claim 1, wherein the candidates generator uses natural language processing metadata to generate the list of words [¶ 0056, 61, 88, 97-102, 112, 113, 118, 119, 126, 141-155, 163, 182, 220, 231, 232, 236, 242] (natural language processing).
	Lucas also teaches: [¶ 0043, 247, 257, 282] (deidentification and removal of protected health information) [¶ 0182] (stop words) [¶ 0007, 09, 49, 56, 59, 66, 72, 74, 81, 127, 249] (structured, unstructured) [¶ 0127] (semi-structured) [¶ 0163, 177] (candidate list). 
	
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine the method of redacting data in Thayer with the method of analysis of data in Hemachandran with the method of analysis of health records in Lucas, with a reasonable expectation of success. 
	The motivation for doing so would have been to mine and uncover information to reduce the amount of gaps in data accessibility [Lucas: ¶ 0007].

Claim 5: 
	Lucas teaches: 
The system of claim 1, wherein the document is a PDF document [¶ 0049, 53, 109, 115, 117, 217, 231, 241] (PDF).

Claim 7: 
	Lucas teaches: 
The system of claim 1, wherein the replacement engine uses parse trees to validate a redacted document for grammatical integrity, the redacted document having the one or more words from the list of words replaced [¶ 0127-129, 161-164, 174; Fig. 7] (parse tree).

Claim 8: 
	Lucas teaches: 
The system of claim 7, further comprising an information extraction system that trains a machine learning model using the redacted document [¶ 0012, 97, 99, 100, 112, 166, 223, 227-229, 273] (used for training data for machine learning).

Claim 11: 
	Lucas teaches: 
The system of claim 10, further comprising a machine learning system that uses the redacted document for training a model [¶ 0012, 97, 99, 100, 112, 166, 223, 227-229, 273] (used for training data for machine learning).

Claim 12: 
	Lucas teaches: 
The system of claim 11, further comprising an information extraction system that uses the model trained with the redacted document [¶ 0009-12, 0064-68; Fig. 3] (extracting information) [¶ 0012, 97, 99, 100, 112, 166, 223, 227-229, 273] (used for training data for machine learning).

Claim 13: 
	Lucas teaches: 
The system of claim 12, wherein the information extraction system processes one or more unredacted customer documents to identify relevant data using the model [¶ 0011, 45, 76, 92, 97, 98, 168, 170, 174, 191, 226-228, 239, 257, 259, 262] (verify accuracy).

Claim 14: 
	Lucas teaches: 
The system of claim 12, wherein the redacted document is used to debug the information extraction system [¶ 0053, 84, 88, 93, 115, 178, 182-183, 188,229, 237-240, 252-255, 263] (error detection). 

Claims 4, 6, 9, 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over Thayer; Nicholas D. et al. US Pub. No. 2018/0285591 (Thayer) in view of Hemachandran; Bharath Kumar US Pub. No. 2018/0232407 (Hemachandran) in view of Lucas; Michael Ryan et al. US Pub. No. 2020/0126663 (Lucas) in view of Carasso; David et al. US 2016/0224804 (Carasso).
Claim 6: 
	Thayer, Hemachandran, Lucas teach all the elements shown above.  
Lucas teaches: [¶ 0049, 53, 109, 115, 117, 217, 231, 241] (PDF)
	Thayer, Hemachandran, Lucas fail to teach, but Carasso teaches: 
The system of claim 5, further comprising a PDF evaluator that determines whether the replacement word uses a same space within the document as a replaced word from the list of words [¶ 0154; Fig. 16] (match length). 

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine the method of redacting data in Thayer with the method of analysis of data in Hemachandran with the method of analysis of health records in Lucas with the method of anonymization in Carasso, with a reasonable expectation of success. 
	The motivation for doing so would have been “processing large volumes of machine-generated data in an intelligent manner and effectively presenting the results of such processing” [Carasso: ¶ 0002, 76].

Carasso; David et al. US 2016/0224804 (Carasso) also teaches the elements of claim 1 as follows:
 [¶ 0052-59; Fig. 2] (data ingesting and indexing is “parsing” a document) [¶ 0101, 168] (the method is performed by a server computing machine coupled to the client computing machine) [¶ 0033, 40, 65, 75, 94] (structured) [¶ 0051] (semi-structured) [¶ 0002, 33, 38-40, 42] (unstructured)
 [¶ 0062-63, 72, 89] (listing of matching events, set of results) [¶ 0065, Fig. 4] (events 416, 417, 418 are a “list of words for redaction” ) [¶ 0158; Fig. 19] (replacement using list) [¶ 0071, Fig. 6A] (search results for “buttercupgames”, the list of events returned would be a “list of words for redaction”; the search results are fed into the anonymization method of Figs. 9 and 21) [¶ ] (using the event search query as input into the anonymizer) [¶ 0170, Fig. 21] (receive data at block 2114) [¶ 0102, Fig. 9] (search query or the results produced from a search query executed using an event processing system (EPS) such as system 100 of FIG. 1) [¶ 0033, 40, 65, 75, 94] (structured) [¶ 0051] (semi-structured), [¶ 0002, 33, 38-40, 42] (unstructured); and
 [¶ 0156-158, 162-165; Figs. 18-19] (random character replacement, replace text, alphanumeric) 
[¶ 0046, 53, 102, 120-121] (metadata for field, metadata for event)
[¶ 0098, 0129, 151, 155, 158] (output) [¶ 0160-167; Fig. 20] (anonymization output settings) [¶ 0168-170; Fig. 21] (output data set or data repository of anonymized data)
[¶ 0046, 102] (metadata stored) [¶ 0053] (storing timestamp as metadata)

Claim 4: 
	Carasso teaches: 
The system of claim 1, wherein the replacement engine uses replacement metadata to generate the replacement word, the replacement metadata including dictionaries and words to ignore [¶ 0158; Fig. 19] (replacement using list, list is a dictionary) [¶ 0142] (dictionary of stop words) [¶ 0141-143, 145; Fig. 12] (interface for specifying stop words, list of stop words).
Examiner’s Note: Stop words, or a stop list, is common in text processing, common words that are not processed are “a”, “an”, “the”, etc.

Claim 9: 
	Carasso teaches: 
The system of claim 1, wherein the replacement engine compares a list of keywords against the list of words and does not replace keywords that appear on the list of words [¶ 0158; Fig. 19] (replacement using list, list is a dictionary) [¶ 0142] (dictionary of stop words) [¶ 0141-143, 145; Fig. 12] (interface for specifying stop words, list of stop words).

Claim 16: 
	Carasso teaches: 
The system of claim 10, further comprising a user interface for setting one or more of a threshold for less frequent words, a parse tree overlap threshold, and a documents overlap threshold [¶ 0141-143, 145; Fig. 12] (interface for specifying stop words, list of stop words, threshold slider for stop word frequency of occurrence) [¶ 0103-104] (range of values) [¶ 0127, 160, Fig. 20] (max events, maximum number of events to be generated to the output data set).

Claim 17: 
	Carasso teaches: 
The system of claim 10, wherein the redacted document has all text obfuscated except for keywords [¶ 0111, 113] (all unknown text of an event should be anonymized with random characters, and that all unspecified fields should be anonymized with random characters).

Claim 18: 
	Carasso teaches: 
The system of claim 10, wherein the redacted document obfuscates confidential keywords and ignores other keywords [¶ 0142] (all stop words, dictionary of stop words are keywords that are not redacted or anonymized) [¶ 0141-143, 145; Fig. 12] (interface for specifying stop words, list of stop words).

Alternate Rejection: 
Claim(s) 1 is/are rejected under 35 U.S.C. 103 as being unpatentable over Muffat; Christopher et al. US Pub. No. 2020/0250139 (Muffat). 
Claim 1: 
	Muffat teaches: 
A system [¶ 0035] (system), comprising:
a non-transitory computer readable storage medium having computer program code stored thereon [¶ 0003, 33, 72] (stored documents would be on a “medium”), the computer program code, when executed by one or more processors implemented on a computer machine presents a redaction server [¶ 0047, 75, 93-95] (personal data masking is equivalent to “redaction”) comprising:
a parser that analyzes documents to identify structured, semi- structured, and unstructured data from a document [¶ 0006-07, 33, 54, 57, 59, 123] (entity recognition and extraction in the structured, semi-structured and unstructured documents);
a candidates generator that generates a list of words for redaction [¶ 0092] (list of words to be masked) [¶ 0075] (list of matches) [¶ 0050] (list of personal identification information (PIIS)) [¶ 0046-47; Fig. 2A, elements 212 and 214] (replacing the personal identification information (PII) with a generic term; for example, “John Smith” is replaced with “<full name>”) from the structured, semi- structured, and unstructured data [¶ 0006-07, 33, 54, 57, 59, 123] (entity recognition and extraction in the structured, semi-structured and unstructured documents) [¶ 0072] (blacklist, part of speech (PoS) tagger, stop words);
a replacement engine that redacts by replacing one or more words from the list of words with one or more of a replacement word, random characters, and random numbers [¶ 0073, 93] (replace) [¶ 0046-47, 72, 75, 85, 90, 92-93] (extract and mask PII). 
Muffat does not use all the same terminology as the claims, however, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention that the terms used are equivalent. For example, “masking” in Muffat is equivalent to “redaction” in the claims. 
In addition, Muffat does not describe a redaction server, however the user of servers for processing, including parsing and redaction, was well known in the art as of the effective filling date of the application, so any of the systems described in Muffat could be done on a server.  

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Please See PTO-892: Notice of References Cited.

Evidence of the level skill of an ordinary person in the art for Claims 1-18: 
	Mankovskii; Serge US 10,061,773 teaches: parsing the structured, semi-structured and/or unstructured documents to extract data and information; determining whether the first parser is able to parse the nested document if not using a different parser. 
Csurka; Gabriela et al. US 20160132277 teaches: replaces characters with randomly chosen characters using an identical or similar font, thereby allowing the text to be illegible, while taking approximately the same space on the page. 
Buford; John F. et al. US 20190104124 teaches: [¶ 0064] maintain the grammatical construct of the message; anonymization of data; natural language processing; neural network; randomly generated, or may be selected from a dictionary of relevant words, phrases, or numbers; [¶ 0065] context is maintained (e.g., names are replaced with names, places with places. 
Goodsitt; Jeremy et al. US 2020/0012892 (Goodsitt) teaches: replace the sensitive portion of the actual data with the synthetic portion; training for machine learning; structured, unstructured, semi-structured. 
Erez; Elad et al. US 20190289034 teaches: pseudonymization (also referred to as data de-identification, anonymization, or obfuscation) is a method of protecting sensitive data by replacing original (also referred to as actual or real) data with fictitious but realistic looking data. 
Horst; Henning et al. US 20190362093 teaches: pseudonymised; replacing the original data string is generated on the one hand by random characters stored in a look-up table or replacement table; string to be replaced is equal to the size of the alphabet. 
Mandpe; Ashvini Sakharam et al. US 20170364523 teaches: phonetically similar masked data; customer sensitive data is replaced with fictitious values. 
Thomson; David et al. US 10573312 teaches: analyzed by the grammar engine to determine which words should be capitalized and how to add punctuation, replace word, anonymize; select a word at random, same number of words; interface to adjust settings; match rule, number of match errors cannot exceed y % (e.g., 25%). 
Gigliotti; Samuel S. et al. US 8700991 teaches: FIG. 4 is a flow diagram of one embodiment of a method for obfuscating content using random text.
Rui; Su Ying et al. US 20090138766 teaches: [¶ 0042] dummy document can be replaced with a random generated string with the same size, in the same character set, and even with the same number of " words", same length.
Hayashi; Daisuke et al. US 10032046 teaches: replace confidential information and other potentially confidential information with other unassociated words (e.g., jargon), random numbers, or characters. 

Response to Arguments
Applicant's arguments filed 1/18/2022 have been fully considered but they are not persuasive.

35 USC 103 Rejection: Thayer in view of Hemachandran: 
The applicant argues that “Thayer’s obfuscator/de-obfuscator 108 follows predefined instructions/rules (based on the obfuscation criterion discussed above) to locate and highlight (tag) fields in the document where obfuscation may be required. The obfuscator of Thayer does not generate a list of words to be redacted, nor it is meant to do so based on Thayer’s disclosure as a whole.” (response pages 6-7). 
The examiner respectfully disagrees. 
Thayer uses the term “list”, but also describes structures that fall under the broad definition of the term “list”. 
Claims are given the broadest reasonable interpretation consistent with the specification (BRI), and an applicant is entitled to be their own lexicographer, but to rebut the presumption that claim terms are to be given their ordinary and customary meaning the applicant must clearly set forth a definition of the term that is different from its ordinary and customary meaning in the specification at the time of filing (see MPEP 2111). 
Applicant’s specification discusses a “list” but does not define what the structure of the list is only that: “A candidates generator generates a list of words for redaction from the structured, semi-structured, and unstructured data. A replacement engine replaces one or more words from the list of words with one or more of a replacement word, random characters, and random numbers.” (See published specification ¶ [0005, 19]) (emphasis by examiner). 
Since applicant does not define what a “list” is or give any examples of what the term “list” means, we look to the plain meaning of the word. 
Merriam-Webster defines list as “the total number to be considered or included” or “a simple series of words or numerals (such as the names of persons or objects)” (https://www.merriam-webster.com/dictionary/list). 
Dictionary.com defines list as “a series of names or other items written or printed together in a meaningful grouping or sequence so as to constitute a record” (https://www.dictionary.com/browse/list). 
Now that we understand the meaning of the term “list” we can interpret the claims and the prior art. In the broadest reasonable interpretation of the claims a “list” is taught in Thayer in at least three different ways. 
The first way Thayer teaches “list” is: Thayer discloses determining sensitive values contained in a document, this is generating “a list of words for redaction” [¶ 0034, 44]. Even though Thayer does not use the term list here, it is clear that “determining sensitive values contained in a document” meets the Merriam-Webster definition of list as “the total number to be considered or included” as well as the Dictionary.com definition of list as “a series of names or other items written or printed together in a meaningful grouping or sequence so as to constitute a record”. 
Thayer later describes the “determining sensitive values contained in a document” which was generated as a pre-defined list, in this case it is pre-defined because the system had previously determined the sensitive words in a document [¶ 0057, 63] (pre-defined list for obfuscation, list dynamically updated by the system). 
A list could be a single item or in this case a single word, but the claim recites a “list of words”. We know that the method in Thayer includes multiple “words” because there is a loop where an obfuscation value is generated for each sensitive word [¶ 0036, 40, 47, 51] (Fig. 2, element 212; Fig. 3, element 314).
The second way Thayer teaches “list” is: a collection of associated values is a “list” in the broadest reasonable sense [¶ 0040, 51]. 
The third way Thayer teaches “list” is: mapping the determined sensitive values to be redacted could be interpreted as the “list” recited in the claims [¶ 0023, 39-40, 49-51, 73].
Any one of these three could be interpreted, in the broadest reasonable interpretation, as the “list” recited in the claims. They fit under the Merriam-Webster definition of “the total number to be considered or included” as well as the Dictionary.com definition of “a series of names or other items written or printed together in a meaningful grouping or sequence so as to constitute a record”. 
Therefore, Thayer does in fact disclose the recited “list”.

The applicant argues that Thayer does not teach the other element because Thayer does not teach a “list” (response page 7). 
The examiner respectfully disagrees. 
Thayer does in fact teach a list and therefore teaches the other element of the claims as shown in the rejection above. 

The applicant argues that “Hemachandran does not cure the deficiencies of Thayer” (response page 7). 
The examiner respectfully disagrees. 
Hemachandran does not need to cure the deficiencies of Thayer, because Thayer teaches a list, as shown above, which means there are not any deficiencies in Thayer.

35 USC 103 Rejection: Muffat: 
The applicant argues that “that ‘masking’ is not equivalent to ‘redacting’ (see
https://www.pkware.com/blog/encryption-tokenization-masking-and-redaction-choosing-the-right-approach)” and that “there are discrete differences between redaction and masking in the art. And for this reason, ‘masking’ is not equivalent to ‘redacting’ as the Examiner alleges” (response page 8). 
The examiner respectfully disagrees. 
The prior art is less than settled about what the differences between “redaction” and “masking” are. 
For example, satori claims that “In many cases, data redaction is considered to be a sub-type of data masking” (https://satoricyber.com/data-masking/the-fundamentals-of-data-redaction/). In other places John English claims that “‘Masking’ is a specialized and more focused version of redacting” (https://www.quora.com/What-is-the-difference-between-masking-and-redacting). So, one claims masking is the subset of redaction and the other claims that redaction is the subset of masking. 
Applicant’s specification does not discuss the differences between masking and redaction and only uses the term “mask” once in relation to the prior art “Redacted document 100 has a mask (black) over the confidential content” (published specification ¶ [0022]). The specification discusses “redaction” at length including that “present system handles redactions of PDF documents and also replaces confidential information in the PDF by taking into account the width of the original characters of the confidential information” (published specification ¶ [0024]) and that the “system first finds the data to be redacted and then replaces confidential data with non-confidential data” (published specification ¶ [0025]).
The applicant’s use of the term “redaction” appears to be more in line with what the industry calls “masking” (see published specification ¶ [0047-48, 55-56], Figs. 6A-6B). Even applicant’s provided definition from pkware when making their argument against Muffat discusses “masking” as “Sensitive information is replaced by random characters in the same format as the original data” (response page 8).  
Definitions matter, but what is important to keep in mind here is that, the claims are read in light of the specification and what the prior art teaches as a whole, not just one single person or company, which the examiner has shown can differ greatly. Whether it is called “masking” or “redaction”, Muffat describes the same process as recited in the claims and disclosed in the specification. 
Muffat discloses masking as replacing the personal identification information (PII) with a generic term. For example, “John Smith” is replaced with “<full name>” (see Fig. 2A, elements 212 and 214, ¶ [0046-47].  
The Applicant’s specification discloses exactly this where the name “Ramesh” is replaced with “Kjhgfd”, (see published specification ¶ [0047-48, 55-56], Figs. 6A-6B). 
Regardless of the term used, whether it is called “masking” or “redaction” the result is the same and Muffat discloses the claims invention. 

The applicant argues that “performing entity recognition (e.g., identifying personal data entities) from structured, semi-structured and unstructured records is not the same as ‘identify structured, semi-structured, and unstructured data from a document’” (response pages 8-9). 
The examiner respectfully disagrees. 
Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
The applicant does not provide any reasoning as to why identifying personal data entities from structured, semi-structured and unstructured records is not the same as “identify structured, semi-structured, and unstructured data from a document”. If the system recognizes and entity as structured, semi-structured and unstructured, it has identified it as “structured, semi-structured, and unstructured data from a document”. 

The applicant argues that “According to Muffat, the purported PII list is not intended for redaction, as alleged by the Examiner.” (response pages 9-10). 
The examiner respectfully disagrees. 
The exact reason that Muffat identifies PII is for redaction, or masking. 
Muffat discloses masking as replacing the personal identification information (PII) with a generic term. For example, “John Smith” is replaced with “<full name>” (see Fig. 2A, elements 212 and 214, ¶ [0046-47].  
The Applicant’s specification discloses exactly this where the name “Ramesh” is replaced with “Kjhgfd”, (see published specification ¶ [0047-48, 55-56], Figs. 6A-6B). 
Therefore Muffat discloses the claimed as well as the disclosed invention. 

35 USC 103 Rejection: Lucas: 
The applicant argues that “Lucas does not cure the deficiencies of Thayer and Hemachandran” (response page 10). 
The examiner respectfully disagrees. 
Lucas does not need to cure the deficiencies of Thayer and Hemachandran, because Thayer and Hemachandran teach all the elements of the claims, as shown above, which means there are not any deficiencies in Thayer and Hemachandran.

35 USC 103 Rejection: Carasso: 
The applicant argues that Carasso fails to teach the elements of claim 1 (response pages 11-14). 
The examiner respectfully disagrees. 
Carasso teaches each limitation as set forth in the rejection of claim 6. This is provided as information for the applicant to consider so the arguments are not responded to even though the examiner disagrees with their basis and reasoning. 

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN J SMITH whose telephone number is (571)270-3825.  The examiner can normally be reached on Monday - Friday 11:00 - 7:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Scott Baderman can be reached on (571)272-3644.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/Benjamin Smith/Examiner, Art Unit 2144                                                                                                                                                                                                        Direct Phone: 571-270-3825
Direct Fax: 571-270-4825
Email: benjamin.smith@uspto.gov