Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Applicant's Response
In Applicant's Response dated 11/2/2020, Applicant amended the Claims and argued against all objections and rejections set forth in the previous Office Action.
All objections and rejections not reproduced below are withdrawn. 
The rejection of the Claims under 35 U.S.C. 101 previously set forth are maintained. 
The prior art rejection of the Claims under 35 U.S.C. 102 and 103 previously set forth are withdrawn.
	The examiner appreciates the applicant noting where the support for the amendments is located in the specification. 
The Application was filed on 4/2/2019.
Claim(s) 1-18 are pending for examination. Claim(s) 1 is/are independent claim(s).

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


A “system” claim must necessarily include hardware. The “redaction server” does not necessarily include hardware (published specification ¶ [0026]). 

Claim(s) 1:
	Claim(s) 1 recites a "system” solely comprising a “parser”, a “candidates generator”, and a “replacement engine”.  An interpretation of the “parser”, “candidates generator”, and “replacement engine” is software, the specification states that the invention is directed toward the “field of computer software and systems” ¶ [0001], the office policy is to interpret “system” as being software.  Thus, for purposes of examination, the examiner interprets the recited “system” to be software per se.  That is, the recited “system” is not a process, a machine, a manufacture or a composition of matter.
	Accordingly, the “system” is software per se and is not a “process,” a “machine,” a “manufacture”, or a “composition of matter”, as defined in 35 U.S.C. 101.
Claims 2-18 merely recite additional features of the “parser”, “candidates generator”, and “replacement engine”.  Thus, Claim 2-18 do not further define the recited “system” as being within a statutory process, machine, manufacture or composition of matter.

	This rejection can be overcome by amending the claim explicitly recite an actual hardware component, such as a “processor” or “memory.”
	For example: 
Claim 1: 
A system, comprising:
a processor, 
a memory, 
a parser … 

Appropriate action is required. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1, 3, 4, 9, 10, 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over Carasso; David et al. US 2016/0224804 (Carasso).

Claim 1: 
	Carasso teaches: 
A system [¶ 0029, 48; Fig. 1] (system), comprising:
a parser [¶ 0052-59; Fig. 2] (data ingesting and indexing is “parsing” a document) of a redaction server [¶ 0101, 168] (the method is performed by a server computing machine coupled to the client computing machine) that analyzes documents to identify structured [¶ 0033, 40, 65, 75, 94] (structured), semi-structured [¶ 0051] (semi-structured), and unstructured data [¶ 0002, 33, 38-40, 42] (unstructured) from a document;
a candidates generator of the redaction server that generates a list of words for redaction from the structured, semi- structured, and unstructured data [¶ 0158; Fig. 19] (replacement using list) [¶ 0071, Fig. 6A] (search results for “buttercupgames”, the list of events returned would be a “list of words for redaction”; the search results are fed into the anonymization method of Figs. 9 and 21) [¶ ] (using the event search query as input into the anonymizer) [¶ 0170, Fig. 21] (receive data at block 2114) [¶ 0102, Fig. 9] (search query or the results produced from a search query executed using an event processing system (EPS) such as system 100 of FIG. 1) [¶ 0062-63] (listing of matching events, set of results) [¶ 0065, Fig. 4] (events 416, 417, 418 are a “list of words for redaction” ) [¶ 0033, 40, 65, 75, 94] (structured) [¶ 0051] (semi-structured), [¶ 0002, 33, 38-40, 42] (unstructured); and
a replacement engine of the redaction server that replaces one or more words from the list of words with one or more of a replacement word, random characters, and random numbers [¶ 0156-158, 162-165; Figs. 18-19] (random character replacement, replace text, alphanumeric).
Carasso does not use some of the same terminology, but the differences are obvious variations of what Carasso teaches. 
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention that Carasso teaches the claimed invention. 
For example, Carasso does not use the term “redact”, but teaches the anonymization of data, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention that “redaction” is an obvious variation of anonymization. 
Carasso also teaches that “the method is performed by a server computing machine coupled to the client computing machine over one or more networks” [¶ 0101, 168] but does not use the term “redaction server.” 
Carasso also teaches data ingesting and indexing instead of “parsing”, Carasso teaches a list of event for anonymization instead of a “list of words for redaction”, 
These are all obvious variations of what is recited in the claims.

Claim 3: 
	Carasso teaches: 
The system of claim 1, wherein the candidates generator uses semi-structured metadata to generate the list of words [¶ 0046, 53, 102, 120-121] (metadata for field, metadata for event).

Claim 4: 
	Carasso teaches: 
The system of claim 1, wherein the replacement engine uses replacement metadata to generate the replacement word, the replacement metadata including dictionaries and words to ignore [¶ 0158; Fig. 19] (replacement using list, list is a dictionary) [¶ 0142] (dictionary of stopwords) [¶ 0141-143, 145; Fig. 12] (interface for specifying stop words, list of stop words).
Examiner’s Note: Stop words, or a stop list, is common in text processing, common words that are not processed are “a”, “an”, “the”, etc.

Claim 9: 
	Carasso teaches: 
The system of claim 1, wherein the replacement engine compares a list of keywords against the list of words and does not replace keywords that appear on the list of words [¶ 0158; Fig. 19] (replacement using list, list is a dictionary) [¶ 0142] (dictionary of stop words) [¶ 0141-143, 145; Fig. 12] (interface for specifying stop words, list of stop words).

Claim 10: 
	Carasso teaches: 
The system of claim 1, wherein a redaction server generates a redacted document from the document [¶ 0098, 0129, 151, 155, 158] (output) [¶ 0160-167; Fig. .

Claim 15: 
	Carasso teaches: 
The system of claim 10, further comprising one or more of data storage, NLP metadata storage, semi-structured metadata storage, and replacement metadata storage [¶ 0046, 102] (metadata stored) [¶ 0053] (storing timestamp as metadata).

Claim 16: 
	Carasso teaches: 
The system of claim 10, further comprising a user interface for setting one or more of a threshold for less frequent words, a parse tree overlap threshold, and a documents overlap threshold [¶ 0141-143, 145; Fig. 12] (interface for specifying stop words, list of stop words, threshold slider for stop word frequency of occurrence) [¶ 0103-104] (range of values) [¶ 0127, 160, Fig. 20] (max events, maximum number of events to be generated to the output data set).

Claim 17: 
	Carasso teaches: 
The system of claim 10, wherein the redacted document has all text obfuscated except for keywords [¶ 0111, 113] (all unknown text of an event should be anonymized .

Claim 18: 
	Carasso teaches: 
The system of claim 10, wherein the redacted document obfuscates confidential keywords and ignores other keywords [¶ 0142] (all stop words, dictionary of stop words are keywords that are not redacted or anonymized) [¶ 0141-143, 145; Fig. 12] (interface for specifying stop words, list of stop words).

Claims 2, 5-8, 11-14 are rejected under 35 U.S.C. 103 as being unpatentable over Carasso; David et al. US 2016/0224804 (Carasso) in view of Lucas; Michael Ryan et al. US Pub. No. 2020/0126663 (Lucas).
Claim 2: 
	Carasso teaches all the elements shown above.  
	Carasso fails to teach, but Lucas teaches: 
	The system of claim 1, wherein the candidates generator uses natural language processing metadata to generate the list of words [¶ 0056, 61, 88, 97-102, 112, 113, 118, 119, 126, 141-155, 163, 182, 220, 231, 232, 236, 242] (natural language processing) .
	Lucas also teaches: [¶ 0043, 247, 257, 282] (deidentification and removal of protected health information) [¶ 0182] (stop words) [¶ 0007, 09, 49, 56, 59, 66, 72, 74, 81, 127, 249] (structured, unstructured) [¶ 0127] (semi-structured). 

	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to combine the method of anonymizing data in Carasso with the method of analysis of health records in Lucas, with a reasonable expectation of success. 
	The motivation for doing so would have been to mine and uncover information to reduce the amount of gaps in data accessibility [Lucas: ¶ 0007].

Claim 5: 
	Lucas teaches: 
The system of claim 1, wherein the document is a PDF document [¶ 0049, 53, 109, 115, 117, 217, 231, 241] (PDF).

Claim 6: 
	Lucas teaches: [¶ 0049, 53, 109, 115, 117, 217, 231, 241] (PDF)
Carasso teaches: 
The system of claim 5, further comprising a PDF evaluator that determines whether the replacement word uses a same space within the document as a replaced word from the list of words [¶ 0154; Fig. 16] (match length). 

Claim 7: 
	Lucas teaches: 
The system of claim 1, wherein the replacement engine uses parse trees to validate a redacted document for grammatical integrity, the redacted document having the one or more words from the list of words replaced [¶ 0127-129, 161-164, 174; Fig. 7] (parse tree).

Claim 8: 
	Lucas teaches: 
The system of claim 7, further comprising an information extraction system that trains a machine learning model using the redacted document [¶ 0012, 97, 99, 100, 112, 166, 223, 227-229, 273, ] (used for training data for machine learning).

Claim 11: 
	Lucas teaches: 
The system of claim 10, further comprising a machine learning system that uses the redacted document for training a model [¶ 0012, 97, 99, 100, 112, 166, 223, 227-229, 273, ] (used for training data for machine learning).

Claim 12: 
	Lucas teaches: 
The system of claim 11, further comprising an information extraction system that uses the model trained with the redacted document [¶ 0009-12, 0064-68; Fig. 3] (extracting information) [¶ 0012, 97, 99, 100, 112, 166, 223, 227-229, 273, ] (used for training data for machine learning).

Claim 13: 
	Lucas teaches: 
The system of claim 12, wherein the information extraction system processes one or more unredacted customer documents to identify relevant data using the model [¶ 0011, 45, 76, 92, 97, 98, 168, 170, 174, 191, 226-228, 239, 257, 259, 262, ] (verify accuracy).

Claim 14: 
	Lucas teaches: 
The system of claim 12, wherein the redacted document is used to debug the information extraction system [¶ 0053, 84, 88, 93, 115, 178, 182-183, 188,229, 237-240, 252-255, 263] (error detection). 

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Please See PTO-892: Notice of References Cited.

Evidence of the level skill of an ordinary person in the art for Claims 1-18: 


Buford; John F. et al. US 20190104124 teaches: [¶ 0064] maintain the grammatical construct of the message; anonymization of data; natural language processing; neural network; randomly generated, or may be selected from a dictionary of relevant words, phrases, or numbers; [¶ 0065] context is maintained (e.g., names are replaced with names, places with places. 
Goodsitt; Jeremy et al. US 2020/0012892 (Goodsitt) teaches: replace the sensitive portion of the actual data with the synthetic portion; training for machine learning; structured, unstructured, semi-structured. 
Erez; Elad et al. US 20190289034 teaches: pseudonymization (also referred to as data de-identification, anonymization, or obfuscation) is a method of protecting sensitive data by replacing original (also referred to as actual or real) data with fictitious but realistic looking data. 
Horst; Henning et al. US 20190362093 teaches: pseudonymised; replacing the original data string is generated on the one hand by random characters stored in a look-up table or replacement table; string to be replaced is equal to the size of the alphabet. 
Mandpe; Ashvini Sakharam et al. US 20170364523 teaches: phonetically similar masked data; customer sensitive data is replaced with fictitious values. 
Thomson; David et al. US 10573312 teaches: analyzed by the grammar engine  to determine which words should be capitalized and how to add punctuation, replace 
Gigliotti; Samuel S. et al. US 8700991 teaches: FIG. 4 is a flow diagram of one embodiment of a method for obfuscating content using random text.
Rui; Su Ying et al. US 20090138766 teaches: [¶ 0042] dummy document can be replaced with a random generated string with the same size, in the same character set, and even with the same number of " words", same length.
Hayashi; Daisuke et al. US 10032046 teaches: replace confidential information and other potentially confidential information with other unassociated words (e.g., jargon), random numbers, or characters. 

Response to Arguments
Applicant's arguments filed 11/2/2020 have been fully considered but they are not persuasive. 

35 USC 101 Rejection: 
The applicant argues that the amendments overcome the rejection (response page 5). 
The examiner respectfully disagrees. 
A “system” claim must necessarily include hardware. The “redaction server” does not necessarily include hardware (published specification ¶ [0026]). 
	This rejection can be overcome by amending the claim explicitly recite an actual hardware component, such as a “processor” or “memory.”

Claim 1: 
A system, comprising:
a processor, 
a memory coupled to the processor, 
a parser … 


Carasso Reference: 
The applicant argues that Carasso fails to teach the “list of words” recited in the claims (response pages 5-6). 
The examiner respectfully disagrees. 
Carasso is a system for anonymizing event data. The user performs a search for event data such as “buttercupgames”, [Fig. 6A], the returned events all contain the searched for term. This returned result is a “list of words”. The search results are then input into the anonymization system [Figs. 9 and 21]. Carasso specifically references using the event search query as input into the anonymizer (see [¶ 0170, Fig. 21] receive search data at block 2114) (see also [¶ 0102, Fig. 9] search query or the results produced from a search query executed using an event processing system (EPS) such as system 100 of FIG. 1). 
See also: [¶ 0158; Fig. 19] (replacement using list) [¶ 0071, Fig. 6A] (search results for “buttercupgames”, the list of events returned would be a “list of words”; the search results are fed into the anonymization method of Figs. 9 and 21) [¶ 0062-63] (listing of matching events, set of results) [¶ 0065, Fig. 4] (events 416, 417, 418 are a “list”). 


In the broadest reasonable interpretation consistent with the specification a “list of words” can include zero items, one item, or more than one item. In software programming a “list” could be a data structure with zero or more items in it. So a “list of words” data structure could have zero or more items in it, even though the claim uses the plural “words”. Since a “list of words” could be data structure with a single item, in this case a single word, Carasso reads on the claimed invention by generating a single word for redaction. 
See: https://english.stackexchange.com/questions /2370/a-list-with-only-one-item 
See also: https://www.dictionary.com/browse/list https://web.archive.org/web/20170512152507/https://www.dictionary.com/browse/list 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN J SMITH whose telephone number is (571)270-3825.  The examiner can normally be reached on Monday - Friday 11:00 - 7:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Scott Baderman can be reached on (571)272-3644.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.





/Benjamin Smith/Examiner, Art Unit 2144                                                                                                                                                                                                        Direct Phone: 571-270-3825
Direct Fax: 571-270-4825
Email: benjamin.smith@uspto.gov


/SCOTT T BADERMAN/Supervisory Patent Examiner, Art Unit 2144