DETAILED ACTION
Response to Amendment
1.	The amendment filed on 12/15/2020 has been entered. Claims 33-35, 47, and 71 have been amended. No new claims have been added or cancelled. Accordingly, claims 33-77 and 79-104 are pending in this office action.
 



Claim Rejections - 35 USC § 102
2.	The following is a quotation of the appropriate paragraphs of pre-AIA  35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

Claim(s) 33-77, 79-104 is/are rejected under pre-AIA  35 U.S.C. 102(e) as being anticpated by US 2006 0089924 (hereinafter Raskutti).


As for claim 33 discloses: A computer-implemented method comprising determining a structure of a body of characters that have meaning, the structure comprising a structure of records and fields of the records or a tree structure of the body of characters the structure of the body of characters being determined by a combination of automatic computer-based processes applied to the body of characters and interaction with a user (See paragraphs 0058-0066 and 0105-0108 note the system will use n-grams to perform feature extraction further see paragraph 
, the interaction with the user comprising displaying portions of the body of characters to the user  in a display that represents at least a portion of the structure of the bod of characters that has been determined by the automatic computer processes (See paragraphs 0066 and 0071-0073 note once the automatic process is complete the user can highlight words and/or phrases displayed within the user interface),
 receiving from the user an indication of at least one record or at least one field of records that are part of the structure of the body of characters , and interpreting the body of characters according to the structure for use in processing or analyzing the body of data (See paragraphs 0070-0074 note the user moving messages/document/clusters is an indication of how the data should be classified and the system interprets the body of characters/message in accordance with the subject matter of the cluster).

	As for claim 34 the rejection of claim 33 are incorporated and further discloses: in which the automatic processes include determining possible boundaries between records based on a character or characters that appear repeatedly in the body of characters (See paragraphs 0059-0060 note the system checks for token frequencies)

	As for claim 35 the rejection of claim 33 are incorporated and further Raskutti discloses: are in which the automatic processes include determining possible boundaries between fields based on features of characters in the body of characters (See paragraphs 0061-0063)


	As for claim 36 the rejection of claim 33 are incorporated and further Raskutti discloses: in which the display shows successive possible records of the body of characters on successive lines of the display (See paragraph 0067 notes sentences are successive lines)

	As for claim 36 the rejection of claim 33 are incorporated and further Raskutti discloses: are in which the display shows possible boundaries between fields (See paragraphs 0061-0063) 

	As for claim 37 the rejection of claim 33 are incorporated and further Raskutti discloses: in which the user indications include changes in the boundaries of records or the boundaries of at least one of the fields (See paragraph 0057 note the user can change the boundaries of the cluster)

	As for claim 39 the rejection of claim 33 are incorporated and further Raskutti discloses: are in which the interaction comprises enabling the user to navigate through an entire body of characters that includes at least millions of record without requiring persistent storage of any portion of the body of characters in a structure that conforms to the records and fields of the structure (See paragraph 0002 note enormous size of documents can include billions) 
	
As for claim 40 the rejection of claim 33 are incorporated and further Raskutti discloses:  in which the interpretation of the body of characters for use in processing or analyzing is done without requiring persistent storage of any portion of the body of characters in a structure that conforms to the records and fields of the structure (See paragraphs 0070-0074 note the user 

	
	Claim 41 is a computer implemented method substantially corresponding to the method of claim 33 and is thus rejected for the same reasons as set forth in the rejection of claim 33.



As for claim 42 the rejection of claim 41 are incorporated and further Raskutti discloses: in which the body of characters is stored in memory (See paragraphs 0073-0075 note a term dictionary is built based on the documents) 

As for claim 43 the rejection of claim 41 are incorporated and further Raskutti discloses: in which the derived structure is determined by a combination of automatic computer-based processes and interaction with a user of a user interface on which portions of the body of characters are displayed (See paragraphs 0070-0074 note the user moving messages/document/clusters is an indication of how the data should be classified and the system interprets the body of characters/message in accordance with the subject matter of the cluster).

As for claim 44 the rejection of claim 41 are incorporated and further Raskutti discloses: are in which the derived structure is applied to successive portions of the body of characters and 

Claim 45 is a computer-implemented method substantially corresponding to the method of claim 33 and is thus rejected for the same reasons as set forth in the rejection of claim 33.

As for claim 46 the rejection of claim 45 are incorporated and further Raskutti discloses: in which the interpreting includes removing ambiguities and determining semantics of the body of characters (See paragraph 0058 note spelling errors create ambiguities)


Claim 47 is a method claim substantially corresponding to the method of claim 33 and is thus rejected for the same reasons as set forth in the rejection of claim 33.

As for claim 48 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the characters have been derived from binary information by applying a standard character decoding (See paragraph 0045 note the system uses zeros and positive values to show which clusters the characters belong to).

As for claim 49 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the features comprise fields (See paragraph 0097).

).

As for claim 51 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the features comprise data formats (See paragraph 0111)



As for claim 52 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the meaning comprises an interpretation (See paragraph 0091)

As for claim 53 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the groups of characters comprise values of fields (See paragraph 0106)

As for claim 54 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the collections of groups of characters comprise records (See paragraph 0097)

As for claim 55 the rejection of claim 47 are incorporated and further Raskutti discloses: are in which the information provided to the user comprises an identification of putative records of the body of characters (See paragraphs 0109 note negative is punitive) 



As for claim 57 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the information provided to the user comprises frequencies of values appearing in the body of characters (See paragraph 0059)

As for claim 58 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the information provided to the user comprises ranges of values appearing in the body of characters (See paragraph 0075)

As for claim 59 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the information provided to the user includes multiple putative records or portions of records of the body of characters (See paragraph 0109)

As for claim 60 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the information provided to the user is displayed in an interactive user interface (See paragraph 0057)

As for claim 61 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the information provided to the user is displayed without requiring that any portion of the body of characters be stored persistently (See paragraph 0057)

As for claim 62 the rejection of claim 47 are incorporated and further Raskutti discloses: are delivering the body of characters as a stream of the collections of groups of characters from memory (See paragraph 0071).

As for claim 63 the rejection of claim 47 are incorporated and further Raskutti discloses: displaying any arbitrary portion of the body of characters as collections of groups of characters in response to actions by the user in the interactive user interface, without requiring that the portion of the body of characters be stored  (See paragraph 0057)
 
As for claim 64 the rejection of claim 47 are incorporated and further Raskutti discloses: enabling a user to initiate an analysis of at least a portion of the body of characters and, based on the initiation, analyzing the portion of the body of characters (See paragraph 0057)

As for claim 65 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the analysis comprises at least one of a frequency analysis of values associated with the field, a frequency analysis of frequencies of values, filtering of data, or an analysis of repetitions in the data (See paragraph 0075).

As for claim 66 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the body of data comprises data expressed in an XML or other hierarchical format (See paragraph 0065)



As for claim 68 the rejection of claim 47 are incorporated and further Raskutti discloses: enabling the user to apply transformations to the body of characters (See paragraphs 0109)
As for claim 69 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the transformations comprise regular expression matching (See paragraph 0078-0082)

As for claim 70 the rejection of claim 47 are incorporated and further Raskutti discloses: incorporated and further Raskutti discloses: blending the body of characters with another body of characters and displaying the body of characters and the other body of characters to the user in a consistent display arrangement (See paragraphs 0068)

As for claim 71 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the body of characters and the other body of characters each comprises information from which a key can be derived, the keys of the bodies of characters are different, the key of one of the bodies of characters has a relationship with the key of another one of the bodies of characters, the method comprising determining a relationship among the keys of the body of data and the other body of data  (See a paragraphs 0071 and 0088)

As for claim 72 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the number of collections of groups of characters comprises at least millions, and the 

As for claim 73 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the arbitrary portion can have any arbitrary location in the body of characters (See paragraphs 0065-0069)

As for claim 74 the rejection of claim 47 are incorporated and further Raskutti discloses: in which the processing is applied only to portions of collections of groups of characters that are within a frame of the interactive user interface (See paragraph 0108)

Claims 75-85 are computer-implemented method claims substantially corresponding to the method of claims 33-74 and are thus rejected for the same reasons as set forth in the rejection.

Claims 86-103 are computer-implemented method claims substantially corresponding to the method of claims 33-74 and are thus rejected for the same reasons as set forth in the rejection.



Claim 104 is a computer-implemented method claims substantially corresponding to the method of claims 33-74 and are thus rejected for the same reasons as set forth in the rejection.

Response to Arguments
3.	Applicant's arguments filed 12/15/2020 have been fully considered but they are not persuasive. 
Applicant argues:
	The applicant respectfully disagrees. In the applicant’s amended claim 33, a determination is made of a structure of a body of characters in which the structure comprises a structure of records and fields of the records or a tree structure of the body of characters.
The cited portions of the Raskutti reference did not describe and would not have made obvious determining a structure of a body of characters. The n-grams referred to in the cited portions related to the content of a given document and did not relate to the structure of the document.
In paragraph 0071 of the Raskutti reference, the described structure (clusters) was of a “collection of documents” and the structure captured “the appropriate conceptual relations.” The structure was related to the content relationships of the documents to one another rather than to the structure of the documents themselves. The user could “create and delete clusters, edit cluster descriptions, and move messages/clusters to other clusters” and could “(i) highlight words and/or phrases in documents to indicate the cluster topic, (ii) create sub-clusters within very large and diverse clusters (using the clusterer 12), and (iii) create a filter for a cluster/category, using the messages under that cluster as examples (by using filter module 14).” These user functions related to the content relationships among the documents and not to the structure of the documents.


Examiner responds:
	Examiner is not persuaded. Examiner is entitled to give claim limitations their broadest reasonable interpretation in light of the specification. Interpretation of Claims-Broadest Reasonable Interpretation: During patent examination, the pending claims must be ‘given the broadest reasonable interpretation consistent with the specification.’  Applicant always has the opportunity to amend the claims during prosecution and broad interpretation by the examiner reduces the possibility that the claim, once issued, will be interpreted more broadly than is justified. In re Prater, 162 USPQ 541,550-51 (CCPA 1969). In this case while the clusters may be comprised of documents, the document set is tokenized and then given values representative of the tokens for similarity comparisons and repetition analysis, moreover, labels are given to the categories for the purposes of classifying/filtering the extracted information. Therefore the tokenization results in documents that can be compared. Paragraph 0066 states that tokenization allows structured documents and previously unstructured documents to have a structure imposed on the data as a normalization technique. 

Applicant argues:
	The applicant has amended claim 33 to recite that display of the portions of the body of characters “represents at least a portion of the structure of the body of characters that has been determined by the automatic computer-based processes.” Regarding the display of information to the user of the Raskutti technology, paragraphs 0071 through 0073 said only that “The editor/browser 16 provides the ability to manually alter the automatically created groups for greater coherence and usability. It provides a visual user interface to allow browsing of the cluster hierarchy and a number of editing functions.” What was displayed apparently did 

Examiner responds:
	Examiner is not persuaded. Paragraph 0072 states that a training model can learn to automatically filter documents based on previous selections.  These filters are generated automatically based on what the machine has learned.
















Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ELIYAH STONE HARPER whose telephone number is (571)272-0759.  The examiner can normally be reached on Monday-Friday 10:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark Featherstone can be reached on (571)270-3750.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.





/Eliyah S. Harper/Primary Examiner, Art Unit 2166                                                                                                                                                                                                        March 13, 2021