DETAILED ACTION
Claims 1-7, 9, 11-17, 19, 21, 22 are pending in this action.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 25 Oct 21 has been entered.

Response to Argument
Applicants’ arguments were considered, but are unpersuasive.
Applicants’ remarks contain two arguments.  

The first, with respect to the independent claims, captures the concept of getting feedback from documents based on content within documents (which presumptively was intended to avoid interpretations covering external metadata such as filename, file size, and such as may be captured by a file system), and creating a new topic.  Remarks at 10


The aspect of claimed content selection within a document as not reading on how Kleinberger handles keyword selection.  Kleinberger keeps a separate keyword file (element 46), used in a manner analogous to an inverted index.  The examiner seemed to remember this as being correct.  And after a fuller review after the RCE was filed, the examiner can confirm that indeed, applicants’ characterization is actually how Kleinberger’s basic system works.  See Col 7 line 66-Col 8 line 16.

Unfortunately, Kleinberger also lists, an additional feature to the basic TOC/keywording system, that analysis may also involve determining the similarity of sentences in the text to one presented in a query.  See Kleinberger starting at the end of Col 20 through Col 21; see also id. at Col 14 lines 27-29 (expressly disclosing that the additional features modify the basic analysis).  Thus, while applicants’ characterization of how Kleinberger works at the most basic level is correct, the additional features, expressly disclosed to supplement the basic model, meet the amended claim limitations.

Thus, Kleinberger, as a whole, still teaches the amendment.

The second argument introduces newly added claims 21, 22, which specify that the subset of the content within the document is the title.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 2, 7, 9, 11, 12, 17, 19, 21, 22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lewis et al. (US 9,110,984 B1) hereinafter Lewis, in view of Kleinberger (US 4,972,349) hereinafter Kleinberger

NOTE: for Kleinberger, Col 14 lines 27-29 expressly disclose the ADDITIONAL FEATURES section is supplemental to the earlier embodiments, and thus the usage of these

With respect to claim 1, Lewis discloses a method (Title) comprising:
obtaining a plurality of electronic documents (Col 1 lines 50-54, problem to be solved assumes that one has obtained a large plurality of electronic documents);



determining a similarity between a first topic of the plurality of topics and a second topic of the plurality of topics (Col 6 lines 5-8, merger of sufficiently related clusters), the first topic associated with a first set of the plurality of electronic documents (Col 6 lines 1-7 determining cluster topic and appropriate label);

refining the plurality of topics based on the similarity of the first topic and the second topic (Col 6 lines 5-8, merger of sufficiently related clusters), the refining including associating the first set of the plurality of electronic documents with the second topic (Col 5 line 64—Col 6 line 10, esp. lines 7-8) and removing the first topic from the plurality of topics (implicit, if the cluster no longer exists since it has been merged, at least one topic is no longer necessary); 

building a document-classifier model by applying machine learning to the plurality of electronic documents with at least one electronic document associated with each of the refined plurality of 
[…]

designating the second topic as relevant based on the personal categories of the document- classifier model (Col 13 line 60-Col 14 line 50, esp. Col 14 lines 6-13, 27-35, teaches custom selection of features used to cluster documents), […]

Of note, Lewis does not expressly articulate, as distinct steps:

obtaining a +candidate electronic document; and 



However, 
these two steps essentially are taking the trained classifier and classifying documents with them.  These correspond to Fig. 5 elements 560, 570.
The written description describes “an electronic document” encompasses both documents in the training set and exterior to the training set.  See Written Description [000100] (stating “In block 560, an electronic document may be obtained. The electronic document may or may not be included in the multiple electronic documents obtained in block 510. In some embodiments, the electronic document obtained may be the same as, or similar to, the electronic documents 138 of Fig. 1.”)

Therefore, the earlier citations to obtain a document and create the classification model also demonstrate these two steps.

but Lewis does not teach
receiving a search request for electronic documents related to a search term; 
[…] the second topic being associated with the search term; 
designating an additional topic as irrelevant based on the personal categories of the document-classifier model, the additional topic being associated with the search term; and 
returning an electronic document associated with the second topic as responsive to the search request;

providing the electronic document for presentation on a display based on the classification of the electronic document;
obtaining feedback relative to the electronic document, the feedback including a selection of a subset of content within the electronic document;
identifying a third topic based on the feedback;
updating the topic-extraction model to include the third topic based on the feedbac;k; and
analyzing the plurality of electronic documents to obtain a second plurality of topics using the updating topic-extraction model including the third topic.


Kleinberger teaches
receiving a search request for electronic documents related to a search term (Fig. 9 “search request” shows ); 
designating the second topic as relevant based on the personal categories of the document- classifier model (Col 13 lines 58-64, expansion by designation of TOC element; Col 14 lines 4-15, basis for expansion is user find/load text being sought), the second topic being associated with the search term (Apples are related to fruit); 
designating an additional topic as irrelevant based on the personal categories of the document-classifier model (Col 13 lines 32-36, hiding or ignoring keywords; lines 40-43, hiding keywords is based on their being “irrelevant for the purpose at hand.), the additional topic being associated with the search term (Col 13 lines 41-42, hidden keywords would otherwise influence the analysis, so are implicitly associated); and 

providing the electronic document for presentation on a display based on the classification of the electronic document (Id.  Fig. 4 shows TOC uses classifications to organize returned texts);
obtaining feedback relative to the electronic document (Col 13 lines 58-64, user may designate group of texts for expansion), the feedback including a selection of a subset of content within the electronic document (Col 20 line 65-Col 21 line 67, sentence similarity comparison.  Note that Col 20 line 65 mentions that it is an “identical” technique of comparison.  This is in reference to the “similar keywords” discussion in Col 18 line 35-Col 20 line 62.);
identifying a third topic based on the feedback (Background Col 4 lines 27-31, TOC organized by categories and sub-categories; lines 40-50, user may repeat analytical process on the subcategories to create further subcategories, and the process may be iterative.);
updating the topic-extraction model to include the third topic based on the feedback (id.); and
analyzing the plurality of electronic documents to obtain a second plurality of topics using the updating topic-extraction model including the third topic (id., wrt iterations).

Lewis, and Kleinberger, are directed to document categorization.  It would have been obvious to those of ordinary skill in the art at the time of filing to combine the teachings of the references (1) in order to provide the user control over group/cluster management with the ease of use that comes from a user interface; and (2) Lewis expressly teaches combination of its technology with known clustering techniques and the Kleinberger right-group/down-group analysis is an earlier-published (and therefore known) model of clustering by keywords.



With respect to claims 2, 12. Kleinberger teaches 
the first topic is associated with a first term obtained from a first electronic document of the plurality of electronic documents (Fig. 9, Apples),
the second topic is associated with a second term obtained from a second electronic document of the plurality of electronic documents (Fig. 9, Red)
wherein the refining the plurality of topics (Col 14 lines 9-15, can further expand.  Col 10 lines 13-44 shows the expansion process is iterative.) further includes:
	obtaining a group of terms including the first term and the second term (per examples as above, expansion includes terms.  Fig. 9 shows expansion is a category-subcategory relationships, thus the expansion implies that the documents are inclusive of the first and second terms.);
	adding a third topic to the refined plurality of topics, the third topic based on the group of terms, the third topic associated with the first electronic document and the second electronic document (implicit to iterative nature of expansion of right groups.).

With respect to claims 7, 17, Lewis discloses providing the electronic document for presentation on a display based on the classification of the electronic document (Col 9 line 63-Col 10 line 6, presenting a cluster of text files to an operator for review and labeling). 

With respect to claims 9, 19, Kleinberger teaches

obtaining a confirmation relating to the second topic (Col 12 lines 60-64, user requests an expansion of a group.  This capability of selection and expansion encompasses the second group.); 
updating the topic-extraction model to include the second topic (Fig. 9 shows sample output; second topic must have been included to be displayed); and 
analyzing the plurality of electronic documents to obtain a second plurality of topics using the updated topic-extraction model including the second topic (Col 12 lines 25-66, esp. lines 33-44, right group analysis is part of recursive process where keywords may be used to dictate new subclassifications.).

With respect to claims 21, 22, Kleinberger teaches the subset of content comprises a title of the electronic document (Col 21 lines 51-56, the “sentence similarity” comparison region may be or include the title.).

Claims 3-6, 13-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lewis and Kleinberger as applied to claims 1, 11, in view of Kawatani et al. (US 2004/0093557 A1) hereinafter Kawatani
With respect to claims 3, 13, Lewis teaches 
obtaining a first term vector of numbers representing a first term associated with the first topic (Col 9 lines 38-44, determining word/phrase content frequency for two particular documents); 
comparing the first term vector to the second term vector (Col 9 lines 32-37, used to determine content); 

Lewis and Kleinberger do not teach the specifics of topic similarity based on term vectors.  However, Kawatani teaches the use of documents in a document set to evaluate whether or not common topics exist (Kawatani [0006] technique A.  Also, generally, list of topic comparisons are automated techniques)

Thus, the combination of Lewis and Kawataini (and Kleiberger) teaches determining the similarity between the first topic and the second topic based on the comparison between the first topic vector and the second topic vector (Lewis shows two frequency vectors representing topics, Kawatani technique A shows analysis to show if there are common topics) indicating that a similarity between the first topic vector and the second topic vector exceeds a threshold (threshold implicit to determine when to merge in automated merger decisions.  Example of use of setting a threshold similarity is in Kawatani [0011])

Lewis, Kleinberger, and Kawatani are directed to document categorization.  It would have been obvious to those of ordinary skill in the art at the time of filing to combine the teachings of the references in order to automate the decision to merge groups.

With respect to claims 4, 14, the combination of Lewis and Kleinberger do not teach

obtaining a second topic vector representing a plurality of associations between a second plurality of terms and the second topic; 
comparing the first topic vector to the second topic vector; and 
determining the similarity between the first topic and the second topic based on the comparison between the first topic vector and the second topic vector indicating that a similarity between the first topic vector and the second topic vector exceeds a threshold.  

Kawatani teaches
obtaining a first topic vector representing a plurality of associations between a first plurality of terms and the first topic ([0006], [0009], one of the pair of documents “in the case of two documents”.  Abstract, sentences/documents are modeled by vectors.); 
obtaining a second topic vector representing a plurality of associations between a second plurality of terms and the second topic ([0006], [0009], the other of the pair of documents “in the case of two documents”); 
comparing the first topic vector to the second topic vector ([0009] comparison of documents.  Per abstract, this requires comparison of vectors); and 
determining the similarity between the first topic and the second topic based on the comparison between the first topic vector and the second topic vector indicating that a similarity between the first topic vector and the second topic vector exceeds a threshold (Fig. 4 elements 48, 49, and conditional statement).  



With respect to claims 5, 15, respectively dependent upon claims 4, 14, Kawatami teaches the determining the similarity between the first topic and the second topic further comprises: obtaining a first term vector of numbers representing a first term associated with the first topic ([0006], [0009], one of the pair of documents “in the case of two documents”.  Abstract, sentences/documents are modeled by vectors.); 
obtaining a second term vector of numbers representing a second term associated with the second topic ([0006], [0009], one of the pair of documents “in the case of two documents”.  Abstract, sentences/documents are modeled by vectors.); and 
comparing the first term vector to the second term vector, wherein the determining the similarity between the first topic and the second topic is further based on the comparison between the first term vector and the second term vector indicating that a similarity between the first term vector and the second term vector exceeds a threshold (Fig. 4 elements 48, 49, and conditional statement).

With respect to claims 6, 16, Lewis teaches selecting another electronic document of the plurality of electronic documents associated with the second topic to build the document-classifier model, the other electronic document selected based on a degree of association between the other electronic document and the second topic exceeding a threshold (As above, Col 6 lines 5-8, merger of sufficiently related clusters). 

However, note that Lewis makes the merger decision a manual one decided by a human, which typically do not rely on thresholds.  Kleinberger itself does not disclose a merger, and discloses manual oversight and input into group management, so also does not teach thresholds.  Kawatani teaches that it is known for automated decisions to be based on similarity thresholds (Kawatani [0011]).

Lewis, Kleinberger, and Kawatani are directed to document categorization.  It would have been obvious to those of ordinary skill in the art at the time of filing to combine the teachings of the references in order to automate the decision to merge groups.

Remarks
All portions of all references cited in the course of prosecution of this application, in this or any previous office action, are hereby employed in support of the current rejections for clarity and to preserve their viability as evidence upon any future appeal.

Earlier Markush suggestion withdrawn.
The examiner recalls that he had suggested claiming the list of exemplary documents listed in the instant written description in a Markush group, and then pointing out that “title” in those contexts is not be synonymous with “filename” to a PHOSITA familiar with the format of most of those document types, thus avoiding potential rejection over the DOS “dir” command.  



Other review of Kleinberger and the instant application, made of record
Kleinberger expressly notes that its “sentences” are actually its earlier described bag of words model, instead of POS tagging or other kinds of grammar analysis.  See Col 21 lines 44-51 (although the request may or may not be a grammatical natural language sentence, the process for analysis is analogous to the keyword list analysis).  

The examiner briefly searched the primary reference Lewis for the terms syntax, grammar, POS, and sentence, and found no hits.  Unfortunately, the same can be said of the applicants’ written description.  The examiner has not reread applicants’ written description for its document analysis section (just ran keyword searches), so if there is indeed disclosure for sentence structure analysis (as opposed to document structure analysis, which Kleinberger alludes to), that might be something that can be further discussed during an interview.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASON G LIAO whose telephone number is (571)270-3775. The examiner can normally be reached M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached on 571-272-4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JASON G LIAO/Primary Examiner, Art Unit 2156                                                                                                                                                                                                        16 Nov 21