DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
The application was received on 10/29/20.  Claims 1-20 are pending in the application.  
Claim 4, 5, 16, and 20 are rejected under 35 U.S.C. 112.
Claim(s) 1-6 and 11-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Murray (US 2011/0225159).
Claims 7-9 are rejected under 35 U.S.C. 103 as being unpatentable over Murray, and further in view of Tacchi (US 9715495).
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Murray, and further in view of Gaussier et al. (US 2003/0101187).

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 4, 5, 16, and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and 
Claims 4, 16, and 20 recites the limitation "the list of words" in the “counted” limitation.  There is insufficient antecedent basis for this limitation in the claim.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-6 and 11-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Murray (US 2011/0225159).

With respect to claim 1, Murray teaches a computer-implemented method for identifying latent themes in textual data, the method comprising: 
receiving a plurality of documents (Murray, pa 0060, extract text documents from the domain corpus); 
preprocessing document text for each document among the plurality of documents (Murray, pa 0060, kPOOL parses the documents 15 into terms 17, by performing part-of-speech (POS) tagging, adaptive stop word removal, porter stemming, and term reduction, to the terms in the documents); 

determining one or more document clusters among the plurality of preprocessed documents based on the calculated similarity of each pair of documents among the plurality of preprocessed documents (Murray, pa 0100, The pairing process will result in clusters of documents comprising no less than two documents); and 
extracting one or more topics in each document cluster among the determined one or more document clusters (Murray, 0057, kPOOL may correlate the relevant documents 31 to the hierarchy of clustered documents 11, to retrieve information regarding the topics the relevant documents fall into.).

With respect to claim 2, Murray teaches the computer-implemented method of claim 1, further comprising: reporting the extracted one or more topics (Murray, pa 0057, A fisheye view 33 displays the topics that correlate to the relevant documents 31.).

With respect to claim 3, Murray teaches the computer-implemented method of claim 1, wherein preprocessing the document text for each document comprises: performing one or more of: lemmatizing the document text for each document, stemming the document text for each document, tokenizing the document text for each 

With respect to claim 4, Murray teaches the computer-implemented method of claim 1, wherein calculating a similarity of each pair of preprocessed documents comprises: counting instances of each word of the list of words appearing in the preprocessed document text for each document; and calculating the similarity of each pair of documents based on the counted instances (Murray, pa 0087, kPOOL measures the vector, or "node" similarities of documents in the reformed term-to-document matrix 23, shown in FIG. 8 and removes non-outliers.).

With respect to claim 5, Murray teaches the computer-implemented method of claim 4, wherein the similarity of each pair of documents is calculated as a cosine similarity (Murray, pa 0089, kPOOL measures the cosine similarity between each document 15 in the document pool 123.).

With respect to claim 6, Murray teaches the computer-implemented method of claim 1, wherein determining one or more document clusters comprises iteratively performing until a desired number of document clusters is determined (Murray, pa 0010, 
searching the calculated similarities of each pair of documents for a maximum similarity between pairs of documents (Murray, pa 0087, similar nodes are paired); 
averaging the similarities of the pair of documents having the maximum similarity into a new document cluster (Murray, pa 0100, The pairing process will result in clusters of documents comprising no less than two documents); 
remove the individual documents of the pair of documents having the maximum similarity from the plurality of documents (Murray, pa  0097, the matching pair 157 is removed 159 from the document pool 123. Removing 159 the matching pair 157 from the document pool 123 assures that neither document that comprises the pair, will be matched with another pair at this point in the clustering process.); 
adding the new document cluster to the plurality of documents; and calculating a similarity of the new cluster and each document among the plurality of preprocessed documents (Murray, pa 0102, . Thus, the gradient equation (6) listed above, allows kPOOL to determine the effect of each document within the cluster on the similarity measure of the entire cluster. The gradient equation (6) is applied to each cluster, and each document within the cluster.)

With respect to claim 11, Murray teaches the computer-implemented method of claim 1, further comprising: identifying one or more outlier documents among the plurality of documents and one or more topics in the one or more outlier documents 

With respect to claims 12-16, the limitations are essentially the same as those in claims 1, 3, 4, and 6, and thus are rejected for the same reasons.

With respect to claims 17-20, the limitations are essentially the same as those in claims 1, 3, 4, and 6, and thus are rejected for the same reasons.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 7-9 are rejected under 35 U.S.C. 103 as being unpatentable over Murray, and further in view of Tacchi (US 9715495).

With respect to claim 7, Murray in view of Tacchi teaches the computer-implemented method of claim 1, as discussed above.  Murray doesn't expressly discuss wherein extracting one or more topics in each document cluster comprises performing Latent Dirichlet Allocation (LDA) within each cluster.

It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Murray with the teachings of Tacchi because it provides a user-friendly mechanism to adjust the relationship graph such that certain topics are not over or under emphasized (Tacchi, Col. 2 Li. 46-62).

With respect to claim 8, Murray in view of Tacchi teaches the computer-implemented method of claim 7, wherein extracting one or more topics in each document cluster further comprises: counting instances of each topic word identified by the LDA appearing in the preprocessed document text for each document and each cluster; and calculating the similarity of each document and each cluster based on the counted instances of each topic word (Tacchi, Col. 7 Li. 11-13, some embodiments may execute a form of Latent Dirichlet Allocation. In some cases, a number of topics to be ascertained may be supplied, e.g., by a user indicating that 2, 3, 5, or 50 topics are to be ascertained).

With respect to claim 9, Murray in view of Tacchi teaches the computer-implemented method of claim 1, as discussed above.  Murray doesn't expressly discuss 
Tacchi teaches wherein a number of extracted topics in each document cluster is a user-specified maximum number of topics (Tacchi, Col. 7 Li. 11-13, some embodiments may execute a form of Latent Dirichlet Allocation. In some cases, a number of topics to be ascertained may be supplied, e.g., by a user indicating that 2, 3, 5, or 50 topics are to be ascertained).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Murray with the teachings of Tacchi because it provides a user-friendly mechanism to adjust the relationship graph such that certain topics are not over or under emphasized (Tacchi, Col. 2 Li. 46-62).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Murray, and further in view of Gaussier et al. (US 2003/0101187).

With respect to claim 10, Murray teaches the computer-implemented method of claim 1, as discussed above.  Murray doesn't expressly discuss wherein a number of determined clusters is a user-specified maximum number of clusters (Gaussier, pa 0029, the parameter .beta. may be a value that controls the complexity of an objective function to optimize through the number of clusters and the computation of the parameter value itself. The value of .beta. … may also be provided by a user through an input/output device such as keyboard 110.)


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRITTANY N ALLEN whose telephone number is (571)270-3566. The examiner can normally be reached M-F 9 am - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 571-272-4046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 
/BRITTANY N ALLEN/           Primary Examiner, Art Unit 2169