DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Double Patenting
	The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.  A nonstatutory double patenting rejection is appropriate where the claims at issue are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
	A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the reference application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
	The USPTO internet Web site contains terminal disclaimer forms which may be used.  Please visit http://www.uspto.gov/forms/.  The filing date of the application will determine what form should be used.  A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission.  For more information about eTerminal Disclaimers, refer to http://www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
	Claim 1-20, are rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claim 1-20 of US Patent US 11,093,557.  Although the conflicting claims are not identical, they are not patentably distinct from each other because of following reasons:
	US Patent 11,093,557 contain(s) every element of claims 1-20 of the instant application 17/403562 and thus anticipate or obvious the claim(s) of the instant application. Claims of the instant application 17/403562, therefore are not patently distinct from the earlier patent claims and as such are unpatentable over obvious-type double patenting. A later patent/application claim is not patentably distinct from an earlier claim if the later claim is anticipated by the earlier claim.
A later patent claim is not patentably distinct from an earlier patent claim if the later claim is obvious over, or anticipated by, the earlier claim. In re Longi, 759 F.2d at 896, 225 USPQ at 651 (affirming a holding of obviousness-type double patenting because the claims at issue were obvious over claims in four prior art patents); In re Berg, 140 F.3d at 1437, 46 USPQ2d at 1233 (Fed. Cir. 1998) (affirming a holding of obviousness-type double patenting where a patent application claim to a genus is anticipated by a 35 patent claim to a species within that genus). " ELI LILLY AND  COMPANY v BARR LABORATORIES, INC., United States Court of Appeals for the Federal Circuit, ON PETITION F£)R REHEARING EN BANC (DECIDED: May 30, 2001). 
	The dependent claims are anticipated or obvious by the species of the patented invention. Cf., Titanium Metals Corp. v. Banner, 778 F.2d 775,227 USPQ 773 (Fed. Cir. 1985) (holding that an earlier species disclosure in the prior art defeats any generic claim). This court's predecessor has held that, without a terminal disclaimer, the species claims preclude issuance of the generic application. In re Van Ornum, 686 F.2d 937, 944, 214 USPQ 761,767 (CCPA 1982); Schneller, 397 F.2d at 354. Accordingly, absent a terminal disclaimer. The dependent claims were properly rejected under the doctrine of obviousness-type double patenting." (In re Goodman (CA FC) 29 USPQ2d 2010 (12/3/1993).

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 9, 19 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Claims 9, 19 recite – a second cluster that is approximately the same size as the first cluster.  This limitation is not supported by the specification as originally filed.  No details explaining how one of ordinary skill in the art would implement such a limitation were provided in the specification.


The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites a plurality of unrelated limitations, with improper anteceding basis, which adds a deal of confusion in the claim.  Specifically -
 -   “keyword phrases and topics extracted”;
“each extracted keyword phrase”;
“each of the extracted keyword phrases”;
“a number of extracted keyword phrases”;
"the extracted keyword phrase";
“an associated extracted keyword phrase” (which is never used, thus the purpose is unclear and confusing in the claim);
“each topic”;
“that respective topic”;
“the topic”.
It is not clear if cited keyword phrase(s) and a plurality of cited topic(s) are meant to be the same of different entities.  For example, “each extracted keyword phrase” is unrelated to initially cited “the extracted keyword phrases” and is unrelated to “each of the extracted keyword phrases.”  Further “each topic” is unrelated to initially cited “topics extracted.”  Further, in the limitation “associated with that respective topic”, that respective topic is being referring to (i.e. topic, the topic, each topic?).
	Claim 1 recites limitation "the topic ".  There is insufficient antecedent basis for this limitation in the claim.  It is not clear of the topic refers to “each topic” or “that respective topic.”
	Claim 1 further recites - "the respective company".  There is insufficient antecedent basis for this limitation in the claim.
Claim 1 recites limitation - “each of the extracted keyword phrases.”  It is not clear of “the extracted keyword phrases” refers to “each extracted keyword phrase” or “a number of extracted keyword phrases.”
Claim 1 further recites - “scores relating each extracted keyword phrase to the respective company”, it is not clear if “each extracted keyword phrase” is meant to be “the each extracted keyword phrase” or “each of the extracted keyword phrases” or is meant to be another, new unrelated keyword phrase.
Claim 1 further recites limitation "the extracted keyword phrase" (singular).  There is insufficient antecedent basis for this limitation in the claim.  For the purposes of examining it is assumed it was meant  - the extracted keyword phrases (plural) .
Claim 1 further recites - “the plurality of companies.”  There is insufficient antecedent basis for this limitation in the claim.
Claim 1 further recites - “clustering technique to the extracted keyword phrases for each respective company”.  It is not clear if “each respective company” is meant to be referring to the each respective company cited initially or is it meant to be new unrelated respective company.
Claim 1 further recites - “the clusters of companies”.  It is not clear if “companies” is meant to be referring to “the plurality of companies” or are meant to be some other unrelated companies.
Further, it is not clear of what “respective business tags” is being in respect to, and thus allowed to be any business tags.
Further, claim recites “generating a count for keyword phrases and topics extracted”.  However, the “count” is never used and the purpose of generating the count is unclear.  Furthermore, it is not clear how the count is different from the “term-frequency (TF)”, which is essentially another count.
Further, “the plurality of topic spaces” should be corrected to “the at least one of the plurality of topic spaces” for consistency and to avoid further confusion.
Further few notes – the claim determines a plurality of topic spaces based on the TF-IDF vectors.  However, the topic spaces are never actually used or output (as they are optional in a subsequent cluster formation and output).  Thus, it is not clear of what is the purpose of determining the TF-IDF vectors and the plurality of topic spaces, when such topic spaces are never actually used.
Independent claim 13 discloses an analogous deficiencies and is rejected based on the same reasoning.  Due to the enormous amount of confusion in the independent claims, the dependent claims have not been checked for the proper anteceding basis and should be further corrected accordingly.
Due to the 35 USC § 112 rejections, the claims have been treated on their merits as best understood by the examiner.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over applicant’s admitted prior art Mishor et al. (US 2012/0203584) in view of Love et al. (US 9,740,368).

Regarding claim 1, Mishor teaches a computer-implemented method, wherein one or more computing devices comprising storage and a processor are programmed to perform steps comprising: 
generating a count ([0024]-[0025], [0037], [0043], [0092] “count the frequency of appearance of terms in the analyzed information”) for keyword phrases and topics extracted from a corpus of documents, the topics being associated with the extracted keyword phrases or a portion of the extracted keyword phrases ([0018], [0021], [0038], [0040]-[0041]); 
determining document frequencies (DF) for each extracted keyword phrase across the corpus of documents ([0024]-[0025], [0093]); 
applying a term-frequency (TF) - inverse-document-frequency (IDF) (TF-IDF) transformation to each of the extracted keyword phrases to generate a respective plurality of TF-IDF vectors  ([0101], [0106], [0135]); 
determining a strength of each topic ([0071]) based on a number of extracted keyword phrases associated with that respective topic ([0092]-[0093]);

generating relevance scores ([0096] see “relevance parameter”) relating each extracted keyword phrase to the respective company based on a strength of each of the extracted keyword phrases, have high degree of semantic similarity to terms extracted”; [0056]-[0057]  “compute semantic similarities between a textual term included in a cluster and a new term”; “a similarity or relatedness measure, score or value”; [0070]-[0071] “a score, weight or relevance”; “cumulative significance values or scores of the related textual terms”), 
applying a representation learning technique ([0028], [0030], [0032], [0083]) to the plurality of TF-IDF vectors ([0101], [0106], [0135]) and the relevance scores ([0066]-[0065], [0074]) to generalize each respective company into at least one of a plurality of topic spaces ([0044]-[0043], [0050]); 
segmenting the plurality of companies into clusters by applying a clustering technique to the extracted keyword phrases for each respective company or to the plurality of topic spaces  ([0044], [0074]); and outputting the clusters of companies with respective business tags  ([0059], [0070]).

Mishor does not explicitly teach, however Love discloses determining a strength of each topic based on a number of extracted keyword phrases associated with that respective topic (C17L1-25); 
determining an edge weight based on a linkage of the topic with an associated extracted keyword phrase (C5L37-40); generate relevance scores relating each extracted keyword phrase to the respective company based on the strength of each topic (C9L47-53), and the edge weight for each topic, the strength of each of the extracted keyword phrases being equal to a TF-IDF vector associated with the extracted keyword phrase (C8L1-15, C16L55-67- C17L1-11, C19L10-31). 
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Mishor to include determining a strength of each topic and determining an edge weight as disclosed by Love.  Doing so would yield results closer to the user's intent (Love [0009]).

Regarding claim 13, Mishor teaches a system, comprising: a processor configured to: generate a count for keyword phrases and topics extracted from a corpus of documents, the topics being associated with the extracted keyword phrases or a portion of the extracted keyword phrases; determine document frequencies (DF) for each extracted keyword phrase across the corpus of documents; apply a term-frequency (TF) - inverse-document-frequency (IDF) (TF-IDF) transformation to each of the extracted keyword phrases to generate a respective plurality of TF-IDF vectors; determine a strength of each topic based on a number of extracted keyword phrases associated with that respective topic; determine an edge weight based on a linkage of the topic with an associated extracted keyword phrase; generate relevance scores relating each extracted keyword phrase to the respective company based on a strength of each of the extracted keyword phrases, the strength of each topic, and the edge weight for each topic, the strength of each of the extracted keyword phrases being equal to a TF-IDF vector associated with the extracted keyword phrase; apply a representation learning technique to the plurality of TF-IDF vectors and the relevance scores to generalize each respective company into at least one of a plurality of topic spaces; create segments of the plurality of companies into clusters by applying a clustering technique to the extracted keyword phrases for each respective company or to the plurality of topic spaces; and an output configured to transmit the clusters of companies with respective business tags to another computing device, network, or system.
Claim 13 recites substantially the same limitations as claim 1, and is rejected for substantially the same reasons.

Regarding claims 2 and 14, Mishor as modified teaches the method and the system further comprising generating similarity between two of the extracted keyword phrases based on a distance metric between the two extracted keyword phrases and determining the strength of each topic based on the number of extracted keyword phrases and the similarity between the two extracted keyword phrases (Love C6L52-67, C1240-49).

Regarding claim 3, Mishor as modified teaches the method of claim 2, wherein the distance metric includes a distance between the two extracted keyword phrases as either cosine distance or Euclidian distance (Mishor [0101], Love C16L30-32, C19L30-31, 53-54).

Regarding claims 4 and 15, Mishor as modified teaches the method and the system, further comprising generating similarity between two of the extracted keyword phrases based on a positive point-wise mutual information (PPMI) matrix of the two extracted keyword phrases to context words (Mishor [0041]) and determining the strength of each topic based on the number of extracted keyword phrases and the similarity between the two extracted keyword phrases  (Mishor [0071], [0100] “use Statistical corpus-based methods, e.g., Pointwise Mutual Information (PMI)”, [0101], Love C18L1-10).

Regarding claims 5 and 16, Mishor as modified teaches the method and the system, further comprising segregating the context words by regions of distances away from a central keyword phrase (Love C6L58-62, C7L25-38, C8L63-67, C10L2-4).

Regarding claims 6 and 17, Mishor as modified teaches the method and the system, further comprising generating a co-occurrence matrix (Mishor [0043], [0083], Love C19L34-41, 52-54) of the two extracted keyword phrases to context words by counting the occurrences of each pair of (w, c), wherein w is the extracted keyword phrase and c is a context word within a specific zone  (Mishor [0100] “use Statistical corpus-based methods, e.g., Pointwise Mutual Information (PMI)”, [0114], [0116]).

Regarding claim 7, Mishor as modified teaches the method of claim 1, further comprising segmenting the plurality of companies into a first cluster and a second, overlapping cluster  (Mishor [0044]-[0045], [0048], [0056], [0060], [0062], [0065] “clustering module or unit may be to maximize the similarity between textual terms in the same cluster and minimize the similarity between textual terms in different clusters”, [0066]-[0067], Love C4L56-60, C5L43-50).

Regarding claims 8 and 18, Mishor as modified teaches the method and the system further comprising segmenting the plurality of companies into a first cluster and a second, non-overlapping cluster (Mishor [0044]-[0045], [0048], [0056], [0060], [0062], [0065] “clustering module or unit may be to maximize the similarity between textual terms in the same cluster and minimize the similarity between textual terms in different clusters”, [0066]-[0067]).

Regarding claims 9 and 19, Mishor as modified teaches the method and the system, further comprising segmenting the plurality of companies into a first cluster and a second cluster that is larger than the first cluster (Mishor [0068], [0080]-[0081], also see merging and splitting clusters [0131], Love C4L13-19).

Regarding claim 10, Mishor as modified teaches the method of claim 1, further comprising segmenting the plurality of companies into a first cluster and a second cluster that is approximately the same size as the first cluster (Love L34-24-25, 45-49).

Regarding claim 11, Mishor as modified teaches the method of claim 1, further comprising segmenting the plurality of companies into a first cluster and a second cluster, and extracting keywords for each of the first cluster and the second cluster (Mishor [0044]-[0045], [0048], [0056], [0060], [0062], [0065], [0066]-[0067], Love C5L37-47, C6L53-67, C9L40-52, C30L34-67).

Regarding claim 12, Mishor as modified teaches the method of claim 11, further comprising generating the relevance scores relating to the first cluster and the second cluster based on a strength of each of the extracted keywords for the first cluster and the second cluster, respectively, and outputting the relevance scores relating to the first cluster and the second cluster (Mishor [0044]-[0045], [0048], [0056], [0060], [0062], [0065], [0066]-[0067], [0068], [0080]-[0081], Love C5L37-47, C6L53-67, C9L40-52, C30L34-67).

Regarding claim 20, Mishor as modified teaches the system of claim 13, wherein: the processor is further configured to: create segments of the plurality of companies into a first cluster and a second cluster, extract keywords for each of the first cluster and the second cluster, and generate the relevance scores relating to the first cluster and the second cluster based on a strength of each of the extracted keywords for the first cluster and the second cluster, respectively, and the output is further configured to output the relevance scores relating to the first cluster and the second cluster (Mishor [0044]-[0045], [0048], [0056], [0060], [0062], [0065], [0066]-[0067], [0068], [0080]-[0081], Love C5L37-47, C6L53-67, C9L40-52, C30L34-67).

Claims 1-3, 13-14 is/are alternatively rejected under 35 U.S.C. 103 as being unpatentable over applicant’s admitted prior art Mishor et al. (US 2012/0203584) in view of Agarwal (US 2016/0140231).

Regarding Claim 1, Mishor teaches claim 1 as indicated above.
Mishor does not explicitly teach, however Agarwal discloses determining an edge weight based on a linkage of the topic with an associated extracted keyword phrase ([0047]-[0048]); generate relevance scores relating each extracted keyword phrase to the respective company based on the strength of each topic, and the edge weight for each topic, the strength of each of the extracted keyword phrases being equal to a TF-IDF vector associated with the extracted keyword phrase ([0054], [0074]-[0075]). 
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Mishor to include determining a strength of each topic and determining an edge weight as disclosed by Agarwal.  Doing so would provide efficient sharing of information  (Agarwal [0009]).

Claim 13 recites substantially the same limitations as claim 1, and is rejected for substantially the same reasons.

Regarding claims 2 and 14, Mishor as modified teaches the method and the system further comprising generating similarity between two of the extracted keyword phrases based on a distance metric between the two extracted keyword phrases and determining the strength of each topic based on the number of extracted keyword phrases and the similarity between the two extracted keyword phrases (Agarwal [0054], [0074]).

Regarding claim 3, Mishor as modified teaches the method of claim 2, wherein the distance metric includes a distance between the two extracted keyword phrases as either cosine distance or Euclidian distance (Agarwal [0051]).

Claims 5-6 and 16-17 are alternatively rejected under 35 U.S.C. 103 as being unpatentable over Mishor as modified and in further view of Shmueli (US 2017/0109344).

Regarding claims 5 and 16, Mishor as modified does not explicitly teach, but Shmueli discloses the method and the system, further comprising segregating the context words by regions of distances away from a central keyword phrase ([0032]-[0035], [0037], [0044]-[0046]).
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Mishor as modified to include segregating the context words by regions of distances away from a central keyword phrase as disclosed by Shmueli.  Doing so would help determine "senses" in which a word appears and associating the words with particular occurrences in the input sequence (Shmueli [0001]).

Regarding claims 6 and 17, Mishor as modified teaches the method and the system, further comprising generating a co-occurrence matrix (Mishor [0043], [0083]) of the two extracted keyword phrases to context words by counting the occurrences of each pair of (w, c), wherein w is the extracted keyword phrase and c is a context word within a specific zone  (Mishor [0100] “use Statistical corpus-based methods, e.g., Pointwise Mutual Information (PMI)”, [0114], [0116], Shmueli [0034]-[0035], [0045]).

Claim 10 is alternatively rejected under 35 U.S.C. 103 as being unpatentable over Mishor as modified and in further view of Houle (US 2004/0139067).

Regarding claim 10, if Mishor as modified does not explicitly teach the method of claim 1, further comprising segmenting the plurality of companies into a first cluster and a second cluster that is approximately the same size as the first cluster ([0155]).
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Mishor as modified to include clusters that are approximately the same size disclosed  by Houle.  Doing so would allow users to assess the degree of cohesion and prominence of clusters at a glance (Houle [0036]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure is indicated on PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to POLINA G PEACH whose telephone number is (571)270-7646. The examiner can normally be reached Monday-Friday, 9:30 - 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aleksandr Kerzhner can be reached on 571-270-1760. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/POLINA G PEACH/Primary Examiner, Art Unit 2165                                                                                                                                                                                                        December 19, 2022