Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for the amendments was provided by Stuart Shapiro (Reg. # 40,169) on November 17, 2021.

The application has been amended as follows:

1.	(Currently amended) A computer-implemented method for summarizing a plurality of texts, the method comprising:			
	generating a vector space  based on a first set of vectors, wherein each vector  includes one or more feature scores equal to or greater than a predefined value and determined from a frequency of tokens of the plurality of texts;
	executing non-hierarchical clustering using the vector space to generate a first plurality of clusters;		
	generating a second set of vectors  including quantities of characters in tokens of first representative texts, wherein each first representative text is  a selected text from a corresponding cluster of the first plurality of clusters;

summarizing the plurality of texts by determining a second representative text for each of the clusters included in the second plurality of clusters, wherein each second representative text is a selected text from a corresponding cluster of the second plurality of clusters, and wherein the second representative text for each of the clusters included in the second plurality of clusters forms a summary of the plurality of texts; and
displaying, on a display, a visualization of the second plurality of clusters, the summary of the plurality of texts, and an element for dynamically changing a threshold value applied to the hierarchical clustering that alters a number of clusters in the second plurality of clusters, wherein when the element dynamically changes the threshold value, the visualization is automatically changed to reflect the altered number of clusters in the second plurality of clusters.

2.	(Original) The computer-implemented method of claim 1, wherein executing the hierarchical clustering generates a tree diagram.

3.	(Currently amended) The computer-implemented method of claim 2, wherein determining the second representative text for each of the clusters in the second plurality of clusters further comprises:
	applying [[a]] the threshold value to the tree diagram. 

4.	(Canceled)	



6.	(Currently amended) The computer-implemented method of claim 1, wherein 
the threshold value includes one of the number of clusters in the second plurality of clusters and a distance between the clusters in the second plurality of clusters.

7.	(Currently amended) The computer-implemented method of claim 6, wherein, as a result of the execution of the hierarchical clustering, the display further displays one or more of: a tree diagram;each second representative text in the summary of the plurality of texts.

8.	(Original)	The computer-implemented method of claim 1, further comprising:
	prior to the execution of the non-hierarchical clustering, determining a number of clusters that will be included in the first plurality of clusters when the non-hierarchical clustering generates the first plurality of clusters. 

9.	(Original)	The computer-implemented method of claim 8, wherein the number of clusters that will be generated in the non-hierarchical clustering is determined according to a number of texts included in the plurality of texts.

10.	(Previously presented)	  The computer-implemented method of claim 9, wherein arrays are generated based on the quantities of characters in the tokens of the first representative texts to generate the second set of vectors when the number of clusters that will be generated in the non-hierarchical clustering is equal to or larger than a predefined number. 

11.	(Previously presented)	  The computer-implemented method of claim 1, wherein arrays are generated based on the quantities of characters in the tokens of the first representative texts to generate the second set of vectors when the first plurality of clusters includes a number of clusters equal to or larger than a predefined number.

12.	(Previously presented) The computer-implemented method of claim 1, further comprising:
	sending an alert to a user or displaying an alert on a display when the second representative text for one or more of the clusters in the second plurality of clusters has a predefined term.

13.	(Previously presented)  The computer-implemented method of claim 1, wherein arrays are generated based on the quantities of characters in the tokens of the first representative texts to generate the second set of vectors, and the  aligning a dimension of each of the vectors of the second set of vectors comprises:
	truncating one or more array elements in each array of the arrays by a predefined number of array elements from a beginning of the array; or 
	padding a tail of each array of the arrays so that a number of digits in each array becomes the predefined number of array elements.

14.	(Previously presented)  The computer-implemented method of claim 1, wherein a number of clusters included in the first plurality of clusters or the second plurality of clusters is determined automatically or by a user.

15.	(Currently amended) A system comprising:
	a processor; and
	a memory storing a program, which, when executed on the processor, summarizes a plurality of texts, the processor configured to perform operations comprising:						generating a vector space based on a first set of vectors, wherein each vector includes one or more feature scores equal to or greater than a predefined value and determined from a frequency of tokens of the plurality of texts;
			executing non-hierarchical clustering using the vector space to generate a first plurality of clusters;						
			generating a second set of vectors  including quantities of characters in tokens of first representative texts, wherein each first representative text is a selected text from a corresponding cluster of the first plurality of clusters;
			aligning a dimension of each of the vectors of the second set of vectors and executing hierarchical clustering using the second set of vectors to generate a second plurality of clusters; [[and]]
summarizing the plurality of texts by determining a second representative text for each of the clusters included in the second plurality of clusters, wherein each second representative text is a selected text from a corresponding cluster of the second plurality of clusters, and wherein the second representative text for each of the clusters included in the second plurality of clusters forms a summary of the plurality of texts; and
displaying, on a display, a visualization of the second plurality of clusters, the summary of the plurality of texts, and an element for dynamically changing a threshold value applied to the hierarchical clustering that alters a number of clusters in the second plurality of clusters, wherein when the element dynamically changes the threshold value, the visualization is automatically changed to reflect the altered number of clusters in the second plurality of clusters.

16.	(Previously presented) The system of claim 15, wherein executing the hierarchical clustering generates a tree diagram.

17.	(Currently amended)	  The system of claim 16, wherein determining the second representative text for each of the clusters in the second plurality of clusters further comprises one or more of:
	applying [[a]] the threshold value to the tree diagram


18.	(Currently amended)	A computer program product for summarizing a plurality of texts, the computer program product comprising one or more computer readable storage  media collectively having program instructions embodied therewith that are executable by at least one processor to cause the at least one processor to:			
	generate a vector space  based on a first set of vectors, wherein each vector  includes one or more feature scores equal to or greater than a predefined value and determined from a frequency of tokens of the plurality of texts;
	execute non-hierarchical clustering using the vector space to generate a first plurality of clusters;		
	generate a second set of vectors  including quantities of characters in tokens of first representative texts, wherein each first representative text is  a selected text from a corresponding cluster of the first plurality of clusters;
	align a dimension of each of the vectors of the second set of vectors and execute hierarchical clustering using the second set of vectors to generate a second plurality of clusters; [[and]]
	summarize the plurality of texts by determining a second representative text for each of the clusters included in the second plurality of clusters, wherein each second representative text is a selected text from a corresponding cluster of the second plurality of clusters, and wherein the second representative text for each of the clusters included in the second plurality of clusters forms a summary of the plurality of texts; and
display, on a display, a visualization of the second plurality of clusters, the summary of the plurality of texts, and an element for dynamically changing a threshold value applied to the hierarchical clustering that alters a number of clusters in the second plurality of clusters, wherein when the element dynamically changes the threshold value, the visualization is automatically changed to reflect the altered number of clusters in the second plurality of clusters.

19.	(Original)	The computer program product according to claim 18, wherein executing the hierarchical clustering generates a tree diagram.
 
20.	(Currently amended)	  The computer program product according to claim 19, wherein in determining the second representative text for each of the clusters in the second plurality of clusters, the program instructions are executable by the at least one processor to cause the at least one processor to
	apply [[a]] the threshold value to the tree diagram



Allowable Subject Matter
Claims 1-3 and 5-20 are allowed.
The computer readable storage medium recited in claim 18 and its dependent claims, reference is made to applicants’ disclosure, paragraph [0143]: “A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transparent media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.” 
Thus the computer readable storage medium is construed to be non-transitory medium.
The following is an examiner’s statement of reasons for allowance:
The prior art searched and made of record fails to anticipate or make obvious the claimed invention.  Specifically, the prior art searched and made of record fails to teach the amended limitation(s) in the independent claims in combination with the other elements of the independent claims.  Further, the amended limitation(s) in the independent claims in combination with the other elements of the independent claims provides a scope that is beyond the abstract and are significantly more than a generic computer implementation of an otherwise abstract process.


The prior art made of record is considered pertinent to applicant's disclosure but fails to anticipate or make obvious the claimed invention.
Sakai et al. (U.S. Pre-Grant Publication No. 2002/0016798, hereinafter referred to as Sakai) teaches a computer-implemented method for summarizing a plurality of texts, the method comprising:
Generating a vector space based on a first set of vectors,
Sakai teaches extracting words from the unclassified text calculating a frequency vector of each word (Para. [0031]) and generating a vector space based on the set of vectors by teaching a categories comprising frequency vectors for corresponding text (Para. [0031]).
wherein each vector includes one or more feature scores determined from a frequency of tokens of the plurality of texts;
Sakai teaches “a similarity degree between the frequency vector and the representative vector of each category is calculated by unit of the same word index” and using a threshold to determine if the text is classified to the one category (Para. [0031]) thereby teaching a score in the form of a similarity degree determined from the extracted words from unclassified text.
Executing non-hierarchical clustering using the vector space to generate a first plurality of clusters;
Sakai teaches non-hierarchical clustering to generate a plurality of clusters based on categorization of frequency vectors to categorize unclassified text information (Para. [0031]).
Generating a second set of vectors including quantities of characters in tokens of first representative texts,
Sakai teaches generating a second set of vectors from the arrays by teaching converting the vector of extracted text information into a bit vector of words (Para. [0032]) thereby teaching generating the second set of vectors including quantities of characters in tokens of extracted text information.
wherein each first representative text is a selected text from a corresponding cluster of the plurality of clusters;
Sakai teaches receiving “text information set as object of category decision from the control unit” (Para. [0032] & Fig. 6 Element S61) thereby teaching representative texts each being selected from a cluster/category of a plurality of categories.
aligning a dimension of each of the vectors of the second set of vectors and
Sakai teaches “each text is converted to a bit vector of words” where “m units of text are objects of clustering and the number of different words extracted from all texts is n…. accordingly, an n-dimensional vector is composed” (Para. [0032]) thereby teaching aligning a dimension of each of the vectors.
executing hierarchical clustering using the second set of vectors to generate a second plurality of clusters; and
Sakai teaches performing hierarchical clustering to generate a second plurality of clusters (Para. [0034]).
determining a second representative text for each of the clusters included in the second plurality of clusters.
Sakai teaches determining multiple representative text for a cluster when there are more than one text in the cluster, after generating the hierarchical clustering result (Para. [0034] & Fig. 9).

Sakai teachs all of the elements of the claimed invention as recited above except:
wherein each vector includes one or more feature scores equal to or greater than a predefined value;
wherein each second representative text is a selected text from a corresponding cluster of the second plurality of clusters, and wherein the second representative text for each of the clusters included in the second plurality of clusters forms a summary of the plurality of texts; and
displaying, on a display, a visualization of the second plurality of clusters, the summary of the plurality of texts, and an element for dynamically changing a threshold value applied to the hierarchical clustering that alters a number of clusters in the second plurality of clusters, wherein when the element dynamically changes the threshold value, the visualization is automatically changed to reflect the altered number of clusters in the second plurality of clusters.

Ukrainczyk et al. (U.S. Pre-Grant Publication No. 2002/0022956, hereinafter referred to as Ukrainczyk) teaches:
wherein each vector includes one or more feature scores equal to or greater than a predefined value;
Ukrainczyk teaches using feature scores and a predefined value to filter out features with lower scores from the vector (Para. [0072]). Therefore, Ukrainczyk teaches reducing a vector using scores to features within the vector that are equal to or greater than a predefined standard.


Mizutani (U.S. Pre-Grant Publication No. 2014/0324865) teaches:
Wherein determining the number of clusters in the second plurality of clusters and the second representative text for each of the clusters in the second plurality of clusters further comprises applying a threshold to the tree diagram.
Mizutani teaches applying a threshold to a tree diagram of a hierarchical cluster to determine when to add nodes to the tree (Para. [0051]).

Fahy (U.S. Pre-Grant Publication No. 2002/0052692) teaches analyzing biological data for sets of test subjects such as gene arrays of group test subjects into clusters and order the clusters into a hierarchy based on similarities and differences of biological data corresponding to the test subjects.

Doan et al. (U.S. Pre-Grant Publication No. 2017/0242891) teaches non-hierarchically clustering records and determining representative set of records based on selecting a representative record from each cluster.

Acharya et al. (U.S. Pre-Grant Publication No. 2007/0271287) teaches records including category data is clustered by representing the data as a plurality of clusters, and generating a hierarchy of clusters based on the clusters.

Chen et al. (U.S. Patent No. 6,728,752) teaches quantitatively representing documents in a document collections as vectors in multi-dimensional vector spaces, quantitatively determining similarities between documents, and clustering documents according to those similarities.

Dange et al. (U.S. Patent No. 8,402,027) teaches clustering  plurality of observations included in a data set by selecting a subset of variables forma  set of variables.

Yang et al. (U.S. Pre-Grant Publication No. 2014/0164376) teaches assigning strings to clusters utilizing one or more clustering techniques including hierarchical clustering.

Mizuguchi et al. (U.S. Pre-Grant Publication No. 2012/0109963) teaches clustering a data group associated with a hierarchical classification, and generating a classification group obtained by extracting a classification satisfying a condition defined in advance from classifications corresponding to respective data in a cluster. The reference further teaches re-updating the classification hierarchy and the classifications of the data group by changing a threshold value for the clustering which is used to determine inclusion relationship and same-meaning relationship (Para. [0090]).

Castillo et al. (U.S. Pre-Grant Publication No. 2011/0029523) teaches clustering target objects into an optimal number of clusters, and selecting a representative target objects from each optimal cluster of target objects to form a test set of target objects (Para. [0047] & Fig. 1).

Guerraz et al. (U.S. Pre-Grant Publication No.2007/0239745) teaches probabilistic clustering using a probabilistic model parameters indicative of word counts, ratios, or frequencies characterizing classes of the clustering system.

Li (Patent No. 8,914,366) teaches generating clusters by a first clustering process, each cluster including one or more related records, and applying a second clustering process to the received clusters in order to determine tendencies for duplicate records associated with a single entity to be clustered into multiple clusters.

Deolalikar et al. (U.S. Pre-Grant Publication No.2014/0037214) teaches clustering a plurality of feature vectors using a hierarchical clustering algorithm to provide a plurality of clusters and cluster similarity measure for each cluster representing the quality of the cluster.

McKeown et al. (U.S. Pre-Grant Publication No. 2004/0203970) teaches pre-processing documents to group documents into document clusters and categories and generating summaries for specific classes of multiple documents clusters by determining a relationship of the documents in a cluster and selecting one of the document summarization engines for use in generating a summary of the cluster.

Non-Patent Literature Liu et al., "Improved Hierarchical K-means Clustering Algorithm without Iteration Based on Distance Measurement", 2014, IFIP Advances in Information and Communication Technology, Volume 432 Pages 38-46. (Year: 2014) teaches a new clustering algorithm iHK that compares the distance between cluster centers for determining if clusters should be merged.

Non-Patent Literature Lu et al. "Hierarchical Initialization approach for K-Means clustering" Pattern Recognition Letters, Volume 29, Issue 6, April 2008, Pages 787-795. (Year: 2008) teaches finding better initial cluster centers by treating the clustering problem as a weighted clustering problem prior to performing k-means clustering.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT F MAY whose telephone number is (571)272-3195.  The examiner can normally be reached on Monday-Friday 9:30am to 6pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain Alam can be reached on 571-272-3978.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/R. F. M./
Examiner, Art Unit 2154
11/17/2021

/HOSAIN T ALAM/Supervisory Patent Examiner, Art Unit 2154