DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Claims
Claims 1, 3-6, 9-15 have been amended and claim 2 has been canceled.  Claims 1, 3-15 are pending.

Claim Construction / Objection
Independent claims recite limitation “absolute importance of a document regardless of the network.”  However, it is not clear of the precise meaning of the term “absolute importance.”
The applicant’s specification provides an example of determining an “influence degree of a journal” by calculating “the impact factor” by a mathematical formula (see paragraph [0033], published version).  However, it is not clear from the context of the paragraph if the impact factor or the influence degree are meant to disclosed the claimed “absolute importance”  (i.e. if the absolute importance is meant to be calculated by the formula).
The applicant is reminded that during prosecution before the USPTO, claims are to be given their broadest reasonable interpretation and are interpreted in light of the specification.  However, the limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181,26 USPQ2d 1057 (Fed. Cir. 1993).
Therefore, the example of the formula, which is not explicitly relates to the claimed “absolute importance” is not read into the independent claims.  Still, “absolute” value is a well-known mathematical term.  Thus, it further not clear if the absolute importance (absolute value or number) is meant to be referring to a mathematical definition.  For example, in mathematics - the absolute value of a number refers to the distance of a number from the origin of a number line. It is represented as |a|, which defines the magnitude of any integer ‘a’. The absolute value of any integer, whether positive or negative, will be the real numbers, regardless of which sign it has.  It is represented by two vertical lines |a|, which is known as the modulus of a.  The absolute value of a number may be thought of as its distance from zero. An absolute value is also defined for the complex numbers, the quaternions, ordered rings, fields and vector spaces. The absolute value is closely related to the notions of magnitude, distance, and norm in various mathematical and physical contexts (see https://en.wikipedia.org/wiki/Absolute_value).
 Still, such disclosure is not specified in the applicant’s specification and is unclear if such definition is intended by the specification.
Therefore, based on the above the “absolute importance” is given a plain meaning as a real, positive number, such as magnitude, distance etc.
The applicant is advised to explicitly indicate the intended calculations for the  “absolute importance” to avoid undue interpretations.



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1, 5-7, 14-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bryden (US 2014/0059089) in view of Masuyama et al. (US 2009/0169110) in view of Canright et al. (US 2009/0296600) and in further view of Laroco, Jr et al. (US 8,122,026).

Regarding claim 1, Bryden teaches a cluster analysis method in which a computer classifies a plurality of documents into clusters according to content of the documents and generates display data indicating a relationship between documents, the cluster analysis method comprising: 
calculating similarity between content of one document and content of another document ([0168], [0184], [0187]); 
generating a network in which a document or a cluster is set as a node based on calculated similarity and similar nodes are connected by an edge, and classifying similar documents into clusters ([0110], [0118], [0122], [0190]); 
calculating, by an algorithm, a first index 
calculating a second index that is different from the first index in the network and indicates low estimated access and/or low priority may be moved to lower-cost lower-performance storage and/or servers”, “groups have high estimated frequencies and/or access priorities, their data may be automatically moved so as to ensure fast access” [0174] “files could be marked as having a degree of increased priority”, [0175]); and 
generating display data:
regarding a document, first display data indicating 


Bryden does not explicitly teach, however Canright discloses a first index calculation step of calculating a first index indicating centrality of a document in the network ([0026]) and
a display data generation step of generating, regarding a document, first display data indicating the network by an expression of a size of an object of a node according to the first index ([0068, [0071], [0089]).
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Bryden to include index indicating centrality and display cluster size as disclosed by Canright.  Doing so would provide a good visualization which presents regional view of a network (Canright [0007]).

Bryden does not explicitly teach, however Masuyama discloses a display data generation step of generating, regarding a document, first display data indicating the network by an expression of a size of an object of a node according to the first index, an expression of a gauge having a shape corresponding to a shape of the object according to the second index and a length of the gauge ([0266], [0280], [0286]-[0287], [0296], [0455]), an expression according to a type of the cluster, and an expression according to magnitude of similarity between documents ([0262], F10, 15-43) and 
the display data in which an object of the first index is represented by a circle, and a gauge of the second index is represented by an arc concentric with the circle of the first index and a length of the arc ([0369], [0455]-[0457]). 
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Bryden to include display of cluster data as disclosed by Masuyama.  Doing so would enable to analyze the general positioning of a document with respect to other document group and the character of the overall document group (Masuyama [0002]).
NOTE that length of the arc is likewise disclosed by YANAI (US 2011/0231129) in paragraphs [0116]-[0117] and further obviates the teachings of Bryden.

Bryden does not explicitly teach, however Laroco, Jr discloses indicates absolute importance of a document regardless of the network (C1L522-59, C8L10-42).  It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Bryden to include absolute importance as disclosed by Laroco, Jr.  Doing so would help disambiguating references to entities in documents (Laroco, Jr C1L9-10).

Regarding claim 5, Bryden as modified teaches the cluster analysis method according to claim 1, wherein the document is published on an academic journal (Bryden [0187]), and the second index is calculated according to citation of the document (Bryden [0190], [0202]).

Regarding claim 6, Bryden as modified teaches the cluster analysis method according to claim 1, wherein the document is described on a website acquired by web search (Bryden [0197], [0216]-[0217]) up to a predetermined number of items (Bryden [0110], [0162], [0177]).

Regarding claim 7, Bryden as modified teaches the cluster analysis method according to claim 6, wherein the second index is calculated according to number of accesses to the website (Bryden [0167]-[0168], [0171], [0173], [0177]).

Claims 3-4, 8-11, 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bryden (US 2014/0059089) as modified and in further view of Victoroff et al. (US 10,956,790).

Regarding claim 3, Bryden as modified teaches the cluster analysis method according to claim 1, wherein the document has at least one of a title, a gist, and a text as a constituent thereof (Bryden [0194], [0187], [0189]), and the generating display data further comprises: 
extracting a word having a high appearance frequency included in at least one of a title, a gist, and a text of a document belonging to one cluster (Bryden [0140], [0236], [0248], [0261], Masuyama [0141]-[0157])
Bryden as modified does not explicitly teach, however Victoroff discloses generating second display data for displaying the word in a size according to the appearance frequency (C21L30-34, F12-13). 
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Bryden as modified to include size according to the appearance frequency as disclosed by Victoroff.  Doing so would indicate how and high importance of the word in a data recognition (Victoroff C31L1-16).

Regarding claim 4, Bryden as modified teaches the cluster analysis method according to claim 1, wherein the document has at least one of a title, a gist, and a text as a constituent thereof, and the generating display data further comprises: 
extracting words having a high appearance frequency included in at least one material selected from the group consisting of a title, a gist, and a text of a document belonging to one cluster  (Bryden [0140], [0236], [0248], [0261], Masuyama [0141]-[0157]), and generating second display data for displaying the words in order according to the appearance frequency (Victoroff C18L15-25, C21L30-34, F2, 4, 12-13).

Regarding claim 8, Bryden as modified teaches the cluster analysis method according to claim 6 or 7, wherein a word with a high appearance frequency included in the document is extracted (Bryden [0140], [0236], [0248], [0261], Masuyama [0141]-[0157]), and second display data for displaying the word in a size according to the appearance frequency is generated (Victoroff C18L15-25, C21L30-34, F2, 4, 12-13).

Regarding claim 9, Bryden as modified teaches the cluster analysis method according to claim 6, wherein words with a high appearance frequency included in the document are extracted (Bryden [0140], [0236], [0248], [0261], Masuyama [0141]-[0157]), and second display data for displaying the words in order according to the appearance frequency is generated (Victoroff C18L15-25, C21L30-34, F2, 4, 12-13).

Regarding claim 10, Bryden as modified teaches the cluster analysis method according to claim 1, further comprising:  designating a word from those having a high appearance frequency included in the document (Bryden [0140], [0236], [0248], [0261], Masuyama [0141]-[0157]); excluding the document including the designated word from the target of analysis and performing the analysis again (C30L5-15, 19-22).

Regarding claim 11, Bryden as modified teaches the cluster analysis method according to claim 1, further comprising: designating a word from those having a high appearance frequency included in the document (Bryden [0140], [0236], [0248], [0261], Masuyama [0141]-[0157]) and generating first display data for highlighting, on a network, a node indicating a document or a cluster including the designated word (Victoroff C29L65-67, C30L64-67, C31L10-16).

Regarding claim 13, Bryden as modified teaches the cluster analysis method according to claim 1, wherein the display data generation step comprises:
expressing an expression according to magnitude of similarity between the documents by thickness of a line connecting documents (Canright [0070], [0088]-[0089]) and displaying the network in enlarged and reduced manners, and generating the first display data by increasing or decreasing number of displayed lines according to the enlarged and reduced display (Victoroff C17L46-47, C30L24-37).

Claim 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bryden (US 2014/0059089) as modified and in further view of Wnek (US 2006/0242190).

Regarding claim 12, Bryden as modified does not explicitly teach, however Wnek discloses the cluster analysis method according to claim 1, wherein the display data generation step comprises determining an arrangement of documents on the network by using a dynamic model so that a plurality of documents are not displayed in an overlapping manner ([0116]-[0117]).
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Bryden to include plurality of documents are not displayed in an overlapping manner as disclosed by Wnek.  Doing so would identify more useful documents and would lead to better clusters (Wnek [0136]).
NOTE alternatively, claim 12 is rejected in view of (US 9,710,544) C6L44-55 or (US 6,298,174) C5L18-21.

Regarding claim 14, Bryden teaches a cluster analysis system that classifies a plurality of documents into clusters according to content of the documents and generates display data indicating a relationship between documents, the cluster analysis system comprising: a storage memory; and a processor, the processor comprising: 
a similarity calculation unit that calculates similarity between content of one document and content of another document ([0168], [0184], [0187]); 
a cluster classification unit that generates a network in which a document is set as a node based on calculated similarity and similar nodes are connected by an edge, and classifies similar documents into clusters ([0168], [0184], [0187]); 
a first index calculation unit that calculates a first index 
a second index calculation unit that calculates a second index that is different from the first index in the network and indicates 
a display data generation unit that generates, regarding a document, 

Bryden does not explicitly teach, however Canright discloses a first index calculation step of calculating a first index indicating centrality of a document in the network ([0026]) and
a display data generation step of generating, regarding a document, first display data indicating the network by an expression of a size of an object of a node according to the first index ([0068, [0071], [0089]).
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Bryden to include index indicating centrality and display cluster size as disclosed by Canright.  Doing so would provide a good visualization which presents regional view of a network (Canright [0007]).

Bryden does not explicitly teach, however Masuyama discloses a display data generation step of generating, regarding a document, first display data indicating the network by an expression of a size of an object of a node according to the first index, an expression of a gauge having a shape corresponding to a shape of the object according to the second index and a length of the gauge ([0266], [0280], [0286]-[0287], [0296], [0455]), an expression according to a type of the cluster, and an expression according to magnitude of similarity between documents ([0262], F10, 15-43). 
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Bryden to include display of cluster data as disclosed by Masuyama.  Doing so would enable to analyze the general positioning of a document with respect to other document group and the character of the overall document group (Masuyama [0002]).

Bryden does not explicitly teach, however Laroco, Jr discloses indicates absolute importance of a document regardless of the network (C1L522-59, C8L10-42).  It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Bryden to include absolute importance as disclosed by Laroco, Jr.  Doing so would help disambiguating references to entities in documents (Laroco, Jr C1L9-10).

Regarding claim 15, Bryden teaches a non-transitory computer-readable medium having stored thereon computer-executable instructions for cluster analysis program that, upon execution, causes a computer to classify a plurality of documents into clusters according to content of the documents and generate display data indicating a relationship between documents, the computer-executable instructions including: 
a similarity calculation step of calculating similarity between content of one document and content of another document; a cluster classification step of generating a network in which a document is set as a node based on calculated similarity and similar nodes are connected by an edge, and classifying similar documents into clusters; a first index calculation step of calculating a first index indicating centrality of a document in the network; a second index calculation step of calculating a second index that is different from the first index in the network and indicates absolute importance of a document regardless of the network; and a display data generation step of generating, regarding a document, first display data indicating the network by an expression of a size of an object of a node according to the first index, an expression of a gauge having a shape corresponding to a shape of the object according to the second index and a length of the gauge, an expression according to a type of the cluster, and an expression according to magnitude of similarity between documents..
Claim 15 recites substantially the same limitations as claim 14, and is rejected for substantially the same reasons.

Claim(s) 1, 14-15 is/are is/are alternatively rejected under 35 U.S.C. 103 as being unpatentable over Damaraju et al. (US 9,911,211) in view of Pitkow et al. (US 6,038,574).

Regarding claim 1, Damaraju teaches a cluster analysis method in which a computer classifies a plurality of documents into clusters according to content of the documents and generates display data indicating a relationship between documents, the cluster analysis method comprising: 
calculating similarity between content of one document and content of another document (C11L61-64, C14L2-5); 
generating a network in which a document or a cluster is set as a node based on calculated similarity and similar nodes are connected by an edge, and classifying similar documents into clusters (C3L60-63, C4L9-10, F2-4); 
calculating, by an algorithm, a first index (C16L61-64 “feature vectors may be subjected by some embodiments to various types of dimensional reduction, like indexing, random indexing”, C17L57-60 “decomposing a matrix into vectors, and translating the vectors into an index indicating which vector scalars have a nonzero value and corresponding indications of those values”, C18L40-42 “high dimensional sparse vectors may be reduced in dimension with random indexing”) indicating centrality of a document in the network (C6L61-62, C10L35-47, C23L35-56, C25L35-44)(see NOTE); 
calculating a second index that is different from the first index in the network (C16L62-67, C17L5-17, 57-60 “decomposing a matrix into vectors, and translating the vectors into an index indicating which vector scalars have a nonzero value and corresponding indications of those values”, C12L1-7 “vector space has dimensions corresponding to attributes of the underlying entities … a single graph within a vector space has multiple different visual representations and different visual representation spaces, in which the dimensions of the visual representation spaces represent different visual attributes”) and indicates absolute importance of a document regardless of the network (C9L14-15 “node icons correspond to nodes that are significant in virtue of metadata … node icons may be significant in virtue of edge properties of the respective node, like a particularly high score of a topic of a cluster to which cluster of that respective node connects, a number of edges connected to that node, a median number of edge weights of edges connect to that node … edges may indicate an amount of edges of nodes in connected clusters extending between the clusters, such as … cross citation”, C12L17 “designate the most relevant document”, wherein weighted metadata which indicates significance of the node is an importance, and is not based on the network.  The weight is an average and therefore is a non-negative and thus, absolute (see C10L19-20 “an amount of edges extending therebetween, and average weight of edges extending therebetween”) (see NOTE); and 
generating display (F2-7) data:
regarding a document, first display data indicating the network by an expression of a size of an object of a node according to the first index (F3, C6L39-67), an expression of a gauge having a shape corresponding to a shape of the object according to the second index and a length of the gauge  (C9L1-10, 31-33), an expression according to a type of the cluster (C15L43-48), and an expression according to magnitude of similarity between documents (C17L26-34) and 
the display data in which an object of the first index is represented by a circle, and a gauge of the second index is represented by an arc concentric with the circle of the first index and a length of the arc (F2-7, C6L38-67, C9L1-10, C26L35-37).

NOTE Damaraju teaches that various feature vectors are extracted from documents.  Such vectors are arranged into different matrices (co-occurrence matrix, term-document matrix, topic matrix, n-gram matrix etc.) and provide various dimension reductions, like indexing or random indexing based on which the documents are clustered (C18L35-36“Documents may be clustered according to their corresponding vectors in the concept space”).  Thus, various indexes are generated based on the feature vectors.  
For example, the “feature vector may correspond to one n-gram” C16L44-45, “with various n-grams in the feature vectors (like at an intersection in an adjacency matrix” C5L25-26.  The n-grams with various scoring are “determine the text associated with the cluster, such as a measure of central tendency or by identifying outliers. Examples of such measures of central tendency include a mean, mode, and median” (C10L40-45).  Therefore, the feature vector, which comprises centrality based on the n-gram generates a dimension corresponding to the centrality index.
The feature vectors can be based on other attributes of the document, such as “a path from one vector to the other vector where every link and the path is a core vector and is it within a threshold distance of one another” (C19L14-15).  Such path indicates an amount to a connections to the node, which indicates a significance of the node (document) and is analogous to the “absolute importance of a document regardless of the network”. Based on the feature vectors (which indicate number of connections to the node) the significance of the node (document) is determined.  The feature vectors of such nodes provides “a second index that is different from the first index in the network and indicates absolute importance of a document regardless of the network”.
However, if Damaraju doesn’t explicitly teach  “a second index that is different from the first index in the network and indicates absolute importance of a document regardless of the network”, Pitkow discloses the same in C4L41-55, C5L6-20, C7L35-44 (wherein Euclidean333 distance is the absolute importance).  It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Damaraju to include second index as disclosed by Pitkow.  Doing so yields insight into the implicit semantic structures of the Web (Pitkow C5L4-5).

Regarding claim 14, Damaraju teaches a cluster analysis system that classifies a plurality of documents into clusters according to content of the documents and generates display data indicating a relationship between documents, the cluster analysis system comprising: 
a storage memory; and a processor (F1), the processor comprising:
a similarity calculation unit that calculates similarity between content of one document and content of another document (C11L61-64, C14L2-5); 
a cluster classification unit that generates a network in which a document is set as a node based on calculated similarity and similar nodes are connected by an edge, and classifies similar documents into clusters (C3L60-63, C4L9-10); 
a first index calculation unit that calculates a first index (C16L61-64 “feature vectors may be subjected by some embodiments to various types of dimensional reduction, like indexing, random indexing”, C17L57-60 “decomposing a matrix into vectors, and translating the vectors into an index indicating which vector scalars have a nonzero value and corresponding indications of those values”, C18L40-42 “high dimensional sparse vectors may be reduced in dimension with random indexing”) indicating centrality of a document in the network (C6L61-62, C10L41-45); 
a second index calculation unit that calculates a second index that is different from the first index in the network (C16L62-67, C17L5-17, 57-60 “decomposing a matrix into vectors, and translating the vectors into an index indicating which vector scalars have a nonzero value and corresponding indications of those values”, C12L1-7 “vector space has dimensions corresponding to attributes of the underlying entities … a single graph within a vector space has multiple different visual representations and different visual representation spaces, in which the dimensions of the visual representation spaces represent different visual attributes”) and indicates absolute importance of a document regardless of the network (C9L14-15 “node icons correspond to nodes that are significant in virtue of metadata … node icons may be significant in virtue of edge properties of the respective node, like a particularly high score of a topic of a cluster to which cluster of that respective node connects, a number of edges connected to that node, a median number of edge weights of edges connect to that node … edges may   indicate an amount of edges of nodes in connected clusters extending between the clusters, such as … cross citation”, wherein weighted metadata is an importance, and is not based on the network.  The weight is an average and therefore is a non-negative and thus, absolute (see C10L19-20 “an amount of edges extending therebetween, and average weight of edges extending therebetween”); and 
a display data generation unit that generates, regarding a document, first display data indicating the network by an expression of a size of an object of a node according to the first index, an expression of a gauge having a shape corresponding to a shape of the object according to the second index (C6L39-67) and a length of the gauge (C9L1-10, 31-33), an expression according to a type of the cluster (C15L43-48), and an expression according to magnitude of similarity between documents (C17L26-34).

Regarding claim 15, Damaraju teaches a non-transitory computer-readable medium having stored thereon computer-executable instructions for cluster analysis program that, upon execution, causes a computer to classify a plurality of documents into clusters according to content of the documents and generate display data indicating a relationship between documents, the computer-executable instructions including: 
a similarity calculation step of calculating similarity between content of one document and content of another document; a cluster classification step of generating a network in which a document is set as a node based on calculated similarity and similar nodes are connected by an edge, and classifying similar documents into clusters; a first index calculation step of calculating a first index indicating centrality of a document in the network; a second index calculation step of calculating a second index that is different from the first index in the network and indicates absolute importance of a document regardless of the network; and a display data generation step of generating, regarding a document, first display data indicating the network by an expression of a size of an object of a node according to the first index, an expression of a gauge having a shape corresponding to a shape of the object according to the second index and a length of the gauge, an expression according to a type of the cluster, and an expression according to magnitude of similarity between documents..
Claim 15 recites substantially the same limitations as claim 1, and is rejected for substantially the same reasons.


Response to Arguments
	Applicant’s arguments, filed 04/26/2022, in regard to the presently amended claims, have been fully considered and are addressed in the updated rejections to the claims above.
	Please note an alternative rejection of the independent claims in order to advance the prosecution.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to POLINA G PEACH whose telephone number is (571)270-7646. The examiner can normally be reached Monday-Friday, 9:30 - 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aleksandr Kerzhner can be reached on 571-270-1760. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/POLINA G PEACH/Primary Examiner, Art Unit 2165                                                                                                                                                                                                        May 2, 2022