Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION

1.	This action is responsive to the communication filed on 3/8/22.  Claims 1-2, 5-6 and 10-16 have been amended. Claims 17-20 have been cancelled. Claims 21-24 have been added. Claims 1-16 and 21-24 are pending.
	Applicants' arguments filed 3/8/22 have been fully considered but they are not deemed to be persuasive.  Rejections and/or objections not reiterated from previous office actions are hereby withdrawn.  The following rejections and/or objections are either reiterated or newly applied.  They constitute the complete set presently being applied to the instant application.

Claim Rejections - 35 USC § 103
2.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
3.	This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was 
4.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

5.	The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
6.	Claims 1, 7-8, 10 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Chi in view of Burges, and further in view of Tsioutsiouliklis et al (U.S. 20070198603 A1 hereinafter, “Tsioutsiouliklis”).
7.	With respect to claim 1,
a computer-implemented method for clustering web pages of a website application, the method comprising:
obtaining user flow data associated with a first browsing session at the website, the user flow data including a plurality of web page identifiers (Chi [0070] – [0075] e.g. [0070] In addition to visualizing the structure and topology of a web site, an embodiment of the present invention also displays information relevant to the web site's design.  For example, analysts may view actual user paths as they travel through different web pages. [0071] A technique for predicting user paths is disclosed in co-pending U.S.  patent application Ser.  No. 09/540,976 entitled "System and Method for Predicting Web User Flow By Determining Association Strength Of Hypermedia Links," incorporated above. [0074] In addition to viewing actual and predicted user paths and goals, an embodiment of the present invention allows viewing of this information together.  Viewing both actual and predicted information allows analysts and developers to determine whether a design based on predicted patterns is actually being followed. [0075] Each of the above-described techniques generate a usage log.  Also, actual user logs can be generated from session logs, or cookies.  Once a log has been assembled, frequently traveled user paths may be generated for layout on a visual display [as obtaining user flow data associated with a first browsing session (e.g. session log) at the website (e.g. web site), the user flow data including a plurality of web (Pirolli Fig. 1 e.g. web pages P0 to P6)]. NOTE: U.S. patent application number: 09/540,976: Pirolli et al (US 6,671,711 B1, hereinafter “Pirolli”)));
generating a web page record for each of the web page identifiers, each web page record including one or more words of the corresponding web page identifier (Pirolli col. 4 lines 14-42 Fig. 1 e.g. The unique content items in the documents are indexed, step 302, as shown in FIG. 6B.  For example, in FIG. 1, there are eight unique items: "Java" 140 (contained in documents P1, P2, P3 and P5), "API" 142 (contained in document P3), "Sun" 144 (contained in document P0), "Home" 146 (contained in documents P0 and P4), "coffee" 148 (contained in document P5), "support" 150 (contained in document P2), "Petes" 152 (contained in document P4) and "Tea" 154 (contained in document P6).  These eight content items are indexed as follows: 0: Java, 1: API, 2: Sun, 3: Home, 4: Coffee, 5: Support, 6: Petes and 7: Tea.  These indexed items are shown in FIG. 6B, along with their associated unique content item numbers [as generating a web page record (e.g. Web pages P0 to P6 in Fig. 1) for each of the web page identifiers, each web page record including one or more words (e.g. “Sun” & “home” for web page P0 to “Tea” for web page P6 in Fig. 1) of the corresponding web page identifier]); and
constructing a first directed graph representative of the first browsing session, wherein directed edges of the first directed graph are transitions between web pages and one or more nodes of the first directed graph include the identified clusters of web page identifiers (Pirolli col. 3 lines 10-23 Fig. 1 e.g. (4)   FIG.1 is a block diagram 100 illustrating the structural linkage and content of a collection of hypermedia linked documents.  Documents P0, P1, P2, P3, P4, P5 and P6, are indexed and shown as 102, 104, 106, 108, 110, 112 and 114.  Documents P0-P6 are linked as shown by hypermedia links 120, 122, 124, 126, 128, 130 and 132.  The hypermedia links may be any type of linked from one document to another, including hypertext links.  An example of the kind of document shown in P0-P6 (102-114) is a web site.  Content items 144-154 are located in documents P0-P6 as shown.  The content of documents associated with these hypermedia links is usually presented to the user by some proximal cue such as a snippet of text or a graphic [as constructing a first directed graph representative of the first browsing session (e.g. Fig. 1), wherein directed edges of the first directed graph are transitions between web pages and one or more nodes of the first directed graph are the identified clusters of web page identifiers]).
Although Chi substantially teaches the claimed invention, Chi does not explicitly indicate
receiving two or more clusters of web page identifiers, wherein the two or more clusters of web page identifiers were output from a machine learning clustering process;
for each of the web page records, identifying a cluster of web page identifiers  by mapping the web page record to one of the two or more clusters of web page identifiers using the machine learning clustering process;
for each of the web page records, identifying one or more valuable webpage identifiers;
include the identified clusters of web page identifiers and the valuable web page identifiers.
Burges teaches the limitations by stating
generating a web page record for each of the web page identifiers, each web page record including one or more words of the corresponding web page identifier (Burges [0003] e.g. each web page is represented as a vector in Euclidian space, and each element in the vector indicates the recurrence of some word [as generating a web page record for each of the web page identifiers, each web page record including one or more words of the corresponding web page identifier]);
receiving two or more clusters of web page identifiers, wherein the two or more clusters of web page identifiers were output from a machine learning clustering process (Burges [0004] – [0007], [0027] e.g. [0007] A collection of web pages is regarded as a directed graph, in which the nodes of the graph are the web pages and directed Web pages are also represented by content, or by other features, to obtain a similarity graph over the web pages, where nodes again denote the web pages and the links or edges between each pair of nodes is weighted by a corresponding similarity between those two nodes.  A random walk is defined for each graph, and a mixture of the random walks is obtained for the set of graphs.  The collection of web pages is then analyzed based on the mixture to obtain a web page analysis result.  The web page analysis result can be, for example, clustering of the web pages to discover web communities, classifying or categorizing the web pages, or spam detection indicating whether a given web page is spam or content. [0027] For instance, clustering nodes of a graph can be viewed as cutting the links, joining different node clusters in the graph.  Clustering the nodes into two clusters (for example) can thus be thought of as a "min cut" problem in which the particular links that are to be cust are identified by finding those links with a smallest total weight.  Therefore, the graph is divided into clusters in such a way that cuts are not made between nodes that are linked by highly weighted links, but instead cuts are made between nodes that are linked by links having a relatively low weight.  Clustering the nodes in the graphs to define web communities or web page classes or categories is indicated by blocks 124 and 126 in FIG. 1 and performing this analysis is receiving two or more clusters (e.g. clusters) of web page identifiers (e.g. web pages – nodes: vector spaces), wherein the two or more clusters of web page identifiers were output from a machine learning clustering process (e.g. machine/multiview learning)]);
for each of the web page records (Burges [0003] e.g. each web page is represented as a vector in Euclidian space, and each element in the vector indicates the recurrence of some word), identifying a cluster of web page identifiers (e.g. graph – cluster of web pages/nodes) by mapping the web page record to one of the two or more clusters of web page identifiers using the machine learning clustering process (Burges [0007], [0027] e.g. [0007] A collection of web pages is regarded as a directed graph, in which the nodes of the graph are the web pages and directed edges are hyperlinks.  Web pages are also represented by content, or by other features, to obtain a similarity graph over the web pages, where nodes again denote the web pages and the links or edges between each pair of nodes is weighted by a corresponding similarity between those two nodes.  A random walk is defined for each graph, and a mixture of the random walks is obtained for the set of graphs.  The collection of web pages is then analyzed based on the mixture to obtain a web page analysis result.  The web page analysis result can be, for example, clustering of the web pages to discover web web pages, or spam detection indicating whether a given web page is spam or content. [0027] For instance, clustering nodes of a graph can be viewed as cutting the links, joining different node clusters in the graph.  Clustering the nodes into two clusters (for example) can thus be thought of as a "min cut" problem in which the particular links that are to be cust are identified by finding those links with a smallest total weight.  Therefore, the graph is divided into clusters in such a way that cuts are not made between nodes that are linked by highly weighted links, but instead cuts are made between nodes that are linked by links having a relatively low weight.  Clustering the nodes in the graphs to define web communities or web page classes or categories is indicated by blocks 124 and 126 in FIG. 1 and performing this analysis is indicated by blocks 158, 160 and 162 in FIG. 2.);
for each of the web page records, identifying one or more valuable webpage identifiers;
constructing a first directed graph representative of the first browsing session, wherein directed edges of the first directed graph are transitions between web pages and one or more nodes of the first directed graph include the identified clusters of web page identifiers and the valuable web page identifiers (Burges [0002], [0007], Claim 10 e.g. [0002] There are a wide variety of different applications that can make use of a system Spam detection applications can also be regarded as web page classification by classifying any given page as spam, or as a legitimate content page. [0007] A collection of web pages is regarded as a directed graph, in which the nodes of the graph are the web pages and directed edges are hyperlinks.  Web pages are also represented by content, or by other features, to obtain a similarity graph over the web pages, where nodes again denote the web pages and the links or edges between each pair of nodes is weighted by a corresponding similarity between those two nodes.  A random walk is defined for each graph, and a mixture of the random walks is obtained for the set of graphs.  The collection of web pages is then analyzed based on the mixture to obtain a web page analysis result.  The web page analysis result can be, for example, clustering of the web pages to discover web communities, classifying or categorizing the web pages, or spam detection indicating whether a given web page is spam or content. [Claim 10] wherein grouping the web pages into groups comprises: grouping the web pages into a first group indicative of a spam web page and a second group indicative of a content web page [as 
for each of the web page records, identifying one or more valuable webpage identifiers (e.g. spam web pages);
(e.g. groups) of web page identifiers (e.g. legitimate content web pages) and the valuable web page identifiers (e.g. spam web pages)]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention, in view of the teachings of Chi and Burges, to provide the ability of displaying generalized graph structures and access patterns, such as World Wide Web sites, actual usage patterns, and predicted usage patterns, so that the important relationships are exposed (Chi [0008]).
Although Chi and Burges combination substantially teaches the claimed invention, they do not explicitly indicate
wherein the valuable web page identifiers are not sent to the machine learning clustering process.
Tsioutsiouliklis teaches the limitations by stating
for each of the web page records, identifying one or more valuable webpage identifiers, wherein the valuable web page identifiers are not sent to the machine learning clustering process; and
constructing a first directed graph representative of the first browsing session, wherein directed edges of the first directed graph are transitions between web pages and one or more nodes of the first directed graph include the identified clusters of web page identifiers and the valuable web page identifiers (Tsioutsiouliklis Abstract, [0035], [0053] e.g. Techniques are Web pages may be represented as nodes within a graph.  Links between web pages may be represented as directed edges between the nodes. [0035] Referring again to FIG. 2, in block 210, specified action is taken with respect to the nodes identified in block 208.  For example, in one embodiment of the invention, all references to web pages that are represented by suspicious nodes are automatically eliminated from further inclusion in any set of search results generated by an Internet search engine. Alternatively, in one embodiment of the invention, the identities of the web pages represented by suspicious nodes are logged.  Logs that identify such web pages may be examined manually by human inspectors.  The inspectors may browse the web pages, manually determine whether the web pages actually have been artificially manipulated, and take appropriate action. [0053] A machine-learning mechanism also may be supplied a set of web page or other entities that are known to be legitimate.  The machine-learning mechanism may be informed that this set represents a legitimate set. Thus, embodiments of the invention may implement machine-learning mechanisms to continuously refine definitions of high-quality for each of the web page records (e.g. web pages), identifying one or more valuable webpage identifiers (e.g. suspicious nodes), wherein the valuable web page identifiers are not sent (e.g. eliminated from further inclusion in any set of search results) to the machine learning (e.g. machine-learning) clustering (e.g. graph) process; and
constructing a first directed graph representative of the first browsing session, wherein directed edges of the first directed graph (e.g. graph) are transitions between web pages and one or more nodes of the first directed graph include the identified clusters of web page identifiers (e.g. a set of web page or other entities that are known to be legitimate) and the valuable web page identifiers (e.g. suspicious nodes - logs)]).
 Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention, in view of the teachings of Chi, Burges and Tsioutsiouliklis, to provide the ability of displaying generalized graph structures and access patterns, such as World Wide Web sites, actual usage patterns, and predicted usage patterns, so that the important relationships are exposed (Chi [0008]). 
8.	With respect to claim 7,
	Burges further discloses wherein all of the nodes of the first directed graph are the identified clusters of web page identifiers (Chi [0070] – [0075] e.g. [0070] In addition to visualizing the structure and topology of a web site, an embodiment of the present invention also displays view actual user paths as they travel through different web pages. [0071] A technique for predicting user paths is disclosed in co-pending U.S.  patent application Ser.  No. 09/540,976 entitled "System and Method for Predicting Web User Flow By Determining Association Strength Of Hypermedia Links," incorporated above. [0074] In addition to viewing actual and predicted user paths and goals, an embodiment of the present invention allows viewing of this information together.  Viewing both actual and predicted information allows analysts and developers to determine whether a design based on predicted patterns is actually being followed. [0075] Each of the above-described techniques generate a usage log.  Also, actual user logs can be generated from session logs, or cookies.  Once a log has been assembled, frequently traveled user paths may be generated for layout on a visual display).
9.	With respect to claim 8,
	Burges further discloses classifying the first browsing session as legitimate or fraudulent using the first directed graph and a machine learning classifier (Burges [0002], [0007], [0025], [0028] e.g. [0002] Spam detection applications can also be regarded as web page classification by classifying any given page as spam, or as a legitimate content page. [0028] The Markov mixture of the random walks can also be used, however, to perform spam detection.  One web page analysis system 106 employs an objective function for classification that forces the classification function to change as slowly as possible on densely connected subgraphs.  In addition, the objective function receives training pages that are labeled as spam or content pages, and forces the function to fit (within a predetermined closeness) the given labels of those pages as well as possible.  Then, each of the nodes in the graph are assigned a value by the classification function, and if the value falls on one side of a predetermined threshold, the pages can be considered spam, while if it fails to meet the threshold, the pages are considered content pages (or vice versa, depending on the classification function used).  This is also discussed in greater detail below.  Performing analysis to generate an indication of spam detection is indicated by block 128 in FIG. 1 and block 164 in FIG. 2).
10.	Claim 10 is same as claim 1 and is rejected for the same reasons as applied hereinabove.
11.	Claim 16 is same as claim 1 and is rejected for the same reasons as applied hereinabove.

12.	Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Chi in view of Burges and Tsioutsiouliklis, and further in view of Miltonberger (U.S. 20100094768 A1 hereinafter, “Miltonberger”).
13.	With respect to claim 9,
responsive to classifying of the first browsing session as fraudulent: automatically terminating the first browsing session.
Miltonberger teaches the limitations by stating responsive to classifying of the first browsing session as fraudulent: automatically terminating the first browsing session (Miltonberger [0046] – [0047], [0203], [0207] – [0209], [0296] e.g. The sequence of related online events comprises a session login event and a termination event, and can include one or more activity events).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention, in view of the teachings of Chi, Burges, Tsioutsiouliklis and Miltonberger, to provide the ability of displaying generalized graph structures and access patterns, such as World Wide Web sites, actual usage patterns, and predicted usage patterns, so that the important relationships are exposed (Chi [0008]). 

14.	Claims 2-6, 11-15 and 21-24 are rejected under 35 U.S.C. 103 as being unpatentable over Chi in view of Burges and Tsioutsiouliklis, and further in view of Poola et al (U.S. 20080010291 A1 hereinafter, “Poola”).
15.	With respect to claim 2,
Chi further discloses
obtaining general user flow data associated with a plurality of browsing sessions at the website, each instance of the general user flow data including the plurality of web page identifiers;
generating a plurality of web page records, one for each of the web page identifiers of the plurality of browsing sessions, each web page record including one or more words of the corresponding web page identifier (Chi [0070] – [0075] e.g. [0070] In addition to visualizing the structure and topology of a web site, an embodiment of the present invention also displays information relevant to the web site's design.  For example, analysts may view actual user paths as they travel through different web pages. [0071] A technique for predicting user paths is disclosed in co-pending U.S.  patent application Ser.  No. 09/540,976 entitled "System and Method for Predicting Web User Flow By Determining Association Strength Of Hypermedia Links," incorporated above. [0074] In addition to viewing actual and predicted user paths and goals, an embodiment of the present invention allows viewing of this information together.  Viewing both actual and predicted information allows analysts and developers to determine whether a design based on predicted patterns is actually being followed. [0075] Each of the above-described techniques generate a usage log.  Also, actual user logs can be generated from session logs, or cookies.  Once a log has been assembled, frequently traveled user paths may be generated for layout on a visual display).

calculating a plurality of clusters using a clustering algorithm and at least one initial clustering algorithm parameter, wherein each web page record of the plurality of web page records is assigned to one the plurality of clusters;
evaluating the plurality of clusters generated using the at least one initial clustering algorithm parameter based on a performance metric; and
tuning the at least one initial algorithm parameter based on a result of the evaluating and re-calculating the plurality of clusters, wherein the machine learning clustering process uses the tuned at least one initial algorithm parameter value.
Poola teaches the limitations by stating
calculating a plurality of clusters using a clustering algorithm and at least one initial clustering algorithm parameter, wherein each web page record of the plurality of web page records is assigned to one the plurality of clusters;
evaluating the plurality of clusters generated using the at least one initial clustering algorithm parameter based on a performance metric; and
tuning the at least one initial algorithm parameter based on a result of the evaluating and re-calculating the plurality of clusters, wherein the machine learning clustering process uses the tuned at least one initial algorithm parameter value (Poola [0035] – [0036] e.g. Clustering Pages Based on Page Features [0036]    According to one embodiment, web page clustering, referred to herein as "CLiP" (CLustering Pages), .
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention, in view of the teachings of Chi, Burges, Tsioutsiouliklis and Poola, to provide the ability of displaying generalized graph structures and access patterns, such as World Wide Web sites, actual usage patterns, and predicted usage patterns, so that the important relationships are exposed (Chi [0008]). 
16.	With respect to claim 3,
	Poola further discloses wherein the clustering algorithm is a K-means clustering algorithm (Poola [0072], [0076] – [0080], [0093], [0096] e.g. [0096] Once the distances of structural similarity (e.g., do) are computed, the radius of influence is computed for each sample page, at block 604.  Stated otherwise, for the sample pages from each of the plurality of groups, a radius of influence for each sample page is computed based on the distance of structural similarity between the features within the sample .
17.	With respect to claim 4,
	Poola further discloses wherein the web page identifiers are URLs (Uniform Resource Locators) (Poola [0038] – [0039] e.g. [0038] (A) URL Normalization [0039] Each URL 202 input into URL normalization .
18.	With respect to claim 5,
	Poola further discloses generating pre-processed URLs by removing noise elements from each of the URLs, wherein the noise elements include elements that are numeric, random, hashed or encrypted, or elements that are longer that a first predetermined length with an entropy higher than a predetermined entropy parameter (Poola [0035] e.g. [0035] (e) Noise section removal by identifying static content or template sections of the website as sections composed of hyperlinks).
19.	With respect to claim 6,
	Poola further discloses
splitting each URL into one or more potential text parts;
splitting each potential text part into the one or more words;
removing words from the one or more words that are longer that a second predetermined length or shorter than a third predetermined length;
storing the one or more words that remain in the record; and
associating a web page identifier with the record (Poola [0038] – [0042] e.g. [0038] (A) URL Normalization [0039] Each URL 202 input into URL normalization 204 may be retrieved from crawler storage 114 (FIG. 1).  URL normalization 204 tokenizes URLs 202 into multiple tokens based on pattern changes.  URL normalization 204 is based on "level" information derived from the URLs.  URL normalization 204 and variation computation 208 are considered scalable processes because these processes do not require parsing web pages in order to cluster structurally similar pages within a domain.  [0040] URL Levels and Level Delimiters [0041] It is desirable to build the cluster hierarchy 210 by clustering pages at levels that demonstrate the least, or less, variation relative to other levels.  As depicted in FIG. 2, variation computation 208 generates a multi-level cluster hierarchy 210.  In cluster hierarchy 210, each of blocks 1-16 represents a cluster of pages determined by the CURL process, where leaf node clusters are depicted as bold blocks.  According to embodiments, levels of a URL are determined using one or more of the following "token delimiters": (a) static token delimiters (e.g., standard, unlearned URL delimiters); (b) learned token delimiters (delimiters learned from the set of URLs under consideration); and/or (c) unit change denominations.  Some levels are separated by static delimiters, such as the following symbols: `/`, "?`, or `&`.  Sublevels of each level are also .
20.	Claims 11-15 are same as claims 2-6 and are rejected for the same reasons as applied hereinabove.
21.	Claims 21-24 are same as claims 2-6 and are rejected for the same reasons as applied hereinabove.

Response to Arguments
22.	Applicant’s remarks and arguments presented on 3/8/22 have been fully considered but they are moot in view of the new grounds of rejection presented in this office action.

Conclusion
23.	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SyLing Yen whose telephone number is 571-270-1306.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark Featherstone can be reached at 571-270-3750.  The fax and phone numbers for the organization where this application or proceeding is assigned is 571-273-8300.
Any inquiry of a general nature or relating to the status of this application or proceeding should be directed to the receptionist whose telephone number is 571-272-2100. 




March 15, 2022