DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The amendments were received on 8/4/2021.  Claims 1-7 and 21-33 are pending where claims 1-7 were previously presented, claims 8-20 were cancelled, and claims 21-33 are newly added.

35 USC § 101
The applicant provided amendments and arguments to address the 35 USC 101 rejections.  In view of the amendments, the respective 35 USC 101 rejections have been withdrawn.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having 

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-7 and 24-33 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al [US 2005/0165753 A1] in view of Stankiewicz et al [US 2012/0117092 A1], Bekkerman [US 2013/0110498 A1], and Ukrainczyk et al [US 2002/0022956 A1].
With regard to claim 24, Chen a method performed by a computing system that executes a search engine, the method comprising: receiving, at the search engine executing on the computing system, a query from a client computing device that is in network communication with the computing system (see paragraphs [0037]-[0039]; the system receives a user query to search for websites/webpages); 
returning a ranked list of webpages to the client computing device based upon the query, wherein the ranked list of webpages comprises a webpage that belongs to a website, and further wherein a position of the webpage in the ranked list of webpages is based upon a topical authority score assigned to the website with respect to a topic, the topical authority score is representative of an authoritativeness of the website with respect to the topic (see paragraphs [0040], [0042], [0050], and [0066]; see Figure 8; 
Chen teaches determining keywords/topics for webpages/websites and assigning a weight/authoritativeness value to those sites/pages but does not appear to explicitly teach wherein the topical authority score for the website is computed by way of acts comprising: identifying key phrase candidates in webpages that belong to the website; converting the identified key phrase candidates into feature vectors; using a classifier that was trained on a set of key phrase pairs having labels indicating whether two key phrases are duplicates of one another, removing duplicate feature vectors to produce remaining feature vectors for remaining key phrase candidates; and assigning scores to the remaining key phrase candidates based upon the remaining feature vectors, wherein a key phrase candidate in the remaining key phrase candidates is the topic, and further wherein the topical authority score assigned to the website is based upon a score in the scores assigned to the key phrase candidate.
Stankiewicz teaches identifying key phrase candidates in webpages that belong to the website; converting the identified key phrase candidates into feature vectors (see paragraph [0003] and [0081]; the system can utilize phrase extraction where phrases are considered feature vectors to provide additional information about the phrase to assist in determining the best candidate keywords/phrases).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify the crawler and website/webpage classification system of Chen by phrase-based identification and feature vector creation as taught by Stankiewicz in order to allow the classification 
Chen in view of Stankiewicz wherein the topical authority score for the website is computed by way of acts comprising: identifying key phrase candidates in webpages that belong to the website; converting the identified key phrase candidates into feature vectors (see Chen, paragraphs [0043], [0049], [0068]; see Stankiewicz, paragraph [0003] and [0081]; instead of searching the webpages for keywords, the system can also have particular keywords determined and extracted ahead of time).
Chen in view of Stankiewicz do not appear to explicitly teach using a classifier that was trained on a set of key phrase pairs having labels indicating whether two key phrases are duplicates of one another, removing duplicate feature vectors to produce remaining feature vectors for remaining key phrase candidates; and assigning scores to the remaining key phrase candidates based upon the remaining feature vectors, wherein a key phrase candidate in the remaining key phrase candidates is the topic, and further wherein the topical authority score assigned to the website is based upon a score in the scores assigned to the key phrase candidate.
Bekkerman teaches using a classifier that was trained on a set of key phrase pairs having labels indicating whether two key phrases are duplicates of one another, removing duplicate feature vectors to produce remaining feature vectors for remaining 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify the crawler and website/webpage classification system of Chen in view of Stankiewicz by incorporating phrase-based classification while allowing means to reduce the total number of identified phrases as taught by Bekkerman in order to allow the classification system to be able to incorporate user guidance when needed with labels while helping to ensure that similar phrases can be mapped together for classification purposes to not only reduce the total number of features for consideration but also to help identify the importance of particular phrases for a topic thus helping to capture semantics associated with phrases to help improve identification of topics for the input document while also reducing the size of the feature space that has to be considered/evaluated.
Chen in view of Stankiewicz and Bekkerman do not appear to explicitly teach assigning scores to the remaining key phrase candidates based upon the remaining feature vectors, wherein a key phrase candidate in the remaining key phrase candidates is the topic, and further wherein the topical authority score assigned to the website is based upon a score in the scores assigned to the key phrase candidate.
Ukrainczyk teaches assigning scores to the key phrase candidates based upon the feature vectors (see paragraph [0075]; the system assigns scores based upon the feature vector).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify the crawler and website/webpage classification system of Chen in view of Stankiewicz and Bekkerman by incorporating means to utilize a concept/topic weighting scheme for the features as taught by Ukrainczyk in order to allow particular concepts to have different weights for different terms/features such that terminology that is relevant to more than one topic can still be assessed for the various respective concepts/topics even if providing low weight thus helping to ensure that all information is considered when determining the appropriate topics for the documents/websites.
Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk teach assigning scores to the remaining key phrase candidates based upon the remaining feature vectors, wherein a key phrase candidate in the remaining key phrase candidates is the topic, and further wherein the topical authority score assigned to the website is based upon a score in the scores assigned to the key phrase candidate (see Ukrainczyk, paragraph [0075]; see Stankiewicz, paragraph [0081]; see Chen, paragraphs [0036] and [0043]; see Bekkerman paragraphs [0026], [0062], [0065], and [0070]; the system can evaluate and score the key phrase candidates such that some top number is selected and returned as the labels/concepts/topics for the website where the system also ensures a weight for the concepts to indicate the topical authority of the document/website for each associated topic/concept).

With regard to claim 25, Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk teach wherein the remaining key phrase candidates are topics, and further 

With regard to claim 26, Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk teach wherein the classifier is a binary classifier (see Chen, paragraphs [0067] and [0043]; see Bekkerman, paragraph [0092]; the system can utilize classifiers to help determine if websites/documents belong in a particular category/concept including utilizing binary classifiers).

With regard to claim 27, Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk teach wherein a regression model is employed to assign the scores to the remaining key phrase candidates (see Stankiewicz, paragraph [0024]; regression model can be used as means to train a classifier to determine weights for the keywords/keyphrases).

With regard to claim 28, Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk teach wherein converting the identified key-phrase candidates include at least one of the following features: language statistics, including Term Frequency (TF) and Inverted Document Frequency (IDF); a ratio of overlap with page title and page URL; relative position on the web page; one-hot encoding; features of surrounding words; case encoding; and stopword features (see Stankiewicz, paragraph [0032]; TF and IDF can be utilized as part of the features).

With regard to claim 29, Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk teach prior to the duplicates being removed through use of the classifier, filtering the feature vectors using deduplication rules (see Bekkerman, paragraphs [0077] and [0081]; the system utilizes means to identify duplicates and remove them as well as have means to determine near duplicates too).

With regard to claim 30, Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk teach wherein the deduplication rules include at least one of: a first rule that combines candidates with different case; a second rule that combines candidates which are the same entity in full name and abbreviation; a third rule that deduplicates candidates that overlap with each other on language statistics; or a fourth rule that drops candidates starting or ending with stopwords or containing curse words (see Bekkerman, paragraphs [0065], [0077] and [0081]; the system can standardize the phrase case to help identify duplicates or near duplicates).

With regard to claim 31, this claim is substantially similar to claim 24 and is rejected for similar reasons as discussed above.

With regard to claims 32 and 33, these claims are substantially similar to claims 25 and 26 and are rejected for similar reasons as discussed above.

With regard to claims 1-7, these claims are substantially similar to claims 24-30 and are rejected for similar reasons as discussed above.



Claims 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al [US 2005/0165753 A1] in view of Stankiewicz et al [US 2012/0117092 A1], Bekkerman [US 2013/0110498 A1], and Ukrainczyk et al [US 2002/0022956 A1] in further view of Banerjee et al [US 2014/0074816 A1].
With regard to claim 21, Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk teach all the claim limitations of claim 1 as discussed above.
Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk teach extracting key phrases but do not appear to explicitly teach wherein extracting the key phrase candidates from the webpages that belong to the website comprises using grammar rules to extract the key phrase candidates from the webpages, wherein the grammar rules are based upon queries previously submitted to the search engine.
Banerjee teaches wherein the grammar rules are based upon queries previously submitted to the search engine (see paragraph [0030] and [0031]; the system can utilize a query log to parse/tag the queries to identify phrases and phrase sequences and use the most dominant ones to be the reference sequences or grammar rules).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify the crawler, tagger, and parser system for website/webpage classification system of Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk by incorporating means to utilize past user queries as means to identify phrases as taught by Banerjee in order to allow the system to be flexible in their phrase grammar rules so that the top 100 reference sequences of a phrase grammar rule can utilize the most commonly or repeatedly used language patterns for phrases in queries to best identify information that could be useful to the majority of system users by finding phrases with similar structure.
Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk in further view of Banerjee teach wherein extracting the key phrase candidates from the webpages that belong to the website comprises using grammar rules to extract the key phrase candidates from the webpages (see Banerjee, paragraph [0031]; see Chen, paragraphs [0043], [0049], [0068], [0074]; see Stankiewicz, paragraph [0003] and [0081]; the topical keywords/keyphrases can be based on query logs where the phrases are based on the most common sequence of parts-of-speech format that users are using in order to search/parse documents to find contextually relevant topical information).

With regard to claim 22, Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk in further view of Banerjee teach wherein the grammar rules are based upon parts of speech assigned to terms in the queries (see Banerjee, paragraph [0031]; parts of speech are used to tag the user queries).

With regard to claim 23, Chen in view of Stankiewicz, Bekkerman, and Ukrainczyk in further view of Banerjee wherein the grammar rules comprise a sequence of parts of speech (see Banerjee, paragraph [0031]; reference sequences are sequences of tags), and further wherein extracting the key phrase candidates from the webpages that belong to the website comprises: assigning parts of speech to terms in the webpages; and extracting a sequence of terms in a webpage as a key phrase candidate due to the sequence of terms being assigned respective parts of speech that matches the sequence of parts of speech identified in the grammar rules (see Banerjee, paragraphs [0031]-[0033]; Chen, paragraphs [0043], [0049], [0068]; see Stankiewicz, paragraph [0003] and [0081]; the system can receive webpages/websites and be able to parse a webpage and provide tags and determine if the tags match particular tag sequences as defined by the grammar rules).

Response to Arguments
Applicant’s arguments (see the first paragraph on page 8 through the last paragraph on page 12) with respect to the 35 USC 101 rejections have been fully considered and are persuasive.  The 35 USC 101 rejections of the claims have been withdrawn.  The applicant provided amendments and arguments to address the 35 USC 

Applicant’s arguments (see the first paragraph on page 13 through the last paragraph on page 16) with respect to the rejection(s) of claim(s) under 35 USC 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Chen and Ukrainczyk.  As seen from the 35 USC 103 rejections, new references were found in view of the newly cited claim amendments that, when combined, would appear to teach or fairly suggest the claim limitations as recited.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARC S SOMERS whose telephone number is (571)270-3567. The examiner can normally be reached M-F 11-8 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela Reyes can be reached on 5712701006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MARC S SOMERS/Primary Examiner, Art Unit 2159                                                                                                                                                                                                        3/4/2022