Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Status
Claims 1-39 are pending.  

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-10, 13-23, 26-36 and 39 are rejected under 35 U.S.C. because the claimed invention is directed to judicial exception, i.e., abstract idea which is not integrated into a practical application and does not include additional elements that amount to significantly more than the judicial exception. 
STEP 1:
The claims are directed to method/process which is one of the statutory categories of statutory subject matter. 
STEP 2A – Prong 1:
The claims recite a judicial exception, i.e., an abstract idea because at least one of the method claims can be performed by mental process(es), i.e., thinking that can be performed in the human mind assisted by pen and paper if necessary.  


The limitation “clustering the news articles according to content of the news articles to form a set of ranked clusters” can be performed by a mental process in the human mind and thus is directed to an abstract idea. 
STEP 2A- Prong 2:
The claim does not recite additional elements that integrate the judicial exception into a practical application.  Simply appending at a high level of generality, well-understood, routine, conventional activities previously known to the industry to the judicial exception does not integrate the judicial exception into a practical application.  A generic computer as in computer-implemented method, is simply linked to a particular technological environment or field of use, i.e., clustering news articles.  Furthermore, artificial intelligence having a deep clustering algorithm is simply “used” to cluster news items.  No special attribute(s) or functioning of the artificial intelligence having a deep clustering algorithm is claimed, in fact sorting news articles by chronological order can be done by a plurality of school children.  Clearly, the generic computer does not improve the technological field.  
	According to above considerations, it is concluded that a draftsman can simply append a computer/ artificial intelligence system to the claimed method steps. 
STEP 2B:

	The courts have recognized the following functions as well-understood, routine and conventional functions when they are claimed in a merely generic manner, e.g., at a high level of generality.
The following are pertinent to instant invention:
	Receiving or transmitting data over a network
	Data gathering 
	Arranging a hierarchy of groups
	Performing repetitive calculations  
Claim 2 recites:
The computer-implemented method of claim 1, wherein the deep learning clustering algorithm comprises the steps of: grouping similar news articles into clusters using a named-entity-—based clustering algorithm; determining a representative news for each cluster; identifying similar clusters based on similarities between the representative news for a cluster pair; and merging the similar clusters.
The above does not correct the deficiencies of claim 1. 

The computer-implemented method of claim 2, wherein the news articles are clustered iteratively in chronological order within a news cycle. 
The above does not correct the deficiencies of claim 1.
Claim 4 recites:
The computer-implemented method of claim 2, wherein the named-entity-based clustering algorithm further comprises the steps of: extracting named entities from news articles using a statistical named entity recognition algorithm; determining similarities between a news article and each exiting cluster; determining a closest cluster based on the similarities; responsive to the similarity between the news article and the closest cluster exceeding a similarity threshold, assigning the news article to the closest cluster; and re-determining the representative news for the closest cluster.
The above does not correct the deficiencies of claim 1.
Claim 5 recites:
The computer-implemented method of claim 4, wherein the named-entity-based clustering algorithm further comprises the steps of: responsive to the similarity between the news article and the closest news cluster not exceeding the similarity threshold, creating a singleton cluster; and assigning the news article to the singleton cluster.
The above does not correct the deficiencies of claim 1.
Claim 6 recites:
The computer-implemented method of claim 4, wherein the deep learning clustering algorithm further comprises the steps of: determining a lower dimensional representation for the representative news for each cluster using a deep contextualized word representation model; 
The above does not correct the deficiencies of claim 1.
Claim 7 recites:
The computer-implemented method of claim 6, further comprising: responsive to the pairwise cosine similarity of the cluster pair being greater than or equal to a pre-specified threshold, merging the cluster pair together into a merged cluster; and re-determining the representative news of the merged cluster.
The above does not correct the deficiencies of claim 1.
Claim 8 recites:
The computer-implemented method of claim 7, wherein clustering the news clusters to form the set of ranked clusters further comprises: ranking news articles within each cluster sequentially by publication date and news ranking score to form the set of ranked clusters.
The above does not correct the deficiencies of claim 1.
Claim 9 recites:
The computer-implemented method of claim 7, wherein clustering the news clusters to form the set of ranked clusters further comprises further comprises: ranking clusters sequentially by update date, clustering ranking score, and cluster size.
The above does not correct the deficiencies of claim 1.
Claim 10 recites:
The computer-implemented method of claim 6, wherein the pairs of clusters are not merged if the pairwise cosine Similarity of the cluster pair is less than a pre-specified threshold.
The above does not correct the deficiencies of claim 1.

The computer-implemented method of claim 2, further comprising: prior to clustering the news articles, removing irrelevant news based on a popularity of the news source of the news article, wherein removing irrelevant news based on the popularity of the news source of the news article further comprises: building a mapping from news sources to domains, including grouping news articles by news sources and extracting domains from uniform resource locators (URLs) of the news articles; for each news source, sequentially looking up the domains in a database of website popularity by frequency until a match is made; and removing news articles published by news sources that rank below a popularity threshold in the database of website popularity.
The above does not correct the deficiencies of claim 1.
Claims 14-23, 26-36 and 39 are rejected on the same basis as the above. 

Claim Objections
Claim 4 is objected to because of the following informalities:  Existing is misspelt.  Appropriate correction is required.

Allowable Subject Matter
Claims 11, 12, 24, 25, 37 and 38 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims and overcoming the rejection under 35 U.S.C. 101.
11.  The computer-implemented method of claim 2, further comprising:
prior to clustering the news articles, removing duplicate news articles using an artificial intelligence system, wherein removing duplicate news articles further comprises: 

determining Levenshtein distances and token sort ratios for titles, descriptions, and content for pairs of the news articles based on the LSH signatures;
determining whether a pair of news articles is a duplicate based on a Support Vector Machine model of the Levenshtein distances and token sort ratios for the titles, descriptions, and content for pairs of the news articles; and responsive to determining that the pair of news articles is a duplicate, removing the news article of the pair of news articles that has an earlier publication date.
12. The computer-implemented method of claim 2, further comprising: identifying main entities in each article using a machine learning model that independently derives features from a title, a description, and a content of each article, wherein identifying the main entities includes: phrase matching of the title to identify one or more main entities; and natural language processing of the description and content to identify the one or more main entities, including n-gram modeling that counts a number of tokens that match with each n-gram of an entity name, and then weighs the counts exponentially; and prior to clustering the news articles, using an artificial intelligence system to remove irrelevant news articles based on a relevance of an article to one or more main entities identified in the article.
24. The subscription—-based news system of claim 15, wherein the number of processors further execute the program instructions: prior to clustering the news articles, to remove duplicate news articles using an artificial intelligence system, wherein removing duplicate news articles further comprises: determining Local Sensitive Hashing (LSH) signatures of titles, descriptions, and contents for news articles within a subscription; determining Levenshtein distances and token 
25. The subscription-based news system of claim 15, wherein the number of processors further execute the program instructions: to identify main entities in each article using a machine learning model that independently derives features from a title, a description, and a content of each article, wherein identifying the main entities includes: phrase matching of the title to identify one or more main entities; and natural language processing of the description and content to identify the one or more main entities, including n-gram modeling that counts a number of tokens that match with each n-gram of an entity name, and then weighs the counts exponentially; and prior to clustering the news articles, to use an artificial intelligence system to remove irrelevant news articles based on a relevance of an article to one or more main entities identified in the article.
37. The computer program product of claim 28, further comprising: program code for removing duplicate news articles using an artificial intelligence system prior to clustering the news articles, wherein removing duplicate news articles further comprises: determining Local Sensitive Hashing (LSH) signatures of titles, descriptions, and contents for news articles within a subscription; determining Levenshtein distances and token sort ratios for titles, descriptions, and content for pairs of the news articles based on the LSH signatures; determining whether a pair of news articles is a duplicate based on a Support Vector Machine model of the Levenshtein 
responsive to determining that the pair of news articles is a duplicate, removing the news article of the pair of news articles that has an earlier publication date.

38. The computer program product of claim 28, further comprising: program code for identifying main entities in each article using a machine learning model that independently derives features from a title, a description, and a content of each article, wherein identifying the main entities includes: phrase matching of the title to identify one or more main entities; and natural language processing of the description and content to identify the one or more main entities, including n-gram modeling that counts a number of tokens that match with each n-gram of an entity name, and then weighs the counts exponentially; and prior to clustering the news articles, using an artificial intelligence system to remove irrelevant news articles based on a relevance of an article to one or more main entities identified in the article.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1 is/are rejected under 35 U.S.C. 103 as being unpatentable over King (US 2014/0006403) in view of Chavez (US 2018/0251230) 
Regarding claim 1, King discloses: 

	King [0001] Most academics and numerous others routinely attempt to discover useful information by reading large quantities of unstructured text. The corpus of text under study may be literature to review, news stories to understand, medical information to decipher, blog posts, comments, product reviews, or emails to sort, audio-to-text summaries of speeches to comprehend. The purpose is to discover useful information from this array of unstructured text. This is a time-consuming task and the information is increasing at a very fast rate, with the quantity of text equivalent to that in Library of Congress being produced in emails alone every ten minutes.

clustering the news articles according to content of the news articles to form a set of ranked clusters, 
King [0037] Two specific clusterings 304 and 306, each corresponding to one point as indicated by arrows 308 and 310, respectively, in the central space appear to the left and right of the figure. In these clusterings, labels have been added manually for clarification. Clustering 1 (304), creates clusters of "Reagan Republicans" (Reagan and the two Bushes) and all others. Clustering 2 (306) groups the presidents into two clusters organized chronologically. 

wherein the clustering is performed using an artificial intelligence system having a deep learning clustering algorithm that sorts the news articles in chronological order, 
King discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Chavez discloses:
[0042] At an operation 504, method 500 can filter, by one or more processors, information received by the pilot to ensure accuracy and timeliness. As an example, the artificial intelligence monitor can eliminate aircraft information that has a timestamp later a predetermined threshold. Alternatively, the artificial intelligence monitor can only provide alerts for values that exceed a certain threshold (such as engine temperature, etc.) or fall below certain thresholds (such as fuel level, etc.), but exclude values that are within predetermined thresholds. Operation 504 may be performed by one or more physical processors configured to execute a machine-readable instruction component, in accordance with one or more implementations. The method 500 then proceeds to operation 506.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify King to obtain above limitation based on the teachings of Chavez for the purpose of ranking news articles [0014] according to timestamping. 

while also considering historical news clusters formed in a previous news cycle.
King [0037] Two specific clusterings 304 and 306, each corresponding to one point as indicated by arrows 308 and 310, respectively, in the central space appear to the left and right of the figure. In these clusterings, labels have been added manually for clarification. Clustering 1 (304), creates clusters of "Reagan Republicans" (Reagan and the two Bushes) and all others. Clustering 2 (306) groups the presidents into two clusters organized chronologically. 

Claims 2 and 3 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of King and Chavez and further in view of Smith (US 2020/0301997) and further in view of Petroni (US 2019/0012374).  
egarding claim 2, the combination of King and Chavez discloses the elements of the claimed invention as noted but does not disclose wherein the deep learning clustering algorithm comprises the steps of: grouping similar news articles into clusters using a named-entity-—based clustering algorithm; determining a representative news for each cluster; identifying similar clusters based on similarities between the representative news for a cluster pair;
However, Smith discloses:
	Smith [0029] In some embodiments, the fuzzy cohort component 140 can utilize a clustering algorithm to cluster entity stage(s) and/or provenance chain(s) into individual classes based on similarity in order to group similar journeys into a group of similar journeys. For example, the clustering algorithm can be a hierarchical algorithm, a k-means algorithm, a distribution-based algorithm, and/or a density-based algorithm. Similarity can be based upon a user-defined similarity threshold, for example, to group journeys that are 99.00% similar.  
Smith [0030] In some embodiments, the clustering algorithm can utilize weights when clustering similar entity stage(s) and/or provenance chain(s) into particular group(s). For example, a feature of provenance chain data (e.g., temporal and/or locational) can have an associated weight, with the associated weight based on significance in inferring similarities. That is, features having a higher associated weight are more significant in inferring similarities and features having a lower associated weight are less significant in inferring similarities. A weight of zero is indicative of the associated feature not being utilized for purposes of clustering.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King and Chavez to obtain above limitation based on the teachings of Smith for the purpose of the fuzzy cohort component 140 utilizing a 
The combination of King and Chavez discloses the elements of the claimed invention as noted but does not disclose merging the similar clusters.  However, Petroni discloses:
Petroni [0094] If the decision is to merge two similar clusters, continuing onto step 322, the cluster module 128 merges the clusters and stores the merged event detected cluster is stored into cluster data store 143. For example, if social media information is the same as a previously detected event, the social media information is then merged with the previously detected event.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King and Chavez to obtain above limitation based on the teachings of Petroni for the purpose of merging two similar clusters. 

Regarding claim 3, the combination of King, Chavez, Smith and Petroni discloses wherein the news articles are clustered iteratively in chronological order within a news cycle. 
Smith [0030] In some embodiments, the clustering algorithm can utilize weights when clustering similar entity stage(s) and/or provenance chain(s) into particular group(s). For example, a feature of provenance chain data (e.g., temporal and/or locational) can have an associated weight, with the associated weight based on significance in inferring similarities. That is, features having a higher associated weight are more significant in inferring similarities and features having a lower associated weight are less significant in inferring similarities. A weight of zero is indicative of the associated feature not being utilized for purposes of clustering.

4 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Smith and Petroni and further in view of Rubenczyk (US 2016/0098404) and further in view of Savir (US 10,235,452).   
Regarding claim 4, the combination of King, Chavez, Smith and Petroni discloses the elements of the claimed invention as noted but does not disclose wherein the named-entity-based clustering algorithm further comprises the steps of: extracting named entities from news articles using a statistical named entity recognition algorithm; determining similarities between a news article and each exiting cluster; determining a closest cluster based on the similarities; responsive to the similarity between the news article and the closest cluster exceeding a similarity threshold, assigning the news article to the closest cluster.  However, Rubenczyk discloses:
	Rubenczyk [0060] At block 203, a cluster is created. The cluster includes the pair that has the highest level of similarity. For example if the pair AX is the pair with the highest level of similarity then a new cluster that is comprised of user A and old cluster X is generated. The highest level of similarity may be the determined by the smallest calculated distance. The process of generating the cluster is explained in greater detail in FIG. 2c. In some embodiments, if there are two or more pairs with higher level of similarity only one pair is chosen.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith and Petroni to obtain above limitation based on the teachings of Rubenczyk for the purpose of choosing a pair with the highest level of similarity.  

 re-determining the representative news for the closest cluster.

Savir col 5, lines 20-35 (66) The collaborative filtering module 122 obtains the topic clusters from the clustering module 120 via the topic cluster ingestion module 130 and processes the received information and the topic clusters in the expert similarity and ranking generator 138 to identify the subject matter expert. For example, this may involve determining the topic clusters that are most closely associated with the received information of the current communication and then identifying particular subject matter experts having experience or other expertise that is most similar to the identified topic cluster or clusters. In such an arrangement, the communication is determined to be associated with one or more topic clusters and then the subject matter expert is identified based on the one or more topic clusters.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith and Petroni to obtain above limitation based on the teachings of Savir for the purpose of obtaining the topic of the clusters. 

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Smith, Petroni, Rubenczyk and Savir and further in view of Contractor (US 10,083,230). 
Regarding claim 5, the combination of King, Chavez, Smith, Petroni, Rubenczyk and Savir      discloses the elements of the claimed invention as noted but does not disclose responsive to the similarity between the news article and the closest news cluster not exceeding the similarity 
	Contractor claim 2. The method of claim 1, further comprising: in response to determining that a number of the retrieved data elements is greater than a threshold, creating the new cluster having a label of the selected feature.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith, Petroni, Rubenczyk and Savir to obtain above limitation based on the teachings of Contractor for the purpose of labelling a cluster. 
Note: Singleton interpreted per paragraph 121 of the specification.   

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Smith, Petroni, Rubenczyk and Savir and further in view of Clifton (US 2021/0117867) and further in view of Stowell (US 2020/0019870).   
Regarding claim 6, the combination of King, Chavez, Smith, Petroni, Rubenczyk discloses the elements of the claimed invention as noted but does not disclose wherein the deep learning clustering algorithm further comprises the steps of:
determining a lower dimensional representation for the representative news for each cluster using a deep contextualized word representation model
However, Clifton discloses:
Clifton abstract, Methods and apparatus for subtyping subjects based on phenotypic information are disclosed. In one arrangement, a data receiving unit receives a subject data unit for each of a plurality of subjects. Each subject data unit represents a plurality of different 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith, Petroni, Rubenczyk    to obtain above limitation based on the teachings of Clifton for the purpose of using a deep learning algorithm to derive a lower dimensional representation of each subject data unit and a clustering algorithm to detect clusters of the resulting lower dimensional representations.

The combination of King, Chavez, Smith, Petroni, Rubenczyk discloses the elements of the claimed invention as noted but does not disclose wherein identifying similar clusters further comprises: determining a pairwise cosine similarity between the lower dimensional representation of the representative news of the cluster pair.  However, Stowell discloses:
	Stowell [0020] In another example embodiment, the clustering can be performed using a clustering heuristic by calculating a pairwise cosine distance between two vector representations of the plurality of vector representations that have not yet been clustered, merging the two vector representations into a cluster if the pairwise cosine distance is below a threshold value, removing the two vector representations from the plurality of vector representations if the pairwise cosine distance is below the threshold value, calculating a cluster vector representation for the cluster as the mean of all vector representations in the cluster, reinserting the cluster vector representation 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith, Petroni, Rubenczyk    to obtain above limitation based on the teachings of Stowell for the purpose of merging the two vector representations into a cluster if the pairwise cosine distance is below a threshold value, 
merging the two vector representations into a cluster if the pairwise cosine distance is below a threshold value,

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton and Stowell and further in view of Gupta (US 9,785,699).  
Regarding claim 7, the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton and Stowell discloses responsive to the pairwise cosine similarity of the cluster pair 
Stowell [0020] In another example embodiment, the clustering can be performed using a clustering heuristic by calculating a pairwise cosine distance between two vector representations of the plurality of vector representations that have not yet been clustered, merging the two vector representations into a cluster if the pairwise cosine distance is below a threshold value, removing the two vector representations from the plurality of vector representations if the pairwise cosine distance is below the threshold value, calculating a cluster vector representation for the cluster as the mean of all vector representations in the cluster, reinserting the cluster vector representation into the plurality of vector representations, and repeating the clustering heuristic for a set number of iterations. The threshold value can be 0.25. The number of iterations can be 3.

The combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton and Stowell does not disclose: being greater than or equal to a pre-specified threshold, merging the cluster pair together into a merged cluster.  However, Gupta discloses:
	Gupta claim 17, The computing device as described in claim 14, wherein the operations further include determining additional similarity scores based on comparing facial representations of faces in the group with at least one facial representation of faces in the other group; and wherein the merging is based on at least a threshold percentage of the additional similarity scores being at or above a merging threshold. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton and Stowell to obtain above limitation based on the teachings of Gupta for the purpose of merging when the at least a threshold percentage of the additional similarity scores being at or above a merging threshold. 

and re-determining the representative news of the merged cluster.
Savir col 5, lines 20-35 (66) The collaborative filtering module 122 obtains the topic clusters from the clustering module 120 via the topic cluster ingestion module 130 and processes the received information and the topic clusters in the expert similarity and ranking generator 138 to identify the subject matter expert. For example, this may involve determining the topic clusters that are most closely associated with the received information of the current communication and then identifying particular subject matter experts having experience or other expertise that is most similar to the identified topic cluster or clusters. In such an arrangement, .

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta and further in view of Joshi (US 2016/0092581) and further in view of Li (US 2018/0121555)
Regarding claim 8, the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta discloses the elements of the claimed invention as noted but does not disclose wherein clustering the news clusters to form the set of ranked clusters further comprises: ranking news articles within each cluster sequentially by publication date.  However, Joshi discloses: 
Joshi [0057] The ranking component 608 may be configured to rank the set of filtered content items to create a ranked set of filtered content items (e.g., ranked from most relevant to least relevant; ranked based upon popularity in a category; ranked based upon the number of “likes” a content item received on a social network, etc.). The filtered content items may be ranked based upon a ranking metric. The ranking metric may be configured to rank the set of filtered content items based upon at least one of a user interest (e.g., favorite actor, favorite director, favorite genera, etc.), a publication date (e.g., ranking the most recent television content items first), or the global popularity of a content item. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta to obtain above limitation based on the teachings of Joshi for the purpose of ranking filtered content items based on publication date.  


	Li [0072] If at step 1106, the link-based approach does not result in a cluster correspondence determination, the method proceeds to step 1108, where the posting is processed to determine whether or not it refers to a same event referenced by an existing cluster of postings using a semantic class-based approach by the semantic class-based clustering module 132. At step 1110, if the semantic class-based approach results in a cluster correspondence determination, then the method proceeds to step 1112, where, as discussed above, the existing cluster of postings is modified to include the social media posting in the cluster database 136. If at step 1110, the semantic class-based approach also does not result in a cluster correspondence determination, the method proceeds to step 1114, where a new cluster of postings is created to include the social media posting in the cluster database 136. At step 1116, a novelty score is calculated for the new cluster. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta to obtain above limitation based on the teachings of Li for the purpose of creating a novelty score for the new cluster.   

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta and further in view of  Ho (US 2018/0196808) and further in view of Chen (US 10,496,691) and further in view of Xu (US 11,244,115).   
King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta discloses the elements of the claimed invention as noted but does not disclose ranking clusters sequentially by update date.  However, Ho discloses: 
Ho [0038] In some embodiments, document database 120 stores supplemental information (e.g., metadata) concerning various documents in the document database. A non-exhaustive set of examples of such information includes document identifier (document ID), author, access control list, document size, timestamps (e.g., timestamps for one or more of creation date, revision history, last updated time, last accessed time, etc.), and document type (e.g., word processor document, spreadsheet, presentation file, etc.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta to obtain above limitation based on the teachings of Ho for the purpose of storing supplemental information (e.g., metadata) concerning various documents in the document database.  

The combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta discloses the elements of the claimed invention as noted but does not disclose clustering ranking score.  However, Chen discloses:
Chen col 13, lines 30-45, FIG. 4 illustrates a flow diagram of an example process 400 for scoring possible clusters, in accordance with disclosed subject matter. Process 400 may be performed during a clustering process and is not dependent on the clustering process selected. In generating a cluster score, the system may first calculate a coverage score that measures the number of top-rated or popular search items in each cluster (405). In one implementation, top-
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta to obtain above limitation based on the teachings of Chen for the purpose of ranking such that a higher score thus indicates high coverage (a better indication of quality).

The combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta discloses the elements of the claimed invention as noted but does not disclose cluster size.  However, Xu discloses:  
	Xu claim 4, The system as set forth in claim 2, wherein each cluster has a size, and wherein the one or more processors further perform operations of: ranking the clusters by size;  and for one or more of the largest clusters, issuing an alert related to vehicle component failure data in the one or more largest clusters.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta to obtain above limitation based on the teachings of Chen for the purpose of ranking the clusters by size. 

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta.   
Regarding claim 10, the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton, Stowell and Gupta discloses wherein the pairs of clusters are not merged if the pairwise cosine similarity of the cluster pair is less than a pre-specified threshold.
	Gupta claim 17, The computing device as described in claim 14, wherein the operations further include determining additional similarity scores based on comparing facial representations of faces in the group with at least one facial representation of faces in the other group; and wherein the merging is based on at least a threshold percentage of the additional similarity scores being at or above a merging threshold. 
Note: See claim 7, for motivation statement.  

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Smith and Petroni and further in view of Trevisiol (US 2016/0299989) and further in view of Nagaraj (US 2012/0207075) and further in view of Geng (US 9,276,956)  and further in view of Baughman (US 2021/0034784) and further in view of Wang (US 2018/0260484) 
Regarding claim 13, the combination of King, Chavez, Smith and Petroni discloses the elements of the claimed invention as noted but does not disclose prior to clustering the news articles, removing irrelevant news based on a popularity of the news source of the news article, wherein removing irrelevant news based on the popularity of the news source of the news article further comprises: 

building a mapping from news sources to domains, 
	Trevisiol [0041] In the illustrated example, the two most common source categories are “search” and “social”. The “search” source category includes sites where users are able to submit image search and navigational queries. The “social” source category includes social network websites, such as Facebook, which constitute very popular access points since users are highly interested in photos shared by friends. The fact that many sessions come from the news domain is indicative that an image is often considered as appealing or significant as the actual text of the article. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith and Petroni to obtain above limitation based on the teachings of Trevisiol for the purpose of accessing a popular news domain. 

including grouping news articles by news sources  
The combination of King, Chavez, Smith and Petroni discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Nagaraj discloses:
	Nagaraj [0003]  Such bundling of different datagram packets (e.g., UDP packets) within conglomerate files for broadcast may be organized according to the type of information or application (e.g., grouping news source UDP packets within a conglomerate news file for broadcast), or according to demographics (e.g., grouping UDP packets for applications of interest 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith and Petroni to obtain above limitation based on the teachings of Nagaraj for the purpose of grouping news sources.  

extracting domains from uniform resource locators (URLs) of the news articles; 
The combination of King, Chavez, Smith and Petroni discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Geng discloses:
	Geng abstract A method for detecting a phishing website includes extracting a domain name from a target URL of a web page under investigation
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith and Petroni to obtain above limitation based on the teachings of Geng for the purpose of extracting a domain name from a URL.

for each news source, sequentially looking up the domains in a database of website popularity by frequency until a match is made; 
The combination of King, Chavez, Smith and Petroni discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Baughman discloses:
	Baughman [0049] Processing proceeds to operation S275 where, responsive to the determination at operation S270 that the first article data set 304A is a real world article, labelling mod 308 adds a tag to the first article data set to label the article as being a real world 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith and Petroni to obtain above limitation based on the teachings of Baughman for the purpose of analyzing the popularity of a sports domain. 

and removing news articles published by news sources that rank below a popularity threshold in the database of website popularity.
The combination of King, Chavez, Smith and Petroni discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Wang discloses:
[0024] Optionally or alternatively, division of a plurality of pieces of news into a plurality of news clusters further includes the following steps: (a) a K-means clustering operation is performed on the K1 news cluster; (b) performing a filtering process on the K1 news cluster, where the filtering process comprises at least one of the following operations: (i) removing the news in each news cluster that has a centroid similarity with the news cluster lower than the third threshold THs2, and (ii) removing the news cluster that has a number of news pieces (M2) fewer than the fourth threshold THm2.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith and Petroni to obtain above limitation based on the teachings of Wang for the purpose of removing a news cluster that is below a threshold.  

Claim 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Smith in view of King and further in view of Chavez. 
Regarding claim 14, Smith discloses: 
a bus system; 
a storage device connected to the bus system, wherein the storage device stores program instructions; and 
a number of processors connected to the bus system, wherein the number of processors execute the program instructions: 
	Smith Figure 6
to ingest news articles from a plurality of news sources; and 

King [0001] Most academics and numerous others routinely attempt to discover useful information by reading large quantities of unstructured text. The corpus of text under study may be literature to review, news stories to understand, medical information to decipher, blog posts, comments, product reviews, or emails to sort, audio-to-text summaries of speeches to comprehend. The purpose is to discover useful information from this array of unstructured text. This is a time-consuming task and the information is increasing at a very fast rate, with the quantity of text equivalent to that in Library of Congress being produced in emails alone every ten minutes.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Smith to obtain above limitation based on the teachings of King for the purpose of discovering useful information from an array of unstructured text.   

to cluster the news articles according to content of the news articles to form a set of ranked clusters, 
King [0037] Two specific clusterings 304 and 306, each corresponding to one point as indicated by arrows 308 and 310, respectively, in the central space appear to the left and right of the figure. In these clusterings, labels have been added manually for clarification. Clustering 1 (304), creates clusters of "Reagan Republicans" (Reagan and the two Bushes) and all others. Clustering 2 (306) groups the presidents into two clusters organized chronologically. 


Smith discloses the elements of the claimed invention as noted but does not discloseabove limitation.  However, Chavez discloses:
Chavez [0042] At an operation 504, method 500 can filter, by one or more processors, information received by the pilot to ensure accuracy and timeliness. As an example, the artificial intelligence monitor can eliminate aircraft information that has a timestamp later a predetermined threshold. Alternatively, the artificial intelligence monitor can only provide alerts for values that exceed a certain threshold (such as engine temperature, etc.) or fall below certain thresholds (such as fuel level, etc.), but exclude values that are within predetermined thresholds. Operation 504 may be performed by one or more physical processors configured to execute a machine-readable instruction component, in accordance with one or more implementations. The method 500 then proceeds to operation 506.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Smith to obtain above limitation based on the teachings of Chavez for the purpose of ranking news articles according to timestamping. 

while also considering historical news, clusters formed in a previous news cycle.
King [0037] Two specific clusterings 304 and 306, each corresponding to one point as indicated by arrows 308 and 310, respectively, in the central space appear to the left and right of the figure. In these clusterings, labels have been added manually for clarification. Clustering 1 (304), creates clusters of "Reagan Republicans" (Reagan and the two Bushes) and all others. Clustering 2 (306) groups the presidents into two clusters organized chronologically.  

Claims 15 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Smith, King and Chavez and further in view of Petroni 
Regarding claim 15, the combination of Smith, King and Chavez discloses: 
wherein the number of processors execute the program instructions of the deep learning clustering algorithm: to group similar news articles into clusters using a named entity-based clustering algorithm; to determine a representative news for each cluster; to identify similar clusters based on similarities between the representative news for a cluster pair; 
	Smith [0029] In some embodiments, the fuzzy cohort component 140 can utilize a clustering algorithm to cluster entity stage(s) and/or provenance chain(s) into individual classes based on similarity in order to group similar journeys into a group of similar journeys. For example, the clustering algorithm can be a hierarchical algorithm, a k-means algorithm, a distribution-based algorithm, and/or a density-based algorithm. Similarity can be based upon a user-defined similarity threshold, for example, to group journeys that are 99.00% similar.  
Smith [0030] In some embodiments, the clustering algorithm can utilize weights when clustering similar entity stage(s) and/or provenance chain(s) into particular group(s). For example, a feature of provenance chain data (e.g., temporal and/or locational) can have an associated weight, with the associated weight based on significance in inferring similarities. That is, features having a higher associated weight are more significant in inferring similarities and features having a lower associated weight are less significant in inferring similarities. A weight of zero is indicative of the associated feature not being utilized for purposes of clustering.
140 utilizing a clustering algorithm to cluster entity stage(s) and/or provenance chain(s) into individual classes based on similarity in order to group similar journeys into a group of similar journeys.  
The combination of Smith, King and Chavez discloses the elements of the claimed invention as noted but does not disclose merging the similar clusters.  However, Petroni discloses:
Petroni [0094] If the decision is to merge two similar clusters, continuing onto step 322, the cluster module 128 merges the clusters and stores the merged event detected cluster is stored into cluster data store 143. For example, if social media information is the same as a previously detected event, the social media information is then merged with the previously detected event.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King and Chavez to obtain above limitation based on the teachings of Petroni for the purpose of merging two similar clusters. 

Regarding claim 16, the combination of Smith, King, Chavez and Petroni discloses wherein the news articles are clustered iteratively in chronological order within a news cycle.
Smith [0030] In some embodiments, the clustering algorithm can utilize weights when clustering similar entity stage(s) and/or provenance chain(s) into particular group(s). For example, a feature of provenance chain data (e.g., temporal and/or locational) can have an associated weight, with the associated weight based on significance in inferring similarities. That is, features having a higher associated weight are more significant in inferring similarities and .

Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Smith, King, Chavez and Petroni and further in view of Rubenczyk and further in view of Savir.  
Regarding claim 17, the combination of Smith, King, Chavez and Petroni discloses the elements of the claimed invention as noted but does not disclose: wherein the number of processors execute the program instructions of the named-entity-based clustering algorithm: to extract named entities from news articles using a statistical named entity recognition algorithm; to determine similarities between a news article and each exiting cluster; 
to determine a closest cluster based on the similarities; responsive to the similarity between the news article and the closest cluster exceeding a similarity threshold, to assign the news article to the closest cluster;  However, Rubenczyk discloses:
Rubenczyk [0060] At block 203, a cluster is created. The cluster includes the pair that has the highest level of similarity. For example if the pair AX is the pair with the highest level of similarity then a new cluster that is comprised of user A and old cluster X is generated. The highest level of similarity may be the determined by the smallest calculated distance. The process of generating the cluster is explained in greater detail in FIG. 2c. In some embodiments, if there are two or more pairs with higher level of similarity only one pair is chosen.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez and Petroni to obtain above limitation based on the teachings of Rubenczyk for the purpose of choosing a pair with the highest level of similarity.  

Savir col 5, lines 20-35 (66) The collaborative filtering module 122 obtains the topic clusters from the clustering module 120 via the topic cluster ingestion module 130 and processes the received information and the topic clusters in the expert similarity and ranking generator 138 to identify the subject matter expert. For example, this may involve determining the topic clusters that are most closely associated with the received information of the current communication and then identifying particular subject matter experts having experience or other expertise that is most similar to the identified topic cluster or clusters. In such an arrangement, the communication is determined to be associated with one or more topic clusters and then the subject matter expert is identified based on the one or more topic clusters.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez and Petroni to obtain above limitation based on the teachings of Savir for the purpose of obtaining the topic of the clusters. 

Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Smith, King, Chavez, Petroni, Rubenczyk and Savir and further in view of Contractor. .  
Regarding claim 18, the combination of Smith, King, Chavez, Petroni, Rubenczyk and Savir     discloses the elements of the claimed invention as noted but does not disclose wherein the number of processors further execute the program instructions of the named-entity-—based clustering algorithm: responsive to the similarity between the news article and the closest news 
Contractor claim 2. The method of claim 1, further comprising: in response to determining that a number of the retrieved data elements is greater than a threshold, creating the new cluster having a label of the selected feature.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez, Petroni, Rubenczyk and Savir to obtain above limitation based on the teachings of Contractor for the purpose of labelling a cluster. 
Note: Singleton interpreted per paragraph 121 of the specification.   

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Smith, King, Chavez, Petroni, Rubenczyk, and Savir and further in view of Clifton and further in view of Stowell.    
Regarding claim 19, the combination of Smith, King, Chavez, Petroni, Rubenczyk and Savir discloses the elements of the claimed invention as noted but does not disclose wherein the number of processors execute the program instructions of the deep learning clustering algorithm:
to determine a lower dimensional representation for the representative news for each cluster using a deep contextualized word representation model; 
However, Clifton discloses:
Clifton abstract, Methods and apparatus for subtyping subjects based on phenotypic information are disclosed. In one arrangement, a data receiving unit receives a subject data unit 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez, Petroni, Rubenczyk and Savir to obtain above limitation based on the teachings of Clifton for the purpose of using a deep learning algorithm to derive a lower dimensional representation of each subject data unit and a clustering algorithm to detect clusters of the resulting lower dimensional representations. 

The combination of Smith, King, Chavez, Petroni, Rubenczyk and Savir discloses the elements of the claimed invention as noted but does not disclose wherein identifying similar clusters further comprises: determining a pairwise cosine similarity between the lower dimensional representation of the representative news of the cluster pair.
However, Stowell discloses:
	Stowell [0020] In another example embodiment, the clustering can be performed using a clustering heuristic by calculating a pairwise cosine distance between two vector representations of the plurality of vector representations that have not yet been clustered, merging the two vector representations into a cluster if the pairwise cosine distance is below a threshold value, removing the two vector representations from the plurality of vector representations if the pairwise cosine 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez, Petroni, Rubenczyk and Savir to obtain above limitation based on the teachings of Stowell for the purpose of merging the two vector representations into a cluster if the pairwise cosine distance is below a threshold value.

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Smith, King, Chavez, Petroni, Rubenczyk, and Savir and further in view of Stowell and further in view of Gupta   
Regarding claim 20, the combination of Smith, King, Chavez, Petroni, Rubenczyk, and Savir discloses the elements of the claimed invention as noted but does not disclose responsive to the pairwise cosine similarity of the cluster pair being greater or equal to a pre-specified threshold.  However, Stowell discloses: 
Stowell [0020] In another example embodiment, the clustering can be performed using a clustering heuristic by calculating a pairwise cosine distance between two vector representations of the plurality of vector representations that have not yet been clustered, merging the two vector representations into a cluster if the pairwise cosine distance is below a threshold value, removing the two vector representations from the plurality of vector representations if the pairwise cosine distance is below the threshold value, calculating a cluster vector representation for the cluster as 

The combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton and Stowell does not disclose: being greater than or equal to a pre-specified threshold, merging the cluster pair together into a merged cluster.  However, Gupta discloses:
	Gupta claim 17, The computing device as described in claim 14, wherein the operations further include determining additional similarity scores based on comparing facial representations of faces in the group with at least one facial representation of faces in the other group; and wherein the merging is based on at least a threshold percentage of the additional similarity scores being at or above a merging threshold. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith, Petroni, Rubenczyk, Savir, Clifton and Stowell to obtain above limitation based on the teachings of Gupta for the purpose of merging when the at least a threshold percentage of the additional similarity scores being at or above a merging threshold. 

and re-determining the representative news of the merged cluster.
Savir col 5, lines 20-35 (66) The collaborative filtering module 122 obtains the topic clusters from the clustering module 120 via the topic cluster ingestion module 130 and processes the received information and the topic clusters in the expert similarity and ranking generator 138 to identify the subject matter expert. For example, this may involve determining .

Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Smith, King, Chavez, Petroni, Rubenczyk, Savir Stowell and Gupta and further in view of Joshi and further in view of Li.  
Regarding claim 21, the combination of Smith, King, Chavez, Petroni, Rubenczyk, Savir Stowell and Gupta discloses the elements of the claimed invention as noted but does not disclose wherein clustering the news clusters to form the set of ranked clusters further comprises: ranking news articles within each cluster sequentially by publication date.  However, Joshi discloses: 
Joshi [0057] The ranking component 608 may be configured to rank the set of filtered content items to create a ranked set of filtered content items (e.g., ranked from most relevant to least relevant; ranked based upon popularity in a category; ranked based upon the number of “likes” a content item received on a social network, etc.). The filtered content items may be ranked based upon a ranking metric. The ranking metric may be configured to rank the set of filtered content items based upon at least one of a user interest (e.g., favorite actor, favorite director, favorite genera, etc.), a publication date (e.g., ranking the most recent television content items first), or the global popularity of a content item. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez, Petroni, Rubenczyk, to obtain above limitation based on the teachings of Joshi for the purpose of ranking filtered content items based on publication date.  

The combination of Smith, King, Chavez, Petroni, Rubenczyk, Savir Stowell and Gupta   discloses the elements of the claimed invention as noted but does not disclose news ranking score to form the set of ranked clusters.  However, Li discloses:
	Li [0072] If at step 1106, the link-based approach does not result in a cluster correspondence determination, the method proceeds to step 1108, where the posting is processed to determine whether or not it refers to a same event referenced by an existing cluster of postings using a semantic class-based approach by the semantic class-based clustering module 132. At step 1110, if the semantic class-based approach results in a cluster correspondence determination, then the method proceeds to step 1112, where, as discussed above, the existing cluster of postings is modified to include the social media posting in the cluster database 136. If at step 1110, the semantic class-based approach also does not result in a cluster correspondence determination, the method proceeds to step 1114, where a new cluster of postings is created to include the social media posting in the cluster database 136. At step 1116, a novelty score is calculated for the new cluster.   
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez, Petroni, Rubenczyk, Savir Stowell and Gupta to obtain above limitation based on the teachings of Li for the purpose of creating a novelty score for the new cluster.   

22 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Smith, King, Chavez, Petroni, Rubenczyk, Savir, Stowell and Gupta and further in view of  Ho and further in view of Chen and further in view of Xu.   
Regarding claim 9, the combination of Smith, King, Chavez, Petroni, Rubenczyk, Savir, Stowell and Gupta discloses the elements of the claimed invention as noted but does not disclose ranking clusters sequentially by update date.  However, Ho discloses: 
Ho [0038] In some embodiments, document database 120 stores supplemental information (e.g., metadata) concerning various documents in the document database. A non-exhaustive set of examples of such information includes document identifier (document ID), author, access control list, document size, timestamps (e.g., timestamps for one or more of creation date, revision history, last updated time, last accessed time, etc.), and document type (e.g., word processor document, spreadsheet, presentation file, etc.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez, Petroni, Rubenczyk, Savir, Stowell and Gupta to obtain above limitation based on the teachings of Ho for the purpose of storing supplemental information (e.g., metadata) concerning various documents in the document database.  

The combination of Smith, King, Chavez, Petroni, Rubenczyk, Savir, Stowell and Gupta  discloses the elements of the claimed invention as noted but does not disclose clustering ranking score.  However, Chen discloses:
Chen col 13, lines 30-45, FIG. 4 illustrates a flow diagram of an example process 400 for scoring possible clusters, in accordance with disclosed subject matter. Process 400 may be 405). In one implementation, top-rated search items may be the items that are ultimately downloaded/purchased after a query. In one implementation, top-rated search items may be the items with a highest relevancy or user-provided ranking. In one implementation, the coverage score may be a percentage of the items in the cluster that are considered top-ranking. A higher score thus indicates high coverage (a better indication of quality). The system generates a respective cluster score for each cluster in the cluster result (e.g., a clustering level).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez, Petroni, Rubenczyk, Savir, Stowell and Gupta to obtain above limitation based on the teachings of Chen for the purpose of ranking such that a higher score thus indicates high coverage (a better indication of quality).

The combination of Smith, King, Chavez, Petroni, Rubenczyk, Savir, Stowell and Gupta       discloses the elements of the claimed invention as noted but does not disclose cluster size.  However, Xu discloses:  
	Xu claim 4, The system as set forth in claim 2, wherein each cluster has a size, and wherein the one or more processors further perform operations of: ranking the clusters by size;  and for one or more of the largest clusters, issuing an alert related to vehicle component failure data in the one or more largest clusters.
.  

Claim 23 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Smith, King, Chavez, Petroni, Rubenczyk and Savir and further in view of Gupta.  
Regarding claim 23 the combination of Smith, King, Chavez, Petroni, Rubenczyk and Savir    discloses wherein the pairs of clusters are not merged if the pairwise cosine similarity of the cluster pair is less than a pre-specified threshold.
	Gupta claim 17, The computing device as described in claim 14, wherein the operations further include determining additional similarity scores based on comparing facial representations of faces in the group with at least one facial representation of faces in the other group; and wherein the merging is based on at least a threshold percentage of the additional similarity scores being at or above a merging threshold. 
Note: See claim 7, for motivation statement.  

Claim 26 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Smith, King, Chavez and Petroni and further in view of Trevisiol and further in view of Nagaraj and further in view of Geng and further in view of Baughman and further in view of Wang.  
Regarding claim 26, the combination of Smith, King, Chavez and Petroni discloses the elements of the claimed invention as noted but does not disclose prior to clustering the news articles, removing irrelevant news based on a popularity of the news source of the news article, wherein 
Regarding claim 26, the combination of Smith, King, Chavez and Petroni discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Trevisiol discloses:
building a mapping from news sources to domains, 
	Trevisiol [0041] In the illustrated example, the two most common source categories are “search” and “social”. The “search” source category includes sites where users are able to submit image search and navigational queries. The “social” source category includes social network websites, such as Facebook, which constitute very popular access points since users are highly interested in photos shared by friends. The fact that many sessions come from the news domain is indicative that an image is often considered as appealing or significant as the actual text of the article. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez and Petroni to obtain above limitation based on the teachings of Trevisiol for the purpose of accessing a popular news domain. 

including grouping news articles by news sources  
The combination of Smith, King, Chavez and Petroni discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Nagaraj discloses:
	Nagaraj [0003]  Such bundling of different datagram packets (e.g., UDP packets) within conglomerate files for broadcast may be organized according to the type of information or 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of        to obtain above limitation based on the teachings of Nagaraj for the purpose of grouping news sources.  

extracting domains from uniform resource locators (URLs) of the news articles; 
The combination of Smith, King, Chavez and Petroni discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Geng discloses:
	Geng abstract A method for detecting a phishing website includes extracting a domain name from a target URL of a web page under investigation
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez and Petroni to obtain above limitation based on the teachings of Geng for the purpose of extracting a domain name from a URL.

for each news source, sequentially looking up the domains in a database of website popularity by frequency until a match is made; 
The combination of Smith, King, Chavez and Petroni discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Baughman discloses:
[0049] Processing proceeds to operation S275 where, responsive to the determination at operation S270 that the first article data set 304A is a real world article, labelling mod 308 adds a tag to the first article data set to label the article as being a real world article about real world game play of the game of quoits. Alternatively, other kinds of responsive actions may be taken in response to the determination that an article is a real world article (or virtual world article), such as: (i) indexing for search engines; (ii) analyzing the text for research purposes (for example, filtering the focus to a correct search domain for applications built for academia, business, and/or government purposes); (iii) analyzing topic trends for the correct domain (for example, analyzing the popularity of a discussion of a real sports team on a social media platform while excluding the discussion of the virtual sports team analog of the real sports team on the social media platform); (iv) indexing and/or filtering results for specialized paid search engines; (v) providing references to news organizations; (vi) automatically posting results to a given social media platform for a correct domain; (vii) automatically moderating a given social media platform the correct domain; (viii) providing topical items for a blog or forum of discussion; (ix) automatically presenting headlines (on television and on the world wide web) in a ticker format; and/or (x) presenting relevant content to video game players while they are actively playing the video game.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez and Petroni to obtain above limitation based on the teachings of Baughman for the purpose of analyzing the popularity of a sports domain. 


The combination of Smith, King, Chavez and Petroni discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Wang discloses:
	Wang [0024] Optionally or alternatively, division of a plurality of pieces of news into a plurality of news clusters further includes the following steps: (a) a K-means clustering operation is performed on the K1 news cluster; (b) performing a filtering process on the K1 news cluster, where the filtering process comprises at least one of the following operations: (i) removing the news in each news cluster that has a centroid similarity with the news cluster lower than the third threshold THs2, and (ii) removing the news cluster that has a number of news pieces (M2) fewer than the fourth threshold THm2.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Smith, King, Chavez and Petroni to obtain above limitation based on the teachings of Wang for the purpose of removing a news cluster that is below a threshold.  

Claim 27 is/are rejected under 35 U.S.C. 103 as being unpatentable over King in view of Chavez and further in view of Kussmaul (US 2020/0401639). 
Regarding claim 27, King discloses:
ingesting news articles from a plurality of news sources;
	King [0001] Most academics and numerous others routinely attempt to discover useful information by reading large quantities of unstructured text. The corpus of text under study may be literature to review, news stories to understand, medical information to decipher, blog posts, 

program code for clustering the news articles according to content of the news articles to form a set of ranked clusters, 
King [0037] Two specific clusterings 304 and 306, each corresponding to one point as indicated by arrows 308 and 310, respectively, in the central space appear to the left and right of the figure. In these clusterings, labels have been added manually for clarification. Clustering 1 (304), creates clusters of "Reagan Republicans" (Reagan and the two Bushes) and all others. Clustering 2 (306) groups the presidents into two clusters organized chronologically. 

wherein the clustering is performed using an artificial intelligence system having a deep learning clustering algorithm that sorts the news articles in chronological order, 
King discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Chavez discloses:
	Chavez [0042] At an operation 504, method 500 can filter, by one or more processors, information received by the pilot to ensure accuracy and timeliness. As an example, the artificial intelligence monitor can eliminate aircraft information that has a timestamp later a predetermined threshold. Alternatively, the artificial intelligence monitor can only provide alerts for values that exceed a certain threshold (such as engine temperature, etc.) or fall below certain thresholds 504 may be performed by one or more physical processors configured to execute a machine-readable instruction component, in accordance with one or more implementations. The method 500 then proceeds to operation 506.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify King to obtain above limitation based on the teachings of Chavez for the purpose of ranking news articles [0014] according to timestamping. 

while also considering historical news clusters formed in a previous news cycle.
King [0037] Two specific clusterings 304 and 306, each corresponding to one point as indicated by arrows 308 and 310, respectively, in the central space appear to the left and right of the figure. In these clusterings, labels have been added manually for clarification. Clustering 1 (304), creates clusters of "Reagan Republicans" (Reagan and the two Bushes) and all others. Clustering 2 (306) groups the presidents into two clusters organized chronologically.  

a non-volatile computer readable storage medium having program code embodied therewith, the program code including: 
King discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Kussmaul discloses:
	Kussmaul [0006] In a further aspect, the invention relates to a computer program product comprising a non-volatile computer-readable storage medium having computer-readable program code embodied therewith for personalizing a search of a search service. The search service comprises a search engine and a search index.
.   

Claims 28 and 29 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of King and Chavez and Kussmaul and further in view of Smith and further in view of Petroni.  
Regarding claim 28, the combination of King and Chavez and Kussmaul discloses the elements of the claimed invention as noted but does not disclose wherein the deep learning clustering algorithm comprises program code for grouping similar news articles into clusters using a named-entity-based clustering algorithm; program code for determining a representative news for each cluster; program code for identifying similar clusters based on similarities between the representative news for a cluster pair;
However, Smith discloses:
	Smith [0029] In some embodiments, the fuzzy cohort component 140 can utilize a clustering algorithm to cluster entity stage(s) and/or provenance chain(s) into individual classes based on similarity in order to group similar journeys into a group of similar journeys. For example, the clustering algorithm can be a hierarchical algorithm, a k-means algorithm, a distribution-based algorithm, and/or a density-based algorithm. Similarity can be based upon a user-defined similarity threshold, for example, to group journeys that are 99.00% similar.  
Smith [0030] In some embodiments, the clustering algorithm can utilize weights when clustering similar entity stage(s) and/or provenance chain(s) into particular group(s). For 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King and Chavez and Kussmaul to obtain above limitation based on the teachings of Smith for the purpose of the fuzzy cohort component 140 utilizing a clustering algorithm to cluster entity stage(s) and/or provenance chain(s) into individual classes based on similarity in order to group similar journeys into a group of similar journeys.  
The combination of King and Chavez and Kussmaul discloses the elements of the claimed invention as noted but does not disclose program code merging the similar clusters.  However, Petroni discloses:
Petroni [0094] If the decision is to merge two similar clusters, continuing onto step 322, the cluster module 128 merges the clusters and stores the merged event detected cluster is stored into cluster data store 143. For example, if social media information is the same as a previously detected event, the social media information is then merged with the previously detected event.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King and Chavez and Kussmaul  to obtain above limitation based on the teachings of Petroni for the purpose of merging two similar clusters. 
King and Chavez and Kussmaul, Smith and Petroni discloses wherein the news articles are clustered iteratively in chronological order within a news cycle.
Smith [0030] In some embodiments, the clustering algorithm can utilize weights when clustering similar entity stage(s) and/or provenance chain(s) into particular group(s). For example, a feature of provenance chain data (e.g., temporal and/or locational) can have an associated weight, with the associated weight based on significance in inferring similarities. That is, features having a higher associated weight are more significant in inferring similarities and features having a lower associated weight are less significant in inferring similarities. A weight of zero is indicative of the associated feature not being utilized for purposes of clustering.

Claim 30 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Kussmaul, Smith and Petroni and further in view of Rubenczyk and further in view of Savir.   
Regarding claim 30, the combination of King, Chavez, Kussmaul, Smith and Petroni discloses the elements of the claimed invention as noted but does not disclose wherein the named-entity-based clustering algorithm further comprises program code for extracting named entities from news articles using a statistical named entity recognition algorithm; program code for determining similarities between a news article and each exiting cluster; program code for determining a closest cluster based on the similarities; responsive to the similarity between the news article and the closest cluster exceeding a similarity threshold, program code for assigning the news article to the closest cluster.  However, Rubenczyk discloses:
[0060] At block 203, a cluster is created. The cluster includes the pair that has the highest level of similarity. For example if the pair AX is the pair with the highest level of similarity then a new cluster that is comprised of user A and old cluster X is generated. The highest level of similarity may be the determined by the smallest calculated distance. The process of generating the cluster is explained in greater detail in FIG. 2c. In some embodiments, if there are two or more pairs with higher level of similarity only one pair is chosen.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul, Smith and Petroni             to obtain above limitation based on the teachings of Rubenczyk for the purpose of choosing a pair with the highest level of similarity.  

 program code for re-determining the representative news for the closest cluster.
The combination of King, Chavez, Kussmaul, Smith and Petroni discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Savir discloses:
Savir col 5, lines 20-35 (66) The collaborative filtering module 122 obtains the topic clusters from the clustering module 120 via the topic cluster ingestion module 130 and processes the received information and the topic clusters in the expert similarity and ranking generator 138 to identify the subject matter expert. For example, this may involve determining the topic clusters that are most closely associated with the received information of the current communication and then identifying particular subject matter experts having experience or other expertise that is most similar to the identified topic cluster or clusters. In such an arrangement, the communication is determined to be associated with one or more topic clusters and then the subject matter expert is identified based on the one or more topic clusters.
obtaining the topic of the clusters. 

Claim 31 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Kussmaul Smith, Petroni, and further in view of Contractor.
Regarding claim 31, the combination of King, Chavez, Kussmaul Smith, Petroni discloses the elements of the claimed invention as noted but does not disclose program code for responsive to the similarity between the news article and the closest news cluster not exceeding the similarity threshold, creating a singleton cluster; and program code for assigning the news article to the singleton cluster.  However, Contractor discloses:
	Contractor claim 2. The method of claim 1, further comprising: in response to determining that a number of the retrieved data elements is greater than a threshold, creating the new cluster having a label of the selected feature.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul Smith, Petroni to obtain above limitation based on the teachings of Contractor for the purpose of labelling a cluster. 
Note: Singleton interpreted per paragraph 121 of the specification.   


32 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of                   and further in view of Clifton (US 2021/0117867) and further in view of Stowell   
Regarding claim 6, the combination of King, Chavez, Smith, Petroni, Rubenczyk discloses the elements of the claimed invention as noted but does not disclose wherein the deep learning clustering algorithm further comprises the steps of:
program code for determining a lower dimensional representation for the representative news for each cluster using a deep contextualized word representation model
However, Clifton discloses:
Clifton abstract, Methods and apparatus for subtyping subjects based on phenotypic information are disclosed. In one arrangement, a data receiving unit receives a subject data unit for each of a plurality of subjects. Each subject data unit represents a plurality of different phenotypic information items about the subject. A data processing unit uses a deep learning algorithm to derive a lower dimensional representation of each subject data unit and a clustering algorithm to detect clusters  of the resulting lower dimensional representations.  The deep learning algorithm and clustering algorithm are implemented by a single mathematical model in which the derivation of the lower dimensional representations and the detection of the clusters are performed jointly.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith, Petroni, Rubenczyk    to obtain above limitation based on the teachings of Clifton for the purpose of using a deep learning algorithm to derive a lower dimensional representation of each subject data unit and a clustering algorithm to detect clusters of the resulting lower dimensional representations.  


	Stowell [0020] In another example embodiment, the clustering can be performed using a clustering heuristic by calculating a pairwise cosine distance between two vector representations of the plurality of vector representations that have not yet been clustered, merging the two vector representations into a cluster if the pairwise cosine distance is below a threshold value, removing the two vector representations from the plurality of vector representations if the pairwise cosine distance is below the threshold value, calculating a cluster vector representation for the cluster as the mean of all vector representations in the cluster, reinserting the cluster vector representation into the plurality of vector representations, and repeating the clustering heuristic for a set number of iterations. The threshold value can be 0.25. The number of iterations can be 3.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Smith, Petroni, Rubenczyk    to obtain above limitation based on the teachings of Stowell for the purpose of merging the two vector representations into a cluster if the pairwise cosine distance is below a threshold value, 
merging the two vector representations into a cluster if the pairwise cosine distance is below a threshold value,

Claim 33 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of 

Regarding claim 33, the combination of King, Chavez, Kussmaul, Smith and Petronic discloses the elements of the claimed invention as noted but does not disclose responsive to the pairwise cosine similarity of the cluster pair.  However, Stowell discloses: 
Stowell [0020] In another example embodiment, the clustering can be performed using a clustering heuristic by calculating a pairwise cosine distance between two vector representations of the plurality of vector representations that have not yet been clustered, merging the two vector representations into a cluster if the pairwise cosine distance is below a threshold value, removing the two vector representations from the plurality of vector representations if the pairwise cosine distance is below the threshold value, calculating a cluster vector representation for the cluster as the mean of all vector representations in the cluster, reinserting the cluster vector representation into the plurality of vector representations, and repeating the clustering heuristic for a set number of iterations. The threshold value can be 0.25. The number of iterations can be 3.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul, Smith and Petronic to obtain above limitation based on the teachings of Stowell for the purpose of merging the two vector representations into a cluster if the pairwise cosine distance is below a threshold value. 
The combination of King, Chavez, Kussmaul, Smith and Petronic does not disclose: being greater than or equal to a pre-specified threshold, merging the cluster pair together into a merged cluster.  However, Gupta discloses:
The computing device as described in claim 14, wherein the operations further include determining additional similarity scores based on comparing facial representations of faces in the group with at least one facial representation of faces in the other group; and wherein the merging is based on at least a threshold percentage of the additional similarity scores being at or above a merging threshold. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul, Smith and Petronic to obtain above limitation based on the teachings of Gupta for the purpose of merging when the at least a threshold percentage of the additional similarity scores being at or above a merging threshold. 

and re-determining the representative news of the merged cluster.
Savir col 5, lines 20-35 (66) The collaborative filtering module 122 obtains the topic clusters from the clustering module 120 via the topic cluster ingestion module 130 and processes the received information and the topic clusters in the expert similarity and ranking generator 138 to identify the subject matter expert. For example, this may involve determining the topic clusters that are most closely associated with the received information of the current communication and then identifying particular subject matter experts having experience or other expertise that is most similar to the identified topic cluster or clusters. In such an arrangement, the communication is determined to be associated with one or more topic clusters and then the subject matter expert is identified based on the one or more topic clusters.

34 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of  King, Chavez, Kussmaul, Smith and Petronic and further in view of Joshi and further in view of Li.
Regarding claim 34, the combination of King, Chavez, Kussmaul, Smith and Petronic        discloses the elements of the claimed invention as noted but does not disclose wherein clustering the news clusters to form the set of ranked clusters further comprises: ranking news articles within each cluster sequentially by publication date.  However, Joshi discloses: 
Joshi [0057] The ranking component 608 may be configured to rank the set of filtered content items to create a ranked set of filtered content items (e.g., ranked from most relevant to least relevant; ranked based upon popularity in a category; ranked based upon the number of “likes” a content item received on a social network, etc.). The filtered content items may be ranked based upon a ranking metric. The ranking metric may be configured to rank the set of filtered content items based upon at least one of a user interest (e.g., favorite actor, favorite director, favorite genera, etc.), a publication date (e.g., ranking the most recent television content items first), or the global popularity of a content item. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul, Smith and Petronic  to obtain above limitation based on the teachings of Joshi for the purpose of ranking filtered content items based on publication date.  

The combination of King, Chavez, Kussmaul, Smith and Petronic  discloses the elements of the claimed invention as noted but does not disclose news ranking score to form the set of ranked clusters.  However, Li discloses:
[0072] If at step 1106, the link-based approach does not result in a cluster correspondence determination, the method proceeds to step 1108, where the posting is processed to determine whether or not it refers to a same event referenced by an existing cluster of postings using a semantic class-based approach by the semantic class-based clustering module 132. At step 1110, if the semantic class-based approach results in a cluster correspondence determination, then the method proceeds to step 1112, where, as discussed above, the existing cluster of postings is modified to include the social media posting in the cluster database 136. If at step 1110, the semantic class-based approach also does not result in a cluster correspondence determination, the method proceeds to step 1114, where a new cluster of postings is created to include the social media posting in the cluster database 136. At step 1116, a novelty score is calculated for the new cluster.  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul, Smith and Petronic  to obtain above limitation based on the teachings of Li for the purpose of creating a novelty score for the new cluster.   

Claim 35 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Kussmaul, Smith and Petronic and further in view of  Ho (US 2018/0196808) and further in view of Chen (US 10,496,691) and further in view of Xu (US 11,244,115).   
Regarding claim 35, the combination of King, Chavez, Kussmaul, Smith and Petronic      discloses the elements of the claimed invention as noted but does not disclose ranking clusters sequentially by update date.  However, Ho discloses: 
[0038] In some embodiments, document database 120 stores supplemental information (e.g., metadata) concerning various documents in the document database. A non-exhaustive set of examples of such information includes document identifier (document ID), author, access control list, document size, timestamps (e.g., timestamps for one or more of creation date, revision history, last updated time last last accessed time, etc.), and document type (e.g., word processor document, spreadsheet, presentation file, etc.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul, Smith and Petronic to obtain above limitation based on the teachings of Ho for the purpose of storing supplemental information (e.g., metadata) concerning various documents in the document database.  

The combination of King, Chavez, Kussmaul, Smith and Petronic discloses the elements of the claimed invention as noted but does not disclose clustering ranking score.  However, Chen discloses:
Chen col 13, lines 30-45, FIG. 4 illustrates a flow diagram of an example process 400 for scoring possible clusters, in accordance with disclosed subject matter. Process 400 may be performed during a clustering process and is not dependent on the clustering process selected. In generating a cluster score, the system may first calculate a coverage score that measures the number of top-rated or popular search items in each cluster (405). In one implementation, top-rated search items may be the items that are ultimately downloaded/purchased after a query. In one implementation, top-rated search items may be the items with a highest relevancy or user-provided ranking. In one implementation, the coverage score may be a percentage of the items in 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul, Smith and Petronic to obtain above limitation based on the teachings of Chen for the purpose of ranking such that a higher score thus indicates high coverage (a better indication of quality).

The combination of King, Chavez, Kussmaul, Smith and Petronic discloses the elements of the claimed invention as noted but does not disclose cluster size.  However, Xu discloses:  
	Xu claim 4, The system as set forth in claim 2, wherein each cluster has a size, and wherein the one or more processors further perform operations of: ranking the clusters by size;  and for one or more of the largest clusters, issuing an alert related to vehicle component failure data in the one or more largest clusters.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul, Smith and Petronic to obtain above limitation based on the teachings of Chen for the purpose of ranking the clusters by size. 

Claim 36 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Kussmaul, Smith and Petronic and further in view of Gupta. 

	Gupta claim 17, The computing device as described in claim 14, wherein the operations further include determining additional similarity scores based on comparing facial representations of faces in the group with at least one facial representation of faces in the other group; and wherein the merging is based on at least a threshold percentage of the additional similarity scores being at or above a merging threshold. 
Note: See claim 7, for motivation statement.  

Claim 39 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of King, Chavez, Kussmaul, Smith, and Petronic and further in view of Trevisiol (US 2016/0299989) and further in view of Nagaraj (US 2012/0207075) and further in view of Geng (US 9,276,956)  and further in view of Baughman (US 2021/0034784) and further in view of Wang (US 2018/0260484) 
Regarding claim 39, the combination of King, Chavez, Kussmaul, Smith, and Petronic         discloses the elements of the claimed invention as noted but does not disclose prior to clustering the news articles, removing irrelevant news based on a popularity of the news source of the news article, wherein removing irrelevant news based on the popularity of the news source of the news article further comprises: 
Regarding claim 13, the combination of King, Chavez, Kussmaul, Smith, and Petronic       discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Trevisiol discloses:

	Trevisiol [0041] In the illustrated example, the two most common source categories are “search” and “social”. The “search” source category includes sites where users are able to submit image search and navigational queries. The “social” source category includes social network websites, such as Facebook, which constitute very popular access points since users are highly interested in photos shared by friends. The fact that many sessions come from the news domain is indicative that an image is often considered as appealing or significant as the actual text of the article.  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul, Smith, and Petronic to obtain above limitation based on the teachings of Trevisiol for the purpose of accessing a popular news domain. 

including grouping news articles by news sources  
The combination of King, Chavez, Kussmaul, Smith, and Petronic discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Nagaraj discloses:
	Nagaraj [0003]  Such bundling of different datagram packets (e.g., UDP packets) within conglomerate files for broadcast may be organized according to the type of information or application (e.g., grouping news source UDP packets within a conglomerate news file for broadcast), or according to demographics (e.g., grouping UDP packets for applications of interest to children in a conglomerate file for broadcast), for example. Files and datagram packets for broadcast and/or reception may be advertised in advance by the broadcast network.
King, Chavez, Kussmaul, Smith, and Petronic to obtain above limitation based on the teachings of Nagaraj for the purpose of grouping news sources.  

extracting domains from uniform resource locators (URLs) of the news articles; 
The combination of King, Chavez, Kussmaul, Smith, and Petronic discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Geng discloses:
	Geng abstract A method for detecting a phishing website includes extracting a domain name from a target URL of a web page under investigation
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul, Smith, and Petronic to obtain above limitation based on the teachings of Geng for the purpose of extracting a domain name from a URL.

for each news source, sequentially looking up the domains in a database of website popularity by frequency until a match is made; 
The combination of King, Chavez, Kussmaul, Smith, and Petronic discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Baughman discloses:
	Baughman [0049] Processing proceeds to operation S275 where, responsive to the determination at operation S270 that the first article data set 304A is a real world article, labelling mod 308 adds a tag to the first article data set to label the article as being a real world 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul, Smith, and Petronic to obtain above limitation based on the teachings of Baughman for the purpose of analyzing the popularity of a sports domain. 

and removing news articles published by news sources that rank below a popularity threshold in the database of website popularity.
The combination of King, Chavez, Kussmaul, Smith, and Petronic discloses the elements of the claimed invention as noted but does not disclose above limitation.  However, Wang discloses:
[0024] Optionally or alternatively, division of a plurality of pieces of news into a plurality of news clusters further includes the following steps: (a) a K-means clustering operation is performed on the K1 news cluster; (b) performing a filtering process on the K1 news cluster, where the filtering process comprises at least one of the following operations: (i) removing the news in each news cluster that has a centroid similarity with the news cluster lower than the third threshold THs2, and (ii) removing the news cluster that has a number of news pieces (M2) fewer than the fourth threshold THm2.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of King, Chavez, Kussmaul, Smith, and Petronic to obtain above limitation based on the teachings of Wang for the purpose of removing a news cluster that is below a threshold.  

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ETIENNE PIERRE LEROUX whose telephone number is (571)272-4022. The examiner can normally be reached Monday through Friday, 8:00 am to 4:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Apu Mofiz can be reached on 571 272 4080. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/ETIENNE P LEROUX/Primary Examiner of Art Unit 2161