DETAILED ACTION
The applicant’s request for continued examination regarding application number 16/013,162, filed June 20, 2018 has been entered.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 

Response to Amendments
The amendment filed February 14, 2022 has been entered. Examiner acknowledges receipt of Amendments to Application 16/013,162, which include: Amendments to the Claims, and Remarks containing applicant’s amendments. 
Regarding Applicant’s Remarks and Amendments to the Claims, Examiner has acknowledged Claims 1, 17, and 20 have been amended. Claims 1-20 remain pending in the application. 

Response to Arguments
Examiner acknowledges receipt of Arguments to Application 16/013,162, which include: Remarks containing Applicant’s arguments. 
Regarding Applicant's Remarks for Claims 1-4 and 8-20 under 35 U.S.C 103 as being unpatentable over Kumar et al., U.S. PGPUB 2017/0061286, published 3/2/2017 [hereafter referred as Kumar] in view of Polikar, Robi, Ensemble Based Systems in Decision Making, IEEE Circuits and Systems Magazine, Third Quarter 2006 [hereafter referred as Polikar]; and for Claims 5-7 under 35 U.S.C. 103 as being unpatentable over Kumar in view of Polikar as applied to Claim 3, in further view of Arpat et al., U.S. PGPUB 2015/0074020, published 3/12/2015, incorporating by reference in its entirety  Examiner acknowledges Applicant’s arguments and have considered them, and have found them to be not persuasive. 
Regarding Applicant’s Remarks on pp.12-13:
“Claim 1 has been amended to recite: 
obtaining the prediction results of the respective classifiers of the user's comments based on respective subject feature word expressions of the user's comments, the subject feature word expression corresponding to the quality or the author of the article, or to an event or an entity in the article. 
Supports for the amendments may be found throughout the specification. For instance, supports for the amendments may be found in paragraph [0162] of the specification. 
Kumar discloses a system and method for generating a recommendation system based on supervised learning, which includes selecting a subset of features and a subset of rows in a master dataset, and building a first model based on a first dataset and a supervised learning method, wherein the first dataset is restricted to the subset of features and the subset of rows in the master dataset. Kumar further discloses generating a prediction of a user response of the first user to a set of candidate items based on the first model, and generating a recommendation of a first candidate item based on the prediction.
Kumar mentions a data collection module collects the topic, author, and comments of an article for generating a prediction of a user response. However, Kumar does not disclose or suggest the feature of "obtaining the prediction results of the respective classifiers of the user's comments based on respective subject feature word expressions of the user's comments," which "correspond[s] to the quality or the author of the article, or to an event or an entity in the article" as recited in claim 1 as currently amended. According to the scheme of claim 1, since the subject feature word expression corresponds to the quality or the author of the article, or to an event or an entity in the article, the obtained prediction results of the respective classifiers of the user's comments are distinguishable between a user's comment corresponding to the quality or the author of the article and that corresponding to an event or an entity in the article.”
Examiner has considered this argument, and finds the argument to be not persuasive. Examiner notes that the Applicant’s arguments are directed to the newly-added limitation that is appended to the last claim limitation in the independent claims (“obtaining the prediction results of the respective classifiers of the user’s comments based on respective subject feature word expressions of the user’s comments, the subject feature word expression corresponding to the quality or the author of the article, or to an event or an entity in the article.”). Under its broadest reasonable interpretation, the ‘or’ defined in this claim limitation is used for linking a list of alternatives, where there are now four alternatives presented with respect to the subject feature word expression: “the quality … of the article”; “the author … of the article”; “an event … in the article”; “an entity in the article”, where a listing of alternatives requires only that one of the four alternatives to be present for the claimed invention to function. Examiner also notes that p.26 2nd paragraph of the applicant’s specification (applicant’s publication U.S. PGPUB 2018/0365574 paragraph [0162]) is the only location in the specification where the terms “an event” and “an entity” in an article are broadly recited, without any further details that limit what is considered an event or entity in an article, thus resulting in an interpretation where any type of action, event, or activity present in text form can correspond to “an event”, and where any type of item, place, location, identifier present in text form can correspond to “an entity”. 
Examiner also notes that Applicant has acknowledged that the Kumar reference teaches a data collection module that collects the topic, author, and comments of an article for generating a prediction of a user response, which was indicated in the Non-Final Office Action mailed November 30, 2021, where Kumar teaches using a text analytics module to featurize the textual data associated with items and/or users, including information pertaining to an author or creator of an item, where the information of the author or creator is provided in a review (Kumar Figure 2, elements 220, 222; [0061]-[0062]; and [0067]-[0068]). Kumar [0067]-[0068] also further teaches the data collection module obtaining additional textual information, including the popularity of items created by the creator or author: “… data collection module 220 obtains author or creator information associated with an item from a server or service and generates item data … the information about the creator could include the popularity of items created or posted by the creator (e.g., in terms of one or more of views, likes, purchases, and/or reviews provided on the server or service or a third party server or service), genres of other items from the same creator, and/or other information pertaining to an author or creator of an item …”, where the identification of items (or genres of other items) created or posted by an author or creator, and the corresponding popularity actions associated with those items created or posted by an author or creator are considered forms of “an entity” and “an event” in an article, respectively. Hence, given the above evidence, the Applicant’s arguments are not persuasive, and thus the prior art rejection is maintained.
Regarding Applicant’s Remarks on p.13:
“Polikar discloses an automatic decision-making application that combines a personal opinion, a second opinion and even a third opinion to arrive at the most sensible final decision. Polikar discusses the combination of classifiers in an ensemble-based algorithm that is conducive to multiple classifiers, classifier fusion, classifier selection, classifier diversity, incremental learning and data fusion. Polikar discusses two key components of ensemble systems: an algorithm to generate the individual classifiers of the ensemble and a method for combining the outputs of these classifiers. Polikar also describes classifier fusion and classifier selection in the decision- making algorithm. However, Polikar fails to disclose or suggest the subject feature word expression "corresponding to the quality or the author of the article, or to an event or an entity in the article" as recited in claim 1. Therefore, Polikar cannot obtain prediction results of the respective classifiers of the user's comments which are distinguishable between a user's comment corresponding to the quality or the author of the article and that corresponding to an event or an entity in the article.
In view of the foregoing, the cited references, either considered individually or in combination, do not teach or suggest the technical features and solution of claim 1 as currently amended. Therefore, claim 1 is in condition for allowance.”
Examiner has considered this argument, and finds the argument to be not persuasive.
As indicated in the Non-Final Office Action mailed November 30, 2021, the Kumar reference is used to teach the earlier limitation “obtaining the prediction results of the respective classifiers of the user's comments based on respective subject feature word expressions of the user's comments, the subject feature word expression corresponding to the quality or the author of the article …”, as well as the newly appended limitation “… or to an event or an entity in the article”, as shown in the response to Applicant’s earlier argument. As indicated in the Non-Final Office Action mailed November 30, 2021, the Polikar processing … each of the plurality of classifiers to obtain a prediction result of the classifier, the prediction result comprising respective results of giving up vote …” and “predicting … according to the prediction results of the respective classifiers and a predetermined weight of the respective classifiers …”), both of which are directed to combining multiple classifier models to produce a combined prediction. A motivation to combine both the Kumar and Polikar references is identified in the Polikar reference, where using boosting algorithms such as AdaBoost.M1 improves the prediction performance of a system by taking a diverse set of multiple classifier models and using them to cross-train each other to produce a stronger combined prediction (Polikar p.40 col.1 1st-4th paragraphs). Hence, given the above evidence, the Applicant’s arguments are not persuasive, and thus the prior art rejection is maintained.

Information Disclosure Statement
The non-patent literature document in the following information disclosure statement was not considered due to the following reason:
IDS 10/13/2021: Search Report of Chinese Application no. 2017104695427 dated May 25, 2021, 3 pages (no English translation). While it appears that this search report was filed on May 25, 2021 (as this is the date shown on page 3), the majority of the search report is written in Chinese (including the search summary and table headings), thus making it difficult to consider its relevance with the applicant’s disclosure. If applicant wishes to have this search report further considered, applicant is advised to provide a translation of the search summary and table headings for this document. 

Claim Rejections - 35 USC § 103












The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1-4 and 8-20 are rejected under 35 U.S.C. 103 as being unpatentable over 
Kumar et al., U.S. PGPUB 2017/0061286, published 3/2/2017 [hereafter referred as Kumar] in view of Polikar, Robi, Ensemble Based Systems in Decision Making, IEEE Circuits and Systems Magazine, Third Quarter 2006 [hereafter referred as Polikar].
Regarding amended Claim 1, 
Kumar teaches
(Currently Amended) A computer-implemented method for recognizing a low-quality article based on artificial intelligence (Examiner’s note: Under its broadest reasonable interpretation, an article’s quality is interpreted to be a measurement of the article’s inherent appeal, which can be measured in various ways, such as its popularity, its interestingness, or its relevance to a reader; hence a “low-quality” article can be interpreted as an article having low popularity (e.g., unpopular), having non-interesting content, or containing non-useful information. Kumar teaches a recommendation system that supports various applications, including news-related applications (corresponding to “news-recommending system”). Kumar also teaches in Kumar [0069] a data collection module within a recommendation server collecting items and their respective item content data including an item topic, with the popularity of an  (Kumar [0022]: “A system and method for generating a recommendation system using supervised learning is described.”; [0002]: “Recommendation systems are applied in a variety of applications … to recommend movies, music, restaurants, books, news, and various other products for user consumption.”; and [0108]: “… popularity is measured in terms of the number of overall positive interactions such as purchases or likes, or the current rate of positive interactions.”).),
wherein the method comprises: 
obtaining a user feedback behavior feature of a to-be-recognized article in a news-recommending system (Kumar Figure 1, elements 102, 108, 116; Kumar Figure 2, element 102, 220, 212: examiner’s note: Kumar teaches a recommendation server for running a news-related application (corresponding to “news-recommending system”) interacting with an online service and items from an item server (an item being a news article, with item data being a feature from the news article) (corresponding to “a to-be-recognized article”) to create supervised learning models that recognize an item’s quality and perform recommendations. Kumar also teaches a data collection module (Kumar [0066], [0069]) within the recommendation server (running a news-related application) obtaining user data attributes and user feedback (corresponding to “a user feedback behavior feature of a to-be-recognized article”), including comments, user shares, likes, dislikes, favorites, and actions. Kumar further teaches corresponding method steps (Kumar Figure 6, element 606, Kumar [0120]-[0121]) for using a data collection module to collect user-item interaction data and featuring this data, thus corresponding to “obtaining a user feedback behavior feature of a to-be-recognized article” (Kumar [0030]: “… the recommendation server 102 provides services to a data analysis customer by receiving and processing information from the plurality of resources or devices 108, 110, and 114 to create predictive models and, in some instances, generate recommendations based on those models. … the recommendation server 102 provides the predictive model to the item server 108 for use in generating item recommendations for users subscribed to the online service 116 hosted by the item server 108.”; [0066]: “…the data collection module 220 collects user data attributes by virtue of users interacting with an application or browser accessing the item server 108 on a client device 114, filling out surveys, publicly known information about the user, etc. For example, the data collection module 220 groups user data as and/or include … (3) … number of positive interactions, number of negative interactions, last five interactions, engagement rate by time of day, user's active applications, number of visits in the last month, week, or day, average interaction time over a time period, etc., (4) user feedback, such as comments, shares, likes, dislikes, favorites, actions, etc., and so forth.”; and [0120]-[0121]: “At block 604, the data collection module 220 collects item data for one or more items, which may occur in the same or similar way to or along with the collection of user data discussed above. The data collection module 220 and/or the data preparation module 226 may augment or featurize the item data to describe items or similarity between items, as described elsewhere herein … At block 606, the data collection module 220 collects user-item interaction data for one or more users and items, which may occur in a similar way to or along with the collection of user data and/or item data discussed above. … the storage device 212 may already contain user data and item data, but the data collection module 220 updates the interaction data to include an interaction of the user with the item (e.g., as received, or, in some instances, as the interaction occurs).”).), 
the user feedback behavior feature being in multiple forms and comprising user’s comments and non-user’s comments which refer to user feedback other than the user’s comments (Kumar Figure 2, element 220: examiner’s note: Kumar teaches a data collection module performing gathering of various forms of user and interaction data, including user’s comments as well as non-user’s comments (user feedback such as likes, dislikes, shares, actions) (Kumar Figures 4, 5 and [0071]-[0075]; [0080]-[0082]; and [0066]-[0068]: “… user feedback, such as comments, shares, likes, dislikes, favorites, actions, etc., and so forth … the data collection module 220 obtains user comments, such as comments on an item, and comment features (e.g., metadata) from a server or service… the data collection module 220 obtains author or creator information associated with an item from a server or service … could include the popularity of items created or posted by the create (e.g., in terms of one or more of views, likes, purchases, and/or reviews provided…), genres of other items by the same creator, and/or other information pertaining to an author or creator of an item …”.);
according to the user feedback behavior feature of the to-be-recognized article and a predetermined low-quality article recognition model, recognizing whether the to-be-recognized article is a low-quality article, the predetermined low-quality article recognition model comprising a plurality of classifiers, each of which corresponding to one of the user feedback feature (Kumar Figure 2, element 234a: examiner’s note: As indicated earlier, Kumar teaches in Kumar [0069] a data collection module within a recommendation server collecting items and their respective item content data including an item topic, with the popularity of an item measured based on user feedback such as view count, number of likes and dislikes, and with an item’s popularity contributing towards a recommendation rating for an item. In the context of a news recommendation service (Kumar [0002],[0022]), these items correspond to a news article, and a news article’s measured popularity and its corresponding recommendation rating are valid measurements for establishing an item’s “quality”, allowing for recognizing whether the to-be-recognized article is a low-quality article or a non-low-quality article. Kumar further teaches a supervised learning module creating multiple supervised learning models that are associated with subsets of user and interaction data (Kumar [0104]: “… the supervised learning module 234a creates multiple models for each supervised learning method and/or on different subsets of original or overall dataset (e.g., different subsets of user data, subsets of item data or subsets of interactions data).”), where the user and interaction data includes the user feedback behavior information (Kumar [0066]-[0068]; Figures 4, 5 and [0071]-[0075]; and [0080]-[0082]), and each supervised learning model can be different types of supervised learning models including a linear model or a gradient boosted model for generating predictions, with each of these multiple supervised models generating predictions corresponding to “a classifier”. This supervised model generation is taught in Kumar Figure 11, [0131]-[0133], where a model generation module is triggered to select the supervised learning methods to train and build these supervised learning models (Kumar [0090]-[0092]), which are further used to perform predictions of a user response (Kumar [0082]: “User response .. (e.g., like, dislike, view, skip, ignore, …) …”, where user responses including likes or dislikes are valid measurement of popularity and hence a  “according to the user feedback behavior feature of the to-be-recognized article and a predetermined low-quality article recognition model, recognizing whether the to-be-recognized article is a low-quality article, …”. Kumar further teaches combining the multiple supervised learning models to generate a final prediction using weighted averaging, where these examples of combining multiple supervised learning models (and their respective predictions) into a larger combined model encompassing the individual supervised learning models corresponds to “… the predetermined low-quality article recognition model comprising a plurality of classifiers, each of which corresponding to one of the user feedback feature” (Kumar [0104]-[0105]: “ … the supervised learning module 234a creates multiple models for each supervised learning method and/or on different subsets of original or overall dataset … multiple models can be created and their results can be combined using … weighted averaging, … the supervised learning module 234a optimizes a quantity of gradient boosted models to be combined by, for example, generating different numbers of datasets from the original dataset as described above, combining the models created for each dataset, and comparing the accuracies obtained for the different numbers of models … the supervised learning module 234a selects and trains multiple models on each sample dataset or subset dataset and then combines the models by a simple averaging approach, which would allow each model to be an expert on a different subset dataset that is restricted in the overall dataset or master dataset … For instance, the supervised learning module 234a first creates a support vector machine, a gradient boosted model, and a linear model, and then creates a final model that takes the predictions of each these models as inputs together with the original inputs and the final model predicts the outputs.”).);
wherein said recognizing whether the to-be-recognized article is a low-quality article comprising:
processing the corresponding user feedback behavior feature using each of the plurality of classifiers (Examiner’s note: As indicated earlier, Kumar teaches a flow (Kumar Figure 11, [0131]-[0133]) that includes generating subsets of features from a master dataset that includes user data, item data, and user-item interaction data (Kumar Figure 11, steps 1102 and 1104, where the user data, item Kumar Figures 3, 4, and 5 and in Kumar [0066]-[0068], [0071]-[0075], [0080]-[0082], thus corresponding to “user feedback behavior feature”) and using this subset of input feature data as input into a model generation module to select supervised learning methods from a supervised learning module (Kumar Figure 2, elements 232, 234a) to train and build supervised learning models based on this subset of input feature data (Kumar [0090]-[0092] and Kumar Figure 11, steps 1106 and 1108, where each of the generated supervised learning model corresponds to “a classifier” as taught in Kumar [0105]). Kumar further teaches that the generated supervised learning model is used to perform predictions of a user response, where these predictions are provided to a recommendation server to generate a recommendation of a candidate item based on the prediction, where a user response includes likes or dislikes, and these likes and dislikes represent valid measurements of popularity (Kumar Figure 11, step 1114; and [0082]: “User response .. (e.g., like, dislike, view, skip, ignore, …) …”). In the context of a news recommendation service, the prediction based on these user responses will reflect the “quality” of an item, where a predicted “like” user response will represent a higher quality of the item versus a predicted “dislike” user response (which will represent a lower quality of the item). Kumar further teaches that additional supervised learning models can be created from these subsets of data (Kumar Figure 11, steps 1120-1122, and steps 1106-1122 repeatedly) until the supervised learning module determines that no more feature subsets of data can be created, thus halting the process of model building. Hence this process of forming feature subsets of datasets and using them to train and build supervised learning models corresponds to “processing the corresponding user feedback behavior feature using each of the plurality of classifiers”.), …
predicting whether the to-be-recognized article is the low-quality article, according to the prediction results of the respective classifiers and a … weight of the respective classifiers (Kumar Figure 2, element 234a: examiner’s note: Kumar teaches the supervised learning module selecting and training multiple supervised learning models and combining the models to generate a final prediction using weighted averaging, where these examples of combining multiple supervised learning models (and their respective predictions) into a larger combined model encompassing the individual supervised learning models. Thus, the process of combining supervised learning models and their respective individual predictions into an aggregate prediction through weighted averaging corresponds to a method Kumar [0104]-[0105]).); and
obtaining the prediction results of the respective classifiers of the user’s comments based on respective subject feature word expressions of the user’s comments, the subject feature word expression corresponding to the quality or the author of the article (Kumar Figure 2, element 220, 222: examiner’s note: The ‘or’ defined in this claim limitation is interpreted as an exclusive ‘or’, indicating that either a subject feature word expression corresponding to the quality of the article or the author of the article is required to be present for the claimed invention, or corresponding to an event in the article or an entity in the article is required to be present for the claimed invention. Kumar teaches a data collection module containing a text analytics module that performs text analysis and feature extraction on user’s comments using natural language processing and bag-of-words methods (Kumar [0061]-[0062]), such that the text analytics module featurizes the textual data associated with items and/or users, including information pertaining to an author or creator of an item, where the information of the author or creator is provided in a review (Kumar [0067]-[0068]: “The data collection module 220 instructs the text analytics module 222 to generate text features from the description text and title, for example, vector space representation of the description text and title and stores it as item data. … the data collection module 220 obtains user comments, such as comments on an item, and comment features (e.g., metadata) from a server or service. The data collection module 220 generates item data from the comments and comment features. For example, the item data may include the number of comments, vector space representations of text comments (generated by text analytics module 222), sentiment features generated from the text comments using natural language processing, etc. … the data collection module 220 obtains author or creator information associated with an item from a server or service and generates item data … the information about the creator could include the popularity of items created or posted by the creator (e.g., in terms of one or more of views, likes purchases, and/or review provided on the server or service or a third party server or service), genres of other items by the same creator, and/or other information pertaining to an author or creator of an item …”, thus corresponding to “… subject feature word expressions of the user’s comments, the subject feature word expression corresponding to … the author of the article”). Kumar further teaches that this information is provided as part of the user data, item data, and user-item interaction data is taught in Kumar Figures 3, 4, and 5 and in Kumar [0066]-[0068], [0071]-[0075], [0081]-[0082], and is provided to the flow taught in Kumar Figure 11 to generate a subset of input feature data, each of which trains and builds a plurality of classifiers to produce prediction results and recommendations, thus corresponding to “obtaining the prediction results of the respective classifiers of the user’s comments based on respective subject feature word expressions of the user’s comments, the subject feature word expression corresponding to … the author of the article …”.), or 
to an event or an entity in the article (Examiner’s note: The ‘or’ defined in this claim limitation is interpreted as an exclusive ‘or’, indicating that either a subject feature word expression corresponding to the quality of the article or the author of the article is required to be present for the claimed invention, or corresponding to an event in the article or an entity in the article is required to be present for the claimed invention. Under its broadest reasonable interpretation, the terms “an event” and “an entity” in an article are broadly recited, thus resulting in an interpretation where any type of action, event, or activity identified in text can correspond to “an event”, and where any type of item, place, location, identified in text can correspond to “an entity”. As indicated earlier, Kumar teaches a data collection module containing a text analytics module that performs text analysis and feature extraction on user’s comments using natural language processing and bag-of-words methods, such that the text analytics module featurizes the textual data associated with items and/or users, including item information (and genres of other items) pertaining to an author or creator, and the corresponding popularity actions associated with the item pertaining to an author or creator, where these item listings/genres of other items and popularity actions represent an entity and an event in the article, respectively (Kumar [0061]-[0062]; [0067]-[0068]). Kumar further teaches that this information is provided as part of the user data, item data, and user-item interaction data is taught in Kumar Figures 3, 4, and 5 and in [0066]-[0068], [0071]-[0075], [0081]-[0082], and is provided to the flow taught in Kumar Figure 11 to generate a subset of input feature data, each of which trains and builds a plurality of classifiers to produce prediction results and recommendations, thus corresponding to “obtaining the prediction results of the respective classifiers of the user’s comments based on respective subject feature word expressions of the user’s comments, the subject feature word expression corresponding … to an event or an entity in the article.”).  
While Kumar teaches combining multiple classifiers through combination methods including weighted averaging, and the storing of the respective learned parameter settings (i.e., weights) in memory, Kumar does not explicitly teach
processing … each of the plurality of classifiers to obtain a prediction result of the classifier, the prediction result comprising respective results of giving up vote …
predicting … according to the prediction results of the respective classifiers and a predetermined weight of the respective classifiers …
Polikar teaches
processing … each of the plurality of classifiers to obtain a prediction result of the classifier, the prediction result comprising respective results of giving up vote (Polikar p.28 Figure 5: examiner’s note: Referring to Polikar Figure 5 (Test: Simple Majority Voting), Polikar teaches simple majority voting, where an ensemble containing classifiers                         
                            
                                
                                    h
                                
                                
                                    1
                                
                            
                        
                    , …,                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     each contribute to a vote                         
                            
                                
                                    v
                                
                                
                                    t
                                
                            
                        
                    , where a value of 1 indicates that the classifier picks a certain class                         
                            
                                
                                    ω
                                
                                
                                    j
                                
                            
                        
                    , and a value of 0 indicates otherwise. Polikar teaches that these individual votes from each of the plurality of classifiers in the ensemble are summed together based on class                         
                            
                                
                                    ω
                                
                                
                                    j
                                
                            
                        
                    , with the decision outcome based on determining the class that receives the highest total vote, and using that determination to determine the final classification for the ensemble. The representation of a vote from a classifier being 0 corresponds to the respective classifier not contributing to the overall classification, which corresponds to “processing … a classifier to obtain a prediction result of the classifier, the prediction result comprising … respective result of giving up vote”. When combined with the teachings of Kumar, the final classification that is based on the sum of each of these votes generated from a plurality of classifiers based on class corresponds to “processing … each of the plurality of classifiers to obtain a prediction result of the classifier, the prediction result comprising respective results of giving up vote”. Polikar further teaches that this concept of simple majority voting is not necessarily tied to bagging algorithm but can be used in the context of performing majority voting and its extension weighted majority voting to determine the final classification result from the ensemble of a plurality of classifiers (Polikar pp.34-35 Section 4.1 Combining Class Labels and Section 4.1.1 Majority Voting; and Section 4.1.2 Weighted Majority Voting 1st paragraph: “… let us denote the decision of hypothesis                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     on class                         
                            
                                
                                    ω
                                
                                
                                    j
                                
                            
                        
                     as                         
                            
                                
                                    d
                                
                                
                                    t
                                    ,
                                    j
                                
                            
                        
                    , such that                         
                            
                                
                                    d
                                
                                
                                    t
                                    ,
                                    j
                                
                            
                        
                     is 1, if                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     selects                         
                            
                                
                                    ω
                                
                                
                                    j
                                
                            
                        
                      and 0, otherwise.”).) …
predicting … according to the prediction results of the respective classifiers and a predetermined weight of the respective classifiers (Polikar p.31, Figure 9: examiner’s note: Referring to Polikar Figure 9, Polikar teaches                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     (1 < k < T), where                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     represents a different classifier with different weights determined by their respective logarithm of training error 1/                        
                            
                                
                                    β
                                
                                
                                    t
                                
                            
                        
                     (Polikar p.30 col.1 3rd paragraph – p.30 col.2 1st paragraph: “…AdaBoost uses a rather undemocratic voting scheme, called the weighted majority voting. The ideas is an intuitive one: those classifiers that have shown good performance during training are rewarded with higher voting weights than the others. … To avoid potential instability that can be caused by asymptotically large numbers, the logarithm of 1/                        
                            
                                
                                    β
                                
                                
                                    t
                                
                            
                        
                     is usually used as the voting weight of                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                    . At the end the class that receives the highest total vote from all classifiers is the ensemble decision.”). Polikar teaches assigning weights                         
                            
                                
                                    w
                                
                                
                                    t
                                
                            
                        
                     for each classifier                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     as part of the Weighted Majority Voting scheme taught in Polikar pp.34-35 Section 4.1.2 Weighted Majority Voting, to determine the final classification result from the ensemble of a plurality of classifiers, where the assigned weights for each respective classifier corresponds to “a predetermined weight of the respective classifiers”. Hence, this process of applying weighted majority voting corresponds to “predicting … according to the prediction results of the respective classifiers and a predetermined weight of the respective classifiers”.) …
Kumar and Polikar are analogous art as both teach the use of multiple classifier models to perform predictions.
It would have been obvious to a person having ordinary skill in the art before the filing date of the invention to take the weighted averaging combination method of Kumar and enhance it with the AdaBoost.M1 and weighted majority voting ensemble method of Polikar as a way to generate an aggregated prediction from multiple classifier models. The motivation to combine is taught in Polikar, as boosting algorithms such as AdaBoost.M1 improve the prediction performance by taking a diverse set of multiple classifier models that individually perform predictions on certain data sets, using each of their individual predictions to cross-train and contribute towards a stronger combined prediction. Furthermore, (Polikar p.40 col.1, 1st - 4th paragraphs: “The primary thrust of using ensemble systems has been to reduce the risk of choosing a single classifier with a poor performance, or to improve upon the performance of a single classifier by using an intelligently combined ensemble of classifiers. Many additional areas and applications have recently emerged, however, for which the ensemble systems are inherently appropriate. … In certain applications, it is not uncommon for the entire dataset to gradually become available in small batches over a period of time. Furthermore, datasets acquired in subsequent batches may introduce instances of new classes that were not present in previous datasets. In such settings, it is necessary to learn the novel information content in the new data, without forgetting the previously acquired knowledge, and without requiring access to previously seen data. The ability of a classifier to learn under these circumstances is usually referred to as incremental learning. … A practical approach for learning from new data involves discarding the existing classifier, and retraining a new one using all data that have been accumulated thus far. This approach, however, results in loss of all previously acquired information, a phenomenon known as catastrophic forgetting (or interfering) [74]. … Ensemble systems have been successfully used to address this problem. The underlying idea is to generate additional ensembles of classifiers with each subsequent database that becomes available, and combine their outputs using one of the combination methods discussed above.”).
Regarding original Claim 2, 
Kumar in view of Polikar teaches
(Original) The method according to claim 1, 
wherein the method further comprises:
obtaining a feature of the to-be-recognized article in the news-recommending system (This claim limitation is similar in scope to a corresponding claim limitation in Claim 1 (where a feature of the to-be-recognized article in the news-recommending system is functionally equivalent to a user feedback behavior feature), and hence is rejected under similar rationale.); 
correspondingly, the step of, according to the user feedback behavior feature of the to-be-recognized article and a predetermined low-quality article recognition model, recognizing whether the to-be-recognized article is a low-quality article specifically further comprises: 
recognizing whether the to-be-recognized article is a low-quality article, according to the user feedback behavior feature of the to-be-recognized article and the predetermined low-quality article recognition model (This claim limitation is similar in scope to a corresponding claim limitation in Claim 1, and hence is rejected under similar rationale.), and
in combination with the feature of the to-be-recognized article (Kumar Figure 2, elements 226, 212: examiner’s note: Kumar teaches a data preparation module within the data collection module formulating a training data table (for training and evaluating a supervised learning model) consisting of rows of stored user data, item data (an item being a news article, with item data being a feature from the news article), and user-item interaction data collected from the storage device and from the data collection module, where the columns represent features related to the user, item, and user-item interaction (where an user-item interaction corresponds to one of “the feature of the to-be-recognized article), thus making this training data table containing user, item, and user-item interaction information corresponding to a use of this training data table “in combination with the feature of the to-be-recognized article” (Kumar [0080]-[0081]: “… the data preparation module 226 obtains user data, item data, and interaction data from storage device 212 and combines the user data, item data, and interaction data into rows of a dataset that will be used for training a supervised learning model. … the data preparation module 226 creates a table in which to organize the user, item, and interaction data and stores the table in the storage device 212.”; and [0082]: “Row 1: [UserID], [User age], [User income level], [User interests], [Average dollar amount spent by similar users], [Favorite item categories of similar users], ... [**] [ItemID], [Item category], [Item tags], [Item view count], [Item number of likes], [Item current rate of views], [Item description feature vector], [List of 5 items most similar to current item in terms of content], [List of 5 items most similar to current item in terms of genre], [List of 5 items most similar to current item in terms of category], [List of 5 items most similar to current item in terms of ratings], [Average age of users having interacted with the item], ... [User response to current item (e.g., like, dislike view, skip ignore, total interaction time, purchase, no purchase, rating, money spent, profit result from purchase)] …”).).  
Regarding original Claim 3, 
Kumar in view of Polikar teaches
(Original) The method according to claim 2, 
wherein before the step of, according to the user feedback behavior feature of the to-be-recognized article and the predetermined low-quality article recognition model, recognizing whether the to-be-recognized article is a low-quality article, the method further comprises: 
collecting user feedback behavior features corresponding to respective training articles in several training articles whose known class is a low-quality article or a non-low-quality article, as training data to obtain several training data (Kumar Figure 1, element 102, 108, 110, 114; Figure 2, elements 110, 220: Kumar teaches a data collection module collecting user-item interactions for multiple items (an item being a news article) (corresponding to “collecting user feedback behavior features corresponding to respective training articles”) (Kumar [0073]: “… the data collection module 220 obtains actions performed by one or more users on items from a server or service. … the item server 108, the data collector 110, or the client device 114, or a component thereof, records user interactions with items, such as actions including likes, dislikes, purchases, skips, views, length of views, etc. … the data collection module 220 obtains actions performed by the one or more users on items which were recommendations suggested to the users by the server or service. … the data collection module 220 obtains whether the user action was to skip, or view, or like, or dislike, or purchase the recommended items.”). Kumar further teaches that the data collection module collects data related to items (an item being a news article) belonging to a category or tag (corresponding to “known class”, Kumar [0082]), with tags chosen by the user of the service or experts/creators (Kumar [0068]: “… the data collection module 220 obtains item tag or category information on items from the server or service and determines a genre, class or category of the item as item data. … the tag can be chosen by the users of the service or by experts. … the data collection module 220 obtains author or creator information associated with an item from a server or service and generates item data. …”). In the context of analyzing news articles, the users or experts/creators can tag these articles as popular or interesting, which are valid measurements for establishing the “quality” of a news article  All of this data collected from the data collection module is provided to the data preparation module to form training data in the form of tables (Kumar [0080]-[0082]), hence corresponding to a process for collecting “… training data to obtain several training data”.); 
training the low-quality article recognition model according to the several training data (Examiner’s note: Under its broadest reasonable interpretation, “the low-quality article recognition model” is interpreted to be the predetermined low-quality article recognition model recited earlier in the claim. As indicated earlier, Kumar teaches a flow that includes generating subsets of features from a master dataset provided by the training data table that includes user data, item data, and user-item interaction data (Kumar Figure 11, steps 1102 and 1104, where the training data table containing user data, item data, and user-item interaction data is taught in Kumar Figures 3, 4, and 5 and in [0066]-[0068], [0071]-[0075], [0080]-[0082]) and using this subset of input feature data as input into a model generation module to select supervised learning methods from a supervised learning module (Kumar Figure 2, elements 232, 234a) to train and build multiple supervised learning models based on this subset of input feature data (Kumar Figure 11, steps 1106 and 1108 and [0090]-[0091]: “The model generation module 232 may include computer logic executable by the processor 202 to create models based on the data collected by the data collection module 220 and data prepared by the data preparation module 226. … As illustrated, the model generation module 232 may include a supervised learning module 234a … The supervised learning module 234a selects supervised learning methods and trains models based on user, item, and interaction data collected by the recommendation server 102.”). Each generated supervised learning model corresponds to “a classifier” as taught in Kumar [0105] based on a subset of input feature data (Kumar Figure 11, steps 1102 and 1104), and that these supervised learning models can be created from these subsets of data (Kumar Figure 11, steps 1120-1122, and steps 1106-1122 repeatedly) until the supervised learning module determines that no more feature subsets of data can be created, thus halting the process of model training and model building. Hence this process of forming feature subsets of datasets and using them to train and build supervised learning models corresponds to “training the low-quality article recognition model according to the several training data” (Kumar [0104]: “… the supervised learning module 234a creates multiple models for each supervised learning method and/or on different subsets of original or overall dataset (e.g. different subsets of user data, subsets of item data or subsets of interaction data).”; and Figure 11, [0131]-[0133]).).  
Regarding original Claim 4, 
Kumar in view of Polikar teaches
(Original) The method according to claim 3, 
wherein each training data further comprises 
a feature of a corresponding training article (Examiner’s note: Kumar teaches a training data table, including item data columns, with features such as item category, item view count, number of likes, item description feature vector, item current rate of views (Kumar [0082], corresponding to “a feature of a corresponding training article”).).
Regarding original Claim 8, 
Kumar in view of Polikar teaches
(Original) The method according to claim 4, wherein if the user feedback behavior feature of the to-be-recognized article includes clicking and opening times and times of clicking dislikes, and the feature of the to-be-recognized article includes displaying times, and the low-quality article recognition model includes a second classifier model (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation in a method claim recites a contingent clause that effectively renders the subsequent claim limitation and claim body to not be performed because the condition precedent (“wherein if the user feedback behavior feature of the to-be-recognized article includes clicking and opening times and times of clicking dislikes, and the feature of the to-be-recognized article includes displaying times, and the low-quality article recognition model includes a second classifier model”) is not required to be met, and the claimed invention can be practiced without the condition occurring. See MPEP 2111.04(II). Applicant is advised to amend the claim to positively cite the condition as being fulfilled, since no patentable weight is given for the subsequent claim language following a contingent clause that does not require the condition to be fulfilled for practicing the claimed invention. However, for purposes of examination, this “if” clause will be treated as if the condition were fulfilled, thus allowing the subsequent claim limitation and claim body for further examination. Kumar teaches a data collection module collecting user-item interaction data (corresponding to “user feedback behavior feature of the to-be-recognized article”), based on clickstreams, including the start time of interaction (corresponding to “includes clicking and opening Kumar Figure 5 and [0075]: “As shown in … FIG. 5, the data collection module 220 aggregates an interaction data list, for example, that represents any action a user can potentially take with an item, which may be obtained, for example from a user's purchase history, user device, clickstream, internet cookies, view history, etc., as described elsewhere herein. For example, the data collection module 220 collects as user-item interaction data and/or includes likes, dislikes, number of watches, viewing time, money spent, copying text, rotating of mobile device, rating, tweets, start time of interaction, end time of interaction, pause time, share, re-share, etc.”). Kumar further teaches the same data collection module collecting item data, including total viewing time or duration, with this information corresponding to “the feature of the to-be-recognized article includes displaying times” (Kumar Figure 4 and [0071]: “As shown in … FIG. 4, the data collection module 220 aggregates item data attributes by virtue of users interacting with a plurality of items, from textual analysis, from preprogrammed item data, or from other methods described herein or known in the art. For example, the data collection module 220 groups item data as and/or include … (3) total viewing time or duration, ratio of total viewing time and total potential viewing time, average time the item has been on application, etc., ...”). As indicated earlier in the claim limitation “training the low-quality article recognition model according to the several training data” recited in Claim 3, Kumar teaches a flow (Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models, where each generated supervised learning model corresponds to “a classifier”, and is trained based on a subset of input feature data derived from a master dataset generated by the data preparation module and the data collection module (using data as described in Kumar Figures 3, 4, and 5 and in [0066]-[0068], [0071]-[0075], [0080]-[0082]). Hence, one of these multiple supervised learning models corresponds to “a second classifier model”, and the process of forming feature subsets of datasets and using them to train and build supervised learning models, where one of these subsets containing the above recited user feedback feature data corresponds to “the user feedback behavior feature of the to-be-recognized article includes clicking and opening times and times of clicking dislikes, and the feature of the to-be-recognized article includes displaying times, and the low-quality article recognition model includes a second classifier model …”.), 
the step of recognizing whether the to-be-recognized article is a low-quality article, according to the user feedback behavior feature and the predetermined low-quality article recognition model, and in combination with the feature of the to-be-recognized article, specifically comprises: 
inputting the clicking and opening times, the times of clicking dislike and the displaying times of the to-be-recognized article into the pre-trained second classifier model, so that the second classifier model predicts whether the to-be-recognized article is the low-quality article (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation is directed to evaluating a pre-trained classifier model using test data to perform the predictions. The first part of the claim limitation: “inputting the clicking and opening times, the times of clicking dislike and the displaying times of the to-be-recognized into the pre-trained second classifier model” is similar in scope to the corresponding claim limitation recited earlier as part of the ‘wherein’ contingent clause (when positively recited), and hence is rejected under similar rationale. As indicated earlier, Kumar teaches a flow (Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models, where each generated supervised learning model corresponds to “a classifier” and is trained based on a subset of input feature data, where the subset of input feature data includes the above described information and a list of the known classes and is derived from a master dataset generated by the data preparation module and the data collection module. Kumar further teaches that the generated supervised learning model is used to perform predictions of a user response, where a user response includes likes or dislikes, and these likes and dislikes represent valid measurements of popularity (Kumar Figure 11, step 1114; and [0082]: “User response .. (e.g., like, dislike, view, skip, ignore, …) …”). In the context of a news recommendation service, the prediction based on these user responses will reflect the “quality” of an item, where a predicted “like” user response will represent a higher quality of the item versus a predicted “dislike” user response (which will represent a lower quality of the item). Kumar further teaches dividing the received dataset for building the model into training, test, and validation sets, as well as storing trained models in a data store (Kumar [0100]; and [0117]). Thus, the process of training and building this classifier model using the subset of data containing the above described information and using the test set portion of the subset of data to evaluate and perform predictions of a user response on a pre-trained classifier model corresponds to “inputting … ).
Regarding original Claim 9, 
Kumar in view of Polikar teaches
(Original) The method according to claim 8, 
wherein the training the low-quality article recognition model according to the several training data specifically comprises:
obtaining the clicking and opening times and the times of clicking dislikes of respective training articles, from user feedback behavior features of respective training articles of said several training data (This claim limitation is similar in scope to the corresponding claim limitation in Claim 8 as part of the ‘wherein’ contingent clause (when positively recited), and hence is rejected under similar rationale.);
obtaining the displaying times of respective training articles from features of respective training articles of said several training data (This claim limitation is similar in scope to the corresponding claim limitation in Claim 8 as part of the ‘wherein’ contingent clause (when positively recited), and hence is rejected under similar rationale.);
training the second classifier model by using the clicking and opening times, the times of clicking dislike, the displaying times and known classes of respective training articles (Examiner’s note: As indicated earlier, Kumar teaches a flow (Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models, where each generated supervised learning model corresponds to “a classifier” and is trained based on a subset of input feature data, where the subset of input feature data includes the above described information and a list of the known classes and is derived from a master dataset generated by the data preparation module and the data collection module. Kumar further teaches that the generated supervised learning model is used to perform predictions of a user response, where a user response includes likes or dislikes, and these likes and dislikes represent valid measurements of popularity (Kumar Figure 11, step 1114; and [0082]: “User response .. (e.g., like, dislike, view, skip, ignore, …) …”). In the context of a news recommendation service, the prediction based on these user responses will reflect the “quality” of an item, where a predicted “like” user response will represent a higher quality of the item versus a predicted “dislike” user response (which will represent a lower quality of the item). Kumar further teaches dividing the received dataset for building the model into training, test, Kumar [0100]; and [0117]). Thus, the process of training and building this classifier model using the subset of data containing the above described information and using the training set portion of the subset of data to perform training of a classifier model corresponds to “training a second classifier model by using the clicking and opening times, the times of clicking dislike, the displaying times and known classes of respective training articles”.).
Regarding original Claim 10, 
Kumar in view of Polikar teaches
(Original) The method according to claim 4, wherein if the user feedback behavior feature of the to-be-recognized article includes a reading progress and a reading duration, the feature of the to-be-recognized article includes a length of the to-be-recognized article and the number of included pictures, and the low-quality article recognition model includes a third classifier model (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation in a method claim recites a contingent clause that effectively renders the subsequent claim limitation and claim body to not be performed because the condition precedent (“wherein if the user feedback behavior feature of the to-be-recognized article includes a reading progress and a reading duration, and the feature of the to-be-recognized article includes a length of the to-be-recognized article and the number of included pictures, and the low-quality article recognition model includes a third classifier model”) is not required to be met, and the claimed invention can be practiced without the condition occurring. See MPEP 2111.04(II). Applicant is advised to amend the claim to positively cite the condition as being fulfilled, since no patentable weight is given for the subsequent claim language following a contingent clause that does not require the condition to be fulfilled for practicing the claimed invention. However, for purposes of examination, this “if” clause will be treated as if the condition were fulfilled, thus allowing the subsequent claim limitation and claim body for further examination. Kumar teaches a data collection module collecting user-item interaction data (corresponding to “user feedback behavior feature of the to-be-recognized article”), including start time of interaction, end time of interaction, and pause time (corresponding to “includes a reading progress”) and viewing time (corresponding to “a reading duration”) (Kumar Figure 5 and [0075]). Kumar further teaches the same data collection module collecting item data, including total viewing time or duration and associated item metadata (Under its broadest reasonable interpretation, “metadata” defined  (Kumar Figure 4 and [0071]). As indicated earlier in the claim limitation “training the low-quality article recognition model according to the several training data” recited in Claim 3, Kumar teaches a flow (Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models, where each generated supervised learning model corresponds to “a classifier”, and is trained based on a subset of input feature data derived from a master dataset generated by the data preparation module and the data collection module (using data as described in Kumar Figures 3, 4, and 5 and in [0066]-[0068], [0071]-[0075], [0080]-[0082]). Hence, one of these multiple supervised learning models corresponds to “a third classifier model”, and the process of forming feature subsets of datasets and using them to train and build supervised learning models, where one of these subsets containing the above recited user feedback feature data corresponds to “the user feedback behavior feature of the to-be-recognized article includes a reading progress and a reading duration, the feature of the to-be recognized article includes a length of the to-be-recognized article and the number of included pictures, and the low-quality article recognition model includes a third classifier model …”.), 
the step of recognizing whether the to-be-recognized article is a low-quality article, according to the user feedback behavior feature and the predetermined low-quality article recognition model, and in combination with the feature of the to-be-recognized article, specifically comprises:
inputting the reading progress and the reading duration of the to-be-recognized article, the length of the to-be-recognized article and the number of included pictures, into the pre-trained third classifier model, so that the third classifier model predicts whether the to-be-recognized article is the low-quality article (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation is directed to evaluating a pre-trained classifier model using test data to perform the predictions. The first part of the claim limitation: “inputting the reading progress and the reading duration of the to-be-recognized article, the length of the to-be-recognized article and the number of included pictures …” is similar in scope to the Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models, where each generated supervised learning model corresponds to “a classifier” and is trained based on a subset of input feature data, where the subset of input feature data includes the above described information and a list of the known classes and is derived from a master dataset generated by the data preparation module and the data collection module. Kumar further teaches that the generated supervised learning model is used to perform predictions of a user response, where a user response includes likes or dislikes, and these likes and dislikes represent valid measurements of popularity (Kumar Figure 11, step 1114; and [0082]: “User response .. (e.g., like, dislike, view, skip, ignore, …) …”). In the context of a news recommendation service, the prediction based on these user responses will reflect the “quality” of an item, where a predicted “like” user response will represent a higher quality of the item versus a predicted “dislike” user response (which will represent a lower quality of the item). Kumar further teaches dividing the received dataset for building the model into training, test, and validation sets, as well as storing trained models in a data store (Kumar [0100]; and [0117]). Thus, the process of training and building this classifier model using the subset of data containing the above described information and using the test set portion of the subset of data to evaluate and perform predictions of a user response on a pre-trained classifier model corresponds to “inputting … into the pre-trained third classifier model, so that the third classifier model predicts whether the to-be-recognized article is the low-quality article”.).
Regarding original Claim 11, 
Kumar in view of Polikar teaches
(Original) The method according to claim 10, 
wherein the training the low-quality article recognition model according to the several training data specifically comprises: 
obtaining the reading progress and reading duration of respective training articles, from user feedback behavior features of respective training articles of said several training data (This claim limitation is similar in scope to the corresponding claim limitation in Claim 10 as part of the ‘wherein’ contingent clause (when positively recited), and hence is rejected under similar rationale.); 
obtaining the length of the respective training articles and the number of included pictures, from features of respective training articles of said several training data (This claim limitation is similar in scope to the corresponding claim limitation in Claim 10 as part of the ‘wherein’ contingent clause (when positively recited), and hence is rejected under similar rationale.);
training the third classifier model by using the reading progress, reading duration, length and the number of included pictures of the respective training articles, and known classes of the respective training articles (Examiner’s note: As indicated earlier, Kumar teaches a flow (Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models, where each generated supervised learning model corresponds to “a classifier” and is trained based on a subset of input feature data, where the subset of input feature data includes the above described information and a list of the known classes and is derived from a master dataset generated by the data preparation module and the data collection module. Kumar further teaches that the generated supervised learning model is used to perform predictions of a user response, where a user response includes likes or dislikes, and these likes and dislikes represent valid measurements of popularity (Kumar Figure 11, step 1114; and [0082]: “User response .. (e.g., like, dislike, view, skip, ignore, …) …”). In the context of a news recommendation service, the prediction based on these user responses will reflect the “quality” of an item, where a predicted “like” user response will represent a higher quality of the item versus a predicted “dislike” user response (which will represent a lower quality of the item). Kumar further teaches dividing the received dataset for building the model into training, test, and validation sets, as well as storing trained models in a data store (Kumar [0100]; and [0117]). Thus, the process of training and building this classifier model using the subset of data containing the above described information and using the training set portion of the subset of data to perform training of a classifier model corresponds to “training a third classifier model by using the reading progress, the reading duration, length and the number of included pictures of the respective training articles, and known classes of respective training articles”.).  
Regarding original Claim 12, 
Kumar in view of Polikar teaches
(Original) The method according to claim 4, wherein if the user feedback behavior feature of the to-be-recognized article includes times of storing in favorites and sharing times, the feature of the to-be-recognized article includes times of displaying the to-be-recognized article, and the low-quality article recognition model includes a fourth classifier model (Under broadest reasonable interpretation, this claim limitation in a method claim recites a contingent clause that effectively renders the subsequent claim limitation and claim body to not be performed because the condition precedent (“wherein if the user feedback behavior feature of the to-be-recognized article includes times of storing in favorites and sharing times, the feature of the to-be-recognized article includes times of displaying the to-be-recognized article, and the low-quality article recognition model includes a fourth classifier model”) is not required to be met, and the claimed invention can be practiced without the condition occurring. See MPEP 2111.04(II). Applicant is advised to amend the claim to positively cite the condition as being fulfilled, since no patentable weight is given for the subsequent claim language following a contingent clause that does not require the condition to be fulfilled for practicing the claimed invention. However, for purposes of examination, this “if” clause will be treated as if the condition were fulfilled, thus allowing the subsequent claim limitation and claim body for further examination. Kumar teaches a data collection module collecting user feedback on items, including shares, likes, dislikes, favorites, actions, with this information corresponding to “the user feedback behavior feature of the to-be-recognized article includes times of storing in favorites and sharing times” (Kumar [0066]: “As shown in the example graphical representation 300 of FIG. 3, the data collection module 220 collects user data attributes by virtue of users interacting with an application or browser accessing the item server 108 on a client device 114, filling out surveys, publicly known information about the user, etc. For example, the data collection module 220 groups user data as and/or include … (4) user feedback, such as comments, shares, likes, dislikes, favorites, actions, etc., and so forth.”). Kumar further teaches the same data collection module collecting item data, including average time the item has been on the application (Kumar Figure 4), with this information corresponding to “the feature of the to-be-recognized article includes times of displaying the to-be-recognized article” (Kumar [0071]: “As shown in the example graphical representation 400 of FIG. 4, the data collection module 220 aggregates item data attributes by virtue of users interacting with a plurality of items, from textual analysis, from preprogrammed item data, or from other methods described herein or known in the art. For example, the data collection module 220 groups item data as and/or include … (3) total viewing time or duration, ratio of total viewing time and total potential viewing time, average time the item has been on application, etc., …”). As indicated earlier in the claim limitation “training the low-quality article recognition model according to the several training data” recited in Claim 3, Kumar teaches a flow (Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models, where each generated supervised learning model corresponds to “a classifier”, and is trained based on a subset of input feature data derived from a master dataset generated by the data preparation module and the data collection module (using data as described in Kumar Figures 3, 4, and 5 and in [0066]-[0068], [0071]-[0075], [0080]-[0082]). Hence, one of these multiple supervised learning models corresponds to “a fourth classifier model”, and the process of forming feature subsets of datasets and using them to train and build supervised learning models, where one of these subsets containing the above recited user feedback feature data corresponds to “the user feedback behavior feature of the to-be-recognized article includes times of storing in favorites and sharing times, the feature of the to-be-recognized article includes times of displaying the to-be-recognized article, and the low-quality article recognition model includes a fourth classifier model …”.), 
the step of recognizing whether the to-be-recognized article is a low-quality article, according to the user feedback behavior feature and the predetermined low-quality article recognition model, and in combination with the feature of the to-be-recognized article, specifically comprises: 
inputting the times of storing in favorites, the sharing times and the displaying times of the to-be-recognized article, into the pre-trained fourth classifier model, so that the fourth classifier model predicts whether the to-be-recognized article is the low-quality article (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation is directed to evaluating a pre-trained classifier model using test data to perform the predictions. The first part of the claim limitation: “inputting the times of storing in favorites, the sharing times and the displaying times of the to-be-recognized article …” is similar in scope to the corresponding claim limitation recited earlier as part of the ‘wherein’ contingent clause (when positively recited), and hence is rejected under similar rationale. As indicated earlier, Kumar teaches a flow (Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models, where each generated supervised learning model corresponds to “a classifier” and is trained based on a subset of input feature data, where the subset of input feature data includes the above described information and a list of the known classes and is derived from a master dataset generated by the data Kumar Figure 11, step 1114; and [0082]: “User response .. (e.g., like, dislike, view, skip, ignore, …) …”). In the context of a news recommendation service, the prediction based on these user responses will reflect the “quality” of an item, where a predicted “like” user response will represent a higher quality of the item versus a predicted “dislike” user response (which will represent a lower quality of the item). Kumar further teaches dividing the received dataset for building the model into training, test, and validation sets, as well as storing trained models in a data store (Kumar [0100], [0117]). Thus, the process of training and building this classifier model using the subset of data containing the above described information and using the test set portion of the subset of data to evaluate and perform predictions of a user response on a pre-trained classifier model corresponds to “inputting … into the pre-trained fourth classifier model so that the fourth classifier model predicts whether the to-be-recognized article is the low-quality article”.).
Regarding original Claim 13, 
Kumar in view of Polikar teaches
(Original) The method according to claim 12, wherein the training the low-quality article recognition model according to the several training data specifically comprises: 
obtaining the times of storing in favorites and the sharing times of the respective training articles, from user feedback behavior features of respective training articles of said several training data (This claim limitation is similar in scope to the corresponding claim limitation in Claim 12 as part of the ‘wherein’ contingent clause (when positively recited), and hence is rejected under similar rationale.); 
obtaining the displaying times of the respective training articles, from features of respective training articles of said several training data (This claim limitation is similar in scope to the corresponding claim limitation in Claim 12 as part of the ‘wherein’ contingent clause (when positively recited), and hence is rejected under similar rationale.); 
training the fourth classifier model by using the times of storing in favorites, the sharing times and the displaying times of the respective training articles, and known classes of the respective training articles (Examiner’s note: As indicated earlier, Kumar teaches a flow (Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models, where each generated supervised learning  where the subset of input feature data includes the above described information and a list of the known classes and is derived from a master dataset generated by the data preparation module and the data collection module. Kumar further teaches that the generated supervised learning model is used to perform predictions of a user response, where a user response includes likes or dislikes, and these likes and dislikes represent valid measurements of popularity (Kumar Figure 11, step 1114; and [0082]: “User response .. (e.g., like, dislike, view, skip, ignore, …) …”). In the context of a news recommendation service, the prediction based on these user responses will reflect the “quality” of an item, where a predicted “like” user response will represent a higher quality of the item versus a predicted “dislike” user response (which will represent a lower quality of the item). Kumar further teaches dividing the received dataset for building the model into training, test, and validation sets, as well as storing trained models in a data store (Kumar [0100]; and [0117]). Thus, the process of training and building this classifier model using the subset of data containing the above described information and using the training set portion of the subset of data to perform training of a classifier model corresponds to “training a fourth classifier model by using the times of storing in favorites, the sharing times and the displaying times of the respective training articles, and known classes of respective training articles”.).
Regarding original Claim 14, 
Kumar in view of Polikar teaches
(Original) The method according to claim 4, wherein if the low-quality article recognition model includes at least two pre-trained classifier models (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation in a method claim recites a contingent clause that effectively renders the subsequent claim limitation and claim body to not be performed because the condition precedent (“wherein if the low-quality article recognition model includes at least two pre-trained classifier models”) is not required to be met, and the claimed invention can be practiced without the condition occurring. See MPEP 2111.04(II). Applicant is advised to amend the claim to positively cite the condition as being fulfilled, since no patentable weight is given for the subsequent claim language following a contingent clause that does not require the condition to be fulfilled for practicing the claimed invention. However, for purposes of examination, this “if” clause will be treated as if the condition were fulfilled, thus allowing the subsequent claim limitation and claim body for further examination. As indicated earlier, Kumar teaches a Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models (Kumar [0090]-[0092] and Figure 11, steps 1106 and 1108, where each generated supervised learning model corresponds to “a classifier” as taught in Kumar [0105]) based on a subset of input feature data (Kumar Figure 11, steps 1102 and 1104). Kumar further teaches dividing the received dataset for building the model into training, test, and validation sets, as well as storing trained models in a data store (Kumar [0100]; and [0117]). Thus, the process of training and building multiple classifier models using subsets of data and using the training set portion of the subset of data corresponds to building and “training at least two classifier models”. Kumar further teaches the supervised learning module combining the models to generate a final prediction using weighted averaging, where these examples of combining multiple supervised learning models (and their respective predictions) into a larger combined model encompassing the individual supervised learning models. Thus, the process of combining supervised learning models and their respective individual predictions into an aggregate prediction through weighted averaging corresponds to a method for “the low-quality recognition model includes at least two pre-trained classifier models” (Kumar [0104]-[0105]).), 
the step of recognizing whether the to-be-recognized article is a low-quality article, according to the user feedback behavior feature and the predetermined low-quality article recognition model, and in combination with the feature of the to-be-recognized article, specifically comprises: 
according to the user feedback behavior feature of the to-be-recognized article, or the user feedback behavior feature of the to-be-recognized article and the feature of the to-be-recognized article (This claim element is similar in scope to a corresponding claim element in Claims 1 and 2, and hence is rejected under similar rationale.), and 
in combination with the pre-trained classifier models, obtaining the classifier models' prediction results about whether the to-be-recognized article is the low-quality article (Polikar p.28 Figure 5: examiner’s note: Under its broadest reasonable interpretation, this claim limitation is directed to combining the prediction results obtained from two pre-trained classifier models in the context of ensemble model learning. Referring to Polikar Figure 5 (Test: Simple Majority Voting), Polikar teaches simple majority voting, where an ensemble containing classifiers                         
                            
                                
                                    h
                                
                                
                                    1
                                
                            
                        
                    , …,                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     each contribute to a vote                         
                            
                                
                                    v
                                
                                
                                    t
                                
                            
                        
                    , where a value of 1 indicates that the classifier picks a certain class                         
                            
                                
                                    ω
                                
                                
                                    j
                                
                            
                        
                    , and a value of 0 indicates                         
                            
                                
                                    ω
                                
                                
                                    j
                                
                            
                        
                    , with the decision outcome based on determining the class that receives the highest total vote, and using that determination to determine the final classification for the ensemble. The representation of a vote from a classifier being 0 corresponds to the respective classifier not contributing to the overall classification, which corresponds to “processing … a classifier to obtain a prediction result of the classifier, the prediction result comprising … respective result of giving up vote”. When combined with the teachings of Kumar, the final classification that is based on the sum of each of these votes generated from a plurality of classifiers based on class corresponds to “in combination with the pre-trained classifier models, obtaining the classifier models’ prediction result about whether the to-be-recognized article is the low-quality article”. Polikar further teaches that this concept of simple majority voting is not necessarily tied to bagging algorithm but can be used in the context of performing majority voting and its extension weighted majority voting to determine the final classification result from the ensemble of a plurality of classifiers (Polikar pp.34-35 Section 4.1 Combining Class Labels and Polikar Section 4.1.1 Majority Voting; and Polikar Section 4.1.2 Weighted Majority Voting 1st paragraph: “… let us denote the decision of hypothesis                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     on class                         
                            
                                
                                    ω
                                
                                
                                    j
                                
                            
                        
                     as                         
                            
                                
                                    d
                                
                                
                                    t
                                    ,
                                    j
                                
                            
                        
                    , such that                         
                            
                                
                                    d
                                
                                
                                    t
                                    ,
                                    j
                                
                            
                        
                     is 1, if                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     selects                         
                            
                                
                                    ω
                                
                                
                                    j
                                
                            
                        
                      and 0, otherwise.”).); 
predicting whether the to-be-recognized article is the low-quality article, according to the classifier models' prediction results about whether the to-be-recognized article is the low-quality article, and predetermined weights of respective classifier models (Polikar p.31, Figure 9: examiner’s note: Under its broadest reasonable interpretation, this claim limitation is directed to combining the prediction results obtained from two pre-trained classifier models in the context of ensemble model learning. Referring to Polikar Figure 9, Polikar teaches                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     (1 < k < T), where                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     represents a different classifier with different weights determined by their respective logarithm of training error 1/                        
                            
                                
                                    β
                                
                                
                                    t
                                
                            
                        
                     (Polikar p.30 col.1 3rd paragraph – p.30 col.2 1st paragraph: “…AdaBoost uses a rather undemocratic voting scheme, called the weighted majority voting. The ideas is an intuitive one: those classifiers that have shown good performance during training are rewarded with higher voting weights than the others. … To avoid potential instability that can be caused by asymptotically large numbers, the logarithm of 1/                        
                            
                                
                                    β
                                
                                
                                    t
                                
                            
                        
                     is usually used as the voting weight of                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                    . At the end the class that receives the highest total vote from all classifiers is the ensemble decision.”). Polikar teaches that the predetermined weight for each classifier can be further used in the Weighted Majority Voting scheme taught in Polikar pp.34-35 Section 4.1.2 Weighted Majority Voting, to determine the final classification result from the ensemble of a plurality of classifiers. Hence, when combined with the teachings of Kumar, this process of applying weighted majority voting corresponds to “predicting whether the to-be-recognized article is the low-quality article, according to the classifier models' prediction results about whether the to-be-recognized article is the low-quality article, and predetermined weights of respective classifier models”.).
Regarding original Claim 15, 
Kumar in view of Polikar teaches
(Original) The method according to claim 14, 
wherein before predicting whether the to-be-recognized article is the low-quality article, according to the classifier models' prediction results about whether the to-be-recognized article is the low-quality article, and predetermined weights of respective classifier models, the method further comprises: 
receiving weights of respective classifier models set by the user (Kumar Figure 1, elements 102, 112: examiner’s note: Kumar teaches storing learned parameter settings (i.e., weights for the models), with the initial values for these weights provided by a user, either through configuration or via hard-coding them as initialization data in the model, thus corresponding to “receiving weights of respective classifier models set by the user (Kumar Figure 1, elements 102, 112; and [0040]: “The data store 112 is coupled to the data collector 110 and comprises a non-volatile memory device or similar permanent storage device and media. … in some implementations, provides access to the recommendation server 102 to obtain the data collected by the data store 112 (e.g. training data, response variables, tuning data, test data, user data, experiments and their results, learned parameter settings, system logs, etc.).”).).  
Regarding original Claim 16, 
Kumar in view of Polikar teaches
(Original) The method according to claim 14, wherein the training the low-quality article recognition model according to the several training data specifically comprises: 
upon performing the first round of training, according to a sampling probability of respective training data, sampling from a training data set D comprised of collected several training data to obtain a training data subset D', D' being a subset of D (Examiner’s note: Polikar teaches applying the t, where training data subsets                         
                            
                                
                                    S
                                
                                
                                    t
                                
                            
                        
                     for each classifier model are extracted from a set of training instances                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                    , i=1, … N (corresponding to “sampling from a training data set D comprised of collected several training data to obtain a training data subset D’, D’ being a subset of D”), based on a weight distribution                         
                            
                                
                                    D
                                
                                
                                    t
                                
                            
                        
                    (i) (corresponding to “according to a sampling probability of respective training data”) (Polikar p.30 col.1, 2nd paragraph: “The pseudocode of the algorithm is provided in Figure 8. Several interesting features of the algorithm are worth noting. The algorithm maintains a weight distribution                         
                            
                                
                                    D
                                
                                
                                    t
                                
                            
                        
                    (i) on training instances                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                    , i = 1, . . . , N, from which training data subsets                         
                            
                                
                                    S
                                
                                
                                    t
                                
                            
                        
                     are chosen for each consecutive classifier (hypothesis)                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                    .”). Polikar further teaches performing Step 1 in the algorithm during the first and subsequent iteration rounds of training, for t=1,2, … T iterations is selecting a training data subset, and using them to train different classifier models to receive a classifier model’s prediction hypothesis ht (corresponding to “upon performing the first round of training, …, sampling from a training data set D”) (Polikar p.30 col.2, Figure 8 Algorithm AdaBoost.M1: “1. Select a training data subset                         
                            
                                
                                    S
                                
                                
                                    t
                                
                            
                        
                    , drawn from the distribution                         
                            
                                
                                    D
                                
                                
                                    t
                                
                            
                            .
                        
                    ”).); 
an initial sampling probability of the respective training data upon the first round of training being the same (Examiner’s note: Polikar teaches applying the AdaBoost.M1 boosting algorithm to multiple classifier models to produce a prediction hypothesis ht, where the weight distribution                         
                            
                                
                                    D
                                
                                
                                    t
                                
                            
                        
                    (i) is initialized to be uniform (corresponding to “an initial sampling probability of the respective training data upon the first round of training being the same”) (Polikar p.30 col.1, 2nd paragraph “The pseudocode of the algorithm is provided in Figure 8. … The algorithm maintains a weight distribution Dt(i) on training instances                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                    , i = 1, ..., N, from which training data subsets                         
                            
                                
                                    S
                                
                                
                                    t
                                
                            
                        
                     are chosen for each consecutive classifier (hypothesis)                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                    . The distribution is initialized to be uniform, so that all instances have equal likelihood to be selected into the first training dataset.”).); 
using respective training data in the training data subset D' to train a plurality of pre-selected classifier models respectively (Examiner’s note: Polikar teaches applying the AdaBoost.M1 boosting algorithm to multiple classifier models to produce a prediction hypothesis ht (corresponding to “train a plurality of pre-trained selected classifier models”) where the weight distribution                         
                            
                                
                                    D
                                
                                
                                    t
                                
                            
                        
                    (i) is initialized to be uniform (Polikar p.30 col.1, 2nd paragraph: “The pseudocode of the algorithm is provided in Figure 8. … The algorithm maintains a weight distribution                         
                            
                                
                                    D
                                
                                
                                    t
                                
                            
                        
                    (i) on training instances                         
                            
                                
                                    x
                                
                                
                                    i
                                
                            
                        
                    , i = 1, ..., N, from which training data subsets                         
                            
                                
                                    S
                                
                                
                                    t
                                
                            
                        
                     are chosen for each consecutive classifier (hypothesis)                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                    . The distribution is initialized to be uniform, so that all instances have equal likelihood to be selected into the first training dataset.”). Polikar further teaches performing Step 2 in the algorithm during the first and subsequent iteration rounds of training, where training data subsets                         
                            
                                
                                    S
                                
                                
                                    t
                                
                            
                        
                     are used to train each classifier model to obtain a classifier model’s prediction hypothesis                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     (corresponding to “using respective training data in the training data subset D’ to train a plurality of pre-selected classifier models respectively”) (Polikar p.30 col.2, Figure 8 Algorithm AdaBoost.M1; Figure 9: “2. Train WeakLearn with                         
                            
                                
                                    S
                                
                                
                                    t
                                
                            
                        
                    , receive hypothesis                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                            .
                        
                    ”).); 
according to results of training the plurality of pre-selected classifier models, calculating a training error of the respective classifier models upon the first round of training (Examiner’s note: Polikar teaches applying the AdaBoost.M1 boosting algorithm to multiple classifier models to produce a prediction hypothesis ht, where a training error                         
                            
                                
                                    ε
                                
                                
                                    t
                                
                            
                        
                     is calculated during the first and subsequent iteration rounds of training (corresponding to “calculating a training error”) (Polikar p.30 col.1, 2nd paragraph: “The pseudocode of the algorithm is provided in Figure 8. … The training error                         
                            
                                
                                    ε
                                
                                
                                    t
                                
                            
                        
                     of classifier                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     is also weighted by this distribution, such that                         
                            
                                
                                    ε
                                
                                
                                    t
                                
                            
                        
                     is the sum of distribution weights of the instances misclassified by                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                             
                        
                    (Equation 12).”). Polikar further teaches Step 3 in the algorithm, showing the calculation of the training error based on the classifier model’s prediction hypothesis                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     during the during the first and subsequent iteration rounds of training (corresponding to “according to results of training the plurality of pre-selected classifier models, calculating a training error of the respective classifier models upon the first round of training”) (Polikar p.30 col.2, Figure 8 Algorithm AdaBoost.M1; Figure 9: “3. Calculate the error of                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                    :                         
                            
                                
                                    ε
                                
                                
                                    t
                                
                            
                        
                     =                         
                            
                                
                                    ∑
                                    
                                        t
                                        :
                                        
                                            
                                                h
                                            
                                            
                                                t
                                            
                                        
                                        
                                            
                                                
                                                    
                                                        x
                                                    
                                                    
                                                        t
                                                    
                                                
                                            
                                        
                                         
                                        ≠
                                         
                                        
                                            
                                                y
                                            
                                            
                                                t
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            D
                                        
                                        
                                            t
                                        
                                    
                                    (
                                    i
                                    )
                                
                            
                        
                    .”).); 
according to the training error of each of said classifier models, obtaining a classifier model with a minimum training error as the first round of classifier model selected by this round of training (Examiner’s note: Polikar teaches a boosting algorithm applied to three classifiers, with each one having a calculated training error; based on the training error, select one of the classifier models with a minimum training error (i.e., the one classifier that was trained with data sets that were properly classified by the model) (corresponding to “according to the training error of each of said classifier models, obtaining a classifier  (Polikar p.29 col.1-col.2, 1st paragraph; p.29 Figure 7, Training Steps 1-5: “In essence, boosting creates three weak classifiers: the first classifier C1 is trained with a random subset of the available training data. The training data subset for the second classifier C2 is chosen as the most informative subset, given C1. That is, C2 is trained on a training data only half of which is correctly classified by C1, and the other half is misclassified. The third classifier C3 is trained with instances on which C1and C2 disagree. The three classifiers are combined through a three-way majority vote. The algorithm is shown in detail in Figure 7. Schapire has shown that the error of this three-classifier ensemble is bounded above, and it is less than the error of the best classifier in the ensemble, provided that each classifier has an error rate that is less than 0.5. For a two-class problem, an error rate of 0.5 is the least we can expect from a classifier, as an error of 0.5 amounts to random guessing. Hence, a stronger classifier is generated from three weaker classifiers. A strong classifier in the strict PAC learning sense can then be created by recursive applications of boosting.”).); 
according to the training error of the first round of classifier model, setting a weight of the first round of classifier model (Examiner’s note: Polikar teaches in a boosting algorithm, weights for classifier models are determined via trainable or non-trainable combination rules, with non-trainable combination rules generating weights using equations built into the boosting algorithm (Polikar p.33, col.2 2nd and 3rd paragraphs: “The second key component of any ensemble system is the strategy employed in combining classifiers. Combination rules are often grouped as (i) trainable vs. non-trainable combination rules, or (ii) combination rules that apply to class labels vs. to class-specific continuous outputs. In trainable combination rules, the parameters of the combiner, usually called weights, are determined through a separate training algorithm. … Conversely, there is no separate training involved in non-trainable rules beyond that used for generating the ensembles. Discussed below, weighted majority voting is an example of such non-trainable rules, since the parameters become immediately available as the classifiers are generated.”). Polikar further teaches a weighted majority voting non-trainable combination rule, using Equation 13 (from Step 4) performed during the first  of the AdaBoost.M1 algorithm to assign weight parameters for the classifier models, with the weight based on the training error, performed during the first and subsequent iteration rounds of training (corresponding to “according to the training error of the first round of classifier model, setting a weight of the first round of classifier model”) (Polikar p.34 col.2, 2nd paragraph; Figure 8 Algorithm AdaBoost.M1, Equation 13: “As indicated in Equations 13 and 15, AdaBoost follows the latter approach: AdaBoost assigns a voting weight of log (1/βt) to ht, where βt = εt /(1 − εt), and εt is the weighted training error of hypothesis ht.”). Polikar further teaches Step 4 showing Equation 13 assigning the weighted training error for each classifier model’s prediction hypothesis ht (Polikar p.30 col.2, Figure 8 Algorithm AdaBoost.M1: “Step 4. Set βt = εt /(1 − εt)”).);
according to training results of the first round of classifier model for the respective training data in the training data subset, updating a sampling probability of respective training data in the training data subset, so that the sampling probability of training data with a wrong prediction result upon this round of training increases, whereas the sampling probability of the training data with a correct prediction result reduces (Examiner’s note: Polikar teaches the AdaBoost.M1 boosting algorithm updating the weight distribution                         
                            
                                
                                    D
                                
                                
                                    t
                                
                            
                        
                     during the during the first and subsequent iteration rounds of training (corresponding to “updating a sampling probability”) for multiple classifier models such that those training data instances that were misclassified have their weight distribution increased (corresponding to “so that the sampling probability of training data with a wrong prediction result upon this round of training increases”) and those training data instances that were correctly classified have their weight distribution decreased (corresponding to “so that the sampling probability of training data with a correct prediction result reduces”), with Equation 14 performed during the first and subsequent iteration rounds of training (Polikar p.30 col.1, 2nd paragraph: “Equation 14 describes the distribution update rule: the distribution weights of those instances that are correctly classified by the current hypothesis are reduced by a factor of βt, whereas the weights of the misclassified instances are unchanged. When the updated weights are renormalized, so that                         
                            
                                
                                    D
                                
                                
                                    t
                                    +
                                    1
                                
                            
                             
                        
                    is a proper distribution, the weights of the misclassified instances are effectively increased. Hence, iteration by iteration, AdaBoost focuses on increasingly difficult instances. Note that AdaBoost raises the weights of instanced misclassified by ht so that they add up to 1/2, and lowers the weights of correctly classified instances, so that they too add up to 1/2.”). Polikar further teaches Step 5 in the algorithm, showing the update of the sampling probability performed during the first and subsequent iteration rounds of training (corresponding to “according to training results of the first round of classifier model for the respective training data in the training data subset”) (Polikar p.30 col.2, Figure 8 Algorithm AdaBoost.M1: “5. Update distribution                         
                            
                                
                                    D
                                
                                
                                    t
                                
                            
                        
                    :                        
                            
                                
                                    D
                                
                                
                                    t
                                    +
                                    1
                                
                            
                        
                    (i) = (                        
                            
                                
                                    D
                                
                                
                                    t
                                
                            
                        
                    (i)/                        
                            
                                
                                    Z
                                
                                
                                    t
                                
                            
                        
                    ) x                         
                            
                                
                                    
                                        
                                            
                                                
                                                    
                                                    
                                                        
                                                            β
                                                        
                                                        
                                                            t
                                                        
                                                    
                                                     
                                                    i
                                                    f
                                                     
                                                    
                                                        
                                                            h
                                                        
                                                        
                                                            t
                                                        
                                                    
                                                    (
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            i
                                                        
                                                    
                                                    )
                                                     
                                                    =
                                                     
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            i
                                                        
                                                    
                                                
                                            
                                        
                                        
                                            
                                                
                                                    
                                                    1
                                                     
                                                    o
                                                    t
                                                    h
                                                    e
                                                    r
                                                    w
                                                    i
                                                    s
                                                    e
                                                
                                            
                                        
                                    
                                
                            
                        
                     ”).);
repeatedly performing the above steps, and performing the second to Nth round of training, to respectively obtain the second round of classifier model, ... the Nth round of classifier model, and weights of respective rounds of classifier models (Polikar p.30 col.2, Figure 8 Algorithm AdaBoost.M1: examiner’s note: Polikar teaches Steps 1-5 in the algorithm, performed for 1,2, … T iterations (corresponding to “repeatedly performing the above steps, and performing the second to Nth round of training, to respectively obtain the second round of classifier model, … the Nth round of classifier model”). Polikar further teaches Step 4 showing Equation 13 assigning the weighted training error for each classifier model’s prediction hypothesis ht, performed during the first and subsequent iteration rounds (corresponding to “and weights of respective rounds of classifier models”) (Polikar p.30 col.2, Figure 8 Algorithm AdaBoost.M1: “4. Set βt = εt /(1 − εt)”).).
Regarding amended Claim 17, 

Claim 17 recites a computer device, wherein the computer device implements a method for recognizing a low-quality article based on artificial intelligence, wherein the method comprises of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 1, and hence is rejected under similar rationale and motivations provided by Kumar and Polikar as indicated in Claim 1. In addition, Kumar teaches a computer implementing the recommendation server containing one or more processors (Kumar Figure 2, element 202; and [0050]: “The processor 202 comprises an arithmetic logic unit, a microprocessor, a general purpose controller, … to execute software instructions … Although only a single processor is shown in FIG. 2, multiple processors may be included.”), a memory for storing one or more programs (Kumar Figure 2, elements 104, 204; and [0051]-[0052]: “In some implementations, the memory 204 may store instructions and/or data that may be executed by the processor 202. … as depicted in FIG. 2, the memory 204 may store the recommendation unit 104, and its respective components, depending on the configuration. … The memory 204 may be coupled to the bus 220 for communication with the processor 202 and the other components of recommendation server 102. … The instructions stored by the memory 204 and/or data may comprise code for performing any and/or all of the techniques described herein.”), as well as the one or more programs stored in a computer readable storage medium (Kumar [0026], [0056], [0134]).).
Regarding original Claim 18, 

Claim 18 recites the computer device according to claim 17, wherein the method further comprises of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 2, and hence is rejected under similar rationale provided by Kumar in view of Polikar as indicated in Claim 2, in view of the rejections applied to Claim 17.
Regarding original Claim 19, 

Claim 19 recites the computer device according to claim 18, wherein the method further comprises of claim limitations that are similar in scope to the corresponding claim limitations recited in Claim 3, and hence is rejected under similar rationale provided by Kumar in view of Polikar as indicated in Claim 3, in view of the rejections applied to Claim 17.
Regarding amended Claim 20, 

Claim 20 recites a computer readable medium on which a computer program is stored, wherein the program, when executed by the processor, implements a method for recognizing a low-quality article based on artificial intelligence, wherein the method comprises of claim limitations that are similar in scope to the corresponding claim limitations recited in Claims 1 and 17, and hence is rejected under similar rationale and motivations provided by Kumar and Polikar as indicated in Claims 1 and 17. In addition, Kumar teaches a computer implementing the recommendation server containing one or more processors (Kumar Figure 2, element 202; and [0050]: “The processor 202 comprises an arithmetic logic unit, a microprocessor, a general purpose controller, … to execute software instructions … Although only a single processor is shown in FIG. 2, multiple processors may be included.”), a memory for storing one or more programs (Kumar Figure 2, elements 104, 204; and [0051]-[0052]: “In some implementations, the memory 204 may store instructions and/or data that may be executed by the processor 202. … as depicted in FIG. 2, the memory 204 may store the recommendation unit 104, and its respective components, depending on the configuration. … The memory 204 may be coupled to the bus 220 for communication with the processor 202 and the other components of recommendation server 102. … The instructions stored by the memory 204 and/or data may comprise code for performing any and/or all of the techniques described herein.”), as well as the one or more programs stored in a computer storage medium (Kumar [0026], [0056], [0134]).
Claims 5-7 are rejected under 35 U.S.C. 103 as being unpatentable over 
Kumar et al., U.S. PGPUB 2017/0061286, published 3/2/2017 [hereafter referred as Kumar] in view of Polikar, Robi, Ensemble Based Systems in Decision Making, IEEE Circuits and Systems Magazine, Third Quarter 2006 [hereafter referred as Polikar] as applied to Claim 3, in further view of Arpat et al., U.S. PGPUB 2015/0074020, published 3/12/2015, incorporating by reference in its entirety with Rajaram, Giridhar, U.S. PGPUB 2012/0331063 (U.S. application 13/167,701), published 12/27/2012 [with the combination hereafter referred as Arpat-Rajaram].
Regarding previously presented Claim 5, 
Kumar in view of Polikar teaches
(Previously Presented) The method according to claim 3, wherein if the user feedback behavior feature of the to-be-recognized article comprises the user's comments, and the low-quality article recognition model comprises a first classifier model (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation in a method claim recites a contingent clause that effectively renders the subsequent claim limitation and claim body to not be performed because the condition precedent (“wherein if the user feedback behavior feature of the to-be-recognized article comprises the user’s comments, and the low-quality article recognition model comprises a first classifier model”) is not required to be met, and the claimed invention can be practiced without the condition occurring. See MPEP 2111.04(II). Applicant is advised to amend the claim to positively cite the condition as being fulfilled, since no patentable weight is given for the subsequent claim language following a contingent clause that does not require the condition to be fulfilled for practicing the claimed invention. However, for purposes of examination, this “if” clause will be treated as if the condition were fulfilled, thus allowing the subsequent claim limitation and claim body for further examination. Kumar teaches a data collection module collecting user feedback on items (an item being a news article), including user comments (corresponding to “user feedback behavior feature of the to-be-recognized article comprises the user’s comments”) (Kumar [0066]: “As shown in the example graphical representation 300 of FIG. 3, the data collection module 220 collects user data attributes by virtue of users interacting with an application or browser accessing the item server 108 on a client device 114, filling out surveys, publicly known information about the user, etc. For example, the data collection module 220 groups user data as and/or include … (4) user feedback, such as comments, shares, likes, dislikes, favorites, actions, etc., and so forth.”). As indicated earlier, Kumar teaches a flow (Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models (Kumar [0090]-[0092] and Figure 11, steps 1106 and 1108, where each generated supervised learning model corresponds to “a classifier” as taught in Kumar [0105]) based on a subset of input feature data (Kumar Figure 11, steps 1102 and 1104), where the subset of input feature data includes the above described information and is derived from a master dataset generated by the data preparation module and the data collection module, as taught in Kumar Figures 3, 4, and 5 and in [0066]-[0068], [0071]-[0075], [0080]-[0082]. Kumar further teaches that additional supervised learning models can be created from these subsets of data (Kumar Figure 11, steps 1120-1122, and steps 1106-1122 repeatedly) until the supervised learning module determines that no more feature subsets of data can be created, thus halting the process of model building. Within the iterative steps, Kumar further teaches a recommendation module along with the supervised learning model (Kumar [0108], corresponding to “the low-quality article recognition model comprises a first classifier model”) to perform a prediction (Kumar Figure 11, steps 1114 and 1116), by analyzing the popularity-based modeling module’s selection of “popular” candidate items, as well as their corresponding user’s comments. Hence this process of forming feature subsets of datasets and using them to train and build supervised learning models to perform predictions corresponds to “the user's comments, and the low-quality article recognition model comprises a first classifier model …” (Kumar [0104]: “… the supervised learning module 234a creates multiple models for each supervised learning method and/or on different subsets of original or overall dataset (e.g. different subsets of user data, subsets of item data or subsets of interaction data).”; and [0131]-[0133]).), 
the step of recognizing whether the to-be-recognized article is a low-quality article according to the user feedback behavior feature and the predetermined low-quality article recognition model specifically comprises: 
according to the user's comments on the to-be-recognized article and a pre-trained primary low-quality article recognition model, performing a primary prediction about whether the to-be-recognized article is the low-quality article, performing a primary prediction about whether the to-be-recognized article is the low-quality article (Kumar Figure 2, elements 226, 212: examiner’s note: Kumar teaches a data preparation module within the data collection module formulating a training data table (for training and evaluating a supervised learning model) consisting of rows of stored user data, item data (an item being a news article, with item data being a feature from the news article), and user-item interaction data (corresponding to “user’s comments on the to-be-recognized article”) (Kumar [0081]: “In some implementations, the data preparation module 226 obtains user data, item data, and interaction data from storage device 212 and combines the user data, item data, and interaction data into rows of a dataset that will be used for training a supervised learning model. In some implementations, the data preparation module 226 creates a table in which to organize the user, item, and interaction data and stores the table in the storage device 212. A schematic example of the rows of a dataset generated by the data preparation module 226 are included in the following paragraph and include a selection of possible columns which may be used in building a model.”). Kumar further teaches a popularity-based modeling module (Kumar Figure 2, elements 220, 230, 232, corresponding to “a pre-trained primary low-quality article recognition model”) performing initial baseline recommendations based on global popularity on the items and corresponding user-item interaction data within the training data set (corresponding to “a primary prediction result”), filtering and selecting candidate “popular” items as input into the supervised learning module and recommendation module (corresponding to “according to user’s comments on the to-be-recognized article…performing a primary prediction about whether the to-be-recognized article is the low-quality article, …”) (Kumar Figure 2, elements 220, 230, 232; Kumar [0089]: “The popularity-based modeling module 230 includes computer logic executable by the processor 202 to augment a model created by the model generation module 232 with a popularity-based naive model. In some implementations, the popularity-based naive model encodes the simple logic of recommending the most popular items (i.e., global popularity) among all the users aggregated in the dataset. In some implementations, the popularity-based naive model recommends items that have gained popularity within a group of similar users and/or items selected for a specific business objective. The model from the popularity based modeling module 230 forms a non-personalized model that makes baseline recommendations, which may be used as a fall-back by the recommendation module 236 described herein when the sophisticated supervised learning model does not make predictions of enough confidence to suggest as recommendations to the user. Another use of this simple model is to select candidate items to consider for each user, in the supervised learning approach as described herein.”).), … 
… to obtain a primary prediction result (Examiner’s note: The identified claim language (“… to obtain a primary prediction result …”) recites an intended result of performing a primary prediction (which is to obtain a prediction result), and hence this recited claim language does not carry patentable weight during examination of this claim.); 
performing word-segmenting processing for the user's comments on the to-be-recognized article (Kumar Figure 2, element 220, 222: examiner’s note: Kumar teaches a data collection module containing a text analytics module that performs text analysis and feature extraction on user’s comments using natural language processing and bag-of-words methods (Kumar [0061]-[0062]), such that the text analytics module featurizes the textual data associated with items and/or users including user’s comments, thus corresponding to “performing word-segmenting processing for the user’s comments on the to-be-recognized article” (Kumar [0067]-[0068]: “The data collection module 220 instructs the text analytics module 222 to generate text features from the description text and title, for example, vector space representation of the description text and title and stores it as item data. … the data collection module 220 obtains user comments, such as comments on an item, and comment features (e.g., metadata) from a server or service. The data collection module 220 generates item data from the comments and comment features. For example, the item data may include the number of comments, vector space representations of text comments (generated by text analytics module 222), sentiment features generated from the text comments using natural language processing, etc. … the data collection module 220 obtains author or creator information associated with an item from a server or service and generates item data … the information about the creator could include the popularity of items created or posted by the creator (e.g., in terms of one or more of views, likes purchases, and/or review provided on the server or service or a third party server or service)…”.); …
… inputting the primary prediction result, the subject word feature expression and the commentary content feature word expression into pre-trained first classifier model, so that the first classifier model predicts whether the to-be-recognized article is the low-quality article (Examiner’s note: Under its broadest reasonable interpretation, this claim limitation is directed to evaluating a pre-trained classifier model using test data to perform the predictions. According to the claim language identified earlier in Claim 4, a commentary content feature word expression is also derived from performing word-segmenting processing for the user’s comments (“detecting situations that segmented words obtained from the word segmenting processing hit commentary content feature words … to obtain a commentary content feature word expression of the user’s comment on the to-be-recognized article.”). It would have been obvious for a person having ordinary skill in the art to perform the same word-segmenting processing taught in Kumar [0061]-[0062] and [0067]-[0068] to perform extraction of commentary content feature words into a commentary content feature word expression from a user’s comments. Hence, the first part of the claim limitation: “inputting the primary prediction result, the subject word feature expression and the commentary content feature word expression …” is similar in scope to the corresponding claim limitations recited earlier as part of the ‘wherein’ contingent clause (when positively recited) and the claim limitation (“according to the user's comments on the to-be-recognized article and a pre-trained primary low-quality article recognition model, performing a primary prediction about whether the to-be-recognized article is the low-quality article, performing a primary prediction about whether the to-be-recognized article is the low-quality article”), and hence is rejected under similar rationale. As indicated earlier, Kumar teaches a flow (Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models, where each generated supervised learning model corresponds to “a classifier” and is trained based on a subset of input feature data, where the subset of input feature data includes the above described information and a list of the known classes and is derived from a master dataset generated by the data preparation module and the data collection module. Kumar further teaches that the generated supervised learning model is used to perform predictions of a user response, where a user response includes likes or dislikes, and these likes and dislikes represent valid measurements of Kumar Figure 11, step 1114; and [0082]: “User response .. (e.g., like, dislike, view, skip, ignore, …) …”). In the context of a news recommendation service, the prediction based on these user responses will reflect the “quality” of an item, where a predicted “like” user response will represent a higher quality of the item versus a predicted “dislike” user response (which will represent a lower quality of the item). Kumar further teaches dividing the received dataset for building the model into training, test, and validation sets, as well as storing trained models in a data store (Kumar [0100]; and [0117]). Thus, the process of training and building this classifier model using the subset of data containing the above described information and using the test set portion of the subset of data to evaluate and perform predictions of a user response on a pre-trained classifier model corresponds to “inputting … into pre-trained first classifier model, so that the first classifier model predicts whether the to-be-recognized article is the low-quality article”.).
While Kumar in view of Polikar teaches word-segmenting processing to extract subject feature word expressions and commentary content feature word expressions (Kumar [0061]-[0062] and [0067]-[0068]), Kumar in view of Polikar does not explicitly teach
… detecting situations that segmented words obtained from the word segmenting processing hit subject feature words in a pre-collected subject feature word repository, to obtain a subject feature word expression of the user's comments on the to-be-recognized article; the subject feature words each being a commenting subject which is pre-collected and used to comment on the low-quality article; … 
… detecting situations that segmented words obtained from the word segmenting processing hit commentary content feature words in a pre-collected commentary content feature word dictionary, to obtain a commentary content feature word expression of the user's comments on the to-be-recognized article, the commentary content feature words each being a word which is pre-collected and used to comment on the low-quality article; …
Arpat-Rajaram (as indicated in Arpat [0035] as incorporating Rajaram by reference in its entirety) teaches
… detecting situations that segmented words obtained from the word segmenting processing hit subject feature words in a pre-collected subject feature word repository, to obtain a subject feature word expression of the user's comments on the to-be-recognized article; the subject feature words each being a commenting subject which is pre-collected and used to comment on the low-quality article (Rajaram Figure 1, elements 105, 110, 140: examiner’s note: Rajaram teaches a user communications include commenting on objects associated with another user (Rajaram [0024], corresponding to “user’s comments on the to-be-recognized article”), where the topic from the user’s comments is interpreted to include object-related comments (subject) as well as commentary-related comments (opinions), and as such, these object-related comments corresponds to the “subject feature words each being a commenting subject” (Rajaram [0027]: “Further, a first user may comment on the profile page of a second user, or may comment on objects associated with a second user, such as content items uploaded by the second user. The topic for a term in any communication within the social networking system may be determined”). Rajaram further teaches that user communication is parsed by an anchor term module (performing “word segmenting processing”) to determine anchor terms (corresponding to “segmented words obtained from the word segmenting processing”), where the anchor term module uses a dictionary storage module (corresponding to “a pre-collected subject feature word repository”) to find associations for the anchor terms. When combined with the teachings of Kumar in view of Polikar, the process of using this anchor term module to determine anchor terms based on a dictionary storage module to find associations for the anchor terms corresponds to a method to “detecting situations that segmented words … hit subject feature words, … to obtain a subject feature word expression of the user’s comments on the to-be-recognized article; the subject feature words each being a commenting subject which is pre-collected and used to comment on the low-quality article” (Rajaram [0028]-[0029]: “In the embodiment of FIG. 1, a social networking system user 100 creates a communication 105 within the context of the social networking system. The communication 105 is received by the anchor term module 110, which parses the communication 105 to identify an anchor term. An anchor term is a word or other alpha-numeric group of characters in the communication 105, the meaning of which the process of the embodiment of FIG. 1 determines. … The anchor term module 110 may be coupled to a dictionary storage module 140 which contains a dictionary including interconnected nodes representing candidate topics for an anchor term. The nodes of the dictionary may be connected based on relatedness between nodes, as discussed above. In one embodiment, the anchor term module 110 identifies an anchor term in a received communication 105 by identifying a term in the communication 105 with one or more associated nodes in a dictionary stored in dictionary storage module 140. For example, if the communication 105 contains the text “Go Sharks, the anchor term module 110 may query the dictionary to identify nodes containing the term “sharks'. In this example, the dictionary may respond to the query identifying the following nodes: Shark (animal), San Jose Sharks (hockey team), Jumping the Shark, and Loan Shark. The anchor term module 110 may identify an anchor term prior to querying the dictionary, or may identify an anchor term in response to receiving query feedback from the dictionary. In either embodiment, the anchor term module 110 may output identified dictionary nodes received from dictionary storage module 140 as candidate nodes 115. As used herein, “candidate nodes' represent potential meanings for an identified anchor term.”).); …
… detecting situations that segmented words obtained from the word segmenting processing hit commentary content feature words in a pre-collected commentary content feature word dictionary, to obtain a commentary content feature word expression of the user's comments on the to-be-recognized article, the commentary content feature words each being a word which is pre-collected and used to comment on the low-quality article (Examiner’s note: According to the claim language, a commentary content feature word expression is also derived from performing word-segmenting processing for the user’s comments. The claim does not give any clear distinction between the content of a subject feature word expression versus a commentary content feature word expression (as they both are derived from a user’s comment), other than the fact that a commentary content may indicate a user’s sentiment. The combination reference Arpat-Rajaram indicates that the topic extraction engine/anchor term module taught in Arpat [0035] is not only relevant for topic extraction (corresponding to subject feature word expressions) but also for sentiment analysis (Arpat [0036]-[0037], [0049]), thus enabling the topic extraction engine/anchor term module to also extract commentary content feature word expressions. It would have been obvious for a person having ordinary skill in the art to perform the same processing as described for the above earlier claim limitation (“…detecting situations that segmented words obtained from the word segmenting processing hit subject feature words in a pre-collected subject feature word repository…”) to satisfy this same claim limitation for commentary content feature words. Hence, this claim limitation is similar in scope to the corresponding claim limitation recited earlier (“…detecting ); …
Kumar in view of Polikar and Arpat-Rajaram are analogous art as both teach word segmenting processing of users’ comments for use in a model to perform some prediction of the “quality” of the content (e.g., through measuring a user’s sentiment, or through measuring the “popularity” of the content).
It would have been obvious to a person having ordinary skill in the art before the filing date of the invention to take text analytics module taught Kumar in view of Polikar and enhance it with the pre-collected dictionary functionality from the topic extraction engine/anchor term module taught in Arpat-Rajaram as a way to perform word processing segmenting of users’ comments in a news article. The motivation to combine is taught in Arpat-Rajaram, as users’ comments are usually in plain text and cannot be directly applied to a model without some form of pre-processing. Using a separate word segmentation module to preprocess users’ comments allows not only for simplifying the processing logic of the predictive model, but it also allows for filtering and narrowing down of users’ comments into concise words with specific meanings that can be used as features into a predictive model, thus improving the reliability of the predictive model (“Rajaram [0006]: “Communications by social networking system users are often plain text and are not manually associated by the users with established subjects. This limits the ability of the social networking system to correlate communications with particular subjects, and limits the functionality of displaying these correlations to users in conjunction with the communications. Further, words may have many meanings, and automated topic recognition may result in the meaning of ambiguous words being determined incorrectly. Thus, there is a need for a solution that determines the underlying topic of communications words, enhancing the richness of information connectivity with the social networking system, and providing a more enjoyable and useful experience to social networking system users.”).).
Regarding original Claim 6, 
Kumar in view of Polikar, in further view of Arpat-Rajaram as applied to Claim 5 teaches
(Original) The method according to claim 5, 
wherein the training the low-quality article recognition model according to the several training data specifically comprises:
obtaining users' comments on respective training articles, from user feedback behavior features of training articles of said several training data (This claim limitation is similar in scope to the corresponding claim limitation in Claim 5 as part of the ‘wherein’ contingent clause (when positively recited), and hence is rejected under similar rationale.); 
regarding users' comments on respective training articles, inputting corresponding users' comments into a pre-trained primary low-quality article recognition model, so that the primary low-quality article recognition model outputs a primary prediction result of whether a corresponding training article is the low-quality article (This claim limitation is similar in scope to the corresponding claim limitation in Claim 5 (“according to the user's comments on the to-be-recognized article and a pre-trained primary low-quality article recognition model, performing a primary prediction about whether the to-be-recognized article is the low-quality article, …”), and hence is rejected under similar rationale.); 
regarding the users' comments on each training article, obtaining a subject feature word expression corresponding to the users' comments on the corresponding training article, according to the subject feature word repository (When combined with the teachings of Kumar in view of Polikar in terms of training subsets of data and performing word-segmenting processing to extract subject feature word expressions, this claim limitation is similar in scope to the corresponding claim limitation in Claim 5 to obtain a subject feature word expression of the users’ comments, and hence is rejected under similar rationale.); 
regarding the user's comment on each training article, obtaining a commentary content feature word expression corresponding to the users' comments on the corresponding training article, according to the commentary content feature word dictionary (When combined with the teachings of Kumar in view of Polikar in terms of training subsets of data and performing word-segmenting processing to extract commentary content feature word expressions, this claim limitation is similar in scope to the corresponding claim limitation in Claim 5 to obtain a commentary content feature word expression of the users’ comments, and hence is rejected under similar rationale.); 
training the first classifier model by using the primary prediction results, subject feature word expressions and the commentary content feature word expressions corresponding to the users' comments on respective training articles, and known classes of respective training articles (Examiner’s note: As indicated earlier, Kumar teaches a flow (Kumar Figure 11, [0131]-[0133]) that includes training and building supervised learning models, where each generated supervised learning model corresponds to “a classifier” and is trained based on a subset of input feature data, where the subset of input feature data includes the above described information and a list of the known classes and is derived from a master dataset generated by the data preparation module and the data collection module. Kumar further teaches that the generated supervised learning model is used to perform predictions of a user response, where a user response includes likes or dislikes, and these likes and dislikes represent valid measurements of popularity (Kumar Figure 11, step 1114; and [0082]: “User response .. (e.g., like, dislike, view, skip, ignore, …) …”). In the context of a news recommendation service, the prediction based on these user responses will reflect the “quality” of an item, where a predicted “like” user response will represent a higher quality of the item versus a predicted “dislike” user response (which will represent a lower quality of the item. Kumar further teaches dividing the received dataset for building the model into training, test, and validation sets, as well as storing trained models in a data store (Kumar [0100]; and [0117]). Thus, the process of training and building this classifier model using the subset of data containing the above described information and using the training set portion of the subset of data to perform training of a classifier model (including the primary prediction results from a pre-trained primary article low-quality recognition model) corresponds to “training the first classifier model by using the primary prediction results, the subject feature word expressions and the commentary content feature word expressions corresponding to the users’ comments on respective training articles, and known classes of respective training articles”.).  
Regarding original Claim 7, 
Kumar in view of Polikar, in further view of Arpat-Rajaram as applied to Claim 6 teaches
(Original) The method according to claim 6, 
wherein before the step of, regarding users' comments on respective training articles, inputting corresponding users' comments into a pre-trained primary low-quality article recognition model, so that the primary low-quality article recognition model outputs a primary prediction result of whether a corresponding training article is the low-quality article, the method further comprises:
using users' comments corresponding to respective training articles and known classes of respective training articles, to train the primary low-quality article recognition model (Examiner’s note: Kumar teaches a data collection module (Figure 2, element 220) collecting user feedback on items, including users’ comments (corresponding to “using users’ comments corresponding to respective training articles”) (Kumar [0066]: “As shown in the example graphical representation 300 of FIG. 3, the data collection module 220 collects user data attributes by virtue of users interacting with an application or browser accessing the item server 108 on a client device 114, filling out surveys, publicly known information about the user, etc. For example, the data collection module 220 groups user data as and/or include … (4) user feedback, such as comments, shares, likes, dislikes, favorites, actions, etc., and so forth.”). Kumar Figure 2, elements 226, 212 further teaches a data preparation module within the data collection module formulating a training data table (for training and evaluating a supervised learning model) consisting of rows of stored user data, item data (an item being a news article, with item data being a feature from the news article), and user-item interaction data (Kumar [0081]: “ … the data preparation module 226 obtains user data, item data, and interaction data from storage device 212 and combines the user data, item data, and interaction data into rows of a dataset that will be used for training a supervised learning model. … the data preparation module 226 creates a table in which to organize the user, item, and interaction data and stores the table in the storage device 212.”). Kumar further teaches that the data collection module collects data related to items (an item being a news article) belonging to a category or tag (corresponding to “known class”, Kumar [0082]), with tags chosen by the user of the service or experts/creators (Kumar [0068]), resulting in this training data table corresponding to “users’ comments corresponding to respective training articles and known classes of respective training articles”. Kumar further teaches a popularity-based modeling module (Kumar Figure 2, elements 220, 230, 232, corresponding to “the primary low-quality article recognition model”) performing initial baseline recommendations based on global popularity on the items and corresponding user-item interaction data within the training data set (corresponding to “a primary prediction result”), filtering and selecting  recommendation module (Kumar [0108]) within the recommendation system that uses a popularity based modelling module, where the recommendation module is trainable off-line using the same methods applied in the supervised learning module to generate classifier models (thus corresponding to “using users’ comments … to train the primary low-quality article recognition model”) (Kumar Figure 2, elements 220, 230, 232; Kumar [0089]: “The popularity-based modeling module 230 includes computer logic executable by the processor 202 to augment a model created by the model generation module 232 with a popularity-based naive model. … the popularity-based naive model encodes the simple logic of recommending the most popular items (i.e., global popularity) among all the users aggregated in the dataset. … the popularity-based naive model recommends items that have gained popularity within a group of similar users and/or items selected for a specific business objective. The model from the popularity based modeling module 230 forms a non-personalized model that makes baseline recommendations, which may be used as a fall-back by the recommendation module 236 described herein when the sophisticated supervised learning model does not make predictions of enough confidence to suggest as recommendations to the user. Another use of this simple model is to select candidate items to consider for each user, in the supervised learning approach as described herein.”).); 
the using users' comments corresponding to respective training articles and known classes of respective training articles, to train the primary low-quality article recognition model specifically comprises: 
inputting users' comments corresponding to respective training articles in turn into the primary low-quality article recognition model, so that the primary low-quality article recognition model predicts a predicted class of a corresponding training article (Examiner’s note: Kumar teaches a data preparation module within the data collection module (Kumar Figure 2, elements 226, 212) formulating a training data table (for training and evaluating a supervised learning model) consisting of rows of stored user data, item data (an item being a news article, with item data being a feature from the news article), and user-item interaction data (corresponding to “users’ comments corresponding to respective training articles”) (Kumar [0081]: “In some implementations, the data preparation module 226 obtains user data, item data, and interaction data from storage device 212 and combines the user data, item data, and interaction data into rows of a dataset that will be used for training a supervised learning model. In some implementations, the data preparation module 226 creates a table in which to organize the user, item, and interaction data and stores the table in the storage device 212. A schematic example of the rows of a dataset generated by the data preparation module 226 are included in the following paragraph and include a selection of possible columns which may be used in building a model.”). Kumar further teaches a popularity-based modeling module (Kumar Figure 2, elements 220, 230, 232, corresponding to “the primary low-quality article recognition model”) performing initial baseline recommendations based on global popularity on the items and corresponding user-item interaction data within the training data set, filtering and selecting candidate “popular” items as input into the supervised learning module and recommendation module (corresponding to “inputting users' comments corresponding to respective training articles in turn into the primary low-quality article recognition model, so that the primary low-quality article recognition model predicts a predicted class of a corresponding training article”) (Kumar [0089]: “The popularity-based modeling module 230 includes computer logic executable by the processor 202 to augment a model created by the model generation module 232 with a popularity-based naive model. In some implementations, the popularity-based naive model encodes the simple logic of recommending the most popular items (i.e., global popularity) among all the users aggregated in the dataset. In some implementations, the popularity-based naive model recommends items that have gained popularity within a group of similar users and/or items selected for a specific business objective. The model from the popularity based modeling module 230 forms a non-personalized model that makes baseline recommendations, which may be used as a fall-back by the recommendation module 236 described herein when the sophisticated supervised learning model does not make predictions of enough confidence to suggest as recommendations to the user. Another use of this simple model is to select candidate items to consider for each user, in the supervised learning approach as described herein.”).); 
judging whether the predicted class of the training article is consistent with the known class (Kumar Figure 2, element 234a: examiner’s note: Kumar teaches a supervised learning module within the model generation module creating multiple supervised learning models based on sets/subsets of user Kumar [0104]-[0105]: “… the supervised learning module 234a creates multiple models for each supervised learning method and/or on different subsets of original or overall dataset (e.g. different subsets of user data, subsets of item data or subsets of interaction data).”). Kumar further teaches the same supervised learning module within the model generation module (in a recommendation system) splitting a dataset or relevant subsets into test, training, and validation sets for training as well as evaluation of a model, where the supervised learning module evaluates and tunes each generated model associated with a user feedback behavior feature to perform classification such as predicting like/dislike, which is a valid measurement of popularity and hence a prediction of the “quality” of an item. The tuning of a machine-learning model involves using the training data set to perform sufficient training and optimizing the model’s parameters to maximize a desired aspect of performance (corresponding to “judging whether the predicted class of the training article is consistent with the known class”), with the training completed once it reaches the desired performance (Kumar [0100]; and [0098]: “… the supervised learning module 234a tunes a model of the chosen type by optimizing its parameters to maximize a desired aspect of performance. … if the supervised learning model is predicting a numerical measure of user-item interaction such as … the user rating of items, … Similarly, in the case of predicting like/dislike, or buy/not buy-type binary user-item interactions, one can use the AUC (area under the ROC curve), or other related measures as a measure of performance.”).); 
in case of inconsistency, adjusting parameters of the primary low-quality article recognition model so that the predicted class of the training article as predicted by the primary low-quality article recognition model tends to be consistent with the known class (Kumar Figure 2, element 234a: examiner’s note: As indicated earlier, Kumar teaches a supervised learning module evaluating and tuning each generated model associated with a user feedback behavior feature to perform classification such as predicting like/dislike, which is a valid measurement of popularity and hence a prediction of the “quality” of an item. The tuning of a machine-learning model involves using the training data set to perform sufficient training and optimizing the model’s parameters to maximize a desired aspect of performance (Kumar [0104]-[0105]; and [0098]), corresponding to “in case of inconsistency, adjusting parameters of the primary low-quality article recognition model so that the predicted class of the training article as predicted by the ); 
according to the above steps, repeatedly using users' comments on respective training articles to train the primary low-quality article recognition model until the primary low-quality article recognition model converges (Kumar Figure 2, element 234a: examiner’s note: As indicated earlier, Kumar teaches a supervised learning module evaluating and tuning each generated model associated with a user feedback behavior feature to perform classification such as predicting like/dislike, which is a valid measurement of popularity and hence a prediction of the “quality” of an item. The tuning involves taking the training data set and optimizing the model’s parameters to maximize a desired aspect of performance (Kumar [0104]-[0105]; and [0098]), corresponding to “according to the above steps, repeatedly using users’ comments on respective training articles to train the primary low-quality article recognition model”), with the training completed once it reaches the desired performance (corresponding to “until the primary low-quality article recognition model converges”).); 
determining parameters of the primary low-quality article recognition model (Kumar Figure 2, element 234a: As indicated earlier, Kumar teaches a supervised learning module evaluating and tuning each generated model associated with a user feedback behavior feature to perform classification such as predicting like/dislike, which is a valid measurement of popularity and hence a prediction of the “quality” of an item; the tuning of a machine-learning model naturally involves using the training data set to perform sufficient training and optimizing the model’s parameters to maximize a desired aspect of performance (Kumar [0104]-[0105]; and [0098]), corresponding to “determining parameters of the primary low-quality article recognition model and thereby determining the primary low-quality article recognition model”), with the training completed once it reaches the desired performance (corresponding to “thereby determining the primary low-quality article recognition model”).) and 
thereby determining the primary low-quality article recognition model (Examiner’s note: The identified claim language (“… thereby determining …”) recites an intended result of predicting parameters of the of the primary low-quality article recognition model, and hence this recited claim language does not carry patentable weight during examination of this claim.).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Liu et al., A Boosting Algorithm for Item Recommendation with Implicit Feedback, Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015), pp.1792-1798, where Liu teaches an Adaptive Boosting Personalized Ranking (AdaBPR) framework for item recommendation with users’ implicit feedback, where the implicit feedback includes users’ preferences and user-item interaction (p.1792 Abstract and Section 1. Introduction). 
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM WAI YIN KWAN whose telephone number is 303-297-4332. The examiner can normally be reached Monday-Friday 8:00am - 4:30pm PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 571-272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit 

/WILLIAM WAI YIN KWAN/Examiner, Art Unit 2121                                                                                                                                                                                                        




/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121