DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to communications: Amendment filed on 6/13/2022.
Claims 1-2, 4, 6-7, 9-10, 12, 14-18, 20, and 22-24 are pending. Claims 1, 9, and 17 are independent. Claims 3, 5, 8, 11, 13, 19, 21, and 25-32 have been canceled.
The previous rejection of claims 1-2, 4, 6-7, 9-10, 12, 14-18, 20, and 22-24 under 35 USC § 103 have been withdrawn in view of the amendment.



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-2, 4, 6-7, 9-10, 12, 14-18, 20, and 22-24  is/are rejected under 35 U.S.C. 103 as being unpatentable over Chhichhia et al. (US2015/0324459) in view of Forman (US2004/0059697) and Lin et al. (US2012/0284212).

In regards to claim 1, Chhichhia et al. substantially discloses a method for training a topic classifier, comprising: 
obtaining a training sample and a test sample, wherein the training sample is obtained by manually labeling after a corresponding topic model having been trained based on text data (Chhichhia et al. para[0080], obtains test sample, para[0084], obtains labeled training sample; 
extracting features of the training sample and of the test sample respectively using a preset algorithm, computing optimal model parameters of a logistic regression model by an iterative algorithm based on the features of the training sample, to train and get a logistic regression model containing the optimal model parameters (Chhichhia et al. para[0096]-[0097], extracts features to train model and get logistic regression model). 
Chhichhia et al. does not explicitly disclose drawing a ROC curve of receiver operating characteristic based on the features of the test sample and the logistic regression model containing the optimal model parameters, and evaluating the logistic regression model containing the optimal model parameters based on the area AUC under the ROC curve, to train and get a first topic classifier.
However Forman substantially discloses drawing a ROC curve of receiver operating characteristic based on the features of the test sample and the logistic regression model containing the optimal model parameters, and evaluating the logistic regression model containing the optimal model parameters based on the area AUC under the ROC curve, to train and get a first topic classifier (Forman fig. 1B para[0062]-[0063], draws an ROC curve identify best features for training a classifier).
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined classification system of Chhichhia et al. with the feature selection method of Forman in order to improve the accuracy of classification (Forman para[022]).
Chhichhia et al. does not explicitly disclose wherein the step of extracting features of the training sample and of the test sample respectively using a preset algorithm, computing optimal model parameters of a logistic regression model by an iterative algorithm based on the features of the training sample, to train and get a logistic regression model containing the optimal model parameters comprises: 
extracting the features of the training sample and of the test sample respectively using a preset algorithm, and correspondingly establishing a first hash table and a second hash table; 
substituting the first hash table into the logistic regression model, and calculating the optimal model parameters of the logistic regression model using the iterative algorithm, to train and get the logistic regression model containing the optimal model parameters.
However Lin et al. substantially discloses wherein the step of extracting features of the training sample and of the test sample respectively using a preset algorithm, computing optimal model parameters of a logistic regression model by an iterative algorithm based on the features of the training sample, to train and get a logistic regression model containing the optimal model parameters comprises: 
extracting the features of the training sample and of the test sample respectively using a preset algorithm, and correspondingly establishing a first hash table and a second hash table (Lin et al. para[0043]-[0044], extracts selected features of both training samples and testing samples) ; 
substituting the first hash table into the logistic regression model, and calculating the optimal model parameters of the logistic regression model using the iterative algorithm, to train and get the logistic regression model containing the optimal model parameters (Lin et al. para[0039], [0106], iteratively trains the logistic regression model).
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the classification method of Chhichhia et al. with the feature selection method of Forman and the accuracy assessment method of Lin et al. in order to test and improve the accuracy of each iteration of the predictive model (Lin et al. para[0007]).

In regards to claim 2, Chhichhia et al. as modified by Forman and Lin et al. substantially discloses the method of claim 1, wherein the step of obtaining a training sample and a test sample, wherein the training sample is obtained by manually labeling after a corresponding topic model having been trained based on text data comprises: 
collecting the text data, and preprocessing the text data to obtain a corresponding first keyword set (Chhichhia et al. para[0073]); 
computing a distribution of the text data on a preset number of topics using a preset topic model based on the first keyword set and the preset number of topics, and clustering the text data based on the distribution of the text data on the topics, to train and get the corresponding topic models of the text data (Chhichhia et al. para[0075]); and 
selecting from among the text data the training samples that correspond to a target topic classifier based on the manual labeling results on the text data based on the topic models, and using the text data other than the training samples as the test sample (Chhichhia et al. para[0076]).  

In regards to claim 4, Chhichhia et al. as modified by Forman and Lin et al. substantially discloses the method of claim 3, wherein the step of drawing a ROC curve of receiver operating characteristic based on the features of the test sample and the logistic regression model containing the optimal model parameters, and evaluating the logistic regression model containing the optimal model parameters based on the area AUC under the ROC curve, to train and get a first topic classifier comprises: 
substituting the second hash table into the logistic regression model containing the optimal model parameters to obtain true positive TP, true negative TN, false negative FN, and false positive FP (Forman para[0008], [0064]-[0065]); 
drawing the ROC curve based on TP, TN, FN and FP (Forman para[0064]-[0065]); 
calculating the area AUC under the ROC curve, and evaluating the logistic regression model containing the optimal model parameters based on the AUC value (Forman para[0064]-[0065]); 
when the AUC value is less than or equal to a preset AUC threshold, determining that the logistic regression model containing the optimal model parameters does not meet the requirement, and returning to the following operation: computing optimal model parameters of the logistic regression model using the iterative algorithm so as to train and get the logistic regression model containing the optimal model parameters (Forman para[0069]); 
otherwise when the AUC value is greater than the preset AUC threshold, determining that the logistic regression model containing the optimal model parameters meets the requirement, and trains to get the first topic classifier (Forman para[0069]).
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combine  classification system of Chhichhia et al. with the feature selection method of Forman in order to improve the accuracy of classification (Forman para[022]).

In regards to claim 6, Chhichhia et al. as modified by Forman and Lin et al. substantially discloses the method of claim 4, further comprising: 
substituting the second hash table into the first topic classifier to obtain a probability that the test sample belongs to a corresponding topic (Forman para[0063]); 
adjusting the preset AUC threshold, and calculating a precision rate p and a recall rate r based on TP, FP, and FN (Forman para[0063]); 
when the p is less than or equal to a preset p threshold, or the r is less than or equal to a preset r threshold, returning to the following operation: adjusting the preset AUC threshold until the p is greater than the preset p threshold, and the r is greater than the preset r threshold, and training to get the second topic classifier (Forman para[0069]); 
classifying the text data using the second topic classifier (Forman para[0072]).  
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combine  classification system of Chhichhia et al. with the feature selection method of Forman in order to improve the accuracy of classification (Forman para[022]).

In regards to claim 7, Chhichhia et al. as modified by Forman and Lin et al. substantially discloses the method of claim 2, wherein the step of collecting the text data, and preprocessing the text data to obtain a corresponding first keyword set comprises: 
collecting the text data, and segmenting the text data (Chhichhia para[0062]); 
removing stop words in the text data after the segmentation based on a preset stop word list, to obtain a second keyword set (Chhichhia et al. para[0096]); 
calculating a term frequency-inverse document frequency TF-IDF value of each keyword in the second keyword set, and removing the keyword whose TF-IDF value is lower than a preset threshold of TF-IDF, to obtain the corresponding first keyword set (Chhichhia et al. para[0096]).  


Claims 9-10, 12, and 14-15 recite substantially similar limitations to claims 1-2, 4 and 6-7. Thus claims 9-10, 12, and 14-15 are rejected along the same rationale as claims 1-2, 4 and 6-7.

In regards to claim 16, Chhichhia et al. as modified by Forman and Lin et al. substantially discloses the device of claim 15, wherein following operations are further performed when the topic classifier training program executed by the processor: 
calculating the term frequency TF and the inverse document frequency IDF of each keyword in the second keyword set  (Chhichhia et al. para[0096]); 
calculating the term frequency-inverse document frequency TF-IDF value of each keyword in the second keyword set, and removing the keyword whose TF-IDF value is lower than the preset threshold of TF-IDF, to obtain the corresponding first keyword set (Chhichhia et al. para[0096]).  

Claims 17-18, 20, and 22-23 recite substantially similar limitations to claims 1-2, 4 and 6-7. Thus claims 17-18, 20, and 22-23 are rejected along the same rationale as claims 1-2, 4 and 6-7.

Claim 24 recites substantially similar limitations to claim 16. Thus claim 24 is rejected along the same rationale as claims 16. 

Response to Arguments
Applicant’s arguments with respect to claims 1-2, 4, 6-7, 9-10, 12, 14-18, 20, and 22-24 have been considered but are moot because the arguments do not apply the current rejection.



Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS HASTY whose telephone number is (571)270-7775. The examiner can normally be reached Monday-Friday 8:30am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen Hong can be reached on (571)272-4124. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/N.H/Examiner, Art Unit 2178                                                                                                                                                                                                        
/STEPHEN S HONG/Supervisory Patent Examiner, Art Unit 2178