Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Reasons for Allowance
Claims 1-21 are allowed.
The following is an examiner’s statement of reasons for allowance: 
Regarding claim 1, 8, and 15, the primary reason for allowance is the prior art fail to teach sampling the next cluster center based on the sampling probability and iterate the process of determining sampling probability and the process of sampling the next cluster center until k cluster centers are sampled to form a set of cluster centers C = {C}ki=1, wherein each of the k cluster centers corresponds to each of the k clusters; generating weightage for each of the k cluster centers by counting a number of data points belonging to each of the k cluster centers; determining sensitivity scores of the data points belonging to each of the k cluster centers based on the weightage for each of the k cluster centers; labeling, based on the determined sensitivity scores, a data point having a sensitivity score greater than a threshold value as an outlier of the digitized text corpus and removing the outlier from the digitized text corpus; and providing a first parameter of the digitized text corpus by analyzing the removed outlier or a 
Claims 2-7, 9-14, and 16-21 are considered allowable based on their respective dependence on allowed claims 1, 8, and 15.
Love et al. (U.S. Patent No. 9,836,183) disclose a process including: obtaining a clustered graph, the clustered graph having three or more clusters, each cluster having a plurality of nodes of the graph, the nodes being connected in pairs by one or more respective edges; determining visual attributes of cluster icons based on amounts of nodes in clusters corresponding to the respective cluster icons; determining positions of the cluster icons in a graphical visualization of the clustered graph; obtaining, for each cluster, a respective subset of nodes in the respective cluster; determining visual attributes of node icons based on attributes of corresponding nodes in the subsets of nodes, each node icon representing one of the nodes in the respective subset of nodes; determining positions of the node icons in the graphical visualization based on the positions of the corresponding cluster icons of clusters having the nodes corresponding to the respective node icons; and causing the graphical visualization to be displayed. ki=1, wherein each of the k cluster centers corresponds to each of the k clusters; generating weightage for each of the k cluster centers by counting a number of data points belonging to each of the k cluster centers; determining sensitivity scores of the data points belonging to each of the k cluster centers based on the weightage for each of the k cluster centers; labeling, based on the determined sensitivity scores, a data point having a sensitivity score greater than a threshold value as an outlier of the digitized text 
Wei et al. (CN 104462802 A) discloses a large-scale data outlier data analysis method, wherein it comprises the following steps: (1), outlier data digging in massive data selecting the outlier data, (2) outlier data clustering: The aim of this step is to step (1) screening out the clustering, outlier data assigned to a different cluster from the group data, outlier data cluster is more similar and different clusters of outlier data in difference is large, (3); a cluster selecting from the group data into clusters is rare to scale in very small cluster outlier data from the group data in the clustering process, and the data is located at the periphery of the feature space, significantly deviates from all data of global outlier data; The electrode selecting cluster sample number is less than threshold Tl; in all clusters from the group data of the data set is marked as CI; (4), outlier data to screening data group characteristics analysis and cluster in the characteristic analysis aims at using the visualization means auxiliary analysis of outlier data in clusters. obtaining the common group characteristic, so as to analyze the reason of the abnormal feature to generate the step while selecting the hidden cluster in very ki=1, wherein each of the k cluster centers corresponds to each of the k clusters; generating weightage for each of the k cluster centers by counting a number of data points belonging to each of the k cluster centers; determining sensitivity scores of the data points belonging to each of the k cluster centers based on the weightage for each of the k cluster centers; labeling, based on the determined sensitivity scores, a data point having a sensitivity score greater than a threshold value as an outlier of the digitized text corpus and removing the outlier from the digitized text corpus; and providing a first parameter of the digitized text corpus by analyzing the removed outlier or a second parameter of the digitized text corpus by analyzing data points without the outlier as now recited in claims 1, 8, and 15  of the present invention.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should 
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHN H LE whose telephone number is (571)272-2275.  The examiner can normally be reached on Monday-Friday from 7:00am – 3:30pm Eastern Time.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, John E. Breene can be reached on (571) 272-4107.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to 

/JOHN H LE/Primary Examiner, Art Unit 2862