DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claims 1, 4 to 10, and 13 to 29 are objected to because of the following informalities:
Independent claims 1, 26, and 28 are amended to set forth “weighting using tf.idf”, where “tf.idf” is an abbreviation that should be written out in full in its initial occurrence.  Applicants’ claim 29 includes this unabbreviated limitation of “term frequency-inverse document frequency (tf.idf)”, but the independent claims do not include it.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4 to 6, 9, 13, 15 to 19, and 21 to 29 are rejected under 35 U.S.C. 103 as being unpatentable over Revesz et al. (U.S. Patent Publication 2015/0154289) in view of Jaiswal (U.S. Patent Publication 2012/0303558).
Revesz et al. discloses a method, system, and computer-program product for categorizing and moderating user-generated content in an online environment, comprising:
“a) providing a set of data contents as training data, the data contents being labeled as acceptable or unacceptable contents” – a reference corpus or reference database refer to a collection of textual examples that, for a particular category of content, are classified (“labeled”) as either positive or negative examples of that particular category; a reference database may contain a collection of positive or negative textual examples for a single category or for a plurality of textual categories (¶[0036]); a reference corpus may be compiled for each category to contain verified positive and negative content examples in each category; verification may be performed by a human reviewer (¶[0092]: Figure 4: Step 404); a machine learning algorithm may be trained and tested using examples in a reference corpus (¶[0112]); here, textual examples (“a set of data content”) are used as “training data” for training a machine learning algorithm, where positive examples are “unacceptable content” and negative examples are “acceptable content” for each category as verified by a human reviewer;
“b) the moderator tool receiving said training data” – a machine learning algorithm may be trained and tested using examples in a reference corpus (¶[0112]); different folds are identified in the reference corpus for use in training and testing the machine learning system; training examples present in the selected fold are used to train the machine learning system (¶[0118] - ¶[0119]: Figure 7: Steps 704 and 706); here, a machine learning system that is trained for moderating user-generated content is “the moderator tool”;

“d) the moderator tool executing a second algorithm in the feature space for choosing the features in a moderation model to be created and defining [a weighting of] data features, the choosing and defining based on the data contents labelled as the acceptable contents and the unacceptable contents” – text of each positive and negative example in the reference corpus is parsed to generate a sequence of n-grams 
“training parameters of a machine learning model [based on the weighted data features] in order to create the moderation model” – training examples are used to train the machine learning system; the machine learning system accepts as training input a set of training vectors generated from the training examples; the machine learning system is trained on the training vectors (¶[0119] - ¶[0121]: Figure 7: Steps 708 to 710);
“e) the moderator tool receiving new data content to be moderated” – embodiments may receive textual content for categorizing web page content generated 
“f) the moderator tool executing the first algorithm on the new data content for identifying the data features in the new data content to be moderated in accordance with the moderation model created” – embodiments may process the selected content to generate a vector that may be used by a trained machine learning system to determine whether the selected content is a positive example of a category by parsing the content to generate a sequence of one or more n-grams based on the selected content, looking up in a features table the unique identifier for each n-gram generated based on the selected content, and generating a combination of unique identifiers that can be used as a vector (¶[0177] - ¶[180]: Figure 15: Steps 1502 to 1512); here, a vector is generated from selected content (“the new data content”) during use of the trained machine learning system, where this vector represents “the data features” in the content of a features table; 
“g) producing a moderated result for the new data content by indicating whether the new data content is acceptable or unacceptable in accordance with the moderation model created” – embodiments automatically determine a probability value indicating that the user-generated content is either a positive example or a negative example of one or more unsuitable categories; if the user-generated content is determined to be a positive example of any of the unsuitable categories (“where the new data content is acceptable or unacceptable”) to a predefined degree of certainty, the content may be automatically excluded from publication in the online environment (“producing a moderated result for the new data content”) (Abstract).
Revesz et al. additionally discloses “one or more client devices with means for sending training data and data contents to the moderator tool to be moderated” and “an interface for interaction between the one or more client devices and the moderator tool” – a user may interact with computing device 1600 through a display that may use one or more user interfaces 1620 (“an interface for interaction between the one or more client devices and the moderator tool”) associated with embodiments (¶[0190]: Figure 16); a network environment 1700 may include one or more servers 1702 and 1704 coupled to one or more clients 1706 and 1708; clients 1706 and 1708 may train and test a machine learning system (“one or more client devices for sending training data and data contents to the moderator tool to be moderated”), and submit the trained machine learning system to servers 1702 and 1704 for using the trained machine learning system to moderate user-generated content; alternatively, servers 1702 and 1704 may train and test a machine learning system, and submit the trained machine learning system to clients 1706 and 1708 (¶[0196] - ¶[0197]: Figure 17).
Concerning independent claims 1, 26, and 28, Revesz et al. discloses using features in a feature table to train a machine learning algorithm, but omits the limitations of “wherein the moderator tool differentiates between the acceptable contents and the unacceptable contents by defining a boundary in the feature space that separates the labeled acceptable contents and unacceptable contents, and wherein the moderator tool, in executing the first algorithm and the second algorithm, performs at least one of language detection, determining sums, determining means, or determining distribution parameters, weighting using tf.idf, weighting using entropy, or normalized document Revesz et al. discloses “choosing the features to be used in a moderation model to be created . . . and choosing . . . based on the data contents labelled as the acceptable and the unacceptable contents” so as to provide “training parameters of a machine learning model based on the . . . data features in order to create the moderation model”.  Still, “defining a boundary in the feature space” is a known characteristic of support vector machines (SVMs).  Here, Revesz et al. discloses training a machine learning algorithm with a reference corpus that is compiled using positive and negative content examples for each category.  (¶[0090] - ¶[0092]: Figure 4: Step 404)  Positive and negative examples in a reference corpus are parsed during pre-processed to generate a sequence of n-grams in a features table, where n-grams are sorted by decreasing frequency, and stop words with high frequency are discarded, but certain textual features that are predictive of categories are retained.  (¶[0100] - ¶[0106]: Figure 5)  Revesz et al., then, teaches “choosing the features to be used in a moderation model to be created . . . based on the data contents labelled as the acceptable contents and the unacceptable contents” because some features are retained from positive and negative training examples (“data contents labelled”).  Additionally, Revesz et al. discloses that training includes updating a distribution of weights to indicate an importance of certain examples during training.  (¶[0115])  However, Revesz et al. does not clearly disclose that a weighting is provided to data features during training, only that a weighting is applied to examples during training.  
Jaiswal teaches generating machine learning classifiers for detecting specific categories of sensitive information.  (Abstract)  Machine learning classifiers are generated by obtaining training data for each specific category of sensitive information, e.g., training data sets 122(1)-(n), each of which includes a plurality of positive and a plurality of negative examples of a specific category of sensitive information to be protected.  (¶[0034]: Figure 1)  Here, Jaiswal teaches (1) extracting a feature set from the training data set that includes statistically significant features of the positive examples within the training data set and statistically significant features of the negative examples within the training data set, and then (2) using the feature set to build a machine learning-based classifier model that is capable of indicating whether or not new items of data contain information that falls within the specific category of sensitive information associated with the training data set.  Examples of features include a word, e.g., ‘proprietary’, a pair of words, e.g., ‘stock market’, and a phrase, e.g., ‘please do not distribute’.  Specifically, Jaiswal teaches that features may be extracted from a training data set in a variety of ways.  A weight may be associated with each extracted feature in order to indicate the relative importance of that feature relative to other features.  Training module 106 may (1) determine the frequency of occurrence of various features, e.g., words, within both the positive and negative examples within the training data set, (2) rank these positive features and negative features based on the frequency of occurrence, and (3) select the highest ranked features for inclusion within a feature set.  The weight associated with each feature may be the frequency of occurrence of the specific feature.  Training module 106 may filter out commonly used words during this process, including ‘the’, ‘it’, ‘and’, etc.  Training module may use a term frequency-inverse document frequency (TF-IDF) algorithm to select and/or weight features within the feature set (“wherein the moderator tool, in executing the first algorithm and the second algorithm, performs at least one of . . . weighting using tf.idf”), or may use a feature-extraction and/or feature-weighting algorithm of segment-set term frequency – inverse document frequency (STF-IDF).  After training module 106 has generated a feature set for a particular training data set, training module 106 may generate a machine learning-based classification model based on the feature set.  (¶[0053] - ¶[0056]: Figure 3)  A machine learning-based classifier includes a map of support vectors that represent boundary features, and these boundary features may be selected from and/or represent the highest ranked features in a feature set (“wherein the moderator tool differentiates the acceptable and unacceptable contents by defining a boundary in the feature space that separates the labeled acceptable contents and the unacceptable contents”).  (¶[0057])  Jaiswal, then, teaches weighting text items based on term frequency – inverse document frequency and defining a boundary in a feature space that represents acceptable and unacceptable contents.  An objective is to more accurately detect and protect sensitive data using machine-learning techniques to identify sensitive data that is similar to but not exactly the same as known examples of sensitive data.  (¶[0003])  It would have been obvious to one having ordinary skill in the art to determine weighting of data features based on term frequency – inverse document frequency to produce a boundary in a feature space as taught by Jaiswal to moderate user-generated content between acceptable and unacceptable contents in Revesz et al. for a purpose of detecting data using machine-learning that is similar to but not exactly the same as known examples.
Revesz et al. discloses that features can be generated by n-grams, which are a sub-sequence of consecutive textual items from a particular textual sequence (“wherein the features consists of one or more of . . . text items, . . . , n-grams”).  (¶[0037] - ¶[0038])  Additionally, these n-grams can be unigrams, which are individual “words”, and implicitly, words at least comprise “characters” and “character strings”, where n-grams are “words combinations” and “phrases”.
Concerning claim 5, Revesz et al. discloses that user-generated content comprises textual items, which may include unigrams, bigrams, and trigrams (“different types of data including at least one of text or metadata”).  (¶[0037])  
Concerning claim 6, Revesz et al. discloses a data structure (“a data format”) that includes table entries including one or more columns, and suitability information.  (¶[0074] - ¶[0076]: Figure 2)  Here, Figure 2 includes “separate fields” defined by the columns of the data structure representing different categories (‘abusive’, ‘racist’, ‘sexist’) (“for different types of data content”), where each category is a ‘label’. 
Concerning claim 9, Revesz et al. discloses that features table 650 has a column 654 for unique IDs associated with n-grams, and a column 656 for n-gram frequencies.  (¶[0111]: Figure 6B); here, a feature frequency is a “feature distribution”; one or more stop words may be discarded from the sorted features table; stop words are commonly used terms and tend to appear at the top of the sorted features table due to their relatively high frequencies; stop words may be discarded after n-grams are generated, or may be discarded before n-grams are generated (¶[0105]: Figure 5: Step 510); n-gram entries in features table 650 are sorted by decreasing n-gram frequency, and entries with stop words as n-grams are discarded from features table 650 (¶[0111]: 
Concerning claim 13, Revesz et al. discloses that a reference corpus can be enriched with more examples to improve the accuracy of the trained system (“sending additional training data to be used for updating the moderation model”) (¶[0099]: Figure 4: Step 404).
Concerning claim 15, Revesz et al. discloses moderating user-generated content in an online environment (Abstract); moderation of user-generated content is performed before publication of the content on a web page (¶[0029]); a machine learning system may be used on new user-generated content for a blog (¶[0097]); computing device 1600 may include a network interface 1612 for the Internet (“an interface that communicates”) (¶[0191]: Figure 16); clients 1706 and 1708 may communicate with servers 1702 and 1704 over the Internet (“that communicates with client devices and using the service”) (¶[0193]: Figure 17); generally, content moderation in an online environment for web pages is “a web service”.
Concerning claims 16 to 19, Revesz et al. discloses that a network environment 1700 may include one or more servers 1702 and 1704 coupled to one or more clients 1706 and 1708; clients 1706 and 1708 may train and test a machine learning system, and submit the trained machine learning system to servers 1702 and 1704 for using the trained machine learning system to moderate user-generated content; alternatively, servers 1702 and 1704 may train and test a machine learning system, and submit the trained machine learning system to clients 1706 and 1708 (¶[0196] - ¶[0197]: Figure 
Concerning claim 21, Revesz et al. discloses moderation of user-generated content that can include blogs and textual content of a web page (“wherein the data contents to be moderated are user-generated content including at least one of: blogs”).  (¶[0070] and ¶[0097])  
Concerning claim 22, Revesz et al. discloses that moderation of user-generated content is performed before publication of the content on a web page (“earlier published by the client”).  (¶[0029])  A reference corpus may be compiled for each category to contain verified positive and negative content examples, where the verification may be performed by a human reviewer (“wherein training data is based on human-generated moderated data”).  (¶[0092]: Figure 4: Step 404)  Moderation of user-generated content that can include blogs and textual content of a web page.  (¶[0070] and ¶[0097])  Implicitly, a blog is “published by a client” at a time that is conventionally, “earlier”, but, logically, publication of a blog must be either “earlier” or “not earlier”. 
Concerning claim 23, Revesz et al. discloses that different folds are identified in the reference data for use in training and testing the machine learning system (¶[0118]: Figure 7: Step 704); the trained machine learning system is tested on test vectors (¶[0123]: Figure 7: Step 714); for each parameter value, an accuracy is determined of 
Concerning claim 24, Revesz et al. discloses automatically determining a probability value indicating that the user-generated content is either a positive example or a negative example of one or more unsuitable categories to a predefined degree of certainty (Abstract; ¶[0031]); a probability value is used as an indication that the content is an example of an unsuitable category, where a higher likelihood value is that the content is unsuitable for publication (¶[0050]); embodiments aggregate the unsuitability information for the comments of the user to generate an indication of how suitable or unsuitable the user’s comments are in each category (¶[0081]: Figure 3: Step 304); clients may moderate user-generated web page content (¶[0194] - ¶[0195]: Figure 17); here, a probability value or likelihood of certainty that content is suitable or unsuitable is equivalent to “a confidence value revealing how certain the moderator tool is about the moderation result”, where this moderation result is “sent to the client device”.


Concerning claim 25, Revesz et al. discloses that different threshold values can be set for different unsuitable categories; threshold values may be lower for a more inflammatory category of ‘racist’, and the threshold values may be higher for a more general category of ‘abusive’ (¶[0054]); here, if a threshold value is set higher or lower for a category, then this is equivalent to “applying a strictness value on the moderation request”; that is, a lower threshold value would have a greater ‘strictness’ as compared to a higher threshold value. 
Concerning claim 27, Revesz et al. discloses clients 1706 and 1708 may train and test a machine learning system, and submit the trained machine learning system to servers 1702 and 1704 for using the trained machine learning system.  (¶[0194]: Figure 17)  That is, Revesz et al. uses clients that provide their own training data so that a machine learning system (“a moderation model”) “is specific for a given type of training data to serve each client device individually by using a specific moderation model.” 
Concerning claim 29, Jaiswal teaches that training module 106 may use a term frequency-inverse document frequent (TF-IDF) algorithm to select and/or weight features within the feature set (“wherein weighting text items includes weighting text items based on at least one of term frequency-inverse document frequency (tf.idf) . . .”), or feature extraction and/or feature weighting algorithms of segment-set term frequency-inverse segment-set frequency (STF-ISSF), or segment-set term frequency-inverse document frequency (STF-IDF).  (¶[0056]) 

s 7 to 8 are rejected under 35 U.S.C. 103 as being unpatentable over Revesz et al. (U.S. Patent Publication 2015/0154289) in view of Jaiswal (U.S. Patent Publication 2012/0303558) as applied to claim 1 above, and further in view of Davi et al. (U.S. Patent Publication 2011/0078242).
Concerning claim 7, Revesz et al. does not expressly disclose using metadata for moderation.  Still, Figure 2 illustrates a data table that includes various data that could be construed as ‘metadata’, e.g., user ID, warned, and flagged.  Anyway, Davi et al. teaches a similar system of automatic moderation of media content, where moderation metadata 64 describes the corresponding moderation action executed by the originating online provider, including an originating moderator field 64a and an action field 64b.  Additionally, media content metadata 62 can include a unique identifier, specification of media type, a title assigned to the media content, and information about the registered user.  (¶[0037] - ¶[0040])  An objective is to provide automatic moderation that has an advantage of being scalable when a moderator can be overwhelmed by an amount of uploaded content.  (¶[0002])  It would have been obvious to one having ordinary skill in the art that a data table of Revesz et al. includes metadata as taught by Davi et al. for a purpose of providing automatic moderation that is scalable and does not produce data that overwhelms a moderator.
Concerning claim 8, Revesz et al. discloses that a features table is updated for new examples in a reference corpus (“processing content data for defining additional features”).  (¶[0102]: Figure 5: Step 504)

10 is rejected under 35 U.S.C. 103 as being unpatentable over Revesz et al. (U.S. Patent Publication 2015/0154289) in view of Jaiswal (U.S. Patent Publication 2012/0303558) as applied to claim 1 above, and further in view of Mylonakis et al. (U.S. Patent Publication 2014/0200878).
Concerning claim 10, Revesz et al. discloses dividing training data into a training set and test set.  (¶[0118] - ¶[0119]: Figure 7: Steps 704 to 706)  However, Revesz et al. does not expressly disclose “a development set” for “defining some parameters for the model”.  Still, Mylonakis et al. discloses model adaptation using training examples, where features of a model 112 are optimized on a development set 66 of text to optimize, e.g., maximize, a scoring metric.  (¶[0054])  Feature weights are tuned on a development set for a mixture of usage and particular styles and genres that match a test domain.  (¶[0078])  The aim is to optimize the scores over all training samples in a development corpus until an optimal combination of weights is found.  (¶[0083])  It would have been obvious to one having ordinary skill in the art to use a development set for defining parameters for a model as taught by Mylonakis et al. to moderate user-generated content of Revesz et al. for a purpose of optimizing a model to match a mixture of styles and genres of a test set. 

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Revesz et al. (U.S. Patent Publication 2015/0154289) in view of Jaiswal (U.S. Patent Publication 2012/0303558) as applied to claim 1 above, and further in view of Miura et al. (U.S. Patent Publication 2015/0254228).
Revesz et al. does not expressly disclose a moderation model that performs “language-specific processing”.  However, Revesz et al. would implicitly produce a moderation model that is specific to a given language, e.g., English.  Anyway, Miura et al. teaches using machine learning to estimate a topic of a document, where an information processing unit handles English as a first language and Japanese as a second language.  (¶[0021])  A controller executes a multilingual document classifying program, which obtains text from first-language field A and second-language field A, and obtains first-language-and-second-language word-sense information.  (¶[0024] - ¶[0027])  An objective is to execute processing for classifying multilingual documents.  (¶[0006])  It would have been obvious to one having ordinary skill in the art to perform language-specific processing of text as taught by Miura et al. to moderate user-generated content of Revesz et al. for a purpose of classifying multilingual documents. 

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Revesz et al. (U.S. Patent Publication 2015/0154289) in view of Jaiswal (U.S. Patent Publication 2012/0303558) as applied to claim 1 above, and further in view of Srinivasan et al. (U.S. Patent Publication 2015/0317562).
Revesz et al. does not expressly disclose that a client moderation tool has an application programming interface (API), but this is a common software component that could be construed as inherent.  That is, a web interface may be construed as an API, and a web interface appears to be disclosed by Revesz et al.  Anyway, Srinivasan et al. teaches automatic moderation of online content, where a web page graphical user interface 310 can generate new content.  (¶[0023]: Figure 3)  This web page graphical Srinivasan et al. to moderate user-generated content of Revesz et al. for a purpose of helping human moderators to process large volumes of information.

Response to Arguments
Applicants’ arguments filed 16 November 2021 have been considered but are moot in view of new grounds of rejection as necessitated by amendment.
Applicants’ amendments overcome the objections to the claims.  However, a new claim objection is applied to independent claims 1, 26, and 28.
Applicants amend independent claims 1, 26, and 28 to set forth new limitations of “wherein the moderator tool differentiates the acceptable contents and the unacceptable contents by defining a boundary in the feature space that separates the labeled acceptable contents and unacceptable contents” and “wherein the moderator tool, in executing the first algorithm and the second algorithm, performs at least one of language detection, determining sums, determining means, or determining distribution parameters, weighting using tf.idf, weighting using entropy, or normalizing document vectors”.  Applicants do not include any significant arguments in the after-final response traversing the prior rejection filed on 16 November 2021 of these independent claims as being obvious under 35 U.S.C. 103 over Revesz et al. (U.S. Patent Publication 2015/0154289) in view of Tetreault et al. (U.S. Patent Publication 2017/0257329).  
Generally, new grounds of rejection are set forth as directed to independent claims 1, 26, and 28 being obvious under 35 U.S.C. §103 over Revesz et al. (U.S. Patent Publication 2015/0154289) in view of Jaiswal (U.S. Patent Publication 2012/0303558).  These new grounds of rejection are necessitated by amendment.  Applicants’ arguments filed in the after-final response of 26 October 2021 are to some degree moot given these new grounds of rejection.  The rejection no longer relies upon Tetreault et al. (U.S. Patent Publication 2017/0257329) or Lim et al. (U.S. Patent Publication 2011/0078187).  Instead, Jaiswal is maintained to teach the limitations of these references and the new limitations.
Specifically, Jaiswal is maintained to better teach the prior limitations directed to “choosing features to be used in a moderation model to be created and defining a weighting of data features”.  Here, Jaiswal, at ¶[0056], clearly states that training module 106 may use a term frequency-inverse document frequency (TF-IDF) to select and/or weight features within a feature set.  Moreover, Jaiswal is maintained to teach the limitation of “wherein the moderator tool differentiates between the acceptable and the unacceptable content by defining a boundary in the feature space that separates the labeled acceptable contents and unacceptable contents”.  Those skilled in the art know that the concept of defining a boundary in a feature space is commonly used with support vector machines (SVMs).  Jaiswal, at ¶[0057], states that a machine learning-based classifier may include a map of support vectors that represent boundary features, Jaiswal, at ¶[0056], at least teaches an alternative of “weighting using tf-idf” because training module 106 may use a term frequency-inverse document frequency (TF-IDF) to select and/or weight features with a feature set.  
Similarly, Revesz et al. can be construed to broadly disclose a variety of the claimed alternatives in the limitation of “wherein the moderator tool, in executing the first algorithm and the second algorithm, performs at least one of language detection, determining sums, determining means, or determining distribution parameters, weighting using tf.idf, weighting using entropy, or normalizing document vectors”.  Specifically, Revesz et al. discloses an alternative of “determining sums” and “determining means” because a weighted sum of hypotheses is described at ¶[0173], and average probability values are described at ¶[0064], ¶[0072], ¶[0082], and ¶[0084], where an average is maintained to be equivalent to “a mean”.  Additionally, Revesz et al. broadly discloses that these first and second algorithms provide for “determining distribution parameters” because a distribution of weights over examples during training is described at ¶[0115], ¶[0160], and ¶[0165].  Applicants’ limitations of this group of alternatives as currently set forth encompasses quite a broad variety of conceivable characteristics of machine learning in the prior art.  

The examiner tends to agree that there is a difference between weighting of training examples in Revesz et al. and weighting of features in Jaiswal as argued by Applicants.  However, it is maintained that a training example could be a feature depending upon how the training examples are defined.  That is, conventional ‘features’ can comprise individual words and word phrases.  Training examples might generally be considered to be longer bodies of text as compared to individual words and word phrases, so that a body of text may have to be parsed into individual words and word phrases to obtain something equivalent to features.  See ¶[0100] - ¶[0102] and ¶[0134] - ¶[0136] of Revesz et al.  Still, if training examples comprise individual words or word phrases after parsing, then weighting of training examples could be weighting of features.  That is, this depends upon how the training examples are constructed.  Long pieces of acceptable and unacceptable content to be moderated could represent Jaiswal.
Applicants’ Declaration filed on 18 October 2021 sets forth twenty-five paragraphs.  Here, ¶1 to ¶9 are factual, and are not contested by the examiner.  Then, ¶10 to ¶13 are directed against Revesz et al.  The examiner does not actually disagree with these points directed against Revesz et al., as stop words and training examples are not necessarily ‘features’, as would be understood to one having ordinary skill in the art.  Still, it is maintained one skilled in the art could draw some analogies between stop words, training examples, and ‘features’.  Applicants’ ¶14 to ¶22 are directed against Tetreault et al.  The examiner does not agree with the points made against Tetreault et al., but these points are moot because the rejection no longer relies upon Tetreault et al.  Here, Tetreault et al. is maintained to clearly teach weighting of features for training, e.g., at ¶[0069] to ¶[0070], and ¶[0079] - ¶[0083], where features are word phrases, e.g., a word phrase of ‘so happy’ has a high weight of 0.7 for a sentiment and a word phrase of ‘freaking pigs’ has a high weight of 0.9 for abusive inflammatory language in Figure 8.  Applicants, at ¶17, appear to attempt to mischaracterize Tetreault et al. as only using machine learning to generate weights for the features, and as not using the weighted features for training a model by machine learning.  The examiner maintains that this is not accurate, as machine learning encompasses the process of weighting the features and the process of training the model with the weighted features.  Tetreault et al. ‘s term ‘linter’ conventionally is a tool in computer programming to flag errors and i.e., cleaning up ‘lint’ on a garment with a lint roller.  https://en.wikipedia.org/wiki/Lint_(software)  Tetreault et al. is using the term ‘linter’ in an analogous way to provide feedback about errors or conditions in messages. 
Applicants’ ¶23 to ¶25 address Jaiswal.  Here, Applicants argue that Jaiswal is only using weights to identify a feature set by keeping the best weights and discarding other features, but does not use the weights in the training the machine learning model.  Applicants state that Jaiswal only uses the term ‘weight’ four times, and that a ‘weight’ is not mentioned as being involved in training at ¶[0057] of Jaiswal.  These arguments are not persuasive as directed against Jaiswal.  Firstly, it is maintained that four occurrences of a description of weighting features at ¶[0055] to ¶[0056] are sufficient to teach the limitation of “defining a weighting of features”.  Secondly, even if features that may have low weightings might be discarded by Jaiswal, “defining a weighting of data features” is still taught by this reference.  That is, even if some features might be discarded in some embodiments of Jaiswal, some features remain and not all the features are discarded.  Thirdly, Jaiswal is describing the weighted features as comprising the feature sets.  These feature sets are described as being used as a particular training data set to generate a machine learning-based classification model at ¶[0057].  Indeed, Jaiswal, at ¶[0055] to ¶[0056], describes these features as extracted from a training data set for training module 106.  Applicants, then, mischaracterize Jaiswal by contending weighted features are not used in training,  
Applicants’ amendments necessitate these new grounds of rejection.  This Office Action is NON-FINAL.

Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicants’ disclosure.
Gartung et al., Sharma et al., Mani et al., Shanahan et al., Wang et al., Zomet et al., and Caseiro et al. disclose related prior art.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARTIN LERNER whose telephone number is (571) 272-7608.  The examiner can normally be reached on Monday-Thursday 8:30 AM-6:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571) 272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access 





/MARTIN LERNER/Primary Examiner
Art Unit 2657     
February 14, 2022