Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
2.	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/08/2021 has been entered.
Response to Amendment
3.	In response to the office action mailed on 10/28/2020, applicant filed an amendment on 11/30/2020, amending claims 1, 3-5, 8, 11-13, 15, and 16.  Claims 6, 14, and 19 are cancelled.  Claims 21-24 are newly added.  The pending claims are 1-5, 7-13, 15-18, and 20-24. 
	
Response to Arguments
4.	Applicant’s arguments with respect to the pending claims have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
	Applicant argues that the prior art of record does not teach determine a similarity between words in the text document and one or more words in a domain ontology associated with the domain, present a similarity score for each word in the text document, and generate the vector with a size equal to a quantity of the words in the text document, where each value in the vector represents the similarity score for the domain 
(a similarity of each word (or feature))  is determined, col. 12- 59- col. 13, line 53), presenting a similarity score for each word in the text document (col. 13, lines 1-21, wherein the probability score of each word to belong to a category is determined ). 
As to generating the vector with a size equal to a quantity of the words in the text document, where each value in the vector represents the similarity score for the domain, Dumais teaches at col. 22, line 65-67, a probability that the textual information object belongs to a particular class is output; and at col. 9, line 66 – col. 10, line 7, classified textual information object are stored and used by the classifier to classify future input, i.e. using the stored training example by a support vector machines (or "SVMs")  to classify textual information; and also col. 24, line 41 – col. 25, line 64, wherein textual information object is processed by feature extraction process, reduced by feature reduction process, and applied to classification process to generate probability score that the textual information object belongs to a particular class.  Dumais in view of Johns does not explicitly disclose generating the vector with a size equal to a quantity of the words in the text document.  However, the newly introduced reference, Kneller (US 20200082810), teaches a text similarity measure, wherein a text is characterized by a vector where the value in each dimension corresponds to the number of times the term appears in the text (Cosine similarity), see paragraph [0073].  Therefore, it would have been obvious at the time the application was filed to use Kneller’s measure of similarity with the system of Dumais in view of Johns, in order to generate the vector with a size equal to a quantity of the words in the text document.  This would provide a useful measure of how similar two sets of text are likely to be in terms of their subject matter and categorizing documents.


Claim Rejections - 35 USC § 103
5.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 7-13, 15-18, and 20-24 are rejected under 35 U.S.C. 103 as being unpatentable over Dumais (US   6192360) in view of Johns, (US 20140236941), and further in view of Kneller (US 20200082810).
As per claim 1, Dumais teaches a device comprising: one or more memories; and one or more processors, communicatively coupled to the one or more memories (Fig. 11A and col. 24, lines 21-59), to: 
receive a corpus of text documents (that are unstructured, as in claim 8) (col. 5, line 33-52, receiving a corpus of millions of words or textual information objects (, i.e. documents, col. 1, lines 24-26)); 
utilize feature extraction on a text document, of the corpus of text documents, to generate features from the text document, the features including: binary features, numeric features, and/ (or, claim 8) categorical features (col. 1,lines 57-60; col. 11, line 31-32, determine whether textual information objects include certain word, word, attribute, syntax (expression) as features.  Col. 12, .line 40, see § 4.2.1.2.2 Category Dependent Feature Reduction. Col. 1, lines 60-62; col. 1, § 1.2.2.1 RULE BASED CLASSIFICATION, lines 9-11, wherein said, if the textual content has the word "close", the phrase "nasdaq" and a number, then it is classified as "stock market" text.  See also, Col. 2, line 49-54, recognizing numerical digits and numbers);
perform feature engineering on the numeric features to generate converted features (col. 5, lines 1-20 and col. 7, lines 44-47; col. 9, line 6-9, converting a textual information objects to feature vectors, and reducing the feature vector to a reduced feature vector having less elements.  Textual information objects comprising scanned numbers and numerical values (col. 1, lines 19-20 and 60-62));
perform feature encoding on the text document, based on the converted features  to represent the text document as a vector with a similarity score for a domain (col. 10, lines 33-40, binarizing the reduced feature vectors.  See also Fig. 2 and corresponding col. 24, line 41 – col. 25, line 64, wherein textual information object is processed by feature extraction process, reduced by feature reduction process, and applied to classification process to generate probability score that the textual information object belongs to a particular class); 
provide the vector with the similarity score for the domain, as training data, to a machine learning model to generate a trained machine learning model (col. 22, line 65-67, a probability that the textual information object belongs to a particular class is output; and col. 9, line 66 – col. 10, line 7, classified textual information object are stored and used by the classifier to classify future input, i.e. using the stored training example by a support vector machines (or "SVMs")  to classify textual information); and 
(col. 3, lines 15-20, once training is complete, the neural network can be used to classify unknown inputs in accordance with the weights and biases determined during training.  See also, col. 13, lines 1-22 and col. 14, line 1-32).
Dumais teaches defining and extracting domain-specific features, wherein each of the features may define a word, phrase or letter grouping (col. 11, lines 51-55).  
However, Dumais does not explicitly disclose generating similarity scores for each word in the text document based on a pre-defined set of words.
Johns in the same field of endeavor teaches generating similarity scores for each word in the text document based on a pre-defined set of words ([0036], [0117]).  Therefore, it would have been obvious at the time the application was filed to use Johnns’ feature of generating similarity scores for each word in the text document based on a pre-defined set of words with the text classifier of Dumais, in order to provide high performance classifiers by improving the process of indicating a degree of relevance of contents.
Further Dumais teaches wherein the one or more processors, when performing the feature encoding, are to:
determine a similarity between words in the text document and one or more words in a domain ontology associated with the domain (a similarity of each word (or feature))  is determined, col. 12- 59- col. 13, line 53),
present a similarity score for each word in the text document (col. 13, lines 1-21, wherein the probability score of each word to belong to a category is determined ). 
As to generating the vector with a size equal to a quantity of the words in the text document, where each value in the vector represents the similarity score for the domain, Dumais 
Dumais in view of Johns does not explicitly disclose generating the vector with a size equal to a quantity of the words in the text document.
Kneller in the same field of endeavor teaches a text similarity measure, wherein a text is characterized by a vector where the value in each dimension corresponds to the number of times the term appears in the text (Cosine similarity), see paragraph [0073].  
Therefore, it would have been obvious at the time the application was filed to use Kneller’s measure of similarity with the system of Dumais in view of Johns, in order to generate the vector with a size equal to a quantity of the words in the text document.  This would provide a useful measure of how similar two sets of text are likely to be in terms of their subject matter and categorizing documents.
As per claim 2, Dumais teaches  wherein the features include one or more of: a word feature, a sentence feature, a document feature, a list feature, a float feature, a string feature, an integer feature, a Boolean feature, or a map feature (col. 5, lines 37-52, words are considered as features.  See also, col. 11, lines 51-55, wherein features may define a word, phrase or letter grouping).
As per claim 3, Dumais teaches wherein the one or more processors are further to perform the feature engineering on the binary features by converting the binary features into a quantity of true instances that are included in the converted features (yes/no features, col. 4, line 17-40).
As per claim 4, wherein the one or more processors, when performing the feature engineering on the numeric features, are further to convert the numeric features into fixed sized vectors that are included in the converted features (col. 22, lines 56, and col. 25, 50-56, generating a reduced fixed size feature vector while maintaining the features important for classification).
	As per claim 5, wherein the one or more processors are further to perform the feature engineering on the categorical features by one or more of: 
converting the categorical features into n-gram sequences that are included in the converted features (col. 11, lines 48-55, wherein converted features define a word, phrase or letter grouping) or convert the categorical features into primary forms that are included in the converted features.
As per claim 7, Dumais teaches wherein the one or more processors are further to: determine vectors with similarity scores for the domain for text documents in the corpus of text documents, other than the text document; and provide the vectors with the similarity scores for the domain, as the training data, to the machine learning model to generate the trained machine learning model (col. 9, line 66 – col. 10, line 7, classified textual information object are stored and provided to learning machine as learning examples (col. 5, lines 33-52; col. 6, lines 18-25.  Using the stored training example by a support vector machines (or "SVMs") to classify textual information, col. 9, lines 66 – col. 10, line 7).
As per claim 21, Dumais teaches performing feature engineering on the binary features by utilizing a log transformation and scaling technique to encode the binary features to generate the converted features (col. 5, lines 1-20 and col. 7, lines 44-47; col. 9, line 6-9, converting a textual information objects to feature vectors, and reducing the feature vector to a reduced feature vector having less elements.  Textual information objects comprising scanned numbers and numerical values (col. 1, lines 19-20 and 60-62).  For utilizing a log transformation, see col.12 , line 59 – col. 13, line 21, wherein log transformation to measure which feature belong to which class) 
As per claims 8, 11-13, 22, Dumais teaches a computer readable medium (Fig. 11A and col. 24, line 23).  The remaining steps are rejected under the same rationale as applied to the method steps of rejected claims 1, 3-5, and 21. 
As per claims 9, Dumais teaches provide the vectors with the similarity scores for the domain, as testing data, to the trained machine learning model; and determine an accuracy of the trained machine learning model based on providing the vectors with the similarity scores for the domain, as the testing data, to the trained machine learning model (col. 9, line 66 – col. 10, line 7, classified textual information object are stored and provided to learning machine as learning examples (col. 5, lines 33-52; col. 6, lines 18-25.  Using the stored training example by a support vector machines (or "SVMs") to classify textual information, col. 9, lines 66 – col. 10, line 7.  Also, col. 3, line 7-14, wherein the system is tested, for accuracy, on a set of validation data).
As per claims 10, Dumais teaches wherein the machine - 38 -PATENT Docket No. 0095-0481 learning model includes one or more of: a classification model, a support vector machine model, a linear regression model, a logistic regression model, a naive Bayes model, a linear discriminant analysis model, a decision support vector machine, col. 4, line 43).
As per claim 15, 16, 18, and 23, method claim 15, 16, 18, 23 and apparatus claims 1, 3-5, 7, 21 are related as method and apparatus of using same, with each claimed element's function corresponding to the claimed method step.  Accordingly, claims 15, 16, 18, 23 are similarly rejected under the same rationale as applied above with respect to apparatus claims 1, 3-5, 7, 21. 
As per claim 20, Dumais teaches wherein the feature extraction technique, the feature engineering technique, and the feature encoding technique process the text document to represent the text document in a format understood by the machine learning model (stored training data is in a format understood by the machine learning model in order to be used (col.5, lines 33-37).
As per claim 24, Dumais providing the vectors with the similarity scores for the domain, as testing data, to the trained machine learning model; and determining an accuracy of the trained machine learning model based on providing the vectors with the similarity scores for the domain, as the testing data, to the trained machine learning model (See Fig. 2 and corresponding col. 24, line 41 – col. 25, line 64, wherein textual information object is processed by feature extraction process, reduced by feature reduction process, and applied to classification process to generate probability score that the textual information object belongs to a particular class, and col. 3, lines 7-20, wherein the trained machine learning model is tested and Fig. 10 for determining probability in category) .

Conclusion
6.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See PTO-892.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on 571-272-7799.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/ABDELALI SERROU/Primary Examiner, Art Unit 2659