Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
				Response to Applicant’s Arguments
In response to “However, the embedding vector generated in block 10 or the document frequency value is not used as an input of the classifier. In contrast, Bespalov discloses that the document embedding vector generated in block 30 is used as input to a classifier.” and “Paragraphs [0018] to [0020] and Fig. 2 of Bespalov disclose that the elements of the document embedding vector inputted into the classifier are not the document frequency value of the word. But claim 1 recites "acquiring a n-dimensional characteristic vector of a text document of the text documents, wherein n is a number of words in the dictionary, and each of n elements of the n-dimensional characteristic vector is a number of occurrence of a different given word of the text document; and training an initial text classifier using the n-dimensional characteristic vector as an input and using tag information corresponding to the text document as an output, to obtain the first tag generation mode"”. 
The relevant limitation recites (1) “acquiring a n-dimensional characteristic vector of a text document of the text documents”, (2) “wherein n is a number of words in the dictionary”, (3) “each of n elements of the n-dimensional characteristic vector is a number of an occurrence of a different given word of the text document”, and (4) “training an initial text classifier using the n-dimensional characteristic vector as an input…”.
Bespalov teaches using document embedding vector as input to a classifier (¶23) in a training process to minimize the classifier’s loss function (¶24). Here, the document ¶21) and the document embedding vector is generated as follows: 
(a) generating i-dimensional (“n-dimensional”) word embedding vector: given one or more strings of text of a document of interest provided for sentiment classification (¶16), a word embedding process performed on one or more strings of text of the document of interest by identifying each word by its index i in dictionary D where words in D were sorted by their document frequency in a training corpus (¶16). An embedding vector of dimension i may be assigned to each word (i-th word) of the text as ei = [ei1, ei2, …, eim]T (¶16). Here, index i (i.e., n) is a number of words in dictionary D (i.e., (2)) where each of i elements of the word embedding vector is a document frequency (i.e., number of occurrence of a different given word, i.e., (3)) since i is sorted by document frequency of the word in the training corpus (higher the i of the word, the more frequent the word is in the training corpus). 
(b) generating phrase or n-gram embedding vectors by concatenating the i-dimensional word embedding vectors: the embedding vectors generated are used in a phrase or n-gram embedding process to generate n-gram vectors through concatenation of word embedding vectors in phrases / sentences: Concatenation vector 
    PNG
    media_image1.png
    28
    71
    media_image1.png
    Greyscale
 is the concatenation of word embeddings of words in i-th phrases: 
    PNG
    media_image2.png
    25
    152
    media_image2.png
    Greyscale
 
    PNG
    media_image3.png
    31
    226
    media_image3.png
    Greyscale
 (¶18 and ¶19). 
(c) generating the document embedding vector using the n-gram embedding vectors, i.e., concatenated i-dimensional word embedding vectors: document embedding vector may comprise a length b, and a k-th element that may be the mean value of the k-th ¶20). The document embedding vector so generated is used as input to a classifier (¶23).
In particular, at ¶23, the document embedding process may be defined as:

    PNG
    media_image4.png
    183
    498
    media_image4.png
    Greyscale

In summary, since document embedding vector is generated by concatenating i-dimensional (i.e., n-dimensional) word embedding vectors (where words in document interest are assigned embedding vector of dimension i according to word’s index i in the dictionary D with i indicating the word’s document frequency in training corpus) and taking the average values of the concatenated i-dimensional word embedding vectors, the document embedding vector meets the claimed limitation “n-dimensional characteristic vector of a text document”.   
/RICHARD Z ZHU/            Primary Examiner, Art Unit 2675                                                                                                                                                                                            
03/19/2021