DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-14  is/are rejected under 35 U.S.C. 103 as being unpatentable over Minglan “Joint RNN Model for Argument Component Boundary Detection” in view of Yuan “Task-Specific Word Identification from Short Texts Using Convolutional Neural Network”.


Regarding claim 1 Minglan teaches a method of automatically generating a terminology definition knowledge base (KB) from text media (identifying word sequences that constitute argument components to predict whether sentences are argumentative or not, see abstract), the method comprising: 
receiving a word sequence to use in constructing the terminology definition KB (extracting arguments from natural language texts, see section 1 introduction); 
processing the word sequence based on the dense vector representations of the words and the label using a Conditional Random Field (CRF) definition extraction model to identify boundaries of the terminology definition in the word sequence (bidirectional neural network with a conditional random field layer above it, see section 1 introduction, joint model for sentence classification and sequence labeling in boundary detection, section 3.2-3.5 and figure 3).
However Minglan does not teach mapping each word in the word sequence to a real value dense vector using dense vector representations; processing the word sequence based on the dense vector representations of the words using a Convolutional Neural Network (CNN) definition identification model to identify whether the word sequence includes a terminology definition and to label the word sequence with a label indicating whether a terminology definition exists within the word sequence; and adding the terminology definition to the terminology definition KB.
In the same field of endeavor Yuan teaches a task-specific word identification which chooses which words best describe a short text (definition, see abstract). Deep neural networks for natural language processing is word embeddings which map each word to a dense vector. Word embeddings are trained on a large text corpus, and can avoid the use of hand-designed features and capture the hidden semantic and grammatical features of words, this improving the performance of many natural language processing tasks,  see section 2.2. Identify task specific words based on the convolutional neural network, see section 3. Task-specific words are identified by using a CNN model to learn the score vector based on text, the text words having highest values are highlighted as task specific words, the words with the corresponding higher values in the score vector are the key information for the CNN model to predict the label of a sentence, see sections 3.2-3.3. Highlight important sentences of a document, see section 4.
It would have been obvious to one of ordinary skill in the art to combine the Miglan invention with the teachings of Yuan for the benefit of improving the performance of the natural language processing task,  see section 2.2.
Regarding claim 2 Minglan teaches the method of claim 1, wherein the dense vector representations are generated by a word representation training component that receives a text collection as input and uses a skip-gram recursive neural network (RNN) to process the text collection to generate the dense vector representations (RNN is a neural architecture designed for dealing with sequential data. RNN takes as input a vector X = [xt]T1 and returns a feature vector sequence ~h = [ht]T1 at every time step, see section 3.1).
Regarding claim 3 Minglan teaches the method of claim 1, wherein the label is a binary label indicating whether the existence of a terminology definition within the word sequence is true or false (The network of joint model is trained to find the parameters that minimize the cross-entropy of the predicted and true argumentative status for sentence and the negative log probability of the sentence’s labels jointly, see section 3.5).
Regarding claim 4 Yuan teaches the method of claim 3, wherein the CNN definition identification model is generated by a CNN training component, the CNN training component using the dense vector representation and a plurality of training word sequences to train the CNN definition identification model to automatically identify whether a word sequence includes a terminology definition and to assign an appropriate binary label to the word sequence (The fundamental of applying the deep neural networks for natural language processing is word embeddings which map each word to a dense vector. These word embeddings are trained in an unsupervised way on a large text corpus, see section 2.2; Our approach first trains the CNN model and transfers the well-learned representation of sentence to identify the task-specific words and phrases. We first introduce how to construct score vectors to identify the task-specific words based on the convolutional neural network. Then, we extend our approach to identify the task-specific phrases, see section 3).
Regarding claim 5 Yuan teaches the method of claim 4, wherein the label assigned to the respective training word sequences is assigned by human annotators (method is trained on labeled text corpus, see section 2.1).
Regarding claim 6 Minglan in view of Yuan teaches the method of claim 1, wherein the CRF definition extraction model is generated by a CRF training component (the Bi-LSTM-CRF method, which augments a CRF layer after the output of Bi-LSTM, so as to explicitly model the dependencies between the output labels, see Minglan section 3.3), the CRF training component using the dense vector representations and the binary labels assigned by the CNN definition identification model (To identify the task-speci_c words, we adopt a CNN model to learn the score vector sc based on the text X = [x1; x2; : : : ; xn] and its label c., see Yuan section 3.3) to train the CRF definition extraction model to automatically detect the boundaries of the terminology definition with the word sequence (For sequence labeling in boundary detection, we reuse the pre-computed hidden states h of the Bi-LSTM. At each time-step, we combine each hidden state ht with the relative location feature s and the sentence’s predicted argumentative status p created by the above mentioned classification operation: h0 t = [ht; s; p]. Then the Hs will be the scores matrix P described in Sec. 3.3 which will be given to the CRF layer, see Minglan section 3.5).
Regarding claim 7 Minglan teaches the method of claim 1, wherein the CRF definition extraction model is configured to tag words in the word sequence that are part of the terminology definition in the word sequence (The output of C is a matrix of scores, denoted by P. P is of size T _k, where k is the number of distinct tags, and Pij corresponds to the score of the jth tag of ith word in a sentence, see section 3.3).
Regarding claim 8 Minglan teaches a system for automatically generating a terminology definition knowledge base (KB) from text media (identifying word sequences that constitute argument components to predict whether sentences are argumentative or not, see abstract), the system comprising: 
a definition extraction component configured to process word sequences using dense vector representations (extracting arguments from natural language texts, see section 1 introduction), 
and wherein the CRF definition extraction model processes the word sequences based on the dense vector representations of the words and the label assigned by the CNN definition identification model to identify boundaries of the terminology definition in the word sequence (bidirectional neural network with a conditional random field layer above it, see section 1 introduction, joint model for sentence classification and sequence labeling in boundary detection, section 3.2-3.5 and figure 3).
However Minglan does not teach a CNN definition identification model and a CRF definition extraction model to extract terminology definitions found in the word sequences and to add the extracted terminology definitions to the terminology definition KB, wherein the dense vector representations are used to map the words in the word sequences to real value vectors, wherein the CNN definition identification model processes the word sequences based on the dense vector representations to identify whether a respective word sequence includes a terminology definition and to label the word sequence with a label indicating whether a terminology definition exists within the word sequence.
In the same field of endeavor Yuan teaches a task-specific word identification which chooses which words best describe a short text (definition, see abstract). Deep neural networks for natural language processing is word embeddings which map each word to a dense vector. Word embeddings are trained on a large text corpus, and can avoid the use of hand-designed features and capture the hidden semantic and grammatical features of words, this improving the performance of many natural language processing tasks,  see section 2.2. Identify task specific words based on the convolutional neural network, see section 3. Task-specific words are identified by using a CNN model to learn the score vector based on text, the text words having highest values are highlighted as task specific words, the words with the corresponding higher values in the score vector are the key information for the CNN model to predict the label of a sentence, see sections 3.2-3.3. Highlight important sentences of a document, see section 4.
It would have been obvious to one of ordinary skill in the art to combine the Minglan invention with the teachings of Yuan for the benefit of improving the performance of the natural language processing task,  see section 2.2.

Regarding claim 9 Minglan teaches the system of claim 8, wherein the dense vector representations are generated by a word representation training component that receives a text collection as input and uses a skip-gram recursive neural network (RNN) to process the text collection to generate the dense vector representations (RNN is a neural architecture designed for dealing with sequential data. RNN takes as input a vector X = [xt]T1 and returns a feature vector sequence ~h = [ht]T1 at every time step, see section 3.1).
Regarding claim 10 Minglan teaches the system of claim 8, wherein the label is a binary label indicating whether the existence of a terminology definition within the word sequence is true or false (The network of joint model is trained to find the parameters that minimize the cross-entropy of the predicted and true argumentative status for sentence and the negative log probability of the sentence’s labels jointly, see section 3.5)..
Regarding claim 11 Yuan teaches the system of claim 10, wherein the CNN definition identification model is generated by a CNN training component, the CNN training component using the dense vector representation and a plurality of training word sequences to train the CNN definition identification model to automatically identify whether a word sequence includes a terminology definition and to assign an appropriate binary label to the word sequence (The fundamental of applying the deep neural networks for natural language processing is word embeddings which map each word to a dense vector. These word embeddings are trained in an unsupervised way on a large text corpus, see section 2.2; Our approach first trains the CNN model and transfers the well-learned representation of sentence to identify the task-specific words and phrases. We first introduce how to construct score vectors to identify the task-specific words based on the convolutional neural network. Then, we extend our approach to identify the task-specific phrases, see section 3).
Regarding claim 12 Yuan teaches the system of claim 11, wherein the label assigned to the respective training word sequences is assigned by human annotators (method is trained on labeled text corpus, see section 2.1).
Regarding claim 13 Minglan in view of Yuan teach The system of claim 8, wherein the CRF definition extraction model is generated by a CRF training component (the Bi-LSTM-CRF method, which augments a CRF layer after the output of Bi-LSTM, so as to explicitly model the dependencies between the output labels, see Minglan section 3.3), the CRF training component using the dense vector representations and the binary labels assigned by the CNN definition identification model (To identify the task-speci_c words, we adopt a CNN model to learn the score vector sc based on the text X = [x1; x2; : : : ; xn] and its label c., see Yuan section 3.3) to train the CRF definition extraction model to automatically detect the boundaries of the terminology definition with the word sequence (For sequence labeling in boundary detection, we reuse the pre-computed hidden states h of the Bi-LSTM. At each time-step, we combine each hidden state ht with the relative location feature s and the sentence’s predicted argumentative status p created by the above mentioned classification operation: h0 t = [ht; s; p]. Then the Hs will be the scores matrix P described in Sec. 3.3 which will be given to the CRF layer, see Minglan section 3.5).
Regarding claim 14 Minglan teaches the system of claim 8, wherein the CRF definition extraction model is configured to tag words in the word sequence that are part of the terminology definition in the word sequence (The output of C is a matrix of scores, denoted by P. P is of size T _k, where k is the number of distinct tags, and Pij corresponds to the score of the jth tag of ith word in a sentence, see section 3.3).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Pertinent prior art available on form 892.
Gao ‘329 teaches a model which encodes a natural language sentence into a dense vector representation and uses pre-trained convolutional neural networks to encode words into vector representations, see abstract.
Hashimoto WO ‘730 teaches a so-called "joint many-task neural network model" that performs increasingly complex NLP tasks at successive layers. Unlike traditional NLP pipeline systems, the joint many-task neural network model is trained end-to-end for POS tagging, chunking, and dependency parsing. It can further be trained end-to-end on semantic relatedness, textual entailment, and other higher level tasks. In a single end-to-end implementation, the model obtains state-of-the-art results on chunking, dependency parsing, semantic relatedness and textual entailment, see par. [0035].

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Ortiz-Sanchez whose telephone number is (571)270-3711. The examiner can normally be reached Monday- Friday 9AM-6PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHAEL ORTIZ-SANCHEZ/Primary Examiner, Art Unit 2656