DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. 16/117,043, filed on 08/30/2018.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 08/30/2018 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
Claim Rejections - 35 USC § 112
Claims 3 and 5 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 3 recites the limitation "the neural network".  There is insufficient antecedent basis for this limitation in the claim. A recurrent neural network is recited in claim 1 but not a neural network. Examiner is interpreted the limitation as “the recurrent
Claim 5 recites “the third distributed representation” in lines 8 and 16. There is insufficient antecedent basis for this limitation in the claim. Claim 5 line 7 recites “a third representation”. For examining purposes, Examiner interprets line 7 as “a third distributed representation”.
Claim 5 recites “the first distributed representation” in lines 9 and 15. It is unclear if this corresponds to “a first distributed representation” in claim 1 or claim 5. For examining purposes, the limitation is being interpreted as “a [the] first distributed representation”. 
Claim 5 recites “the second distributed representation” in lines 10 and 13, 15. It is unclear whether this limitation refers to “a second distributed representation” in claim 1 or claim 5. . For examining purposes, the limitation is being interpreted as “a [the] second distributed representation”.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1, 4-7, and 9-11 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
CLAIM 1
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
(1) generating an input vector 
	The generating an input vector is a mental processes of contemplating an input vector which can reasonably be performed in one’s mind using pencil and paper, but for the recitation of the processor. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2
a processor
obtained by loading a distributed representation of each of words or phrases included in subject data into a common dimension and a dimension corresponding to a data class representing a role in the subject data 
executing machine learning that uses the input vectors and that relates to features of the words or phrases included in the subject data
The processor is mere instructions to apply the exception using generic computer components under MPEP 2106.05(f). The loading is mere data gathering, an insignificant extra-solution activity under MPEP 2106.05(g). The executing machine learning is not a meaningful limitation under MPEP 206.05(e). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the same reasons discussed above in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 4 incorporates the rejection of claim 1.
Step 1: The claim is a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: Claim 4 incorporates the generating limitation of claim 1.  Additionally, claim 4 limits the generating limitation as follows:
(1) using transformation parameters corresponding respectively to surface layer, word class and unique representation that are common features between the words or phrases, 

These limitations are mathematical calculations. Accordingly, the claim recites an abstract idea. Whereas claim 2 focuses on the machine learning process, and claim 3 focuses on training the model, claim 4 merely focuses on generating data with mathematical computations. It is not focused on improving the the machine learning process. 
Step 2A Prong 2: This judicial exception is not integrated into a practical application. It does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the same reasons discussed above in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 5 incorporates the rejection of claim 4.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: Claim 5 incorporates the generating limitation of claim 4.  Additionally, claim 5 limits the generating limitation as follows:
(1) when the word or phrase corresponds to an entity whose relationship is to be learned, setting, among a first distributed representation of the common dimension, a second distributed representation of a data class corresponding to the entity, and a third representation of others excluding the entity, the third distributed representation at 0 and 
(2) generating the input vector obtained by connecting the first distributed representation, the second distributed representation and the third distributed representation and, 

(4) generating the input vector obtained by connecting the first distributed representation, the second distributed representation and the third distributed representation.
The setting limitations (1) and (3) are mental processes of determining, which can practically be performed in one’s mind with the aid of pencil and paper. The generating limitations (2) and (4) are mental processes of contemplating which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. It does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the same reasons discussed above in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 6
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: Claim 6 incorporates the generating limitation of claim 1.  Additionally, claim recites the following limitations: 
(1) acquiring a result of determination from the input vector 
(2) …learn the learning model and a dimension corresponding to the data class.
	These limitation are mental processes of determining. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. The claim recites the following additional elements: 
using a learned model
inputting an input vector obtained by loading a distributed representation of each of words or phrases included in subject data into a common dimension and a dimension corresponding to a data class representing a role in the subject data and by executing learning that relates to features of the words or phrases included in the subject data
using a processor; 
loading a distributed representation of each of words or phrases included in determination subject data into a common dimension corresponding to the input 33Docket No. PFJA-18034-US: Status Finalvector…
Using a learned model is not a meaningful limitation under MPEP 2106.05(e) because the learned model is not used in a meaningful way, as opposed to claims 2 and 3 where the model is used in a meaningful way. The remainder of the claim limitations focus on inputting data and outputting a result rather than the model. The inputting and loading limitations are mere data gathering, an insignificant extra-solution activity under MPEP 2106.05(g). The processor is mere instructions to apply the exception using generic computer components under MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for reasons set forth in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 7
Step 1: The claim recites a product, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:

	This limitation is a mental process of determining a value, which can be reasonably performed in one’s mind or with the aid of pencil and paper. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. The claim recites the following additional elements: 
A non-transitory computer-readable recording medium 
a program 
a computer
a learned model 
inputting an input vector 
loading a distributed representation of each of words or phrases contained in determination subject data into a common dimension and a dimension corresponding to a data class representing a role in the determination subject data; and
The recording medium, program, and computer amount to no more than generally linking the use of the judicial exception to a field of use (machine learning) under MPEP 2106.05(h). The learned model is not a meaningful limitation under MPEP 2106.05(e) because the learned model is not used in a meaningful way for the reasons set forth above in Claim 1 Step 2A Prong 2. The inputting an input vector and loading a distributed representation are mere data gathering, an insignificant extra-solution activity under 2106.05(g). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the same reasons discussed above in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 9
Step 1: Claim 9 is being interpreted as a product by a process. A product is one of the four categories of eligible subject matter. The “obtained by” language in line 1 is interpreted as describing how the vector was generated rather describing the vector itself.
Step 2A Prong 1: The claim recites the following limitations:
(1) generating an input vector 
The generating an input vector is a mental processes of contemplating an input vector which can reasonably be performed in one’s mind using pencil and paper, but for the recitation of the processor. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. The claim recites the following additional elements: 
a processor
The processor is mere instructions to apply the exception using generic computer components under MPEP 2106.05(f). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The rest of the limitations do not have any patentable weight because there is no clear nexus between them and the input vector. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the same reasons discussed above in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 10
Step 1: The claim recites a product, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
(1) generating an input vector 
The generating an input vector is a mental processes of contemplating an input vector which can reasonably be performed in one’s mind using pencil and paper, but for the recitation of the processor. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. The claim recites the following additional elements: 
A non-transitory computer-readable recording medium 
program 
computer
loading a distributed representation of each of words or phrases included in subject data into a common dimension and a dimension corresponding to a data class representing a role in the subject data; and 
executing machine learning that uses the input vectors and that relates to features of the words or phrases included in the subject data.

The recording medium, program, and computer amount to no more than generally linking the use of the judicial exception to a field of use (machine learning) under MPEP 2106.05(h). The loading is mere data gathering, an insignificant extra-solution activity under MPEP 2106.05(g). The executing machine learning is not a meaningful limitation under MPEP 206.05(e). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the same reasons discussed above in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 11
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
(1) generate an input vector.
The generating an input vector is a mental processes of contemplating an input vector which can reasonably be performed in one’s mind using pencil and paper, but for the recitation of the processor. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. The claim recites the following additional elements: 
A learning device comprising: 
a processor configured to: 
35Docket No. PFJA-18034-US: Status Finalloading a distributed representation of each of words or phrases included in subject data into a common dimension and a dimension corresponding to a data class representing a role in the subject data; and 
execute machine learning that uses the input vectors and that relates to features of the words or phrases included in the subject data.
The learning device amounts to no more than generally linking the judicial exception to the field of use (machine learning) under MPEP 2106.05(h). The processor is mere instructions to apply the exception using generic computer components under MPEP 2106.05(f). The loading is mere data gathering, an insignificant extra-solution activity under MPEP 2106.05(g). The executing machine 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the same reasons discussed above in Step 2A Prong 2. The claim is not patent eligible.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-4 and 6-11 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by “Joint Event Extraction via Recurrent Neural Networks” to Nguyen et al., hereinafter “Nguyen”. Figure 1 from Nguyen p. 303 is shown below.

    PNG
    media_image1.png
    496
    997
    media_image1.png
    Greyscale

Regarding claim 1, Nguyen teaches: A learning method comprising: generating an input vector (Interpreted as any vector                                  
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             in the input sequence                                 
                                    X
                                    =
                                    
                                        
                                            x
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    
                                        
                                            x
                                        
                                        
                                            2
                                        
                                    
                                    ,
                                    …
                                    ,
                                     
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             (p. 302)) obtained by loading a distributed representation of each of words or phrases included in subject data into a common dimension (Distributed representation is interpreted as word embedding. Common dimension is interpreted as any of the first 4 dimensions of vector xn, which is loaded with word embedding) and a dimension corresponding to a data class representing a role in the subject data, (entity type embedding (a data class) are loaded into any of the next 2 dimensions and dependency tree relations (a data class) into any of the last 2 dimensions) using a processor (Experiments on p. 305 indicates using a processor); and 
executing machine learning (bidirectional RNNs in Fig. 1) that uses the input vectors and that relates to features of the words or phrases included in the subject data, using the processor. (Nguyen p. 302 § 3.1.1 teaches that the input vectors consist of word embedding, entity type embedding, and dependency tree relations. P. 305 teaches these features are optimized: “During training, besides the weight matrices, we also optimize the word and entity type embedding tables”)

Regarding claim 2, Nguyen teaches: The learning method according to claim 1, wherein the generating includes sequentially generating the input vectors of the words or phrases that appear in the subject data according to an order in which the words or the phrases appear (word embedding “a” is in x1, “man” in x2, etc) and sequentially inputting the input vectors into a recurrent neural network; and (According to p. 302 section RNN,  x1 is input into the RNN at iteration 1, x2 at iteration 2, etc.)
the executing includes, using each of state vectors that are output values from the recurrent neural network to which each of the input vectors is input, executing the machine learning relating to the features of the words or phrases included in the subject data. (This is shown by Fig. 1 and is a feature of a bidirectional RNN, specifically an LSTM, as taught at the top of p. 303).

Regarding claim 3, Nguyen teaches: The learning method according to claim 1, wherein the generating includes generating a connected input vector using the input vector that is generated for each of the words or phrases included in the subject data and inputting the connected input vector into the neural network (Under the broadest reasonable interpretation, a connected input vector is interpreted as the input sequence                                 
                                    X
                                    =
                                    
                                        
                                            x
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    
                                        
                                            x
                                        
                                        
                                            2
                                        
                                    
                                    ,
                                    …
                                    ,
                                     
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             (p. 302). It is connected by the nature of sequentially inputting each input vector xn into the RNN as shown in Fig. 1), and the executing includes executing machine learning relating to the features of the words or phrases contained in the subject data. (P. 305: “During training, besides the weight matrices, we also optimize the word and entity type embedding tables”)

Regarding claim 4, Nguyen teaches: The learning method according to claim 1, wherein the generating includes, using transformation parameters corresponding respectively to surface layer (word embedding lookup table - Fig. 1 and p. 302, §Sentence Encoding), word class (entity type lookup  and unique representation (ruleset for the dependency tree relations on p. 302: “The value at each dimension of this vector is set to 1 only if there exists one edge of the corresponding relation connected to                                 
                                    
                                        
                                            w
                                        
                                        
                                            i
                                        
                                    
                                
                             in the dependency tree of W”) that are common features between the words or phrases, generating a distributed representation corresponding to the common dimension (word embedding spanning the upper 4 dimensions of xn) and a distributed representation corresponding to the data class from the words or phrases (the entity type embedding spanning the fifth and sixth dimensions) to generate the input vector obtained by connecting the distributed representations (connecting the distributed representations is interpreted as concatenating the word embedding, entity type embedding, and dependency tree relations into a vector                                 
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                            )

Regarding claim 6, Nguyen teaches: A method of using a result of learning comprising: using a learned model (the trained model is used during testing – testing performance in Table 3) obtained by inputting an input vector (Interpreted as any vector                                  
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             in the input sequence                                 
                                    X
                                    =
                                    
                                        
                                            x
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    
                                        
                                            x
                                        
                                        
                                            2
                                        
                                    
                                    ,
                                    …
                                    ,
                                     
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             (p. 302)) obtained by loading a distributed representation of each of words or phrases included in subject data into a common dimension (Distributed representation is interpreted as word embedding. Common dimension is interpreted as any of the first 4 dimensions of vector xn, which is loaded with word embedding) and a dimension corresponding to a data class representing a role in the subject data (entity type embedding (a data class) are loaded into any of the next 2 dimensions and dependency tree relations (a data class) into any of the last 2 dimensions) and by executing learning (bidirectional RNNs in Fig. 1) that relates to features of the words or phrases included in the subject data(Nguyen p. 302 § 3.1.1 teaches that the input vectors consist of word embedding, entity type embedding, and dependency tree relations. P. 305 teaches these features are optimized: “During training, besides the weight matrices, we also optimize the word and entity type embedding tables”), using a processor (Experiments on p. 305 indicates using a processor); and
acquiring a result of determination (taught by Table 3, testing performance)  from the input vector obtained by loading a distributed representation of each of words or phrases included in determination subject data (interpreted as testing subject data, as opposed to training subject data) into a common dimension corresponding to the input vector used to learn the learning model and a dimension corresponding to the data class, using the processor. (These limitations are interpreted as using Fig. 1 for testing the model after training in Nguyen, Experiments section)

Regarding claim 7, Nguyen teaches: A non-transitory computer-readable recording medium having stored therein a program that causes a computer to execute as a learned model comprising: (The limitation is implied by the Experiments section)
inputting an input vector (Interpreted as any vector                          
                            
                                
                                    x
                                
                                
                                    n
                                
                            
                        
                     in the input sequence                         
                            X
                            =
                            
                                
                                    x
                                
                                
                                    1
                                
                            
                            ,
                            
                                
                                    x
                                
                                
                                    2
                                
                            
                            ,
                            …
                            ,
                             
                            
                                
                                    x
                                
                                
                                    n
                                
                            
                        
                     (p. 302)) obtained by loading a distributed representation of each of words or phrases contained in determination subject data into a common dimension (determination subject data is interpreted as testing subject data, using the model in Fig. 1)  and a dimension corresponding to a data class representing a role in the determination subject data; and (entity type embedding (a data class) are loaded into any of the next 2 dimensions and dependency tree relations (a data class) into any of the last 2 dimensions)
outputting a value representing a relationship between specified data classes. (prediction outputs, Fig. 1)

Regarding claim 8, Nguyen teaches: A non-transitory computer-readable recording medium having stored therein a data structure that includes an input vector (Interpreted as any vector                                  
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             in the input sequence                                 
                                    X
                                    =
                                    
                                        
                                            x
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    
                                        
                                            x
                                        
                                        
                                            2
                                        
                                    
                                    ,
                                    …
                                    ,
                                     
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             (p. 302)) obtained by loading a distributed representation of each of words or phrases contained in subject data into a common dimension (Distributed representation is and a dimension corresponding to a data class representing a role in the subject data and a relationship label value representing a relationship between specified data classes (entity type embedding (a data class) are loaded into any of the next 2 dimensions and dependency tree relations (a data class) into any of the last 2 dimensions) and that is used by a learning device to learn a relationship between the input vector and the relationship label value (A relationship is learned between the prediction outputs in Fig. 1 and input vector).

Regarding claim 9, Nguyen teaches: A generating method comprising: generating an input vector (Interpreted as any vector                                  
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             in the input sequence                                 
                                    X
                                    =
                                    
                                        
                                            x
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    
                                        
                                            x
                                        
                                        
                                            2
                                        
                                    
                                    ,
                                    …
                                    ,
                                     
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             (p. 302)) obtained by loading a distributed representation of each of words or phrases included in subject data into a common dimension (Distributed representation is interpreted as word embedding. Common dimension is interpreted as any of the first 4 dimensions of vector xn, which is loaded with word embedding) and a dimension corresponding to a data class representing a role in the subject data (entity type embedding (a data class) are loaded into any of the next 2 dimensions and dependency tree relations (a data class) into any of the last 2 dimensions) using a processor (Experiments on p. 305 indicates using a processor); and 
generating data in which the input vector and a relationship label value representing a relationship between specified data classes are associated with each other, using the processor. (A relationship is learned between the prediction outputs in Fig. 1 and input vector).

Regarding claim 10, Nguyen teaches: A non-transitory computer-readable recording medium having stored therein a program that causes a computer to execute a process comprising: generating an input vector (Interpreted as any vector                                  
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             in the input sequence                                 
                                    X
                                    =
                                    
                                        
                                            x
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    
                                        
                                            x
                                        
                                        
                                            2
                                        
                                    
                                    ,
                                    …
                                    ,
                                     
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             (p. 302)) obtained by loading a distributed representation of each of words or phrases included in subject data into a common dimension (Distributed representation is interpreted as word embedding. Common dimension is interpreted as any of the first 4 dimensions of vector xn, which is loaded with word embedding) and a dimension corresponding to a data class representing a role in the subject data (entity type embedding (a data class) is loaded into any of the next 2 dimensions and dependency tree relations (a data class) into any of the last 2 dimensions); and 
executing machine learning (bidirectional RNNs in Fig. 1) that uses the input vectors and that relates to features of the words or phrases included in the subject data. (Nguyen p. 302 § 3.1.1 teaches that the input vectors consist of word embedding, entity type embedding, and dependency tree relations. P. 305 teaches these features are optimized: “During training, besides the weight matrices, we also optimize the word and entity type embedding tables”)

Regarding claim 11, Nguyen teaches: A learning device comprising: a processor configured to (Experiments on p. 305 indicates using a computer processor): generate an input vector (Interpreted as any vector                                  
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             in the input sequence                                 
                                    X
                                    =
                                    
                                        
                                            x
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                    
                                        
                                            x
                                        
                                        
                                            2
                                        
                                    
                                    ,
                                    …
                                    ,
                                     
                                    
                                        
                                            x
                                        
                                        
                                            n
                                        
                                    
                                
                             (p. 302)) obtained by loading a distributed representation of each of words or phrases included in subject data into a common dimension (Distributed representation is interpreted as word embedding. Common dimension is interpreted as any of the first 4 dimensions of vector xn, which is loaded with word embedding) and a dimension corresponding to a data class representing a role in the subject data (entity type embedding (a data class) are loaded into any of the next 2 dimensions and dependency tree relations (a data class) into any of the last 2 dimensions) using a processor (Experiments on p. 305 indicates using a processor); and 
execute machine learning (bidirectional RNNs in Fig. 1)  that uses the input vectors and that relates to features of the words or phrases included in the subject data. (Nguyen p. 302 § 3.1.1 teaches that the input vectors consist of word embedding, entity type embedding, and dependency tree 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Nguyen over “Joint Event Extraction via Structured Prediction with Global Features” to Li et al., hereinafter “Li”.

Regarding claim 5, Nguyen teaches: The learning method according to claim 4, 
wherein the generating includes, among a first distributed representation of the common dimension (word embedding dimensions in xn), a second distributed representation of a data class corresponding to the entity, and a third representation of others excluding the entity (second and third distributed representations are interpreted as the dependency tree relations dimensions in xn. This excludes the entity, i.e. the word embedding), 
and generating the input vector obtained by connecting the first distributed representation, the second distributed representation and the third distributed representation and, (all distributed representations are concatenated in input vector xn)
	Nguyen teaches a vector of binary elements which represents the dependency features that are shown to be helpful in the previous research (Li et al., 2013). But Nguyen does not explicitly teach: 
when the word or phrase corresponds to an entity whose relationship is to be learned, setting the third distributed representation at 0 and… when the word or phrase does not correspond to the entity, setting the second distributed representation at 0 
Li teaches: when the word or phrase corresponds to an entity whose relationship is to be learned, setting the third distributed representation at 0 and… when the word or phrase does not correspond to the entity, setting the second distributed representation at 0 (Li p. 77 teaches that the local features that Nguyen uses as dependency features include a local feature function for argument labeling including…

    PNG
    media_image2.png
    108
    577
    media_image2.png
    Greyscale

…versions, of which one is:

    PNG
    media_image3.png
    101
    480
    media_image3.png
    Greyscale

Argument k [Wingdings font/0xE0] entity, where k is an index that spans all the arguments/entities and e_k is the argument candidate 
Trigger word i [Wingdings font/0xE0] word or phrase 
Li language: “when the trigger corresponds to the argument”

When the trigger/word i corresponds to some argument/entity k, then Li sets the q_1 that corresponds to that i and k to 1 (claim 5: a second representation) and q_1 that corresponds to that i and some other k to 0 (claim 5: a third representation at 0). 
When the trigger/word i does NOT correspond to a particular argument/entity k, then set the q_1 that corresponds to that i and k to 0 (claim 5: a second representation at 0).
 (Note that a first, a second, and a third distributed representations in claim 5 are interpreted as being distinct from the first and the second distributed representations of claims 1 and 4.)

It would have been obvious to use features including specifically the q1 feature of Li as dependency features as part of the binary vector for dependency features of Nguyen. A motivation is that thee dependency features that are shown to be helpful (Nguyen, 302).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: “Learning Distributed Word Representations For Bidirectional LSTM Recurrent Neural Network” to Wang et al.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher Jablon whose telephone number is (571)270-7648.  The examiner can normally be reached on Monday - Friday, 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/ASHER H. JABLON/Examiner, Art Unit 2122                                                                                                                                                                                                        
/ERIC NILSSON/Primary Examiner, Art Unit 2122