DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Introduction
This office action is in response to communications filed on 08/31/2020. Claims  7-26 are pending, and likewise Claims 7-26 have been examined.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 12/28/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Objections
Claim 7 objected to because of the following informalities: 
Two periods at the end of the Claim.
Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 7, 9-12, 14, 16-19, 21, 23-25 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
The Independent claim(s) 7, 14, 21 recite(s), “comprising: receiving an input text, wherein the input text includes a plurality of phrases;”, “extracting, based on a dependency analysis, a triplet, wherein the triplet comprises a first phrase, a second phrase, and a dependency label, and wherein the dependency label defines a first dependency relationship between the first phrase and the second phrase;”, “generating, by an encoder, a set of vectors, wherein the set of vectors is for training a relationship estimation model”, “and wherein the set of vectors comprises: a first vector based on the first phrase, a second vector based on the second phrase, and a third vector based on the dependency label;”, “generating, by the encoder, a first pair, wherein the first pair is for training a phrase generation model, and wherein the first pair comprises the first vector and the third vector;”, “generating, by the encoder a second pair, wherein the second pair is for training the phrase generation model, wherein the second pair comprises the second vector and a fourth vector, and wherein the fourth vector is based on a reverse label of the first dependency relationship;”, “and generating, based on the set of vectors for training, the relationship estimation model;”, “using the first pair and the second pair, training the phrase generation model;”, “receiving a third phrase and a connection expression;”, “determining, based on the third phrase and the connection expression using the trained phrase generation model, a fourth phrase, wherein the fourth phrase relates to the third phrase based on the connection expression;”, ACTIVE. 124906014.013U.S. Patent Application Serial No. Filed herewithPreliminary Amendment dated August 31, 2020“determining, based on the trained relationship estimation model, a relation score, wherein the relation score defines a degree of a second dependency relationship between the received third phrase and the determined fourth phrase;”, “and providing the fourth phrase and the determined relation score in response to the received third phrase and the connection expression”. 
Claim 14 also recites the limitations of “a processor; and a memory storing computer-executable instructions that when executed by the processor cause the system to”, and Claim 21 recites “A computer-readable non-transitory recording medium storing computer- executable instructions that when executed by a processor cause a computer system to”. Claim 1 recites “A computer-implemented method for processing phrases, the method comprising”
The limitations “comprising: receiving an input text, wherein the input text includes a plurality of phrases;”, “extracting, based on a dependency analysis, a triplet, wherein the triplet comprises a first phrase, a second phrase, and a dependency label, and wherein the dependency label defines a first dependency relationship between the first phrase and the second phrase;”, “generating, by an encoder, a set of vectors, wherein the set of vectors is for training a relationship estimation model”, “and wherein the set of vectors comprises: a first vector based on the first phrase, a second vector based on the second phrase, and a third vector based on the dependency label;”, “generating, by the encoder, a first pair, wherein the first pair is for training a phrase generation model, and wherein the first pair comprises the first vector and the third vector;”, “generating, by the encoder a second pair, wherein the second pair is for training the phrase generation model, wherein the second pair comprises the second vector and a fourth vector, and wherein the fourth vector is based on a reverse label of the first dependency relationship;”, “and generating, based on the set of vectors for training, the relationship estimation model;”, “using the first pair and the second pair, training the phrase generation model;”, “receiving a third phrase and a connection expression;”, “determining, based on the third phrase and the connection expression using the trained phrase generation model, a fourth phrase, wherein the fourth phrase relates to the third phrase based on the connection expression;”, ACTIVE. 124906014.013U.S. Patent Application Serial No. Filed herewithPreliminary Amendment dated August 31, 2020“determining, based on the trained relationship estimation model, a relation score, wherein the relation score defines a degree of a second dependency relationship between the received third phrase and the determined fourth phrase;”, “and providing the fourth phrase and the determined relation score in response to the received third phrase and the connection expression” as drafted, covers a mental process, as this could be done by mentally or by hand with pen and paper.
This judicial exception is not integrated into a practical application. Claim 1 recites “A computer-implemented method for processing phrases, the method comprising”. Claim 14 recites “a processor; and a memory storing computer-executable instructions that when executed by the processor cause the system to”. Claim 21 recites “A computer-readable non-transitory recording medium storing computer- executable instructions that when executed by a processor cause a computer system to”. These limitations direct towards using a computer for the method, and does not impose any meaningful limits on practicing the abstract idea. 
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The addition of the generic computer components recited above with regard to claim 1, 14 and 21 do not amount to more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Claims 1, 14 and 21 does not recite any additional limitations. The claims as drafted, are not patent eligible.

Claims 9, 16 and 23 recites the additional limitations of “wherein the phrase generation model includes a decoder, wherein the decoder generates a phrase having a relationship represented by the dependency label of the triplet.” These limitations cover mental processes, as they could be done by mentally or by hand with pen and paper.
This judicial exception is not integrated into a practical application, as Claims 9, 16 and 23 comprises no additional limitations.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception, as Claims 9, 16 and 23 do not recite any additional limitations. The claims as drafted, are not patent eligible.

Claims 10, 17 and 24 recites the additional limitations of “concurrently performing, for training: generating, based on the set of vectors, the relationship estimation model; and generating, based the first pair and the second pair, the phrase generation model” (17 and 24 replace set of vectors with triple). These limitations cover mental processes, as they could be done by mentally or by hand with pen and paper.
This judicial exception is not integrated into a practical application, as Claims 10, 17 and 24  comprises no additional limitations.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception, as Claims 10, 17 and 24 do not recite any additional limitations. The claims as drafted, are not patent eligible.

Claims 11 and 18 recites the additional limitations of “wherein the generating the relationship estimation model and the generating the phrase generation model relates to minimizing a loss function, wherein the loss function is based on a combination of a first loss function for the relationship estimation model and a second loss function for the phrase generation model.”. These limitations cover mental processes, as they could be done by mentally or by hand with pen and paper.
This judicial exception is not integrated into a practical application, as Claims 11 and 18 comprises no additional limitations.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception, as Claims 11 and 18 do not recite any additional limitations. The claims as drafted, are not patent eligible.

Claims 12, 19 and 25 recite the additional limitations of generating a negative example of the triplet for training; and generating, based on the negative example of the triplet for training, the relationship estimation model”. These limitations cover mental processes, as they could be done by mentally or by hand with pen and paper.
This judicial exception is not integrated into a practical application, as Claims 12, 19 and 25 comprises no additional limitations.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception, as Claims 12, 19 and 25 do not recite any additional limitations. The claims as drafted, are not patent eligible.

Claims 8, 15, 22, 13, 20 and 26 were not rejected under 35 USC 101 for the following reasons:
Claims 8, 15, 22, recite “an attention-based encode-decoder model” and “training” the specific neural network model, which causes the computer implementation to be necessary, and causes the limitations to be something that cannot be done by hand or with pen and paper. Claims 13, 20 and 26 recite “ common Neural Network” or “shared Neural Network” (Claim 26), The simple addition of a shared Neural Network in a Generator and Discriminator, would not be enough to establish patentable subject matter, if it was found to be explicitly taught in prior art, as conventional in the art, at the effective time of filling. However, prior art was not found that explicitly taught this was conventional and well known in the art, at the effective time of filling.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 7, 10, 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cai et al. (KBGAN: Adversarial Learning for Knowledge Graph Embeddings) hereinafter Cai, and further in view of Li et al. (Commonsense Knowledge Base Completion), hereinafter Li.

Regarding Claim 7:

A computer-implemented method for processing phrases, the method comprising: receiving an input text, wherein the input text includes a plurality of phrases(Pg 7, 4.1.1 Datasets, Ln 1-2, We use three common knowledge base completion datasets. Pg 4, Fig 1, phrases such as New Orleans, Barack Obama); 
extracting, based on a dependency analysis, a triplet, wherein the triplet comprises a first phrase, a second phrase, and a dependency label, and wherein the dependency label defines a first dependency relationship between the first phrase and the second phrase(Pg 4, Fig 1, LocatedIn(NewOrleans, Luisiana). Luisiana is a single word, but this second part can be more than one word depending on what phase is used there, for example Fig 1 also shows the possibility of Barack Obama. Because either object can contain more than one word, they are considered a phrase, and this will not be clarified repeatedly in following citations); 
generating, by the encoder, a first pair, wherein the first pair is for training a phrase generation model, and wherein the first pair comprises the first vector and the third vector(Pg 4, Fig 1, LocatedIn(NewOrleans,?) is input into the generator. Pg 4, 3.1 Weakness of Uniform Negative Sampling, Para 2, Ln 5-6, First, we remove the tail entity, leaving LocatedIn(NewOrleans,?). Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. The first vector and the third vector being “vectors” is addressed with the combination of Li below); 
generating, by the encoder a second pair, wherein the second pair is for training the phrase generation model, wherein the second pair comprises the second vector and a fourth vector, and wherein the fourth vector is based on a reverse label of the first dependency relationship(Pg 7, 4.1.1 Datasets, Ln 1-3, We use three common knowledge base completion datasets for our experiment: FB15k-237, WN18 and WN18RR. Ln 9-10, WN18RR is a subset of WN18 … which removes reversing relations. Therefore WN18 has examples with reversed relations. Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. A pair of phrase and relation is generated to be put into the generation model as shown in Pg 4, Fig 1, LocatedIn(NewOrleans,?). The second vector and the fourth vector being “vectors” is addressed with the combination of Li below); 
and generating, based on the set of vectors for training, the relationship estimation model(Pg 4, Fig 1, shows LocatedIn(NewOrleans, Luisiana) being inputted into the discriminator. Pg 5, Para 2, Ln 7-10, the objective of the discriminator is to minimize the marginal loss between the positive triple and the generated negative triple. Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. The set of vectors being in “vector” form is addressed below with the combination of Li); 
using the first pair and the second pair, training the phrase generation model(Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. A pair of phrase and relation put into the generation model as shown in Pg 4, Fig 1, LocatedIn(NewOrleans,?). Dataset WN18 contains examples with inverse relations as shown in previous citation to Pg 7, 4.1.1 Datasets, Ln 1-3 & 9-10); 
receiving a third phrase and a connection expression(Pg 7, 4.1.3 Implementation Details, Para 2, Ln 1-3, In the adversarial training stage…... Pg 7, 4.1.1, Ln 1-2, We use three common knowledge base completion datasets for our experiment, Pg 4, Fig 1, LocatedIn(NewOrleans,?). This is a new set of phrase and relation for the adversarial part of training, opposed to the previous which were used for pre training the generator. Data sets show there is more than the single previous example. Figure shows that the generator receives a phrase and relation); 
determining, based on the third phrase and the connection expression using the trained phrase generation model, a fourth phrase, wherein the fourth phrase relates to the third phrase based on the connection expression(Pg 4, Fig 1, LocatedIn(NewOrleans,Florida) is shown to be output from the generator from the LocatedIn(NewOrleans,?) input. Which means Florida is the determined fourth phrase); ACTIVE. 124906014.013U.S. Patent Application Serial No. Filed herewith Preliminary Amendment dated August 31, 2020 
determining, based on the trained relationship estimation model, a relation score, wherein the relation score defines a degree of a second dependency relationship between the received third phrase and the determined fourth phrase(Pg 4, Fig 1, Description, Ln 2-5, The discriminator receives the generated negative triple as well as the ground truth triple and calculates their scores….. D minimizes the marginal loss between positive and negative triples by gradient descent. Pg 3, 3.1 Types of Training Objectives, Ln 7-9, The estimated likelihood of a triple to be true depends only on its score given by the score function); 
and providing the fourth phrase and the determined relation score in response to the received third phrase and the connection expression(Pg 4, Fig 1, shows fourth phrase Florida being provided from the generator based off of the LocatedIn(NewOrleans,?) input. Fig 1 also shows score being provided by the discriminator based off of the LocatedIn(NewOrleans, Florida) input).
Cai does not explicitly teach generating, by an encoder, a set of vectors, wherein the set of vectors is for training a relationship estimation model, and wherein the set of vectors comprises: a first vector based on the first phrase, a second vector based on the second phrase, and a third vector based on the dependency label;
generating ….first vector…third vector; generating …second vector…fourth vector; and ….set of vectors (these are in reference to the limitations above where it is mentioned that Cai does not explicitly state that the inputs to the models are in vector form).
In the same field of knowledge bases, Li teaches generating, by an encoder, a set of vectors, wherein the set of vectors is for training a relationship estimation model, and wherein the set of vectors comprises: a first vector based on the first phrase, a second vector based on the second phrase, and a third vector based on the dependency label; generating ….first vector…third vector; generating…second vector…fourth vector; and ….set of vectors. (Pg 1447, 3.2 Deep Neural Network Models, Para 2, Ln 4-8, For the DNN AVG model, we obtain the term vectors v1 and v2. We then concatenate v1, v2, and a relation vector vR to form the input of the DNN, denoted vin.  The vectors are show to be for training, as they are used for training in the other limitation in Claim 7).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify Cai, with the creation of the term vectors and relation vectors of Li, as Neural Networks require a vectorized input(Pg 1447, 3.2 Deep Neural Network Models, Para 2, Ln 7, to form the input of the DNN).

Regarding Claim 10:
The combination of Cai and Li teaches the computer-implemented method of claim 7, and Cai teaches the method further comprising: concurrently performing, for training: generating, based on the set of vectors, the relationship estimation model; and generating, based the first pair and the second pair, the phrase generation model(Pg 5, Para 2, Ln 10-12, In an adversarial training setting, the generator and the discriminator are alternatively trained towards their respective objectives. Pg 4, Fig 1, shows generator and estimator model inputs).

Regarding Claim 12:
The combination of Cai and Li teaches the computer-implemented method of claim 7, and Cai teaches the method further comprising: ACTIVE. 124906014.014U.S. Patent Application Serial No. Filed herewith Preliminary Amendment dated August 31, 2020 generating a negative example of the triplet for training; and generating, based on the negative example of the triplet for training, the relationship estimation model(Pg 1, abstract, Ln 14-18, we use one knowledge graph embedding model as a negative sample generator to assist the training of our desired model, which acts as the discriminator. Pg 4, Fig 1, shows it is a negative example of the triplet being output from the generator).

Claim 8 and 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over The combination of Cai and Li as applied to claim 7 above, and further in view of Vaswani et al. (Attention Is All You Need), hereinafter Vaswani.

Regarding Claim 8:
The combination of Cai and Li teaches the computer-implemented method of claim 7,and Cia teaches, wherein the phrase generation model is based on a model(Pg 4, 3.3 Generative Adversarial…, Para 2, Ln 4-5, We use softmax probabilistic models as the generator),
and the method further comprising: training, based on the second pair for training, the phrase generation model(Pg 7, 4.1.1 Datasets, Ln 1-3, We use three common knowledge base completion datasets for our experiment: FB15k-237, WN18 and WN18RR. Ln 9-10, WN18RR is a subset of WN18 … which removes reversing relations. Therefore WN18 has examples with reversed relations. Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. A pair of phrase and relation is generated to be put into the generation model as shown in Pg 4, Fig 1, LocatedIn(NewOrleans,?). The second vector and the fourth vector being “vectors” is addressed with the combination of Li in Claim 7);
and updating, based at least on a loss function, one or more parameters of the phrase generation model(Cai, Pg 6, Para 6, Ln 5-7, one can pre-train the generator by minimizing the loss function defined in Equation (1)).
The combination of Cai and Li does not specifically teach an attention-based encode-decoder model:
updating, based at least on a loss function relating to encoder-decoder.
In the same field of Natural Language Processing, Vaswani teaches an attention-based encode-decoder model(Pg 2, 3 Model Architecture, Ln 6-7, The Transformer follows this overall architecture using stacked self-attention and point-wise, fully connected layers for both the encoder and decoder):
updating, based at least on a loss function relating to encoder-decoder(Cai, Pg 6, Para 6, Ln 5-7, one can pre-train the generator by minimizing the loss function defined in Equation (1). If the model is was switched for an attention-based encode-decoder model, the loss function would be relating to encoder-decoder).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Cai and Li, with the attention-based encoder-decoder structure of Vaswani, as it can be trained quickly(Pg 10, Conclusion, Para 2, Ln 1-2).

Regarding Claim 9:
The combination of Cai and Li teaches the computer-implemented method of claim 7, and Cai teaches wherein the phrase generation model generates a phrase having a relationship represented by the dependency label of the triplet(Pg 4, Fig 1, shows Florida being generated to match with New Orleans and the relation LocatedIn. Fig 1 description states, Ln 1-2, The generator (G) calculates a probability distribution over a set of candidate negative triples, then sample one triples from the distribution as the output).
The combination of Cai and Li does not teach the phrase generation model includes a decoder, wherein the decoder generates a phrase.
In the same field if Natural language processing, Vaswani teaches the phrase generation model includes a decoder, wherein the decoder generates a phrase(Pg 2, 3 Model Architecture, Ln 6-7, The Transformer follows this overall architecture using stacked self-attention and point-wise, fully connected layers for both the encoder and decoder. Ln 3-4 Given z, the decoder then generates an output sequence).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Cai and Li, with the attention-based encoder-decoder structure of Vaswani, as it can be trained quickly(Pg 10, Conclusion, Para 2, Ln 1-2).

Claim 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Cai and Li as applied to claim 7 above, and further in view of Bazrafkan et al. (US 20180211164 A1).

Regarding Claim 11:
The combination of Cai and Li teaches the computer-implemented method of claim 7, and Cai teaches wherein the generating the relationship estimation model and the generating the phrase generation model relates to minimizing a loss function(Pg 5, Para 3, Ln 6-8, The objective of the discriminator can be formulated as minimizing the following marginal loss function), 
The combination of Cai and Li does to explicitly teach wherein the loss function is based on a combination of a first loss function for the relationship estimation model and a second loss function for the phrase generation model.
In the same field of Generative Adversarial networks, Bazrafkan teaches wherein the loss function is based on a combination of a first loss function for the relationship estimation model and a second loss function for the phrase generation model(Para [0046], Ln 2, generative network.Para [0049], Ln 1, The loss function for network A (L.sub.A) Para [0050], Ln 2-3 fed into network B as inputs. Para [0052] Ln 1-2, In this example, the loss function L.sub.B of network B. Para [0053], Ln 1-3,  The total loss of the whole model is αL.sub.A+βL.sub.B, which is a linear combination of the loss functions of the two networks).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Cai and Li, with the two combined loss functions for the generator and discriminator of Bazrafkan, because it allows both models to be trained together improving them(Para [0037], all lines) 

Claim 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Cai and Li as applied to claim 7 above, and further in view of Roblek et al. (US 20180190249 A1).

Regarding Claim 13:
The combination of Cai and Li teaches the computer-implemented method of claim 7, but does not explicitly teach wherein the relationship estimation model and the phase generation model use a common neural network.
In the same field of Generative Adversarial Networks, Roblek teaches wherein the relationship estimation model and the phase generation model use a common neural network(Para [0041], Ln 3-7, The machine-learned audio generation model can have been trained to receive ….or receive …. embeddings from the deep neural network. Para [0050], Ln 1-3, the audio generation model can include one or more machine-learned generative adversarial networks. Since the Generative Adversarial network uses the embeddings from the deep neural network, both generator and discriminator use a shared neural network that provides them with input embeddings).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Cai and Li, with the shared embedding model of Roblek, as it provides representations that are subtler and more general(Para [0038], Ln 3-6).

Claim 14, 17, 19, 21, 24 and is/are rejected under 35 U.S.C. 103 as being unpatentable over Cai, and further in view of Kruengkrai et al (US 20210286948 A1), and further in view of Li.

Regarding Claim 14:
Cai teaches a system for processing phrases, the system comprises: receive an input text, wherein the input text includes a plurality of phrases (Pg 7, 4.1.1 Datasets, Ln 1-2, We use three common knowledge base completion datasets. Pg 4, Fig 1, phrases such as New Orleans, Barack Obama); 
extract, based on a dependency analysis, a triplet, wherein the triplet comprises a first phrase, a second phrase, and a dependency label, and wherein the dependency label defines a first dependency relationship between the first phrase and the second phrase (Pg 4, Fig 1, LocatedIn(NewOrleans, Luisiana). Luisiana is a single word, but this second part can be more than one word depending on what phase is used there, for example Fig 1 also shows the possibility of Barack Obama. Because either object can contain more than one word, they are considered a phrase, and this will not be clarified repeatedly in following citations); 
generate, by the encoder, a first pair, wherein the first pair is for training a phrase generation model, and wherein the first pair comprises the first vector and the third vector (Pg 4, Fig 1, LocatedIn(NewOrleans,?) is input into the generator. Pg 4, 3.1 Weakness of Uniform Negative Sampling, Para 2, Ln 5-6, First, we remove the tail entity, leaving LocatedIn(NewOrleans,?). Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. The first vector and the third vector being “vectors” is addressed with the combination of Li below); 
generate, by the encoder a second pair, wherein the second pair is for training the phrase generation model, wherein the second pair comprises the second vector and a fourth vector, and wherein the fourth vector is based on a reverse label of the first dependency relationship (Pg 7, 4.1.1 Datasets, Ln 1-3, We use three common knowledge base completion datasets for our experiment: FB15k-237, WN18 and WN18RR. Ln 9-10, WN18RR is a subset of WN18 … which removes reversing relations. Therefore WN18 has examples with reversed relations. Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. A pair of phrase and relation is generated to be put into the generation model as shown in Pg 4, Fig 1, LocatedIn(NewOrleans,?). The second vector and the fourth vector being “vectors” is addressed with the combination of Li below); 
and generate, based on the set of vectors for training, the relationship estimation model (Pg 4, Fig 1, shows LocatedIn(NewOrleans, Luisiana) being inputted into the discriminator. Pg 5, Para 2, Ln 7-10, the objective of the discriminator is to minimize the marginal loss between the positive triple and the generated negative triple. Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. The set of vectors being in “vector” form is addressed below with the combination of Li); 
using the first pair and the second pair, train the phrase generation model (Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. A pair of phrase and relation put into the generation model as shown in Pg 4, Fig 1, LocatedIn(NewOrleans,?). Dataset WN18 contains examples with inverse relations as shown in previous citation to Pg 7, 4.1.1 Datasets, Ln 1-3 & 9-10); 
receive a third phrase and a connection expression (Pg 7, 4.1.3 Implementation Details, Para 2, Ln 1-3, In the adversarial training stage…... Pg 7, 4.1.1, Ln 1-2, We use three common knowledge base completion datasets for our experiment, Pg 4, Fig 1, LocatedIn(NewOrleans,?). This is a new set of phrase and relation for the adversarial part of training, opposed to the previous which were used for pre training the generator. Data sets show there is more than the single previous example. Figure shows that the generator receives a phrase and relation); ACTIVE. 124906014.015U.S. Patent Application Serial No. Filed herewith Preliminary Amendment dated August 31, 2020 
determine, based on the third phrase and the connection expression using the trained phrase generation model, a fourth phrase, wherein the fourth phrase relates to the third phrase based on the connection expression (Pg 4, Fig 1, LocatedIn(NewOrleans,Florida) is shown to be output from the generator from the LocatedIn(NewOrleans,?) input. Which means Florida is the determined fourth phrase); 
determine, based on the trained relationship estimation model, a relation score, wherein the relation score defines a degree of a second dependency relationship between the received third phrase and the determined fourth phrase (Pg 4, Fig 1, Description, Ln 2-5, The discriminator receives the generated negative triple as well as the ground truth triple and calculates their scores….. D minimizes the marginal loss between positive and negative triples by gradient descent. Pg 3, 3.1 Types of Training Objectives, Ln 7-9, The estimated likelihood of a triple to be true depends only on its score given by the score function);
and provide the fourth phrase and the determined relation score in response to the received third phrase and the connection expression (Pg 4, Fig 1, shows fourth phrase Florida being provided from the generator based off of the LocatedIn(NewOrleans,?) input. Fig 1 also shows score being provided by the discriminator based off of the LocatedIn(NewOrleans, Florida) input).
Cai does not explicitly teach a processor; and a memory storing computer-executable instructions that when executed by the processor cause the system to.
In the same field of Natural Language Processing, Kruengkrai teaches a processor; and a memory storing computer-executable instructions that when executed by the processor cause the system to(Para [0110], Ln 1-7, CPU…memory..instructions).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify Cai, with the generic computer components of Kruengkrai, as it provides the system an environment to operate in(Para [0108], All lines).
The combination of Cai and Kruengkrai does not explicitly teach generate, by an encoder, a set of vectors, wherein the set of vectors is for training a relationship estimation model, and wherein the set of vectors comprises: a first vector based on the first phrase, a second vector based on the second phrase, and a third vector based on the dependency label; 
generating ….first vector…third vector; generating …second vector…fourth vector; and ….set of vectors (these are in reference to the limitations above where it is mentioned that Cai does not explicitly state that the inputs to the models are in vector form).
In the same field of knowledge bases, Li teaches generate, by an encoder, a set of vectors, wherein the set of vectors is for training a relationship estimation model, and wherein the set of vectors comprises: a first vector based on the first phrase, a second vector based on the second phrase, and a third vector based on the dependency label; generating ….first vector…third vector; generating …second vector…fourth vector; and ….set of vectors (Pg 1447, 3.2 Deep Neural Network Models, Para 2, Ln 4-8, For the DNN AVG model, we obtain the term vectors v1 and v2 …... We then concatenate v1, v2, and a relation vector vR to form the input of the DNN, denoted vin. The vectors are show to be for training, as they are used for training in the other limitation in Claim 14).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Cai and Kruengkrai, with the creation of the term vectors and relation vectors of Li, as Neural Networks require a vectorized input(Pg 1447, 3.2 Deep Neural Network Models, Para 2, Ln 7, to form the input of the DNN).

Regarding Claim 17:
The combination of Cai, Kruengkrai and Li teaches the system of claim 14, and Cai the computer-executable instructions when executed further causing the system to: concurrently perform, for training: generating, based on the triple, the relationship estimation model; and generating, based the first pair and the second pair, the phrase generation model (Pg 5, Para 2, Ln 10-12, In an adversarial training setting, the generator and the discriminator are alternatively trained towards their respective objectives. Pg 4, Fig 1, shows generator and estimator model inputs).

Regarding Claim 19:
The combination of Cai, Kruengkrai and Li teaches the system of claim 14, and Cai teaches the computer-executable instructions when executed further causing the system to: generate a negative example of the triple for training; and generate, based on the negative example of the triple for training, the relationship estimation model(Pg 1, abstract, Ln 14-18, we use one knowledge graph embedding model as a negative sample generator to assist the training of our desired model, which acts as the discriminator. Pg 4, Fig 1, shows it is a negative example of the triplet being output from the generator).

	Regarding Claim 21:
Cai teaches receive an input text, wherein the input text includes a plurality of phrases (Pg 7, 4.1.1 Datasets, Ln 1-2, We use three common knowledge base completion datasets. Pg 4, Fig 1, phrases such as New Orleans, Barack Obama); 
extract, based on a dependency analysis, a triplet, wherein the triplet comprises a first phrase, a second phrase, and a dependency label, and wherein the dependency label defines a first dependency relationship between the first phrase and the second phrase (Pg 4, Fig 1, LocatedIn(NewOrleans, Luisiana). Luisiana is a single word, but this second part can be more than one word depending on what phase is used there, for example Fig 1 also shows the possibility of Barack Obama. Because either object can contain more than one word, they are considered a phrase, and this will not be clarified repeatedly in following citations); 
generate, by the encoder, a first pair, wherein the first pair is for training a phrase generation model, and wherein the first pair comprises the first vector and the third vector (Pg 4, Fig 1, LocatedIn(NewOrleans,?) is input into the generator. Pg 4, 3.1 Weakness of Uniform Negative Sampling, Para 2, Ln 5-6, First, we remove the tail entity, leaving LocatedIn(NewOrleans,?). Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. The first vector and the third vector being “vectors” is addressed with the combination of Li below); 
generate, by the encoder a second pair, wherein the second pair is for training the phrase generation model, wherein the second pair comprises the second vector and a fourth vector, and wherein the fourth vector is based on a reverse label of the first dependency relationship (Pg 7, 4.1.1 Datasets, Ln 1-3, We use three common knowledge base completion datasets for our experiment: FB15k-237, WN18 and WN18RR. Ln 9-10, WN18RR is a subset of WN18 … which removes reversing relations. Therefore WN18 has examples with reversed relations. Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. A pair of phrase and relation is generated to be put into the generation model as shown in Pg 4, Fig 1, LocatedIn(NewOrleans,?). The second vector and the fourth vector being “vectors” is addressed with the combination of Li below); 
and generate, based on the set of vectors for training, the relationship estimation model (Pg 4, Fig 1, shows LocatedIn(NewOrleans, Luisiana) being inputted into the discriminator. Pg 5, Para 2, Ln 7-10, the objective of the discriminator is to minimize the marginal loss between the positive triple and the generated negative triple. Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. The set of vectors being in “vector” form is addressed below with the combination of Li); 
using the first pair and the second pair, train the phrase generation model (Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. A pair of phrase and relation put into the generation model as shown in Pg 4, Fig 1, LocatedIn(NewOrleans,?). Dataset WN18 contains examples with inverse relations as shown in previous citation to Pg 7, 4.1.1 Datasets, Ln 1-3 & 9-10); 
receive a third phrase and a connection expression (Pg 7, 4.1.3 Implementation Details, Para 2, Ln 1-3, In the adversarial training stage…... Pg 7, 4.1.1, Ln 1-2, We use three common knowledge base completion datasets for our experiment, Pg 4, Fig 1, LocatedIn(NewOrleans,?). This is a new set of phrase and relation for the adversarial part of training, opposed to the previous which were used for pre training the generator. Data sets show there is more than the single previous example. Figure shows that the generator receives a phrase and relation); 
determine, based on the third phrase and the connection expression using the trained phrase generation model, a fourth phrase, wherein the fourth phrase relates to the third phrase based on the connection expression (Pg 4, Fig 1, LocatedIn(NewOrleans,Florida) is shown to be output from the generator from the LocatedIn(NewOrleans,?) input. Which means Florida is the determined fourth phrase); ACTIVE. 124906014.017U.S. Patent Application Serial No. Filed herewith 
Preliminary Amendment dated August 31, 2020 determine, based on the trained relationship estimation model, a relation score, wherein the relation score defines a degree of a second dependency relationship between the received third phrase and the determined fourth phrase (Pg 4, Fig 1, Description, Ln 2-5, The discriminator receives the generated negative triple as well as the ground truth triple and calculates their scores….. D minimizes the marginal loss between positive and negative triples by gradient descent. Pg 3, 3.1 Types of Training Objectives, Ln 7-9, The estimated likelihood of a triple to be true depends only on its score given by the score function); 
and provide the fourth phrase and the determined relation score in response to the received third phrase and the connection expression (Pg 4, Fig 1, shows fourth phrase Florida being provided from the generator based off of the LocatedIn(NewOrleans,?) input. Fig 1 also shows score being provided by the discriminator based off of the LocatedIn(NewOrleans, Florida) input).
Cai does not teach a computer-readable non-transitory recording medium storing computer- executable instructions that when executed by a processor cause a computer system to.
In the same field of Natural Language Processing, Kruengkrai teaches a computer-readable non-transitory recording medium storing computer- executable instructions that when executed by a processor cause a computer system to (Para [0110], Ln 1-7, CPU…RAM...instructions).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify Cai, with the generic computer components of Kruengkrai, as it provides the system an environment to operate it(Para [0108], All lines).
The combination of Cai and Kruengkrai does not teach generate, by an encoder, a set of vectors, wherein the set of vectors is for training a relationship estimation model, and wherein the set of vectors comprises: a first vector based on the first phrase, a second vector based on the second phrase, and a third vector based on the dependency label; 
generating ….first vector…third vector; generating …second vector…fourth vector; and ….set of vectors (these are in reference to the limitations above where it is mentioned that Cai does not explicitly state that the inputs to the models are in vector form).
In the same field of knowledge bases, Li teaches generate, by an encoder, a set of vectors, wherein the set of vectors is for training a relationship estimation model, and wherein the set of vectors comprises: a first vector based on the first phrase, a second vector based on the second phrase, and a third vector based on the dependency label; generating ….first vector…third vector; generating …second vector…fourth vector; and ….set of vectors (Pg 1447, 3.2 Deep Neural Network Models, Para 2, Ln 4-8, For the DNN AVG model, we obtain the term vectors v1 and v2 by averaging word vectors in the respective terms. We then concatenate v1, v2, and a relation vector vR to form the input of the DNN, denoted vin. The vectors are show to be for training, as they are used for training in the other limitation in Claim 21).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Cai and Kruengkrai, with the creation of the term vectors and relation vectors of Li, as Neural Networks require a vectorized input(Pg 1447, 3.2 Deep Neural Network Models, Para 2, Ln 7, to form the input of the DNN).

Regarding Claim 24:
	Claim 24 contains similar limitations as Claim 17, and is therefore rejected for the same reasons.

Regarding Claim 25:
	Claim 25 contains similar limitations as Claim 19, and is therefore rejected for the same reasons.

Claim 15 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over The combination of Cai, Kruengkrai and Li as applied to claim 14 above, and further in view of Vaswani.

Regarding Claim 15:
The combination of Cai, Kruengkrai and Li teaches the system of claim 14, and Cai teaches wherein the phrase generation model is based on a model(Pg 4, 3.3 Generative Adversarial…, Para 2, Ln 4-5, We use softmax probabilistic models as the generator), 
and the computer-executable instructions when executed further causing the system to: train, based on the second pair for training, the phrase generation model(Pg 7, 4.1.1 Datasets, Ln 1-3, We use three common knowledge base completion datasets for our experiment: FB15k-237, WN18 and WN18RR. Ln 9-10, WN18RR is a subset of WN18 … which removes reversing relations. Therefore WN18 has examples with reversed relations. Pg 7, 4.1.3 Implementation Details, Ln 1-2, In the pre-training stage, we train every model to convergence for 1000 epochs. A pair of phrase and relation is generated to be put into the generation model as shown in Pg 4, Fig 1, LocatedIn(NewOrleans,?). The second vector and the fourth vector being “vectors” is addressed with the combination of Li in Claim 7); 
and update, based at least on a loss function relating to encoder-decoder, one or more parameters of the phrase generation model(Cai, Pg 6, Para 6, Ln 5-7, one can pre-train the generator by minimizing the loss function defined in Equation (1)).
The combination of Cai, Kruengkrai and Li does not teach an attention-based encode-decoder model
update, based at least on a loss function relating to encoder-decoder
In the same field of Natural Language Processing, Vaswani teaches an attention-based encode-decoder model(Pg 2, 3 Model Architecture, Ln 6-7, The Transformer follows this overall architecture using stacked self-attention and point-wise, fully connected layers for both the encoder and decoder):
update, based at least on a loss function relating to encoder-decoder(Cai, Pg 6, Para 6, Ln 5-7, one can pre-train the generator by minimizing the loss function defined in Equation (1). If the model is was switched for an attention-based encode-decoder model, the loss function would be relating to encoder-decoder).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Cai, Kruengkrai and Li, with the attention-based encoder-decoder structure of Vaswani, as it can be trained quickly(Pg 10, Conclusion, Para 2, Ln 1-2).

Regarding Claim 16:
The combination of Cai, Kruengkrai and Li teaches the system of claim 14, and Cai teaches wherein the phrase generation model generates a phrase having a relationship represented by the dependency label of the triple(Pg 4, Fig 1, shows Florida being generated to match with New Orleans and the relation LocatedIn. Fig 1 description states, Ln 1-2, The generator (G) calculates a probability distribution over a set of candidate negative triples, then sample one triples from the distribution as the output).
The combination of Cai, Kruengkrai and Li does not teach the phrase generation model includes a decoder, wherein the decoder generates a phrase.
In the same field if Natural language processing, Vaswani teaches the phrase generation model includes a decoder, wherein the decoder generates a phrase(Pg 2, 3 Model Architecture, Ln 6-7, The Transformer follows this overall architecture using stacked self-attention and point-wise, fully connected layers for both the encoder and decoder. Ln 3-4 Given z, the decoder then generates an output sequence).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Cai, Kruengkrai and Li, with the attention-based encoder-decoder structure of Vaswani, as it can be trained quickly(Pg 10, Conclusion, Para 2, Ln 1-2).

Claim 22 and 23 is/are rejected under 35 U.S.C. 103 as being unpatentable over The combination of Cai, Kruengkrai and Li as applied to claim 21 above, and further in view of Vaswani.

Regarding Claim 22:
Claim 22 contains similar limitations as Claim 15 and is therefore rejected for the same reasons.

Regarding Claim 23:
Claim 23 contains similar limitations as Claim 16 and is therefore rejected for the same reasons.

Claim 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Cai, Kruengkrai and Li as applied to claim 14 above, and further in view of Bazrafkan.

Regarding Claim 18:
The combination of Cai, Kruengkrai and Li teaches the system of claim 14, wherein the generating the relationship estimation model and the generating the phrase generation model relates to minimizing a loss function(Pg 5, Para 3, Ln 6-8, The objective of the discriminator can be formulated as minimizing the following marginal loss function), 
The combination of Cai, Kruengkrai and Li does not explicitly teach wherein the loss function is based on a combination of a first loss function for the relationship estimation model and a second loss function for the phrase generation model.
In the same field of Generative Adversarial networks, Bazrafkan teaches wherein the loss function is based on a combination of a first loss function for the relationship estimation model and a second loss function for the phrase generation model. (Para [0046], Ln 2, generative network.Para [0049], Ln 1, The loss function for network A (L.sub.A) Para [0050], Ln 2-3 fed into network B as inputs. Para [0052] Ln 1-2, In this example, the loss function L.sub.B of network B. Para [0053], Ln 1-3,  The total loss of the whole model is αL.sub.A+βL.sub.B, which is a linear combination of the loss functions of the two networks).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Cai, Kruengkrai and Li, with the two combined loss functions for the generator and discriminator of Bazrafkan, because it allows both models to be trained together improving them(Para [0037], all lines).

Claim 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Cai, Kruengkrai and Li as applied to claim 14 above, and further in view of Roblek.

Regarding Claim 20:
The combination of Cai, Kruengkrai and Li teaches the computer-implemented method of claim 14, but does not explicitly teach wherein the relationship estimation model and the phase generation model use a common neural network.
In the same field of Generative Adversarial Networks, Roblek teaches wherein the relationship estimation model and the phase generation model use a common neural network(Para [0041], Ln 3-7, The machine-learned audio generation model can have been trained to receive ….or receive …. embeddings from the deep neural network. Para [0050], Ln 1-3, the audio generation model can include one or more machine-learned generative adversarial networks. Since the Generative Adversarial network uses the embeddings from the deep neural network, both generator and discriminator use a shared neural network that provides them with input embeddings).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Cai, Kruengkrai and Li, with the shared embedding model of Roblek, as it provides representations that are subtler and more general(Para [0038], Ln 3-6).

Claim 26 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Cai, Kruengkrai and Li as applied to claim 21 above, and further in view of Roblek.

Regarding Claim 26:
The combination of Cai, Kruengkrai and Li teaches the computer-readable non-transitory recording medium of claim 21, but does not teach wherein the relationship estimation model and the phase generation model share a common neural network.
In the same field of Generative Adversarial Networks, Roblek teaches wherein the relationship estimation model and the phase generation model share a common neural network(Para [0041], Ln 3-7, The machine-learned audio generation model can have been trained to receive ….or receive …. embeddings from the deep neural network. Para [0050], Ln 1-3, the audio generation model can include one or more machine-learned generative adversarial networks. Since the Generative Adversarial network uses the embeddings from the deep neural network, both generator and discriminator use a shared neural network that provides them with input embeddings).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Cai, Kruengkrai and Li, with the shared embedding model of Roblek, as it provides representations that are subtler and more general(Para [0038], Ln 3-6).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Wu et al. (US 20200159997 A1).
Phrases relationships and Generative adversarial networks.
Gao et al. (US 20190114348 A1)
Generative adversarial networks.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER G MARLOW whose telephone number is (571)272-4536. The examiner can normally be reached Monday - Thursday 10:00 am - 8:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richmond Dorvil can be reached on (571)272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ALEXANDER G MARLOW/Assistant Examiner, Art Unit 2658                                                                                                                                                                                                        

/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658