DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claim 21 is rejected under 35 U.S.C. 112(a) because the specification, while being enabling for “a dependency tree having sub-trees labeled with ground-truth functional tags … wherein during the training, an error of prediction is used to adjust parameters of the Predictor, the Encoder, and the Scheduler,” does not reasonably provide enablement for “a dependency tree having zero sub-trees labeled with ground-truth functional tags … wherein during the training, an error of prediction is used to adjust parameters of the Predictor, the Encoder, and the Scheduler.” The specification does not enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention commensurate in scope with these claims. Applicant discloses “[t]his created label 418 is compared to a ground truth label 420 for the target 406, and any error is sent to both the predictor 416 and the scheduler 408 so that they may be adjusted to minimize future errors.” See Specification, para. [0052]. Without the ground truth labels, the claimed error of prediction would not be enabled.

The following is a quotation of 35 U.S.C. 112(b):

(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


Claim 11 is rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. The term “sufficient” in claim 11 is a relative term which renders the claim indefinite. The term “sufficient” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. As such, the limitation “determining that the feature vector is sufficient to invoke the predictor” is indefinite.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 7, 9, 14, 18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Marcheggiani et al. (A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling, 2017, arXiv:1701.02593v2) in view of Senba (US 2010/0296595).
Regarding claim 1, Marcheggiani teaches/suggests: A computer-implemented method, comprising: 
receiving, i in a sentence w build a word representation xi;” §2.1 ¶1: “We represent each word as the concatenation of four vectors ... The randomly initialized embeddings xre, xpos, and xle are fine-tuned during training, while the pre-trained ones are kept fixed.” [The input sentence used for training meets the claimed training data instance; the arguments of the input sentence meet the claimed target instance.]); 
generating, re, xpos, and xle are fine-tuned during training, while the pre-trained ones are kept fixed.”); 
sending the input sequence which takes as input the word representation xi and provide a dynamic representation of the word and its context in a sentence.”); 
mapping, by the encoder, the input sequence to a feature vector (Marcheggiani §2.3 ¶2: “Specifically, when identifying arguments of a given predicate, we add a predicate-specific feature to the representation of each word in the sentence by concatenating a binary flag to the word representation of Section 2.1.”); 
sending the feature vector 
mapping, by the predictor, the feature vector to a class vector to create a label for the target instance (Marcheggiani §2.4.1 ¶1: “The basic role classifier takes the hidden state of the top-layer bidirectional LSTM corresponding to the considered word at position i and uses it to estimate the probability of the role r.”).
Marcheggiani does not teach/suggest a scheduler. Nor does Marcheggiani teach/suggest:
sending the feature vector from the encoder to the scheduler; 
sending the feature vector from the scheduler to a predictor; 
Senba, however, teaches/suggests a scheduler (Senba [0032]: “Then, the control unit 26 sends transmitter fault information to the scheduling unit 27 to indicate the number of transmitting units capable of transmitting;” [0034]: “By referring to the transmitter fault information and feedback information, the scheduling unit 27 determines the codeword length, the modulation mode, the number of streams for each codeword, and the transmitting units to be used to transmit the streams.”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the word representation component of Marcheggiani to include the scheduling as taught/suggested by Senba to schedule the BiLSTM representations (the claimed feature vector). As such, Marcheggiani as modified by Senba teaches/suggests:
sending the feature vector from the encoder to the scheduler (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components ... a classifier which takes as an input the BiLSTM representation of the candidate argument and the BiLSTM representation of the predicate to predict the role associated to the candidate argument;” Senba [0032]: “Then, the control unit 26 sends transmitter fault information to the scheduling unit 27 to indicate the number of transmitting units capable of transmitting.”); 
sending the feature vector from the scheduler to a predictor (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components ... a classifier which takes as an input the BiLSTM representation of the candidate argument and the BiLSTM representation of the predicate to predict the role associated to the candidate argument;” Senba [0034]: “By referring to the transmitter fault information and feedback information, the scheduling unit 27 determines the codeword length, the modulation mode, the number of streams for each codeword, and the transmitting units to be used to transmit the streams.”); 

Regarding claim 7, Marcheggiani as modified by Senba teaches/suggests: The computer-implemented method of Claim 1, wherein the feature vector includes one or more features of the target instance within the training data instance (Marcheggiani §2.3 ¶2: “Specifically, when identifying arguments of a given predicate, we add a predicate-specific feature to the representation of each word in the sentence by concatenating a binary flag to the word representation of Section 2.1.”).

Regarding claim 9, Marcheggiani as modified by Senba teaches/suggests: The computer-implemented method of Claim 1, wherein the encoder is selected from a group consisting of a recurrent neural network (RNN), a hidden Markov model, and long-short term memory (Marcheggiani §2.2 ¶¶1-2: “One of the most effective ways to model sequences are recurrent neural networks … Formally, we can define an LSTM as a function LSTMΘ(x1:i) that takes as input the sequence x1:i and returns a hidden state.”).

Regarding claim 14, Marcheggiani as modified by Senba teaches/suggests: The computer-implemented method of Claim 1, wherein the input sequence is generated based on a predetermined strategy (Marcheggiani §2.1 ¶1: “We represent each word as the concatenation of four vectors ... The randomly initialized embeddings xre, xpos, and xle are fine-tuned during training, while the pre-trained ones are kept fixed.”).

Claim 18 recites limitations similar in scope to those of claim 1 and is rejected for the same reason(s). Marcheggiani as modified by Senba further teaches/suggests a computer program product (Marcheggiani Fig. 2: “Predicting an argument and its label with an LSTM encoder.”).

Claim 20 recites limitations similar in scope to those of claim 1 and is rejected for the same reason(s). Marcheggiani as modified by Senba further teaches/suggests a processor; and logic integrated with the processor (Marcheggiani Fig. 2: “Predicting an argument and its label with an LSTM encoder.”).

Claims 2-6 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Marcheggiani et al. (A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling, 2017, arXiv:1701.02593v2) in view of Senba (US 2010/0296595) as applied to claims 1 and 18 above, and further in view of Chatterjee et al. (US 2020/0312297).
Regarding claim 2, Marcheggiani further discloses a dependency graph (Fig. 1). Marcheggiani is silent regarding: The computer-implemented method of Claim 1, wherein: 
the training data instance includes a sentence represented as a dependency tree, 
the target instance includes a portion of the dependency tree, 
the input sequence includes a restructured training data instance and target instance that are understandable by the encoder, 
the feature vector includes one or more features of the target instance within the training data instance, 
the label includes a predicted nonfunctional label that predicts whether the target instance is a functional tag.
Chatterjee, however, teaches/suggests a dependency tree (Chatterjee [0032]: “The dependency parse module 210 receives a natural language sentence as an input and generates a dependency parse tree for the given sentence.”). Before the effective filing date of the claimed invention, the substitution of one known element (the dependency tree of Chatterjee) for another (the dependency graph of Marcheggiani) would have been obvious to one of ordinary skill in the art because such substitution would have yielded a predictable result, namely, to represent the input sentence. As such, Marcheggiani as modified by Senba and Chatterjee teaches/suggests:
the training data instance includes a sentence represented as a dependency tree (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components ... a word representation component that from a word wi in a sentence w build a word representation xi;” Chatterjee [0032]: “The dependency parse module 210 receives a natural language sentence as an input and generates a dependency parse tree for the given sentence.”), 
the target instance includes a portion of the dependency tree (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components ... a word representation component that from a word wi in a sentence w build a word representation xi;” Chatterjee [0032]: “The dependency parse module 210 receives a natural language sentence as an input and generates a dependency parse tree for the given sentence.”), 
the input sequence includes a restructured training data instance and target instance that are understandable by the encoder (Marcheggiani §2.1 ¶1: “We represent each word as the concatenation of four vectors ... The randomly initialized embeddings xre, xpos, and xle are fine-tuned during training, while the pre-trained ones are kept fixed.”), 
the feature vector includes one or more features of the target instance within the training data instance (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components ... a Bidirectional LSTM (BiLSTM) encoder which takes as input the word representation xi and provide a dynamic representation of the word and its context in a sentence.”), 
the label includes a predicted nonfunctional label that predicts whether the target instance is a functional tag (Marcheggiani §2.4 ¶1: “This can be accomplished by labeling each word in a sentence with a role, including the special ‘NULL’ role to indicate that it is not an argument of the predicate.”).

Marcheggiani is also silent regarding:
the class vector includes a correlation between predetermined features and predetermined labels, 
Chatterjee further teaches/suggests:
the class vector includes a correlation between predetermined features and predetermined labels (Chatterjee [0142]: “Thereafter, at step 404, the annotated factoid tags associated with each word along with the associated word in a plurality of natural language sentences are iteratively inputted to the neural network in order to train the neural network.” [The claimed predetermined features are inherent and/or implicit features of the annotated tags.]), 
Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the input sentence of Marcheggiani as modified by Senba and Chatterjee to be annotated as taught/suggested by Chatterjee for the training.

Regarding claim 3, Marcheggiani as modified by Senba and Chatterjee teaches/suggests: The computer-implemented method of Claim 1, wherein the training data instance includes a sentence represented as a dependency tree (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components ... a word representation component that from a word wi in a sentence w build a word representation xi;” Chatterjee [0032]: “The dependency parse module 210 receives a natural language sentence as an input and generates a dependency parse tree for the given sentence.”). The same rationale to combine as set forth in the rejection of claim 2 above is incorporated herein.

Regarding claim 4, Marcheggiani as modified by Senba and Chatterjee teaches/suggests: The computer-implemented method of Claim 1, wherein the training data instance includes a plurality of identified and labeled dependencies (Chatterjee [0142]: “Thereafter, at step 404, the annotated factoid tags associated with each word along with the associated word in a plurality of natural language sentences are iteratively inputted to the neural network in order to train the neural network.”). The same rationale to combine as set forth in the rejection of claim 2 above is incorporated herein.

Regarding claim 5, Marcheggiani as modified by Senba and Chatterjee teaches/suggests: The computer-implemented method of Claim 1, wherein the target instance includes a portion of a dependency tree (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components ... a word representation component that from a word wi in a sentence w build a word representation xi;” Chatterjee [0032]: “The dependency parse module 210 receives a natural language sentence as an input and generates a dependency parse tree for the given sentence.”). The same rationale to combine as set forth in the rejection of claim 2 above is incorporated herein.

Regarding claim 6, Marcheggiani as modified by Senba and Chatterjee teaches/suggests: The computer-implemented method of Claim 1, wherein the target instance includes a subtree within a dependency tree (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components ... a word representation component that from a word wi in a sentence w build a word representation xi;” Chatterjee [0032]: “The dependency parse module 210 receives a natural language sentence as an input and generates a dependency parse tree for the given sentence.”). The same rationale to combine as set forth in the rejection of claim 2 above is incorporated herein.

Claim 19 recites limitations similar in scope to those of claim 3 and is rejected for the same reason(s).

Claims 8 and 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Marcheggiani et al. (A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling, 2017, arXiv:1701.02593v2) in view of Senba (US 2010/0296595) as applied to claim 1 above, and further in view of He et al. (Syntax for Semantic Role Labeling, To Be, Or Not To Be, 2018, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Long Papers), pp. 2061-2071).
Regarding claim 8, Marcheggiani as modified by Senba does not teach/suggest: The computer-implemented method of Claim 1, wherein the scheduler is selected from a group consisting of a logistic regression module, a support vector machine (SVM), and a fully connected neural network. He, however, teaches/suggests a fully connected neural network (He §2.2 ¶2: “At character level, we exploit convolutional neural network (CNN) with bidirectional LSTM (BiLSTM) to learn character embedding.”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the scheduler of Marcheggiani as modified by Senba to be a fully connected neural network as taught/suggested by He for machine learning.

Regarding claim 10, Marcheggiani as modified by Senba does not teach/suggest: The computer-implemented method of Claim 1, wherein the predictor is selected from a group consisting of a logistic regression module, a support vector machine (SVM), and a fully connected neural network. He, however, teaches/suggests a fully connected neural network (He §2.3 ¶2: “To get the final predicted semantic roles, we exploit a multi-layer perceptron (MLP) with highway connections on the top of BiLSTM networks ... The MLP network consists of 10 layers with highway connections and we employ ReLU activations for the hidden layers.”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the classifier (the claimed predictor) of Marcheggiani as modified by Senba to be a fully connected neural network as taught/suggested by He for machine learning.

Claim 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Marcheggiani et al. (A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling, 2017, arXiv:1701.02593v2) in view of Senba (US 2010/0296595) as applied to claim 1 above, and further in view of Friedrichs et al. (US 2013/0139261).
Regarding claim 11, Marcheggiani as modified by Senba does not teach/suggest: The computer-implemented method of Claim 1, wherein the feature vector is sent to the predictor in response to determining that the feature vector is sufficient to invoke the predictor. Friedrichs, however, teaches/suggests determining that the feature vector is sufficient to invoke (Friedrichs [0047]: “Once a sufficient number of feature vectors have been labeled in conjunction with a file, a machine learning classifier can be trained.”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the BiLSTM representations (the claimed feature vector) of Marcheggiani as modified by Senba to be sufficient before invoking the classifier (the claimed predictor) as taught/suggested by Friedrichs for the training.

As such, Marcheggiani as modified by Senba and Friedrichs teaches/suggests wherein the feature vector is sent to the predictor in response to determining that the feature vector is sufficient to invoke the predictor (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components ... a classifier which takes as an input the BiLSTM representation of the candidate argument and the BiLSTM representation of the predicate to predict the role associated to the candidate argument;” Friedrichs [0047]: “Once a sufficient number of feature vectors have been labeled in conjunction with a file, a machine learning classifier can be trained.”).

Claim 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Marcheggiani et al. (A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling, 2017, arXiv:1701.02593v2) in view of Senba (US 2010/0296595) as applied to claim 1 above, and further in view of Dundar et al. (US 2019/0244060).
Regarding claim 12, Marcheggiani as modified by Senba does not teach/suggest: The computer-implemented method of Claim 1, further comprising: 
comparing the label to a predetermined training label to determine a difference between the label and the predetermined training label; and 
adjusting the predictor and the scheduler, based on the difference.
Dundar, however, teaches/suggests determine a difference (Dundar [0149]: “The training loss unit 645 adjusts parameters (weights) of the recognition neural network model 620 based on differences between the predicted synthetic recognition data and the ground truth recognition data.”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the neural network of Marcheggiani as modified by Senba to be adjusted as taught/suggested by Dundar for the training. As such, Marcheggiani as modified by Senba and Dundar teaches/suggests:
comparing the label to a predetermined training label to determine a difference between the label and the predetermined training label (Chatterjee [0142]: “Thereafter, at step 404, the annotated factoid tags associated with each word along with the associated word in a plurality of natural language sentences are iteratively inputted to the neural network in order to train the neural network;” Dundar [0149]: “The training loss unit 645 adjusts parameters (weights) of the recognition neural network model 620 based on differences between the predicted synthetic recognition data and the ground truth recognition data.”); and 
adjusting the predictor and the scheduler, based on the difference (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components;” Senba Fig. 1: scheduling unit 27; Dundar [0149]: “The training loss unit 645 adjusts parameters (weights) of the recognition neural network model 620 based on differences between the predicted synthetic recognition data and the ground truth recognition data.”).

Claim 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Marcheggiani et al. (A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling, 2017, arXiv:1701.02593v2) in view of Senba (US 2010/0296595) and Dundar et al. (US 2019/0244060) as applied to claim 12 above, and further in view of Chatterjee et al. (US 2020/0312297).
Regarding claim 13, Marcheggiani as modified by Senba and Dundar does not teach/suggest: The computer-implemented method of Claim 12, further comprising applying a model to the adjusted scheduler. Chatterjee, in view of Dundar, teaches/suggests applying a model to the adjusted scheduler (Chatterjee [0032]: “The dependency parse module 210 receives a natural language sentence as an input and generates a dependency parse tree for the given sentence;” Dundar [0149]: “The training loss unit 645 adjusts parameters (weights) of the recognition neural network model 620 based on differences between the predicted synthetic recognition data and the ground truth recognition data.”). The same rationales to combine as set forth in the rejection of claims 2 and 12 above are incorporated herein.

Claim 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Marcheggiani et al. (A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling, 2017, arXiv:1701.02593v2) in view of Senba (US 2010/0296595) as applied to claim 14 above, and further in view of Hayashi (US 2003/0128863).
Regarding claim 15, Marcheggiani as modified by Senba does not teach/suggest: The computer-implemented method of Claim 14, wherein the predetermined strategy includes a cold start strategy where the scheduler adopts a uniform distribution to generate the input sequence. Hayashi, however, teaches/suggests a cold start strategy where the scheduler adopts a uniform distribution to generate the input sequence (Hayashi [0047]: “The generated first and second pseudo-random number sequences r1 and r2 are output to the pseudo-random number allocating section 203, where a real number sequence conforming to a uniform distribution contained within the range [-1, 1] is used for the first and second pseudo-random number sequences r1 and r2.”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the word embeddings of Marcheggiani as modified by Senba to have a uniform distribution as taught/suggested by Hayashi because that would have been well-understood, routine, and conventional to generate such sequence.

Claim 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Marcheggiani et al. (A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling, 2017, arXiv:1701.02593v2) in view of Senba (US 2010/0296595) as applied to claim 14 above, and further in view of Ahmad et al. (US 2005/0251765).
Regarding claim 16, Marcheggiani as modified by Senba does not teach/suggest: The computer-implemented method of Claim 14, wherein the predetermined strategy includes a warmup strategy, where the scheduler adopts an epsilon-greedy method to generate the input sequence. Ahmad, however, teaches/suggests a warmup strategy, where the scheduler adopts an epsilon-greedy method to generate the input sequence (Ahmad [0021]: “In one example of our invention the input sequence is weighted and this weighting is used to generate an input sequences having a biased probability distribution.”). The claimed epsilon-greedy method is an inherent and/or implicit feature of the biased probability distribution. In addition, such feature would have been well known for the biased probability distribution (Official Notice). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the word embeddings of Marcheggiani as modified by Senba to have a biased probability distribution as taught/suggested by Ahmad because that would have been well-understood, routine, and conventional to generate such sequence.

Claim 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Marcheggiani et al. (A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling, 2017, arXiv:1701.02593v2) in view of Senba (US 2010/0296595) as applied to claim 14 above, and further in view of Rolleston Phillips (US 2008/0004940).
Regarding claim 17, Marcheggiani as modified by Senba does not teach/suggest: The computer-implemented method of Claim 14, wherein the predetermined strategy includes a heat convergence strategy, where the scheduler adopts a maximum-likelihood action to generate the input sequence. Rolleston Phillips, in view of Marcheggiani, teaches/suggests a heat convergence strategy, where the scheduler adopts a maximum-likelihood action to generate the input sequence (Marcheggiani §2.1 ¶1: “We represent each word as the concatenation of four vectors ... The randomly initialized embeddings xre, xpos, and xle are fine-tuned during training, while the pre-trained ones are kept fixed;” Rolleston Phillips [0173]: “In this way, the Personalization system can now rank all cards from the portfolio in order of the likelihood that any particular visitor may purchase those cards, based upon their observed preferences to date.”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the word embeddings of Marcheggiani as modified by Senba to be ranked as taught/suggested by Rolleston Phillips to induce a favorable action.

Claim 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Marcheggiani et al. (A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling, 2017, arXiv:1701.02593v2) in view of Senba (US 2010/0296595), Chatterjee et al. (US 2020/0312297), Dundar et al. (US 2019/0244060), and Rolleston Phillips (US 2008/0004940).
Regarding claim 6, Marcheggiani as modified by Senba teaches/suggests: A computer-implemented method, comprising: 
receiving training data (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components ... a word representation component that from a word wi in a sentence w build a word representation xi;” §2.1 ¶1: “We represent each word as the concatenation of four vectors ... The randomly initialized embeddings xre, xpos, and xle are fine-tuned during training, while the pre-trained ones are kept fixed.”); 
training a model that includes a Scheduler, an Encoder, and a Predictor (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components;” Senba Fig. 1: scheduling unit 27), wherein: 
(i) the Scheduler is any function that determines tokens to pass to the Encoder, and decides when to invoke the Predictor (Marcheggiani §2.1 ¶1: “We represent each word as the concatenation of four vectors ... The randomly initialized embeddings xre, xpos, and xle are fine-tuned during training, while the pre-trained ones are kept fixed;” Senba Fig. 1: scheduling unit 27); 
(ii) the Encoder is any function that maps a sequence of input data to a feature vector (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components ... a Bidirectional LSTM (BiLSTM) encoder which takes as input the word representation xi and provide a dynamic representation of the word and its context in a sentence;” §2.3 ¶2: “Specifically, when identifying arguments of a given predicate, we add a predicate-specific feature to the representation of each word in the sentence by concatenating a binary flag to the word representation of Section 2.1.”); 
(iii) the Predictor is any function that maps a feature vector to a vector of functional tags (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components ... a classifier which takes as an input the BiLSTM representation of the candidate argument and the BiLSTM representation of the predicate to predict the role associated to the candidate argument.”); and 
applying the trained model to predict functional tags in a sentence (Marcheggiani §2.4.1 ¶1: “The basic role classifier takes the hidden state of the top-layer bidirectional LSTM corresponding to the considered word at position i and uses it to estimate the probability of the role r.”).
The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Marcheggiani as modified by Senba does not teach/suggest:
wherein each training datum includes a dependency tree having zero or more sub-trees labeled with ground-truth functional tags; 
Chatterjee, however, teaches/suggests:
wherein each training datum includes a dependency tree having zero or more sub-trees labeled with ground-truth functional tags (Chatterjee [0032]: “The dependency parse module 210 receives a natural language sentence as an input and generates a dependency parse tree for the given sentence;” [0142]: “Thereafter, at step 404, the annotated factoid tags associated with each word along with the associated word in a plurality of natural language sentences are iteratively inputted to the neural network in order to train the neural network.”); 
The same rationale to combine as set forth in the rejection of claim 2 above is incorporated herein.

Nor does Marcheggiani as modified by Senba teach/suggest:
(iv) wherein during the training, an error of prediction is used to adjust parameters of the Predictor, the Encoder, and the Scheduler; 
Dundar, in view of Marcheggiani and Senba, teaches/suggests:
(iv) wherein during the training, an error of prediction is used to adjust parameters of the Predictor, the Encoder, and the Scheduler (Marcheggiani §2 ¶2: “In order to identify and classify arguments, we propose a model composed of three components;” Senba Fig. 1: scheduling unit 27; Dundar [0149]: “The training loss unit 645 adjusts parameters (weights) of the recognition neural network model 620 based on differences between the predicted synthetic recognition data and the ground truth recognition data.”); 
The same rationale to combine as set forth in the rejection of claim 12 above is incorporated herein.

Nor does Marcheggiani as modified by Senba teach/suggest maps a feature vector to an action ranking vector. Rolleston Phillips, in view of Marcheggiani, teaches/suggests maps a feature vector to an action ranking vector (Marcheggiani §2.1 ¶1: “We represent each word as the concatenation of four vectors ... The randomly initialized embeddings xre, xpos, and xle are fine-tuned during training, while the pre-trained ones are kept fixed;” Rolleston Phillips [0173]: “In this way, the Personalization system can now rank all cards from the portfolio in order of the likelihood that any particular visitor may purchase those cards, based upon their observed preferences to date.”). The same rationale to combine as set forth in the rejection of claim 17 above is incorporated herein.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
US 2008/0221878 – semantic role labeling
US 2009/0210218 – semantic role labeling
US 2015/0198443 – support vector machine
US 2017/0330056 – cold start
US 2017/0337474 – semantic role labeling
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANH-TUAN V NGUYEN whose telephone number is 571-270-7513. The examiner can normally be reached on M-F 9AM-5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KEE TUNG can be reached on 571-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ANH-TUAN V NGUYEN/
Primary Examiner, Art Unit 2611