Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


DETAILED ACTION

This final office action is in response to the application filed 9/14/2019.
Claims 1-20 are pending. Claims 1 and 11 are independent claims.  


Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.







Claims 1, 2, 9, and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Xie (20180300314) in further view of Chen (20180067923) 

Regarding claim 1, Xie teaches a method of using a transformer framework with tree-based attention for hierarchical encoding in natural language processing, comprising: (Fig. 5, 530, 0002, natural language processing and 0004,0019, discloses encoding a question hierarchical relations among constituents to predict an answer of a question) obtaining, at the transformer framework, a pre-parsed constituency tree having a set of terminal nodes and a set of nonterminal nodes corresponding to a natural language sentence;  (0018, Fig. 1, discloses a parsing module 140 that once the passage and questions are presented, the parsing module parses them into constituent parse trees with internal nodes having more than one word and leaf nodes representing a single word of the sentence and the encoding module 150 uses the constituent parse trees to learn representation for the constituents in question and passages using tree LSTM and chain-of-trees LSTM) computing a respective value component of each nonterminal node in the pre-parsed constituency tree by adding hidden states of descendant nodes of the respective nonterminal node along a specific path that includes the respective nonterminal node in a bottom-up manner; (0019, Fig. 5,510, discloses the encoding module 150 encodes a question related to the text passage and 0040, discloses encoding the individual constituents form a text passage using a chain of trees long short-term memory encoding Fig.1,0028, discloses a tree-guided attention module 160 maybe a chain of trees long short-term memory configured to compute hidden states for each sentence using a bottom-up tree long short-term memory encodings into a root node and 0020-0028, discloses each node has two hidden states ( h.sub..upward.  produced by the LSTM in bottom-up direction) and T denotes the maximum number of children (descendant node) an internal node (nonterminal node) can have and L be the number of children the node has and feeding the hidden states for each sentence into a root node in a bottom up manner. The examiner interprets as aggregating the hidden states of the descendant nodes.)
Xie fails to teach encoding the natural language sentence at a transformer encoder of the transformer framework by: determining a set of paths from a root node of the pre-parsed constituency tree to each of the set of terminal nodes based on a structure of the pre-parsed constituency tree;  and computing a final representation of the respective nonterminal node in the pre-parsed constituency tree by applying weighted aggregation of respective value components corresponding to the respective nonterminal nodes over all paths that include the respective nonterminal nodes over all paths that include the respective nonterminal node; which generating an encoded representation of the natural language sentences based on the computed final representations of all nonterminal nodes. 
Chen teaches encoding the natural language sentence at a transformer encoder of the transformer framework by: determining a set of paths from a root node of the pre-parsed constituency tree to each of the set of terminal nodes based on a structure of the pre-parsed constituency tree;  (0004, discloses semantic tagging of an input sentence into one or more substructures using parsing and an encoding method and Fig. 2 and 0018, discloses sub-structure begins with the root node and follows a path down the structure to a leaf node in a tree) and computing a final representation of the respective nonterminal node in the pre-parsed constituency tree by applying weighted aggregation of respective value components corresponding to the respective nonterminal nodes over all paths that include the respective nonterminal nodes over all paths that include the respective nonterminal node;  (0005, discloses computing a weighted sum vector for each sub-structure) which generating an encoded representation of the natural language sentences based on the computed final representations of all nonterminal nodes. (0004, discloses encoding into vectors or representation of the input sentence)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie to incorporate the teachings of Chen.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding in order to process an input phrase using NLP.

Regarding claim 2, Xie and Chen teaches the method of claim 1.  Xie teaches further comprising:  determining, based on the pre-parsed constituency tree,  (0018, discloses a constituency parse tree constructed from a natural language text) a first hidden representation vector corresponding to value components of the set of terminal nodes (0020, discloses a node embedding with a vector representation and 0021, discloses hidden vectors of the inputs and ) and a second hidden representation vector corresponding to value components of the set of nonterminal nodes;  (0020-0021, discloses nonterminal node’s embedding where a nonterminal node embedding has a vector representation) applying an interpolation function to the first hidden representation vector, (0031, discloses performing an aggregation function of the node embedding 220) the second hidden representation vector and a set of rules indexed by the set of nonterminal nodes; (0021, discloses a plurality of n tokens for the nonterminal node and 0028, discloses an interaction index and 0029, discloses a tree guided attention mechanism used to learn a question-aware representation for each constitution in the passage and                                                                                             obtaining a first tensor from the interpolation function, wherein the tensor has rows and columns arranged according to a structure of the pre-parsed constituency tree.  (0026, discloses a graph G=(V,E), and 0019, discloses V represents a set of vertices V that correspond to the set of words and E represents a set of ordered pairs of vertices of relationships between the elements)

Regarding claim 9, Xie and Chen teaches the method of claim 1.  Xie teaches further comprising integrating encoder self-attention into the transformer framework by: (0004, discloses encoding individual constituents into a hierarchical relation amongst the constituents with a tree-guided attention mechanism) computing, via a tree-based self-attention layer, (0004, discloses a tree-guided attention mechanism) first output representations for the set of terminal nodes and second output representations for the set of nonterminal nodes;  (0019, discloses that internal nodes in a parse tree have more than one word and leaf nodes represent one word and 0022, discloses an internal node are defined and a leaf node contains an additional input) generating a query-key affinity matrix based on a comparison of the first output representations and the second output representations; (0037, discloses searching for a correct answer for a question) computing first value representations for the set of terminal nodes based on the first output representations;  (0022, discloses computing a hidden state) encoding second value representations for the set of nonterminal nodes using hierarchical accumulation based on the first output representations and the second output representations;
(0022, discloses encoding nodes in a top down and bottoms up fashion for a leaf node and internal node) computing final attentions for the set of terminal nodes and the set of nonterminal nodes by taking weighted averages of the encoded second value representations and the first value representations; (0021, discloses using weights and 0029, discloses a tree guided local normalization) and passing the final attentions through serial computations of a transformer network to generate the final representation of the set of nonterminal nodes. (Fig. 4 and 0036, discloses generating a final chain LSTM)

Regarding claim 11,  Xie teaches a system of using a transformer framework with tree-based attention for hierarchical encoding in natural language processing, comprising: a memory containing machine readable medium storing machine executable code; and one or more processors coupled to the memory and configurable to execute the machine executable code to cause the one or more processors to:  (Fig. 5, 530, 0002, natural language processing and 0004,0019, discloses encoding a question hierarchical relations among constituents to predict an answer of a question) obtain at the transformer framework, a pre-parsed constituency tree having a set of terminal nodes and a set of nonterminal nodes corresponding to a natural language sentence;  (0018, Fig. 1, discloses a parsing module 140 that once the passage and questions are presented, the parsing module parses them into constituent parse trees with internal nodes having more than one word and leaf nodes representing a single word of the sentence and the encoding module 150 uses the constituent parse trees to learn representation for the constituents in question and passages using tree LSTM and chain-of-trees LSTM) computing a respective value component of each nonterminal node in the pre-parsed constituency tree by adding hidden states of descendant nodes of the respective nonterminal node along a specific path that includes the respective nonterminal node in a bottom-up manner; (0019, Fig. 5,510, discloses the encoding module 150 encodes a question related to the text passage and 0040, discloses encoding the individual constituents form a text passage using a chain of trees long short-term memory encoding Fig.1,0028, discloses a tree-guided attention module 160 maybe a chain of trees long short-term memory configured to compute hidden states for each sentence using a bottom-up tree long short-term memory encodings into a root node and 0020-0028, discloses each node has two hidden states ( h.sub..upward.  produced by the LSTM in bottom-up direction) and T denotes the maximum number of children (descendant node) an internal node (nonterminal node) can have and L be the number of children the node has and feeding the hidden states for each sentence into a root node in a bottom up fashion. The examiner interprets as aggregating the hidden states of the descendant nodes)
Xie fails to teach encode the natural language sentence at a transformer encoder of the transformer framework by: determining a set of paths from a root node of the pre-parsed constituency tree to each of the set of terminal nodes based on a structure of the pre-parsed constituency tree;  and computing a final representation of the respective nonterminal node in the pre-parsed constituency tree by applying weighted aggregation of respective value components corresponding to the respective nonterminal nodes over all paths that include the respective nonterminal nodes over all paths that include the respective nonterminal node; which generating an encoded representation of the natural language sentences based on the computed final representations of all nonterminal nodes. 
Chen encode the natural language sentence at a transformer encoder of the transformer framework by: determining a set of paths from a root node of the pre-parsed constituency tree to each of the set of terminal nodes based on a structure of the pre-parsed constituency tree;  (0004, discloses semantic tagging of an input sentence into one or more substructures using parsing and an encoding method and Fig. 2 and 0018, discloses sub-structure begins with the root node and follows a path down the structure to a leaf node in a tree) and computing a final representation of the respective nonterminal node in the pre-parsed constituency tree by applying weighted aggregation of respective value components corresponding to the respective nonterminal nodes over all paths that include the respective nonterminal nodes over all paths that include the respective nonterminal node; (0005, discloses computing a weighted sum vector for each sub-structure) which generating an encoded representation of the natural language sentences based on the computed final representations of all nonterminal nodes. (0004, discloses encoding into vectors or representation of the input sentence)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie to incorporate the teachings of Chen.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding in order to process an input phrase using NLP.

Claims 3, 5-7, 10, 12, 13, 15-17, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Xie (20180300314) in further view of Chen (20180067923) in further view of Wu (20210026922)

Regarding claim 3, Xie and Chen teaches the method of claim 2.  
Xie and Chen fail to teach further comprising: computing a second tensor from the first tensor via an upward cumulative-average operation,  wherein each element in the second tensor is computed by dividing a respective nonterminal node representation from the first tensor by a total number of all descendent nodes of the respective nonterminal node in a particular branch, and wherein each row of the second tensor represents a nonterminal node, and the each element of the second tensor represents a vector representation of the nonterminal node reflecting the particular branch.
Wu teaches further comprising: computing a second tensor from the first tensor via an upward cumulative-average operation, (0020, discloses each node embedding learns its vector representation by aggregating information within k in a local node neighborhood) wherein each element in the second tensor is computed by dividing a respective nonterminal node representation from the first tensor by a total number of all descendent nodes of the respective nonterminal node in a particular branch, (0020, discloses aggregating information. The examiner interprets as a function to compute an average) and wherein each row of the second tensor represents a nonterminal node, and the each element of the second tensor represents a vector representation of the nonterminal node reflecting the particular branch. (0021, discloses a tree decoder that uses nonterminal n tokens and a vector representation using an attention mechanism)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Chen to incorporate the teachings of Wu.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding in order to process an input phrase using NLP.

Regarding claim 5, Xie and Chen teaches the method of claim 2.  
Xie and Chen further comprising: applying hierarchical embedding to the first tensor before elements of the first tensor is accumulated through an upward cumulative-average operation.
Wu teaches further comprising: applying hierarchical embedding to the first tensor before elements of the first tensor is accumulated through an upward cumulative-average operation. (0030, discloses merging the attentional representation in a recursive and bottom-up way and 0024,0029, discloses a tree guided attention mechanism and embedded normalization and aggregation)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Chen to incorporate the teachings of Wu.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding.

Regarding claim 6, Xie, Chen, and Wu teach the method of claim 5. 
Xie and Chen teaches wherein the hierarchical embedding includes: constructing a tensor of hierarchical embedding, wherein an entry of the tensor of hierarchical embedding is computed by concatenating a first row vector from a vertical embedding matrix and a second row vector from a horizontal embedding matrix.
Wu teaches wherein the hierarchical embedding includes: (0018-0020, discloses a hierarchical representation) constructing a tensor of hierarchical embedding, (0019, discloses a syntactic graph represented in the form G=(V, E)) wherein an entry of the tensor of hierarchical embedding is computed by concatenating a first row vector from a vertical embedding matrix and a second row vector from a horizontal embedding matrix. (0026-0030, Fig. 4-5, discloses concatenating vectors using a matrix)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Chen to incorporate the teachings of Wu.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding.

Regarding claim 7, Xie, Chen, and Wu teach the method of claim 6. 
Xie and Chen fail to teach further comprising: summing the first tensor with the tensor of hierarchical embedding; and applying the upward cumulative-average operation to the summed first tensor with the tensor of hierarchical embedding.
Wu teaches further comprising: summing the first tensor with the tensor of hierarchical embedding; (0018-0020, discloses a tensor and a hierarchical node embedding that aggregates information from a local node neighborhood) and applying the upward cumulative-average operation to the summed first tensor with the tensor of hierarchical embedding.  (0020, discloses each node learns its vector representation by aggregating information from a node local neighborhood)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Chen to incorporate the teachings of Wu.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding.

Regarding claim 10, Xie and Chen teach the method of claim 9. 
Xie and Chen fail to teach further comprising integrating decoder cross-attention into the transformer framework by: computing affinity score matrices based on a target-side query matrix, the first output representations for the set of terminal nodes and the second output representations for the set of nonterminal nodes;  computing first value representations for the set of terminal nodes based on the first output representations;  computing second value representations for the set of nonterminal nodes using hierarchical accumulation based on the first output representations and the second output representations;  and computing an attention output of decoder cross-attention based on the affinity score matrices, the first value representations and the second value representations.
 Wu teaches further comprising integrating decoder cross-attention into the transformer framework by: (0021, discloses a tree decoder that uses one or more tree decoder mechanisms) computing affinity score matrices based on a target-side query matrix, the first output representations for the set of terminal nodes and the second output representations for the set of nonterminal nodes; (0019, discloses computing vertices that corresponds to words in the natural language text)
computing first value representations for the set of terminal nodes based on the first output representations;  (0021, discloses nonterminal nodes and terminal nodes) computing second value representations for the set of nonterminal nodes using hierarchical accumulation based on the first output representations and the second output representations; (0020, discloses an inductive node embedding algorithm) and computing an attention output of decoder cross-attention based on the affinity score matrices, the first value representations and the second value representations.  (0021, dioceses decoding using the special tokens)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Chen to incorporate the teachings of Wu.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding.

Regarding claim 12, Xie and Chen teaches the method of claim 11.  Xie teaches further comprising:  determining, based on the pre-parsed constituency tree,  (0018, discloses a constituency parse tree constructed from a natural language text) a first hidden representation vector corresponding to value components of the set of terminal nodes (0020, discloses a node embedding with a vector representation and 0021, discloses hidden vectors of the inputs and ) and a second hidden representation vector corresponding to value components of the set of nonterminal nodes;  (0020-0021, discloses nonterminal node’s embedding where a nonterminal node embedding has a vector representation) applying an interpolation function to the first hidden representation vector, (0031, discloses performing an aggregation function of the node embedding 220) the second hidden representation vector and a set of rules indexed by the set of nonterminal nodes; (0021, discloses a plurality of n tokens for the nonterminal node and 0028, discloses an interaction index and 0029, discloses a tree guided attention mechanism used to learn a question-aware representation for each constitution in the passage and ) and obtaining a first tensor from the interpolation function, wherein the tensor has rows and columns arranged according to a structure of the pre-parsed constituency tree.  (0026, discloses a graph G=(V,E), and 0019, discloses V represents a set of vertices V that correspond to the set of words and E represents a set of ordered pairs of vertices of relationships between the elements)

Regarding claim 13, Xie and Chen teach the method of claim 12.
 Xie and Chen fail to teach further comprising wherein the machine executable code further causes the one or more processors to: compute a second tensor from the first tensor via an upward cumulative-average operation,  wherein each element in the second tensor is computed by dividing a respective nonterminal node representation from the first tensor by a total number of all descendent nodes of the respective nonterminal node in a particular branch, and wherein each row of the second tensor represents a nonterminal node, and the each element of the second tensor represents a vector representation of the nonterminal node reflecting the particular branch.
Wu teaches further comprising: compute a second tensor from the first tensor via an upward cumulative-average operation, (0020, discloses each node embedding learns its vector representation by aggregating information within k in a local node neighborhood) wherein each element in the second tensor is computed by dividing a respective nonterminal node representation from the first tensor by a total number of all descendent nodes of the respective nonterminal node in a particular branch, (0020, discloses aggregating information. The examiner interprets as a function to compute an average) and wherein each row of the second tensor represents a nonterminal node, and the each element of the second tensor represents a vector representation of the nonterminal node reflecting the particular branch. (0021, discloses a tree decoder that uses nonterminal n tokens and a vector representation using an attention mechanism)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Chen to incorporate the teachings of Wu.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding.

Regarding claim 15, Xie and Chen teach the system of claim 12. 
Xie and Chen fail to teach wherein the machine executable code further causes the one or more processors to: apply hierarchical embedding to the first tensor before elements of the first tensor is accumulated through an upward cumulative-average operation.
Wu teaches further comprising: apply hierarchical embedding to the first tensor before elements of the first tensor is accumulated through an upward cumulative-average operation. (0030, discloses merging the attentional representation in a recursive and bottom-up way and 0024,0029, discloses a tree guided attention mechanism and embedded normalization and aggregation)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Chen to incorporate the teachings of Wu.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding.

Regarding claim 16, Xie and Chen teach the system of claim 15. Xie and Chen fail to teach wherein the hierarchical embedding includes: constructing a tensor of hierarchical embedding, wherein an entry of the tensor of hierarchical embedding is computed by concatenating a first row vector from a vertical embedding matrix and a second row vector from a horizontal embedding matrix.
Wu teaches wherein the hierarchical embedding includes: (0018-0020, discloses a hierarchical representation) constructing a tensor of hierarchical embedding, (0019, discloses a syntactic graph represented in the form G=(V, E)) wherein an entry of the tensor of hierarchical embedding is computed by concatenating a first row vector from a vertical embedding matrix and a second row vector from a horizontal embedding matrix. (0026-0030, Fig. 4-5, discloses concatenating vectors using a matrix)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Chen to incorporate the teachings of Wu.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding.

Regarding claim 17, Xie, Chen, and Wu teach the method of claim 16. 
Xie and Chen fail to teach further comprising: sum the first tensor with the tensor of hierarchical embedding; and applying the upward cumulative-average operation to the summed first tensor with the tensor of hierarchical embedding.
Wu teaches further comprising: summing the first tensor with the tensor of hierarchical embedding; (0018-0020, discloses a tensor and a hierarchical node embedding that aggregates information from a local node neighborhood) and applying the upward cumulative-average operation to the summed first tensor with the tensor of hierarchical embedding.  (0020, discloses each node learns its vector representation by aggregating information from a node local neighborhood)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Chen to incorporate the teachings of Wu.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding.

Regarding claim 19, Xie and Chen teach the system of claim 11.  Xie teaches wherein the machine executable code further causes the one or more processors to integrate encoder self-attention into the transformer framework by:(0004, discloses encoding individual constituents into a hierarchical relation amongst the constituents with a tree-guided attention mechanism) computing, via a tree-based self-attention layer, (0004, discloses a tree-guided attention mechanism) first output representations for the set of terminal nodes and second output representations for the set of nonterminal nodes;  (0019, discloses that internal nodes in a parse tree have more than one word and leaf nodes represent one word and 0022, discloses an internal node are defined and a leaf node contains an additional input) generating a query-key affinity matrix based on a comparison of the first output representations and the second output representations; (0037, discloses searching for a correct answer for a question) computing first value representations for the set of terminal nodes based on the first output representations;  (0022, discloses computing a hidden state) encoding second value representations for the set of nonterminal nodes using hierarchical accumulation based on the first output representations and the second output representations;  (0022, discloses encoding nodes in a top down and bottoms up fashion for a leaf node and internal node)
computing final attentions for the set of terminal nodes and the set of nonterminal nodes by taking weighted averages of the encoded second value representations and the first value representations; (0021, discloses using weights and 0029, discloses a tree guided local normalization) and passing the final attentions through serial computations of a transformer network to generate the final representation of the set of nonterminal nodes. (Fig. 4 and 0036, discloses generating a final chain LSTM)

Regarding claim 20, Xie and Chen teach the system of claim 19. 
Xie and Chen fail to teach wherein the machine executable code further causes the one or more processors to integrate decoder cross-attention into the transformer framework by: integrating decoder cross-attention into the transformer framework by: computing affinity score matrices based on a target-side query matrix, the first output representations for the set of terminal nodes and the second output representations for the set of nonterminal nodes;  computing first value representations for the set of terminal nodes based on the first output representations;  computing second value representations for the set of nonterminal nodes using hierarchical accumulation based on the first output representations and the second output representations;  and computing an attention output of decoder cross-attention based on the affinity score matrices, the first value representations and the second value representations.
 Wu teaches wherein the machine executable code further causes the one or more processors to integrate decoder cross-attention into the transformer framework by: integrating decoder cross-attention into the transformer framework by: (0021, discloses a tree decoder that uses one or more tree decoder mechanisms) computing affinity score matrices based on a target-side query matrix, the first output representations for the set of terminal nodes and the second output representations for the set of nonterminal nodes; (0019, discloses computing vertices that corresponds to words in the natural language text)
computing first value representations for the set of terminal nodes based on the first output representations;  (0021, discloses nonterminal nodes and terminal nodes) computing second value representations for the set of nonterminal nodes using hierarchical accumulation based on the first output representations and the second output representations; (0020, discloses an inductive node embedding algorithm) and computing an attention output of decoder cross-attention based on the affinity score matrices, the first value representations and the second value representations.  (0021, dioceses decoding using the special tokens)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Chen to incorporate the teachings of Wu.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding.

Claims 4 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Xie (20180300314) in further view of Chen (20180067923) in further view of Wu (20210026922)in further view of Blouw (20190370389)

Recording claim 4, Xie, Chen, and Wu teach the method of claim 3. 
Xie and Wu fails to teach wherein the computing, by weighted aggregation, the final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding comprises: applying a weighting vector to the each element of the second tensor; and for a particular nonterminal node from the set of nonterminal nodes, combining, into a single accumulation vector, weighed elements from the second tensor corresponding to vector representations of nonterminal nodes in a subtree rooted at the particular nonterminal node.
Blouw teaches wherein the computing, by weighted aggregation, the final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding comprises: (0028, discloses weight matrices in the encoder and decoder neural network tied to the syntactic structure in the tree structure) applying a weighting vector to the each element of the second tensor; (0028, discloses applying the same set of weights for each occurrence of a given syntactic relation in a given tree-structure) and for a particular nonterminal node from the set of nonterminal nodes, combining, into a single accumulation vector, weighed elements from the second tensor corresponding to vector representations of nonterminal nodes in a subtree rooted at the particular nonterminal node.  (0005, discloses vector representations of sentences)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Wu to incorporate the teachings of Blouw.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding.

Recording claim 14, Xie, Chen, and Wu teach the system of claim 13. 
 Xie, Chen, and Wu fail to teach wherein the machine executable code further causes the one or more processors to compute, by weighted aggregation, the final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding comprises: applying a weighting vector to the each element of the second tensor; and for a particular nonterminal node from the set of nonterminal nodes, combining, into a single accumulation vector, weighed elements from the second tensor corresponding to vector representations of nonterminal nodes in a subtree rooted at the particular nonterminal node.
Blouw teaches wherein the computing, by weighted aggregation, the final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding comprises: (0028, discloses weight matrices in the encoder and decoder neural network tied to the syntactic structure in the tree structure) applying a weighting vector to the each element of the second tensor; (0028, discloses applying the same set of weights for each occurrence of a given syntactic relation in a given tree-structure) and for a particular nonterminal node from the set of nonterminal nodes, combining, into a single accumulation vector, weighed elements from the second tensor corresponding to vector representations of nonterminal nodes in a subtree rooted at the particular nonterminal node.  (0005, discloses vector representations of sentences)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie, Chen, and Wu to incorporate the teachings of Blouw.  Doing so would allow a final representation of the set of nonterminal nodes in the pre-parsed constituency tree based on the encoding.

Claims 8 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Xie (20180300314) in further view of Chen (20180067923) further view of Yi (20190155862)

Regarding claim 8, Xie and Chen teach the method of claim 1. 
Xie and Chen fail to teach further comprising: applying a masking function to each node in the pre-parsed constituency tree based on a corresponding affinity value of the respective node, wherein, when a node query is attending to a particular node in the pre-parsed constituency tree, the masking function prevents the node query from accessing node other than descendants of the particular node.  
Yi teaches further comprising: applying a masking function to each node in the pre-parsed constituency tree based on a corresponding affinity value of the respective node, (0020, discloses querying data in a node tree) wherein, when a node query is attending to a particular node in the pre-parsed constituency tree, the masking function prevents the node query from accessing node other than descendants of the particular node. (0038, discloses querying/filtering a tree and obtaining a filtered result and 0225, discloses hiding or masking a part of the data querying result that maybe customized by the user)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Chen to incorporate the teachings of Wu.  Doing so would allow masking of irrelevant nodes and increasing permissions in the tree and limit the search to only elements in a subtree of the query by only displaying results of the query and displaying descendant nodes of the result of the query.

Regarding claim 18, Xie and Chen teach the system of claim 11. 
Xie and Chen fail to teach further comprising: applying a masking function to each node in the pre-parsed constituency tree based on a corresponding affinity value of the respective node, wherein, when a node query is attending to a particular node in the pre-parsed constituency tree, the masking function prevents the node query from accessing node other than descendants of the particular node.  
Yi teaches further comprising: applying a masking function to each node in the pre-parsed constituency tree based on a corresponding affinity value of the respective node, (0020, discloses querying data in a node tree) wherein, when a node query is attending to a particular node in the pre-parsed constituency tree, the masking function prevents the node query from accessing node other than descendants of the particular node. (0038, discloses querying/filtering a tree and obtaining a filtered result and 0225, discloses hiding or masking a part of the data querying result that maybe customized by the user)
It would have been obvious to one of ordinary skill in the art before the effective filing date to have modified the teachings of Xie and Chen to incorporate the teachings of Wu.  Doing so would allow masking of irrelevant nodes and increasing permissions in the tree and limit the search to only elements in a subtree of the query by only displaying results of the query and displaying descendant nodes of the result of the query.


Response to Arguments

Applicant's arguments filed 1/20/2022, see Arguments filed with respect to the rejection(s) of claim(s) 1 and 11 under Xie and Blouw, have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Xie in further view of Chen.  See detailed rejection above.  See arguments below.

The applicant argues that the prior art (Xie) fails to teach or disclose (pp. 12-13) fails to disclose:  “computing a respective value component of each nonterminal node in the pre-parsed constituency tree by adding hidden states of descendant nodes of the respective nonterminal node along a specific path that includes the respective nonterminal node in a bottom-up manner;” The examiner respectfully disagrees.  Xie discloses; (0020, computing two hidden states for each node. A path from the up-down direction and a path for a bottoms up fashion.  The examiner interprets as adding a hidden state in an up down manner.)  For this reason this argument is not persuasive.  See detailed rejection above regarding further amendments.


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a) Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.

 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to STEVEN GOLDEN whose telephone number is (571)272-2128.  The examiner can normally be reached on Monday-Friday; 08:00a.m.-05.00 p.m EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Scott Baderman can be reached on 571-272-3644.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/STEVEN GOLDEN/
Examiner, Art Unit 2144


/SCOTT T BADERMAN/Supervisory Patent Examiner, Art Unit 2144