DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination - 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17[e], was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17[e] has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR1.114. The applicant’s submission for RCE filed on 10 November 2022 has been entered. 

Remarks
This action is in response to the applicant’s response filed 17 October 2022, which is in response to the USPTO office action mailed 22 August 2022. Claims 1, 11 and 15 are amended. Claim 3 is cancelled. Claim 1, 2 and 4-15 are currently pending.

Response to Arguments
With respect to the 35 USC §103 rejection of claims 1, 2 and 4-15, the applicant’s arguments have been fully considered but have not been deemed persuasive.
Firstly, the applicant argues “Chang '458 describes how a graph can be input into a transformer, where a node is to be classified and related graph information attached to the node is encoded in the transformer by enumerating the distance to the node-to-be-classified (see, e.g., Chang '458 at paragraphs [0007] - [0013] and [0071] - [0072]). Chang '458, however, does not teach or suggest the decomposition into path queries or resetting of positional counter values.” (response pg. 7). Respectfully, the examiner disagrees.
As an initial note, the examiner acknowledges the applicant argues Chang’458 does not teach “decomposition into path queries”. However, independent claims 1, 11 and 15 do not explicitly recite decomposition into path queries. Dependent claims 2 and 12 do, however, recite “decomposing the graph query into the set of m x n path queries” (e.g. claim 2 lines 3-4). Chang’693 is relied upon for teaching this limitation, where “[0044] note a DAG-PATH table used to enumerate all possible paths form the root node to each node in the DAG” (Chang’693 [0044]) and “Given a knowledge DAG with nodes (1), (2), (3), (4), (5), (6), connected as shown, the DAG Path table is as shown in FIG. 3A. As each node is added, the DAG Path Table lists all possible paths to each node (indicated as a Leaf node using LeafID). E.g., for Node (6), all possible paths are enumerated: (1) (2) (4) (6) and a second path: (1) (5) (4) (6)” (Chang’693, [0073]). Therefore, the argument that Chang’458 does not teach decomposition into path queries is not persuasive. 
Next, Chang’458 teaches “a current sequence corresponding to the current node to be analyzed is determined” (Chang’458, [0069]) where “The current node can be used as the root node. The nodes that are sequentially reachable from the edges of the root node are obtained, and the positional encodings of these nodes relative to the root node are determined” (Chang’458, [0071]). Finally, Chang’458 teaches “the positional encoding can include the quantity of edges that the node passes relative to the root node, or can be referred to as an order” (Chang’458, [0076]). The examiner interprets determining positional encodings for sequentially reachable nodes starting from a root, which include the quantity of edges that each node passes relative to the root reads on “positional counter values” as claimed because each node is traversed by following an edge, so therefore, the positional encoding for each node would increase with each edge (i.e. a counter). Furthermore, each current sequence corresponding to a current node being used as a root reads on “reset at a start position” as claimed because each root node is the first node in each sequence of nodes. Accordingly, this argument is not persuasive. 
Lastly, the applicant argues “The presently claimed invention enables to predict a node's missing information in complex graph queries, which are also represented as a graph and encoded in the transformer. However, to encode such complex graph queries, one cannot simply adopt Chang '458's method of enumerating distances of nodes. Importantly, a node with missing information has to be encoded as a special token and this requires a completely different type of positional encoding than provided in Chang '458 in order to answer such queries with a transformer. If complex graph queries were encoded by labeling the distance from the node with missing information as in Chang '458, then the structure of the complex graph query would not be able to be encoded.” (response pg. 7). Respectfully, this argument is not persuasive. 
In response to the applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., a node with missing information is encoded as a special token) are not recited in the rejected claims. Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). Accordingly, this argument is not persuasive. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2 and 4-15 are rejected under 35 U.S.C. 103 as being unpatentable over Chang et al., US 20070208693 A1 (hereinafter “Chang’693”) in view of Chang et al., US 20210049458 A1 (hereinafter “Chang’458”) in further view of Eisenschlos et al., US 20210165960 A1 (hereinafter “Eisenschlos”).

Claim 1: Chang’693 teaches a processor-implemented method of encoding a query graph into a sequence representation, the query graph having a plurality of nodes including m root nodes and n leaf nodes, wherein m and n are integer values, the method comprising the steps, implemented in a processor, of:
receiving a set of m x n path queries representing a query graph, wherein the query graph is a connected directed acyclic graph (DAG), and wherein each of the m x n path queries begins with a root node and ends with a leaf node (Chang’693, [0041] note encoding path information of the generalized DAG in entries of a path table in the relational database, where the entries of the path table correspond to paths in the generalized DAG from nodes of the generalized DAG to a root node of the generalized DAG, [0047] note Find all paths from any Node to the Root node, [0048] note Find all paths having a specific Node as a leaf, [Fig. 3A], [0073] note the DAG Path Table lists all possible paths to each node);
converting the DAG into a set of sequences corresponding to the set of m x n path queries, wherein each sequence in the set of sequences represents one of the path queries in the set of m x n path queries (Chang’693, [Fig. 4] note 410, [0194] note Path information can be converted at 410 into text strings in accordance with an entry format, [0036] note The interface 190 can be configured to trigger the text indexing engine 160 to generate the lexical index 162 of the encoded path information 122, where the lexical index 162 separately lists tokens of paths in the encoded path information 122); 
encoding the set of sequences as a single sequential query, wherein positions of each node and each edge between nodes within each of the m x n path queries are encoded independently, wherein the encoded positions include a positional order within each of the m x n path queries (Chang’693, [Fig. 4] note 420, [0194] note The path table can be compressed at 420 by referencing a sub-path within an entry using a single token, [0036] note The lexical index 162 can include a B-tree index, and the encoded path information 122 can be a compressed path table in which a sub-path within an entry is referenced using a single token).
Chang’693 does not explicitly teach wherein the query graph includes one or more missing nodes or one or more missing edges or both one or more missing nodes and one or more missing edges; wherein the encoded positions include positional counter values that are reset at a start position of each path of the m x n path queries in the single sequential query; and feeding the single sequential query to a transformer encoder.
However, Chang’458 teaches wherein the encoded positions include positional counter values that are reset at a start position of each path of the m x n path queries in the single sequential query (Chang’458, [0069] note a current sequence corresponding to the current node to be analyzed is determined, where the current sequence includes a plurality of nodes within a predetermined range reachable from the current node through edges, and positional encoding of each of the plurality of nodes relative to the current node in the dynamic interaction graph, [0071] note The current node can be used as the root node. The nodes that are sequentially reachable from the edges of the root node are obtained, and the positional encodings of these nodes relative to the root node are determined, [0076] note For each node within the predetermined range that is obtained through traversal, the positional encoding of the node relative to the root node is determined. In an implementation, the positional encoding can include the quantity of edges that the node passes relative to the root node, or can be referred to as an order; i.e. the examiner interprets that positional encodings of nodes relative to a current node use as a root node reads on positional counter values that are reset at a start position of each path).
It would have been obvious to one of ordinary skill in the art at the effective filing date of the application to combine the DAG of Chang’693 with the positional encodings relative to a root node of Chang’458 according to known methods (i.e. determining positional encodings relative to a current node used as a root node). Motivation for doing so is that this provides an improved solution to analyze objects more effectively to obtain feature vectors suitable for subsequent analysis (Chang’458, [0004]).
	Chang’693 and Chang’458 do not explicitly teach wherein the query graph includes one or more missing nodes or one or more missing edges or both one or more missing nodes and one or more missing edges; and feeding the single sequential query to a transformer encoder.
However, Eisenschlos teaches this (Eisenschlos, [Fig. 1], [Fig. 3], [0021] note FIG. 3 is an example system 300 for modifying input text to generate modified text by masking words or tokens and selecting replacement words or tokens, [0022] note Tokenization component 310 receives as input the input text to be modified and generates an input sequence of tokens to represent the input text, [0023] note tokenization component 310 may use byte-pair encodings to represent the input text, [0029] note A masking neural network may include any appropriate neural network, such as one or more of the following:… bidirectional encoder representations from transformers (BERT), [0053] note The generation loss function 𝓛G (Θ) corresponds to the cross-entropy loss of the log probability of regenerating or reconstructing the input sequence of tokens from the masked sequence of tokens given the masked sequence of tokens, the known attribute value of the training sample (indicated as y), and the model parameters. The generation loss function may be computed from the training data as a sample-based expected value).
It would have been obvious to one of ordinary skill in the art at the effective filing date of the application to combine the DAG of Chang’693 and Chang’458 with the token masking based on a bidirectional encoder representations form transformer of Eisenschlos according to known methods (i.e. masking tokens stored in the DAG). Motivation for doing so is that text may be reviewed to have proper grammar and spelling or to avoid the use of possibly offensive language (Eisenschlos, [0014]). 

Claim 2: Chang’693, Chang’458 and Eisenschlos teach the method of claim 1, wherein the receiving the set of m x n path queries representing the query graph includes: receiving the query graph, and converting the query graph into a sequential format by decomposing the graph query into the set of m x n path queries (Chang’693, [0044] note a DAG-PATH table used to enumerate all possible paths form the root node to each node in the DAG, [0073] note Given a knowledge DAG with nodes (1), (2), (3), (4), (5), (6), connected as shown, the DAG Path table is as shown in FIG. 3A. As each node is added, the DAG Path Table lists all possible paths to each node (indicated as a Leaf node using LeafID). E.g., for Node (6), all possible paths are enumerated: (1) (2) (4) (6) and a second path: (1) (5) (4) (6)).

Claim 4: Chang’693, Chang’458 and Eisenschlos teach the method of claim 1, wherein each of the plurality of nodes represents one of an entity on the query graph, an existentially quantified entity variable or a free entity variable, and wherein each edge represents a relation type (Chang’693, [0033] note machine knowledge (e.g., multiple thousands of related knowledge entities/interrelationships); i.e. the examiner interprets the DAG stores entities as vertices and relationships as edges between connecting the vertices).

Claim 5: Chang’693, Chang’458 and Eisenschlos teach the method of claim 4, further including masking each free variable entity (Eisenschlos, [Fig. 1], [Fig. 3], [0021] note FIG. 3 is an example system 300 for modifying input text to generate modified text by masking words or tokens and selecting replacement words or tokens, [0022] note Tokenization component 310 receives as input the input text to be modified and generates an input sequence of tokens to represent the input text, [0023] note tokenization component 310 may use byte-pair encodings to represent the input text, [0029] note A masking neural network may include any appropriate neural network, such as one or more of the following:… bidirectional encoder representations from transformers (BERT)).

Claim 6: Chang’693, Chang’458 and Eisenschlos teach the method of claim 4, wherein the query graph includes at least two free variable entities (Eisenschlos, [Fig. 1], [Fig. 3], [0021] note FIG. 3 is an example system 300 for modifying input text to generate modified text by masking words or tokens and selecting replacement words or tokens, [0022] note Tokenization component 310 receives as input the input text to be modified and generates an input sequence of tokens to represent the input text, [0023] note tokenization component 310 may use byte-pair encodings to represent the input text, [0029] note A masking neural network may include any appropriate neural network, such as one or more of the following:… bidirectional encoder representations from transformers (BERT)).

Claim 7: Chang’693, Chang’458 and Eisenschlos teach the method of claim 1, wherein the encoding positions of each node and each edge between nodes within each of the m x n path queries independently includes mapping each of the m x n path queries to a sequence of tokens, and encoding positions of each token within each sequence of tokens (Chang’693, [0034] note One or more paths in the graph are each encoded using at least three tokens, where the at least three tokens indicate nodes of the each respective path (e.g., the tokens can be node identifiers or edge identifiers, or a combination thereof)).

Claim 8: Chang’693, Chang’458 and Eisenschlos teach the method of claim 1, wherein the method further includes masking each token representing one of the one or more missing nodes or one or more missing edges or both one or more missing nodes and one or more missing edges to produce a masked sequential query (Eisenschlos, [0029] note A masking neural network may include any appropriate neural network, such as one or more of the following:… bidirectional encoder representations from transformers (BERT), [0051] note During training, the training items may be iteratively processed by system 400, [0053] note The generation loss function 𝓛G (Θ) corresponds to the cross-entropy loss of the log probability of regenerating or reconstructing the input sequence of tokens from the masked sequence of tokens given the masked sequence of tokens, the known attribute value of the training sample (indicated as y), and the model parameters. The generation loss function may be computed from the training data as a sample-based expected value).

Claim 9: Chang’693, Chang’458 and Eisenschlos teach the method of claim 8, wherein the feeding the single sequential query to the transformer encoder includes feeding the masked sequential query to the transformer encoder, wherein the transformer encoder is trained at the location of each of the masked tokens in the masked sequential query using a categorical cross-entropy loss (Eisenschlos, [0053] note The generation loss function 𝓛G (Θ) corresponds to the cross-entropy loss of the log probability of regenerating or reconstructing the input sequence of tokens from the masked sequence of tokens given the masked sequence of tokens, the known attribute value of the training sample (indicated as y), and the model parameters. The generation loss function may be computed from the training data as a sample-based expected value).

Claim 10: Chang’693, Chang’458 and Eisenschlos teach the method of claim 8, wherein two or more of the masked tokens represent the same node or edge, and wherein the method further includes averaging output probability distributions of the two or more masked tokens (Eisenschlos, [0053] note The generation loss function 𝓛G (Θ) corresponds to the cross-entropy loss of the log probability of regenerating or reconstructing the input sequence of tokens from the masked sequence of tokens given the masked sequence of tokens, the known attribute value of the training sample (indicated as y), and the model parameters. The generation loss function may be computed from the training data as a sample-based expected value).

Claim 11: Chang’693 teaches a system for encoding a query graph, the query graph having a plurality of nodes including m root nodes and n leaf nodes, wherein m and n are integer values, the system comprising one or more processors which, alone or in combination, are configured to provide for execution of a method comprising:
receiving a set of m x n path queries representing a query graph, wherein the query graph is a connected directed acyclic graph (DAG), and wherein each of the m x n path queries begins with a root node and ends with a leaf node (Chang’693, [0041] note encoding path information of the generalized DAG in entries of a path table in the relational database, where the entries of the path table correspond to paths in the generalized DAG from nodes of the generalized DAG to a root node of the generalized DAG, [0047] note Find all paths from any Node to the Root node, [0048] note Find all paths having a specific Node as a leaf, [Fig. 3A], [0073] note the DAG Path Table lists all possible paths to each node);
converting the DAG into a set of sequences corresponding to the set of m x n path queries, wherein each sequence in the set of sequences represents one of the path queries in the set of m x n path queries (Chang’693, [Fig. 4] note 410, [0194] note Path information can be converted at 410 into text strings in accordance with an entry format, [0036] note The interface 190 can be configured to trigger the text indexing engine 160 to generate the lexical index 162 of the encoded path information 122, where the lexical index 162 separately lists tokens of paths in the encoded path information 122); and
encoding the set of sequences as a single sequential query, wherein positions of each node and each edge between nodes within each of the m x n path queries are encoded independently, wherein the encoded positions include a positional order within each of the m x n path queries (Chang’693, [Fig. 4] note 420, [0194] note The path table can be compressed at 420 by referencing a sub-path within an entry using a single token, [0036] note The lexical index 162 can include a B-tree index, and the encoded path information 122 can be a compressed path table in which a sub-path within an entry is referenced using a single token).
Chang’693 does not explicitly teach wherein the query graph includes one or more missing nodes or one or more missing edges or both one or more missing nodes and one or more missing edges; wherein the encoded positions include positional counter values that are reset at a start position of each path of the m x n path queries in the single sequential query; and feeding the single sequential query to a transformer encoder.
However, Chang’458 teaches wherein the encoded positions include positional counter values that are reset at a start position of each path of the m x n path queries in the single sequential query (Chang’458, [0069] note a current sequence corresponding to the current node to be analyzed is determined, where the current sequence includes a plurality of nodes within a predetermined range reachable from the current node through edges, and positional encoding of each of the plurality of nodes relative to the current node in the dynamic interaction graph, [0071] note The current node can be used as the root node. The nodes that are sequentially reachable from the edges of the root node are obtained, and the positional encodings of these nodes relative to the root node are determined, [0076] note For each node within the predetermined range that is obtained through traversal, the positional encoding of the node relative to the root node is determined. In an implementation, the positional encoding can include the quantity of edges that the node passes relative to the root node, or can be referred to as an order; i.e. the examiner interprets that positional encodings of nodes relative to a current node use as a root node reads on positional counter values that are reset at a start position of each path).
It would have been obvious to one of ordinary skill in the art at the effective filing date of the application to combine the DAG of Chang’693 with the positional encodings relative to a root node of Chang’458 according to known methods (i.e. determining positional encodings relative to a current node used as a root node). Motivation for doing so is that this provides an improved solution to analyze objects more effectively to obtain feature vectors suitable for subsequent analysis (Chang’458, [0004]).
Chang’693 and Chang’458 do not explicitly teach wherein the query graph includes one or more missing nodes or one or more missing edges or both one or more missing nodes and one or more missing edges; and feeding the single sequential query to a transformer encoder.
However, Eisenschlos teaches this (Eisenschlos, [Fig. 1], [Fig. 3], [0021] note FIG. 3 is an example system 300 for modifying input text to generate modified text by masking words or tokens and selecting replacement words or tokens, [0022] note Tokenization component 310 receives as input the input text to be modified and generates an input sequence of tokens to represent the input text, [0023] note tokenization component 310 may use byte-pair encodings to represent the input text, [0029] note A masking neural network may include any appropriate neural network, such as one or more of the following:… bidirectional encoder representations from transformers (BERT), [0053] note The generation loss function 𝓛G (Θ) corresponds to the cross-entropy loss of the log probability of regenerating or reconstructing the input sequence of tokens from the masked sequence of tokens given the masked sequence of tokens, the known attribute value of the training sample (indicated as y), and the model parameters. The generation loss function may be computed from the training data as a sample-based expected value).
It would have been obvious to one of ordinary skill in the art at the effective filing date of the application to combine the DAG of Chang’693 and Chang’458 with the token masking based on a bidirectional encoder representations form transformer of Eisenschlos according to known methods (i.e. masking tokens stored in the DAG). Motivation for doing so is that text may be reviewed to have proper grammar and spelling or to avoid the use of possibly offensive language (Eisenschlos, [0014]). 

Claim 12: Chang’693, Chang’458 and Eisenschlos teach the system of claim 11, wherein the receiving the set of m x n path queries representing the query graph includes receiving the query graph, and converting the query graph into a sequential format by decomposing the graph query into the set of m x n path queries (Chang’693, [0044] note a DAG-PATH table used to enumerate all possible paths form the root node to each node in the DAG, [0073] note Given a knowledge DAG with nodes (1), (2), (3), (4), (5), (6), connected as shown, the DAG Path table is as shown in FIG. 3A. As each node is added, the DAG Path Table lists all possible paths to each node (indicated as a Leaf node using LeafID). E.g., for Node (6), all possible paths are enumerated: (1) (2) (4) (6) and a second path: (1) (5) (4) (6)).

Claim 13: Chang’693, Chang’458 and Eisenschlos teach the system of claim 11, wherein the method further includes masking each token representing the one or more missing nodes or one or more missing edges or both one or more missing nodes and one or more missing edges to produce a masked sequential query (Eisenschlos, [0029] note A masking neural network may include any appropriate neural network, such as one or more of the following:… bidirectional encoder representations from transformers (BERT), [0051] note During training, the training items may be iteratively processed by system 400, [0053] note The generation loss function 𝓛G (Θ) corresponds to the cross-entropy loss of the log probability of regenerating or reconstructing the input sequence of tokens from the masked sequence of tokens given the masked sequence of tokens, the known attribute value of the training sample (indicated as y), and the model parameters. The generation loss function may be computed from the training data as a sample-based expected value).

Claim 14: Chang’693, Chang’458 and Eisenschlos teach the system of claim 13, wherein the feeding the single sequential query to the transformer encoder includes feeding the masked sequential query to the transformer encoder, wherein the transformer encoder is trained at the location of each of the masked tokens in the masked sequential query using a categorical cross-entropy loss (Eisenschlos, [0053] note The generation loss function 𝓛G (Θ) corresponds to the cross-entropy loss of the log probability of regenerating or reconstructing the input sequence of tokens from the masked sequence of tokens given the masked sequence of tokens, the known attribute value of the training sample (indicated as y), and the model parameters. The generation loss function may be computed from the training data as a sample-based expected value).

Claim 15: Chang’693 teaches a tangible, non-transitory computer-readable medium having instructions thereon which, upon being executed by one or more processors, alone or in combination, provide for execution of a method of encoding a query graph into a sequence representation, the query graph having a plurality of nodes including m root nodes and n leaf nodes, wherein m and n are integer values, the method comprising the steps of:
receiving a set of m x n path queries representing a query graph, wherein the query graph is a connected directed acyclic graph (DAG), and wherein each of the m x n path queries begins with a root node and ends with a leaf node (Chang’693, [0041] note encoding path information of the generalized DAG in entries of a path table in the relational database, where the entries of the path table correspond to paths in the generalized DAG from nodes of the generalized DAG to a root node of the generalized DAG, [0047] note Find all paths from any Node to the Root node, [0048] note Find all paths having a specific Node as a leaf, [Fig. 3A], [0073] note the DAG Path Table lists all possible paths to each node);
converting the DAG into a set of sequences corresponding to the set of m x n path queries, wherein each sequence in the set of sequences represents one of the path queries in the set of m x n path queries (Chang’693, [Fig. 4] note 410, [0194] note Path information can be converted at 410 into text strings in accordance with an entry format, [0036] note The interface 190 can be configured to trigger the text indexing engine 160 to generate the lexical index 162 of the encoded path information 122, where the lexical index 162 separately lists tokens of paths in the encoded path information 122); and
encoding the set of sequences as a single sequential query, wherein positions of each node and each edge between nodes within each of the m x n path queries are encoded independently, wherein the encoded positions include a positional order within each of the m x n path queries (Chang’693, [Fig. 4] note 420, [0194] note The path table can be compressed at 420 by referencing a sub-path within an entry using a single token, [0036] note The lexical index 162 can include a B-tree index, and the encoded path information 122 can be a compressed path table in which a sub-path within an entry is referenced using a single token).
Chang’693 does not explicitly teach wherein the query graph includes one or more missing nodes or one or more missing edges or both one or more missing nodes and one or more missing edges; wherein the encoded positions include positional counter values that are reset at a start position of each path of the m x n path queries in the single sequential query; and feeding the single sequential query to a transformer encoder.
However, Chang’458 teaches wherein the encoded positions include positional counter values that are reset at a start position of each path of the m x n path queries in the single sequential query (Chang’458, [0069] note a current sequence corresponding to the current node to be analyzed is determined, where the current sequence includes a plurality of nodes within a predetermined range reachable from the current node through edges, and positional encoding of each of the plurality of nodes relative to the current node in the dynamic interaction graph, [0071] note The current node can be used as the root node. The nodes that are sequentially reachable from the edges of the root node are obtained, and the positional encodings of these nodes relative to the root node are determined, [0076] note For each node within the predetermined range that is obtained through traversal, the positional encoding of the node relative to the root node is determined. In an implementation, the positional encoding can include the quantity of edges that the node passes relative to the root node, or can be referred to as an order; i.e. the examiner interprets that positional encodings of nodes relative to a current node use as a root node reads on positional counter values that are reset at a start position of each path).
It would have been obvious to one of ordinary skill in the art at the effective filing date of the application to combine the DAG of Chang’693 with the positional encodings relative to a root node of Chang’458 according to known methods (i.e. determining positional encodings relative to a current node used as a root node). Motivation for doing so is that this provides an improved solution to analyze objects more effectively to obtain feature vectors suitable for subsequent analysis (Chang’458, [0004]).
Chang’693 and Chang’458 do not explicitly teach wherein the query graph includes one or more missing nodes or one or more missing edges or both one or more missing nodes and one or more missing edges; and feeding the single sequential query to a transformer encoder.
However, Eisenschlos teaches this (Eisenschlos, [Fig. 1], [Fig. 3], [0021] note FIG. 3 is an example system 300 for modifying input text to generate modified text by masking words or tokens and selecting replacement words or tokens, [0022] note Tokenization component 310 receives as input the input text to be modified and generates an input sequence of tokens to represent the input text, [0023] note tokenization component 310 may use byte-pair encodings to represent the input text, [0029] note A masking neural network may include any appropriate neural network, such as one or more of the following:… bidirectional encoder representations from transformers (BERT), [0053] note The generation loss function 𝓛G (Θ) corresponds to the cross-entropy loss of the log probability of regenerating or reconstructing the input sequence of tokens from the masked sequence of tokens given the masked sequence of tokens, the known attribute value of the training sample (indicated as y), and the model parameters. The generation loss function may be computed from the training data as a sample-based expected value).
It would have been obvious to one of ordinary skill in the art at the effective filing date of the application to combine the DAG of Chang’693 and Chang’458 with the token masking based on a bidirectional encoder representations form transformer of Eisenschlos according to known methods (i.e. masking tokens stored in the DAG). Motivation for doing so is that text may be reviewed to have proper grammar and spelling or to avoid the use of possibly offensive language (Eisenschlos, [0014]). 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Giuseppi Giuliani whose telephone number is (571)270-7128. The examiner can normally be reached Monday-Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aleksandr Kerzhner can be reached on (571)270-1760. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GIUSEPPI GIULIANI/Primary Examiner, Art Unit 2165