Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on September 3, 2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 4, 8, 9, 12, and 16-18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated over Shen (U.S. Patent No. 10387575).
Regarding claim 1, Shen discloses a method for semantic recognition, comprising:
in response to performing semantic analysis on information acquired by a terminal, acquiring a sentence to be processed ([Col 1, Rows 38-40] – One method used in natural language processing is semantic parsing. This extracts the semantic meaning of various words within a sentence);
performing word recognition on the sentence to be processed, to obtain a plurality of words and part-of speech information corresponding to each of the plurality of words ([Col 1, Rows 64-67] – The semantic graph comprises: a plurality of nodes, each node representing a span of one or more words taken from the initial set of words and having a corresponding shared semantic role within the initial set of words);
determining, with a pre-trained word processing model, a target set update operation corresponding to a set of words to be processed from a plurality of preset set update operations, according to one or more words to be input, part-of-speech information of the words to be input, and a dependency relationship of a first word ([Col 16, Rows 41-51] – The new semantic roles may be detected by training a machine learning classifier to identify these new roles, e.g., using supervised learning. Accordingly, one or more classifiers may be trained to identify auxiliary verbs, conjunctions, verb chains, and argument modifiers and to cluster the text accordingly and generate the graph according to the nodes and their interrelationship. Alternatively, the extension to the semantic rules may be implemented by adapting semantic tags generated by traditional SRL to extend the semantic graph (e.g., based on additional information provided by a syntactic dependency tree));
wherein the set of words to be processed is a set of words to be processed currently in the plurality of words; the words to be input include a word to be processed in the set of words to be processed, the first word and a second word; the first word is a word that has been determined to have a dependency relationship with the word to be processed in the plurality of words; the second word includes a preset number of words following the word to be processed in the plurality of words ([Col 3, Rows 25-34] – The spans for each determined combination may be combined to reproduce the syntactic dependency of the equivalent spans within the initial set of words. Having said this, whilst this method would form syntactically correct inferred clauses, alternative methods exist where the combinations do not reproduce exactly the initial dependencies. Instead, the combinations may be formed through the use of a language model that is designed to ensure that the inferred clauses are syntactically correct);
wherein the word processing model is configured to: obtain a word feature vector of the words to be input, a part-of-speech feature vector of the part-of-speech information and a relationship feature vector of the dependency relationship of the first word; calculate, with a preset activation function, a first feature vector according to the word feature vector, the part-of-speech feature vector and the relationship feature vector; calculate a second feature vector according to the word feature vector, the part-of-speech feature vector and the relationship feature vector; and calculate confidence levels of the plurality of preset set update operations according to the first feature vector and the second feature vector ([Col 16, Rows 41-51] – The new semantic roles may be detected by training a machine learning classifier to identify these new roles, e.g., using supervised learning. Accordingly, one or more classifiers may be trained to identify auxiliary verbs, conjunctions, verb chains, and argument modifiers and to cluster the text accordingly and generate the graph according to the nodes and their interrelationship. Alternatively, the extension to the semantic rules may be implemented by adapting semantic tags generated by traditional SRL to extend the semantic graph (e.g., based on additional information provided by a syntactic dependency tree). [Col 18, Rows 20-25] – In order to more effectively determine the meaning behind the input text, the dialogue system may embed the input text to generate a vector representation for the input text. The vector representations can be generated based on machine learning models that have been trained on training data);
performing a dependency relationship determination step cyclically according to the target set update operation, until obtaining a dependency parsing result of the sentence to be processed, wherein the dependency parsing result represents a dependency relationship among the plurality of words ([Col 17, Rows 4-10] – The relative semantic relationships between these groups are determined. That is, the words are parsed to determine the semantic relationship between the words in the input. A semantic graph is formed, with each node representing a group of one or more words having a particular semantic role, and edges representing semantic relationships between nodes);
and performing the semantic recognition on the sentence to be processed according to the dependency parsing result ([Col 17, Rows 4-10] – The relative semantic relationships between these groups are determined. That is, the words are parsed to determine the semantic relationship between the words in the input. A semantic graph is formed, with each node representing a group of one or more words having a particular semantic role, and edges representing semantic relationships between nodes).
Regarding claim 4, Shen discloses the method, wherein the dependency relationship determination step comprises:
updating the set of words to be processed according to the target set update operation, to obtain an updated set of words; and determining a dependency relationship of the word to be processed in the set of words to be processed according to the target set update operation ([Col 16, Rows 41-51] – The new semantic roles may be detected by training a machine learning classifier to identify these new roles, e.g., using supervised learning. Accordingly, one or more classifiers may be trained to identify auxiliary verbs, conjunctions, verb chains, and argument modifiers and to cluster the text accordingly and generate the graph according to the nodes and their interrelationship. Alternatively, the extension to the semantic rules may be implemented by adapting semantic tags generated by traditional SRL to extend the semantic graph (e.g., based on additional information provided by a syntactic dependency tree). [Col 18, Rows 20-25] – In order to more effectively determine the meaning behind the input text, the dialogue system may embed the input text to generate a vector representation for the input text. The vector representations can be generated based on machine learning models that have been trained on training data);
re-determining the words to be input, the part-of-speech information of the words to be input; and the dependency relationship of the first word according to the updated set of words ([Col 16, Rows 41-51] – The new semantic roles may be detected by training a machine learning classifier to identify these new roles, e.g., using supervised learning. Accordingly, one or more classifiers may be trained to identify auxiliary verbs, conjunctions, verb chains, and argument modifiers and to cluster the text accordingly and generate the graph according to the nodes and their interrelationship. Alternatively, the extension to the semantic rules may be implemented by adapting semantic tags generated by traditional SRL to extend the semantic graph (e.g., based on additional information provided by a syntactic dependency tree). [Col 18, Rows 20-25] – In order to more effectively determine the meaning behind the input text, the dialogue system may embed the input text to generate a vector representation for the input text. The vector representations can be generated based on machine learning models that have been trained on training data);
determining, with the word processing model, a set update operation corresponding to the updated set of words from the plurality of preset set update operations, according to the redetermined words to be input, the redetermined part-of-speech information of the words to be input, and the redetermined dependency relationship of the first word ([Col 16, Rows 41-51] – The new semantic roles may be detected by training a machine learning classifier to identify these new roles, e.g., using supervised learning. Accordingly, one or more classifiers may be trained to identify auxiliary verbs, conjunctions, verb chains, and argument modifiers and to cluster the text accordingly and generate the graph according to the nodes and their interrelationship. Alternatively, the extension to the semantic rules may be implemented by adapting semantic tags generated by traditional SRL to extend the semantic graph (e.g., based on additional information provided by a syntactic dependency tree). [Col 18, Rows 20-25] – In order to more effectively determine the meaning behind the input text, the dialogue system may embed the input text to generate a vector representation for the input text. The vector representations can be generated based on machine learning models that have been trained on training data);
taking the updated set of words as a new set of words to be processed, and taking the obtained set update operation corresponding to the updated set of words as a new target set update operation ([Col 16, Rows 41-51] – The new semantic roles may be detected by training a machine learning classifier to identify these new roles, e.g., using supervised learning. Accordingly, one or more classifiers may be trained to identify auxiliary verbs, conjunctions, verb chains, and argument modifiers and to cluster the text accordingly and generate the graph according to the nodes and their interrelationship. Alternatively, the extension to the semantic rules may be implemented by adapting semantic tags generated by traditional SRL to extend the semantic graph (e.g., based on additional information provided by a syntactic dependency tree). [Col 18, Rows 20-25] – In order to more effectively determine the meaning behind the input text, the dialogue system may embed the input text to generate a vector representation for the input text. The vector representations can be generated based on machine learning models that have been trained on training data).
Regarding claim 8, Shen discloses the method, wherein performing the word recognition on the sentence to be processed to obtain the plurality of words and the part-of-speech information corresponding to each of the plurality of words ([Col 20, Rows 20-24] – Segmenting the user utterance into simpler, self-contained structures like inferred clauses can help with the extraction relevant information from each, as well as “mix and match” information extraction and answer matching for each of the clauses), comprises:
performing word segmentation on the sentence to be processed to obtain a plurality of words to be recognized and part-of-speech information of each of the plurality of words to be recognized ([Col 20, Rows 20-24] – Segmenting the user utterance into simpler, self-contained structures like inferred clauses can help with the extraction relevant information from each, as well as “mix and match” information extraction and answer matching for each of the clauses);
matching the plurality of words to be recognized with entity words in a preset word database ([Col 20, Rows 20-24] – Segmenting the user utterance into simpler, self-contained structures like inferred clauses can help with the extraction relevant information from each, as well as “mix and match” information extraction and answer matching for each of the clauses);
performing word fusion on the words to be recognized according to the matched entity words and part-of speech information of the words to be recognized, to obtain the plurality of words and part-of-speech information corresponding to the each of the words ([Col 20, Rows 20-24] – Segmenting the user utterance into simpler, self-contained structures like inferred clauses can help with the extraction relevant information from each, as well as “mix and match” information extraction and answer matching for each of the clauses).
Regarding claim 9, Shen discloses an electronic device, comprising:
a processor ([Col 4, Row 54] – processor);
memory storing a computer program executable by the processor ([Col 4, Row 54] – memory);
wherein the processor is configured to:
in response to performing semantic analysis on information acquired by a terminal, acquire a sentence to be processed ([Col 1, Rows 38-40] – One method used in natural language processing is semantic parsing. This extracts the semantic meaning of various words within a sentence);
perform word recognition on the sentence to be processed, to obtain a plurality of words and part-of speech information corresponding to each of the plurality of words ([Col 1, Rows 64-67] – The semantic graph comprises: a plurality of nodes, each node representing a span of one or more words taken from the initial set of words and having a corresponding shared semantic role within the initial set of words);
determine, with a pre-trained word processing model, a target set update operation corresponding to a set of words to be processed from a plurality of preset set update operations, according to one or more words to be input, part-of-speech information of the words to be input, and a dependency relationship of a first word ([Col 16, Rows 41-51] – The new semantic roles may be detected by training a machine learning classifier to identify these new roles, e.g., using supervised learning. Accordingly, one or more classifiers may be trained to identify auxiliary verbs, conjunctions, verb chains, and argument modifiers and to cluster the text accordingly and generate the graph according to the nodes and their interrelationship. Alternatively, the extension to the semantic rules may be implemented by adapting semantic tags generated by traditional SRL to extend the semantic graph (e.g., based on additional information provided by a syntactic dependency tree));
wherein the set of words to be processed is a set of words to be processed currently in the plurality of words; the words to be input include a word to be processed in the set of words to be processed, the first word and a second word; the first word is a word that has been determined to have a dependency relationship with the word to be processed in the plurality of words; the second word includes a preset number of words following the word to be processed in the plurality of words ([Col 3, Rows 25-34] – The spans for each determined combination may be combined to reproduce the syntactic dependency of the equivalent spans within the initial set of words. Having said this, whilst this method would form syntactically correct inferred clauses, alternative methods exist where the combinations do not reproduce exactly the initial dependencies. Instead, the combinations may be formed through the use of a language model that is designed to ensure that the inferred clauses are syntactically correct);
wherein the word processing model is configured to: obtain a word feature vector of the words to be input, a part-of-speech feature vector of the part-of-speech information and a relationship feature vector of the dependency relationship of the first word; calculate, with a preset activation function, a first feature vector according to the word feature vector, the part-of-speech feature vector and the relationship feature vector; calculate a second feature vector according to the word feature vector, the part-of-speech feature vector and the relationship feature vector; and calculate confidence levels of the plurality of preset set update operations according to the first feature vector and the second feature vector ([Col 16, Rows 41-51] – The new semantic roles may be detected by training a machine learning classifier to identify these new roles, e.g., using supervised learning. Accordingly, one or more classifiers may be trained to identify auxiliary verbs, conjunctions, verb chains, and argument modifiers and to cluster the text accordingly and generate the graph according to the nodes and their interrelationship. Alternatively, the extension to the semantic rules may be implemented by adapting semantic tags generated by traditional SRL to extend the semantic graph (e.g., based on additional information provided by a syntactic dependency tree). [Col 18, Rows 20-25] – In order to more effectively determine the meaning behind the input text, the dialogue system may embed the input text to generate a vector representation for the input text. The vector representations can be generated based on machine learning models that have been trained on training data);
perform a dependency relationship determination step cyclically according to the target set update operation, until obtaining a dependency parsing result of the sentence to be processed, wherein the dependency parsing result represents a dependency relationship among the plurality of words ([Col 17, Rows 4-10] – The relative semantic relationships between these groups are determined. That is, the words are parsed to determine the semantic relationship between the words in the input. A semantic graph is formed, with each node representing a group of one or more words having a particular semantic role, and edges representing semantic relationships between nodes);
and perform the semantic recognition on the sentence to be processed according to the dependency parsing result ([Col 17, Rows 4-10] – The relative semantic relationships between these groups are determined. That is, the words are parsed to determine the semantic relationship between the words in the input. A semantic graph is formed, with each node representing a group of one or more words having a particular semantic role, and edges representing semantic relationships between nodes).
Dependent claims 12 and 16 are analogous in scope to claims 4 and 8, and are rejected according to the same reasoning.
Regarding claim 17, Shen discloses the electronic device, further comprising a display screen configured to display a result of the semantic recognition ([Col 6, Rows 29-30] – retrieve and display their medical records (information retrieval)), wherein
the processor is configured to input one or more words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word into the word processing model to obtain the target set update operation, and determine the dependency parsing result according to the target set update operation (Figure 7 - Processor 501, Semantic parsing module 513; [Col 3, Rows 25-34] – The spans for each determined combination may be combined to reproduce the syntactic dependency of the equivalent spans within the initial set of words. Having said this, whilst this method would form syntactically correct inferred clauses, alternative methods exist where the combinations do not reproduce exactly the initial dependencies. Instead, the combinations may be formed through the use of a language model that is designed to ensure that the inferred clauses are syntactically correct);
the word processing model combines the features corresponding to the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word, without a need of manually constructing the feature templates ([Col 3, Rows 25-27] - The spans for each determined combination may be equivalent spans within the initial set of words. [Col 18, Rows 20-25] – In order to more effectively determine the meaning behind the input text, the dialogue system may embed the input text to generate a vector representation for the input text. The vector representations can be generated based on machine learning models that have been trained on training data);
the dependency relationship between a word second-place to the word to be processed and the word to be processed is taken into account in the word processing model, to thereby select the target set update operation accurately and improve accuracy of the dependency parsing result and accuracy of the semantic recognition ([Col 7, Rows 5-7] - The present embodiments are primarily concerned with improving input recognition accuracy and In addition to the extraction of information from user computational efficiency).
Regarding claim 18, Shen discloses a non-transitory computer storage medium having stored thereon computer-executable instructions that, when executed by a processor, cause operations of a method to be performed ([Col 28, Rows 12-14] – encoded on computer storage medium for execution by, or to control the operation of, data processing apparatus), the method comprising:
in response to performing semantic analysis on information acquired by a terminal, acquiring a sentence to be processed ([Col 1, Rows 38-40] – One method used in natural language processing is semantic parsing. This extracts the semantic meaning of various words within a sentence);
performing word recognition on the sentence to be processed, to obtain a plurality of words and part-of speech information corresponding to each of the plurality of words ([Col 1, Rows 64-67] – The semantic graph comprises: a plurality of nodes, each node representing a span of one or more words taken from the initial set of words and having a corresponding shared semantic role within the initial set of words);
determining, with a pre-trained word processing model, a target set update operation corresponding to a set of words to be processed from a plurality of preset set update operations, according to one or more words to be input, part-of-speech information of the words to be input, and a dependency relationship of a first word ([Col 16, Rows 41-51] – The new semantic roles may be detected by training a machine learning classifier to identify these new roles, e.g., using supervised learning. Accordingly, one or more classifiers may be trained to identify auxiliary verbs, conjunctions, verb chains, and argument modifiers and to cluster the text accordingly and generate the graph according to the nodes and their interrelationship. Alternatively, the extension to the semantic rules may be implemented by adapting semantic tags generated by traditional SRL to extend the semantic graph (e.g., based on additional information provided by a syntactic dependency tree));
wherein the set of words to be processed is a set of words to be processed currently in the plurality of words; the words to be input include a word to be processed in the set of words to be processed, the first word and a second word; the first word is a word that has been determined to have a dependency relationship with the word to be processed in the plurality of words; the second word includes a preset number of words following the word to be processed in the plurality of words ([Col 3, Rows 25-34] – The spans for each determined combination may be combined to reproduce the syntactic dependency of the equivalent spans within the initial set of words. Having said this, whilst this method would form syntactically correct inferred clauses, alternative methods exist where the combinations do not reproduce exactly the initial dependencies. Instead, the combinations may be formed through the use of a language model that is designed to ensure that the inferred clauses are syntactically correct);
wherein the word processing model is configured to: obtain a word feature vector of the words to be input, a part-of-speech feature vector of the part-of-speech information and a relationship feature vector of the dependency relationship of the first word; calculate, with a preset activation function, a first feature vector according to the word feature vector, the part-of-speech feature vector and the relationship feature vector; calculate a second feature vector according to the word feature vector, the part-of-speech feature vector and the relationship feature vector; and calculate confidence levels of the plurality of preset set update operations according to the first feature vector and the second feature vector ([Col 16, Rows 41-51] – The new semantic roles may be detected by training a machine learning classifier to identify these new roles, e.g., using supervised learning. Accordingly, one or more classifiers may be trained to identify auxiliary verbs, conjunctions, verb chains, and argument modifiers and to cluster the text accordingly and generate the graph according to the nodes and their interrelationship. Alternatively, the extension to the semantic rules may be implemented by adapting semantic tags generated by traditional SRL to extend the semantic graph (e.g., based on additional information provided by a syntactic dependency tree). [Col 18, Rows 20-25] – In order to more effectively determine the meaning behind the input text, the dialogue system may embed the input text to generate a vector representation for the input text. The vector representations can be generated based on machine learning models that have been trained on training data);
performing a dependency relationship determination step cyclically according to the target set update operation, until obtaining a dependency parsing result of the sentence to be processed, wherein the dependency parsing result represents a dependency relationship among the plurality of words ([Col 17, Rows 4-10] – The relative semantic relationships between these groups are determined. That is, the words are parsed to determine the semantic relationship between the words in the input. A semantic graph is formed, with each node representing a group of one or more words having a particular semantic role, and edges representing semantic relationships between nodes);
and performing the semantic recognition on the sentence to be processed according to the dependency parsing result ([Col 17, Rows 4-10] – The relative semantic relationships between these groups are determined. That is, the words are parsed to determine the semantic relationship between the words in the input. A semantic graph is formed, with each node representing a group of one or more words having a particular semantic role, and edges representing semantic relationships between nodes).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 10, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Shen (U.S. Patent No. 10387575) in view of Paulik (U.S. Patent No. 9972304).
Regarding claim 2, Shen discloses all of the limitations as in claim 1, above.
However, Shen does not disclose the method, wherein determining, with a pre-trained word processing model, the target set update operation corresponding to the set of words to be processed from the plurality of preset set update operations, according to the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word, comprises:
inputting the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word into the word processing model, to obtain the confidence levels, each corresponding to a respective one of the plurality of preset set update operations;
taking a preset set update operation with a highest one of the confidence levels as the target set update operation.
Paulik does teach the method, wherein determining, with a pre-trained word processing model, the target set update operation corresponding to the set of words to be processed from the plurality of preset set update operations, according to the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word, comprises:
inputting the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word into the word processing model, to obtain the confidence levels, each corresponding to a respective one of the plurality of preset set update operations ([Col 3, Rows 41-43] – The plurality of accuracy scores are confidence scores indicating the likelihood of the speech recognition result given the respective user speech sample. [Col 34, Rows 33-37] – the front-end speech pre-processor can perform a Fourier transform on the speech input to extract spectral features that characterize the speech input as a sequence of representative multi-dimensional vectors);
taking a preset set update operation with a highest one of the confidence levels as the target set update operation ([Col 38, Rows 29-31] – the domain having the highest confidence value (e.g., based on the relative importance of its various triggered nodes) can be selected).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Shen to incorporate the teachings of Paulik in order to implement the method, wherein determining, with a pre-trained word processing model, the target set update operation corresponding to the set of words to be processed from the plurality of preset set update operations, according to the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word, comprises: inputting the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word into the word processing model, to obtain the confidence levels, each corresponding to a respective one of the plurality of preset set update operations; taking a preset set update operation with a highest one of the confidence levels as the target set update operation. Doing so allows the privacy of the user to be preserved while allowing the system to determine whether a second personalized speech recognition system should be activated (Paulik, [Col 3, Rows 45-50]).
Dependent claims 10 and 19 are analogous in scope to claim 2, and are rejected according to the same reasoning.
Claims 3, 11, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Shen (U.S. Patent No. 10387575) in view of Paulik (U.S. Patent No. 9972304), and further in view of Ravi (U.S. Publication No. 20200042596).
Regarding claim 3, Shen in view of Paulik teaches all of the limitations as in claim 2, above.
Paulik does teach outputting, through the output layer, the confidence levels, each corresponding to a respective one of the plurality of preset set update operations, according to the first feature vector and the second feature vector ([Col 3, Rows 41-43] – The plurality of accuracy scores are confidence scores indicating the likelihood of the speech recognition result given the respective user speech sample. [Col 34, Rows 33-37] – the front-end speech pre-processor can perform a Fourier transform on the speech input to extract spectral features that characterize the speech input as a sequence of representative multi-dimensional vectors).
However, Shen in view of Paulik does not teach the method, wherein the word processing model comprises an input layer, an embedding layer, a hidden layer, a self-attention mechanism layer and an output layer;
wherein inputting the words to be input, the part-of speech information of the words to be input, and the dependency relationship of the first word into the word processing model, to obtain the confidence levels, each corresponding to a respective one of the plurality of preset set update operations, comprises:
inputting, through the input layer, the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word into the embedding layer;
generating, through the embedding layer, the word feature vector, the part-of-speech feature vector and the relationship feature vector according to the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word, and splicing the word feature vector, the part-of-speech feature vector and the relationship feature vector to obtain a first splicing feature vector;
inputting, through the embedding layer, the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector into both the hidden layer and the self-attention mechanism layer;
determining, through the hidden layer, a first feature vector with the preset activation function according to the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector, and inputting the first feature vector into the output layer;
determining, through the self-attention mechanism layer, a second feature vector according to the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector, and inputting the second feature vector into the output layer.
Ravi does teach the method, wherein the word processing model comprises an input layer, an embedding layer, a hidden layer, a self-attention mechanism layer and an output layer (Figure 1 - Projection Layer 108, Hidden Layer 114, Output Layer 120; [0138] - generative attentional neural network; [0140] – word embedding);
wherein inputting the words to be input, the part-of speech information of the words to be input, and the dependency relationship of the first word into the word processing model, to obtain the confidence levels, each corresponding to a respective one of the plurality of preset set update operations, comprises:
inputting, through the input layer, the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word into the embedding layer ([0087] - deep learning techniques in natural language processing whose performance depends on embeddings pre trained on large corpora);
generating, through the embedding layer, the word feature vector, the part-of-speech feature vector and the relationship feature vector according to the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word, and splicing the word feature vector, the part-of-speech feature vector and the relationship feature vector to obtain a first splicing feature vector ([0023] - an input feature vector or sequence (e.g., in the case of recursive neural networks) and Y; is an output (e.g., an output category for classification tasks or a predicted sequence). Typically, these networks consist of multiple layers of hidden units or neurons with connections between a pair of layers. [0033] - the SGNNs can learn compact projection vectors with locality sensitive hashing (LSH));
inputting, through the embedding layer, the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector into both the hidden layer and the self-attention mechanism layer ([0023] - an input feature vector or sequence (e.g., in the case of recursive neural networks) and Y; is an output (e.g., an output category for classification tasks or a predicted sequence). Typically, these networks consist of multiple layers of hidden units or neurons with connections between a pair of layers);
determining, through the hidden layer, a first feature vector with the preset activation function according to the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector, and inputting the first feature vector into the output layer ([0078] - A layer of the projection network 102 can serve as the output layer 120 if the output of such layer is included in the projection network output 106. An output layer may be a softmax layer, a projection layer, or any other appropriate neural network layer. The output layer 120 may be configured to receive as input an output generated by a projection layer or a conventional layer);
determining, through the self-attention mechanism layer, a second feature vector according to the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector, and inputting the second feature vector into the output layer ([0078] - A layer of the projection network 102 can serve as the output layer 120 if the output of such layer is included in the projection network output 106. An output layer may be a softmax layer, a projection layer, or any other appropriate neural network layer. The output layer 120 may be configured to receive as input an output generated by a projection layer or a conventional layer).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Shen in view of Paulik to incorporate the teachings of Ravi in order to implement the method, wherein the word processing model comprises an input layer, an embedding layer, a hidden layer, a self-attention mechanism layer and an output layer; wherein inputting the words to be input, the part-of speech information of the words to be input, and the dependency relationship of the first word into the word processing model, to obtain the confidence levels, each corresponding to a respective one of the plurality of preset set update operations, comprises: inputting, through the input layer, the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word into the embedding layer; generating, through the embedding layer, the word feature vector, the part-of-speech feature vector and the relationship feature vector according to the words to be input, the part-of-speech information of the words to be input, and the dependency relationship of the first word, and splicing the word feature vector, the part-of-speech feature vector and the relationship feature vector to obtain a first splicing feature vector; inputting, through the embedding layer, the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector into both the hidden layer and the self-attention mechanism layer; determining, through the hidden layer, a first feature vector with the preset activation function according to the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector, and inputting the first feature vector into the output layer; determining, through the self-attention mechanism layer, a second feature vector according to the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector, and inputting the second feature vector into the output layer. Doing so allows each projection layer to use a set of projection functions to project the input into a bit-space , thereby greatly reducing the dimensionality of the input and enabling computation with lower resource usage (Ravi [0022]).
Dependent claims 11 and 20 are analogous in scope to claim 3, and are rejected according to the same reasoning.
Claims 5 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Shen (U.S. Patent No. 10387575) in view of Nurvitadhi (U.S. Publication No. 20180189638).
Regarding claim 5, Shen discloses all of the limitations as in claim 1, above.
However, Shen does not disclose the method, wherein the target set update operation comprises a shift operation, a first update operation, and a second update operation;
wherein updating the set of words to be processed according to the target set update operation, to obtain the updated set of words; and determining the dependency relationship of the word to be processed from the set of words to be processed according to the target set update operation, comprises:
in response to that the target set update operation is the first update operation, shifting a first-place word in two indicated words to be processed out of the set of words to be processed, and setting a dependency relationship between the two indicated words to be processed as a first dependency relationship , wherein the first dependency relationship indicates that a second-place word in the two indicated words is a subordinate word of the first-place word;
in response to that the target set update operation is the second update operation, shifting the first-place word in the two indicated words to be processed out of the set of words to be processed , and setting the dependency relationship between the two indicated words to be processed as a second dependency relationship, wherein the second dependency relationship indicates that the first-place word is the subordinate word of the second-place word in the two indicated words;
in response to that the target set update operation is the shift operation, taking a specified word of the plurality of words as a new word to be processed in the set of words to be processed.
Nurvitadhi does teach the method, wherein the target set update operation comprises a shift operation, a first update operation, and a second update operation ([0138] - scale and update operation 1102m; [0265] - The execution units 3562 may perform various operations (e.g., shifts…));
wherein updating the set of words to be processed according to the target set update operation, to obtain the updated set of words; and determining the dependency relationship of the word to be processed from the set of words to be processed according to the target set update operation, comprises:
in response to that the target set update operation is the first update operation, shifting a first-place word in two indicated words to be processed out of the set of words to be processed, and setting a dependency relationship between the two indicated words to be processed as a first dependency relationship , wherein the first dependency relationship indicates that a second-place word in the two indicated words is a subordinate word of the first-place word (Figure 8 - URTHER IDENTIFIES DATA DEPENDENCIES BETWEEN ONES OF THE PLURALITY OF OPERATIONS , WHEREIN THE PLURALITY OF OPERATIONS INCLUDES ONE OR MORE MATRIX OPERATIONS AND ONE OR MORE VECTOR OPERATIONS 805; [0240] - A shifter 3106 extracts the required words 3102 out of the burst 3101 and routes them to the right location in a line buffer 3107 whose size matches the rows in the vector value buffer);
in response to that the target set update operation is the second update operation, shifting the first-place word in the two indicated words to be processed out of the set of words to be processed , and setting the dependency relationship between the two indicated words to be processed as a second dependency relationship, wherein the second dependency relationship indicates that the first-place word is the subordinate word of the second-place word in the two indicated words (Figure 8 - URTHER IDENTIFIES DATA DEPENDENCIES BETWEEN ONES OF THE PLURALITY OF OPERATIONS , WHEREIN THE PLURALITY OF OPERATIONS INCLUDES ONE OR MORE MATRIX OPERATIONS AND ONE OR MORE VECTOR OPERATIONS 805; [0240] - A shifter 3106 extracts the required words 3102 out of the burst 3101 and routes them to the right location in a line buffer 3107 whose size matches the rows in the vector value buffer);
in response to that the target set update operation is the shift operation, taking a specified word of the plurality of words as a new word to be processed in the set of words to be processed (Figure 8 - URTHER IDENTIFIES DATA DEPENDENCIES BETWEEN ONES OF THE PLURALITY OF OPERATIONS , WHEREIN THE PLURALITY OF OPERATIONS INCLUDES ONE OR MORE MATRIX OPERATIONS AND ONE OR MORE VECTOR OPERATIONS 805; [0240] - A shifter 3106 extracts the required words 3102 out of the burst 3101 and routes them to the right location in a line buffer 3107 whose size matches the rows in the vector value buffer).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Shen to incorporate the teachings of Nurvitadhi in order to implement the method, wherein the target set update operation comprises a shift operation, a first update operation, and a second update operation; wherein updating the set of words to be processed according to the target set update operation, to obtain the updated set of words; and determining the dependency relationship of the word to be processed from the set of words to be processed according to the target set update operation, comprises: in response to that the target set update operation is the first update operation, shifting a first-place word in two indicated words to be processed out of the set of words to be processed, and setting a dependency relationship between the two indicated words to be processed as a first dependency relationship , wherein the first dependency relationship indicates that a second-place word in the two indicated words is a subordinate word of the first-place word; in response to that the target set update operation is the second update operation, shifting the first-place word in the two indicated words to be processed out of the set of words to be processed, and setting the dependency relationship between the two indicated words to be processed as a second dependency relationship, wherein the second dependency relationship indicates that the first-place word is the subordinate word of the second-place word in the two indicated words; in response to that the target set update operation is the shift operation, taking a specified word of the plurality of words as a new word to be processed in the set of words to be processed. Doing so allows the reduction of the number of writes/cycles that the vector value buffer needs to support, thus reducing its size (Nurvitadhi [0240]).
Dependent claim 13 is analogous in scope to claim 5, and is rejected according to the same reasoning.
Claims 6 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Shen (U.S. Patent No. 10387575) in view of Chen (U.S. Publication No. 20200160836).
Regarding claim 6, Shen discloses all of the limitations as in claim 1, above.
However, Shen does not disclose the method, wherein the preset activation function comprises:

    PNG
    media_image1.png
    22
    141
    media_image1.png
    Greyscale

where h is the first feature vector, w is a weight matrix corresponding to the first splicing feature vector, w' is a weight matrix corresponding to the part-of-speech feature vector, W1 is a weight matrix corresponding to the relationship feature vector, x” is the first splicing feature vector, x' is the part-of-speech feature vector, x is the relationship feature vector, b is a bias.
Chen does teach the method, wherein the preset activation function comprises:

    PNG
    media_image1.png
    22
    141
    media_image1.png
    Greyscale
([0084] – [Equation 1])
where h is the first feature vector, w is a weight matrix corresponding to the first splicing feature vector, w' is a weight matrix corresponding to the part-of-speech feature vector, W1 is a weight matrix corresponding to the relationship feature vector, x” is the first splicing feature vector, x' is the part-of-speech feature vector, x is the relationship feature vector, b is a bias ([0026] - a vector indicative of the language or dialect is linearly transformed by the weight matrices of the neural network layer and added to the original hidden activations before a nonlinearity is applied. [0034] - output of an encoder of the speech recognition model is averaged across multiple frames to obtain an utterance-level feature vector).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Shen to incorporate the teachings of Chen in order to implement the method, wherein the preset activation function comprises: 
    PNG
    media_image1.png
    22
    141
    media_image1.png
    Greyscale
where h is the first feature vector, w is a weight matrix corresponding to the first splicing feature vector, w' is a weight matrix corresponding to the part-of-speech feature vector, W1 is a weight matrix corresponding to the relationship feature vector, x” is the first splicing feature vector, x' is the part-of-speech feature vector, x is the relationship feature vector, b is a bias. Doing so allows the system to predict the likelihood of speech belonging to each of the multiple languages (Chen [0034]).
Dependent claim 14 is analogous in scope to claim 6, and is rejected according to the same reasoning.
Claims 7 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Shen (U.S. Patent No. 10387575) in view of Paulik (U.S. Patent No. 9972304), in view of Ravi (U.S. Publication No. 20200042596), and further in view of Han (U.S. Publication No. 20210005182).
Regarding claim 7, Shen in view of Paulik in view of Ravi teaches all of the limitations as in claim 3, above.
However, Shen in view of Paulik in view of Ravi does not teach the method, wherein determining the second feature vector according to the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector, comprises:
splicing the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector to obtain a second splicing feature vector;
determining a target weight with a first formula according to the second splicing feature vector, wherein the first formula comprises:

    PNG
    media_image2.png
    51
    75
    media_image2.png
    Greyscale

where X is the second splicing feature vector, S is the target weight f is a softmax function, dx is a dimension of the second splicing feature vector;
determining the second feature vector with a second formula according to the second splicing feature vector and the target weight,

    PNG
    media_image3.png
    21
    41
    media_image3.png
    Greyscale

where L is the second feature vector.
Han does teach the method, wherein determining the second feature vector according to the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector, comprises:
splicing the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector to obtain a second splicing feature vector ([0013] - The audio signal may be processed by feature computation component 110 to produce a sequence of feature vectors that represent the audio signal. [0084] - The conventional 39 - dimensional MFCC features were spliced over 9 frames and LDA was applied to project the spliced features onto a 40-dimensional sub-space);
determining a target weight with a first formula according to the second splicing feature vector, wherein the first formula comprises ([0076] - the factorized TDNN may be used. Singular Value Decomposition (SVD) may be used to factorize a learned weight matrix into two low-rank factors and reduce the model complexity of neural networks):

    PNG
    media_image2.png
    51
    75
    media_image2.png
    Greyscale
 ([0078] – [Equation 1])
where X is the second splicing feature vector, S is the target weight f is a softmax function, dx is a dimension of the second splicing feature vector ([0078] – [Equation 1]);
determining the second feature vector with a second formula according to the second splicing feature vector and the target weight ([0013] - The audio signal may be processed by feature computation component 110 to produce a sequence of feature vectors that represent the audio signal. [0076] - the factorized TDNN may be used. Singular Value Decomposition (SVD) may be used to factorize a learned weight matrix into two low-rank factors and reduce the model complexity of neural networks. [0084] - The conventional 39 - dimensional MFCC features were spliced over 9 frames and LDA was applied to project the spliced features onto a 40-dimensional sub-space),

    PNG
    media_image4.png
    21
    41
    media_image4.png
    Greyscale
 ([0078] – [Equation 1])
where L is the second feature vector ([0078] – [Equation 1]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Shen in view of Paulik in view of Ravi to incorporate the teachings of Han in order to implement the method, wherein determining the second feature vector according to the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector, comprises: splicing the first splicing feature vector, the part-of-speech feature vector and the relationship feature vector to obtain a second splicing feature vector; determining a target weight with a first formula according to the second splicing feature vector, wherein the first formula comprises:
    PNG
    media_image2.png
    51
    75
    media_image2.png
    Greyscale
 where X is the second splicing feature vector, S is the target weight f is a softmax function, dx is a dimension of the second splicing feature vector; determining the second feature vector with a second formula according to the second splicing feature vector and the target weight,
    PNG
    media_image4.png
    21
    41
    media_image4.png
    Greyscale
 where L is the second feature vector. Doing so allows the system to retain model complexity to a reasonable level as well as avoid the loss of modeling power (Han [0079]).
Dependent claim 15 is analogous in scope to claim 7, and is rejected according to the same reasoning.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Kalukin (U.S. Publication No. 20200242146) teaches artificial intelligence system for generating conjectures and comprehending text, audio, and visual data using natural language understanding. Velikovich (U.S. Patent No. 11238227) teaches word lattice augmentation for automatic speech recognition. Zhang (U.S. Publication No. 20200311077) teaches multi-layer semantic search.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ETHAN DANIEL KIM whose telephone number is (571) 272-1405.  The examiner can normally be reached on Monday - Friday 9:00 - 5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ETHAN DANIEL KIM/
Examiner, Art Unit 2658


/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658