DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
The drawings are objected to because there are characters in the language of Chinese in Figures 2A and 2B, but it would be helpful to an understanding of an issued patent if there were additionally a corresponding translation of these words into English.  
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office Action to avoid abandonment of the application.  Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended.  The figure or figure number of an amended drawing should not be labeled as “amended.”  If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency.  Additional replacement sheets may be necessary to show the renumbering of the remaining figures.  Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d).  If the changes are not accepted by the examiner, Applicants will be notified and informed of any required corrective action in the next Office Action.  The objection to the drawings will not be held in abeyance.

Specification
The disclosure is objected to because of the following informalities:
In ¶[0029], “an application scenarios involved” should be “an application scenario involved”.
In ¶[0030], “only the best” should be “because only the best”.
In ¶[0042], “in response to that that” should be “in response to”.
In ¶[0049], “conll” appears that it should be “CoNLL”, which is an abbreviation for Conference on Natural Language Learning.
In ¶[0063], “and then proceed” should be “and then proceeds”.
In ¶[0063], “and store this dependency” should be “and stores this dependency”.
In ¶[0065], “are annotate” should be “are annotated”.
In ¶[0065], “are sample are” should be “are samples that are”.
In ¶[0096], “also provides” should be “also provide”.
In ¶[0096], “for semantic recognition” should be “for semantic recognition may be performed”.  
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4 to 6, 8, 10 to 13, 15 to 16, and 18 to 19 are rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al. (“A Neural Probabilistic Structured-Prediction Model for Transition-Based Dependency Parsing”) in view of Buchholz (U.S. Patent Publication 2007/0016398).
Concerning independent claims 1, 8, and 16, Zhou et al. discloses a dependency parser, comprising:
“in response to performing semantic analysis on information acquired by a terminal, acquiring a sentence to be processed; performing word recognition on the sentence to be processed, to obtain a plurality of words and part-of-speech information corresponding to each of the plurality of words” – a greedy neural arc-standard parse extracts atomic features from a parsing state, which consists of words, POS (part of speech) tags, and dependency labels (§2.3.1 Model: Page 1215); a POS-tagger assigns POS automatically (§4.1 Set-up); 
“determining, with a pre-trained word processing model, a target set update operation corresponding to a set of words to be processed from a plurality of preset set update operations, according to a word to be processed in the set of words to be processed and part-of-speech information of the word to be processed; wherein the set of words to be processed is a set of words to be processed currently in the plurality of words” – at each step, a parser chooses one of the following actions: SHIFT: move the front word wj from the queue into the stack; LEFT-ARC(l): add an arc with label l between the top two trees on the stack (s1←s0), and remove s1 from the stack; RIGHT-ARC(l): add an arc with label l between the top two trees on the stack (s1→s0), and remove s0 from the stack (§2.1 Arc-standard Parsing: Page 1214); here, “a plurality of preset set update operations” are SHIFT, LEFT-ARC(l), and RIGHT-ARC(l); Compare Specification, ¶[0055], which defines “the preset set update operations” as a shift operation, a first update operation, and a second update operation; “a target set update operation” is an action selected for a current word, i.e., one of SHIFT, LEFT-ARC(l), and RIGHT-ARC(l);  
“in response to that a dependency relationship corresponding to the target set update operation is a first dependency relationship, determining, through each of the plurality of preset set update operations, a respective dependency relationship of the word to be processed and a respective confidence level corresponding to the dependency relationship, and performing, according to the each of the plurality of preset set update operations, a respective update of the set of words to be processed; wherein the first dependency relationship indicates that a second-place word in two of the plurality of words is a subordinate word of a first-place word in the two of the plurality of words” – RIGHT-ARC(l) adds an arc with label l between the top two trees on the stack (s1→s0), and removes s0 from the stack (§2.1 Arc-standard Parsing: Page 1214); here, RIGHT-ARC(l) provides “a first dependency relationship”, where s1 is “a second-place word” that is “a subordinate word of a first-place word”, where s0 is “a first-place word”; “a first dependency relationship” places “a second-place word” s1 so that it is “a subordinate word” of a first-place word s0 according to (s1→s0), by adding an arc between the two words in the dependency tree; a score of an action sequence y is given by score(y)=Σaₑy θ • Φ(a) (Equation (2)), where a is an action and Φ(a) is a feature function for a; a score for an action sequence is the linear sum of the scores for each action (§2.2 Global Learning and Beam Search: Page 1214); a score is equivalent “a respective confidence level corresponding to the dependency relationship”; that is, θ • Φ(a) represents a ‘confidence level’ for a dependency relationship of an action for a current word;
“in response to that the dependency relationship corresponding to the target set update operation is not the first dependency relationship, determining, through the target set update operation, the dependency relationship of the word to be processed 40and the confidence level corresponding to the dependency relationship, and updating the set of words to be processed according to the target set update operation” – LEFT-ARC(l) adds an arc with label l between the top two trees on the stack (s1←s0), and removes s1 from the stack (§2.1 Arc-standard Parsing: Page 1214); broadly, LEFT-ARC(l) is “the target set update operation” that is “not the first dependency relationship”; that is, RIGHT-ARC(l) is “the first dependency relationship”, and LEFT-ARC(l) is “not the first dependency relationship”; a score of an action sequence y is given by score(y)=Σaₑy θ • Φ(a) (Equation (2)), where a is an action and Φ(a) is a feature function for a; a score for an action sequence is the linear sum of the scores for each action (§2.2 Global Learning and Beam Search: Page 1214: Equation (2)); a score is equivalent “the respective confidence level corresponding to the dependency relationship” for an action of LEFT-ARC(l);
“performing, according to the respective updated set of words to be processed, the step of determining, with the pre-trained word processing model, the target set update operation corresponding to the set of words to be processed from the plurality of preset set update operations, according to the word to be processed in the set of words to be processed and the part-of-speech information of the word to be processed, to the step of updating the set of words to be processed according to the target set update operation repeatedly, until obtaining a plurality of dependency parsing results of the sentence to be processed; wherein each of the dependency parsing results represents a respective set of dependency relationships among the plurality of words” – transition-based dependency parsers scan an input sentence from left to right, and perform a sequence of transition actions to predict the parse tree; parsing starts with an empty stack and a queue consisting of the whole input sentence; at each step, a transition action is taken to consume the input and construct the output; the process repeats (“repeatedly”) until the input queue is empty and the stack contains only one dependency tree; for a sentence of size n, parsing stops after performing exactly 2n-1 actions (§2.1 Arc-standard Parsing: Page 1214); here, these operations are repeated for all the words in a sentence;
“taking a dependency parsing result with a highest one of multiple sums of confidence levels, each sum being a sum of a set of confidence levels corresponding to a respective set of dependency relationships among the plurality of words, as an optimal parsing result in the plurality of dependency parsing results, and performing the semantic recognition on the sentence to be processed according to the optimal parsing result” – at each step, a parser deterministically selects a highest-scored one as the next state (§2.1 Arc-standard Parsing: Page 1214); the goal is to find the highest-scored action sequence globally: y = arg max score(y’) for y’ an element of GEN(x) (Equation (1)), where GEN(x) denotes all possible action sequences on x; a score of an action sequence y is: score(y)=Σaₑy θ • Φ(a), where the score of an action sequence is the linear sum of the scores of each action (§2.2 Global Learning and Beam Search: Page 1214: Equations (1) and (2)); here, arg max score(y’) is “a highest one of multiple sums of confidence levels”, where each sum of confidence levels is represented by score(y)=Σaₑy θ • Φ(a); a maximum sum of scores provides “an optimal parsing result” and generating a parsing tree is “performing semantic recognition on the sentence to be processed”.
Concerning independent claims 1, 8, and 16, Zhou et al. arguably discloses all of the limitations of these independent claims, but whatever might be omitted is taught by Buchholz.  Specifically, Zhou et al. may not clearly disclose performing word recognition to obtain part-of-speech recognition prior to choosing one of the transition actions, and using a trained neural network to score sequences of transition actions.  Still, Buchholz teaches a similar method of parsing that receives a tokenized and part-of-speech tagged utterance, assigning a role and a head to each token of an utterance, storing the A most likely resulting partial parses, advancing to a next successive token, and storing the A most likely resulting partial parses until all n tokens are parsed.  (Abstract)  A natural language sentence of an utterance that has been tokenized and also part-of-speech tagged is input into the parser.  (¶[0033])  For each of the possible roles of token i and each other token j such that a dependency relation from i to j would not create an illegal dependency, a probability model is then consulted to determine whether the relation is possible at all and how probable it is.  (¶[0038])  For each possible extended parse, the parser compares the extended parse’s probability by multiplying the probability of the original parse p(b) with the probability of the extension with role r1 given by the original parse, p(r(i,j)|b).  If the probability of the extended parse is lower than that of the lowest element of the new beam, the extended parse is not inserted into new-_beam.  (¶[0078])  The logarithmic probabilities are those actually computed by the parser for the sentence.  (¶[0109])  Buchholz, then, teaches an equivalent procedure for parsing words in a sentence that are tagged with part-of-speech that determines a parse with a maximum probability after assigning dependency relations to words in a sentence.  An objective is to provide a probabilistic parser that can help resolve ambiguities in text and that does not rely upon rules that are time-consuming to construct.  (¶[0017] - ¶[0018])  It would have been obvious to one having ordinary skill in the art to determine a parse with a maximum probability from tokens tagged with parts-of-speech as taught by Buchholz in a probabilistic model of transition-based dependency parsing of Zhou et al. for a purpose of providing a probabilistic parser that can resolve ambiguities in text and that does not rely upon rules that are time-consuming to construct.

Concerning claims 4 to 5, 11 to 12, and 18, Zhou et al. discloses that LEFT-ARC(l) adds an arc with label l between the top two trees on the stack (s1←s0), and removes s1 from the stack; RIGHT-ARC(l) adds an arc with label l between the top two trees on the stack (s1→s0), and removes s0 from the stack (§2.1 Arc-standard Parsing: Page 1214); here, LEFT-ARC(l) is “the first update operation” that removes s1 from the stack, where removing s1 from the stack is equivalent to “shifting a second-place word in two indicated words to be processed out of the set of words to be processed, to update the set of words to be processed”; that is, s0 is a first-place word, and s1 is “a second-place word”; similarly, RIGHT-ARC(l) adds an arc with label l between the top two trees on the stack (s1→s0), and removes s0 from the stack (§2.1 Arc-standard Parsing: Page 1214); here, RIGHT-ARC(l) is “the second update operation” that removes s0 from the stack, where removing s0 from the stack is equivalent to “shifting a first-place word in the two indicated words to be processed out of the set of words to be processed”; that is, s0 is “a first-place word” that is shifted out of the stack.   
Concerning claims 6, 13, and 19, Zhou et al. discloses that SHIFT: moves the front word wj from the queue into the stack (§2.1 Arc-standard Parsing: Page 1214); that is, SHIFT is an operation that takes a front word wj from the queue, which is “a new word to be processed in the set of words to be processed” after shifting out a word from operations of LEFT-ARC(l) and RIGHT-ARC(l).  
Concerning claim 10, Zhou et al. discloses that the goal is to find the highest-scored action sequence globally: y = arg max score(y’) for y’ an element of GEN(x) (Equation (1)), where GEN(x) denotes all possible action sequences on x; a score of an action sequence y is: score(y)=Σaₑy θ • Φ(a), where the score of an action sequence is the linear sum of the scores of each action (§2.2 Global Learning and Beam Search: Page 1214: Equations (1) and (2)); here, determining scores of action sequences is “obtain the confidence levels, each corresponding to a respective one of the plurality of preset set update operations”; finding the highest-scored action sequence globally: y = arg max score(y’) for y’ₑGEN(x) (Equation (1)) is “taking a preset update operation with a highest one of the confidence levels as the target set update operation.”
Concerning claim 15, Zhou et al. discloses that the goal is to find the highest-scored action sequence globally: y = arg max score(y’) for y’ an element of GEN(x) (Equation (1)), where GEN(x) denotes all possible action sequences on x; a score of an action sequence y is: score(y)=Σaₑy θ • Φ(a), where the score of an action sequence is the linear sum of the scores of each action (§2.2 Global Learning and Beam Search: Page 1214: Equations (1) and (2)); here, finding a highest-scoring action sequence globally is equivalent to “select an optimal dependency parsing result for the semantic recognition from the plurality of possible dependency parsing results”; implicitly, this provides “improving the accuracy of the semantic recognition.”

Claims 2 to 3, 9, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al. (“A Neural Probabilistic Structured-Predication Model for Transition-Based Dependency Parsing”) in view of Buchholz (U.S. Patent Publication 2007/0016398) as applied to claims 1, 8, and 16 above, and further in view of Zhang et al. (“Transition-based Dependency Parsing with Rich Non-local Features”).
Concerning claims 2, 9, and 17, Zhou et al. does not clearly disclose the limitation of “wherein the first word includes a preset number of words following the word to be processed in the plurality of words, and the second word is a word that has been determined to have a dependency relationship with the word to be processed in the plurality of words.”  That is, Zhou et al. does not clearly disclose “a preset number of words following the word to be processed”, but only appears to process two words at any one time.  Still, Zhou et al. discloses a plurality of feature templates in Table 1.  (§2.3.2 Features: Page 1215: Table 1)  Generally, Zhang et al. teaches transition-based dependency parsers with richer feature sets that can improve an accuracy of the parsers.  (Abstract)  A transition-based parsing algorithm performs actions of Shift, which removes the front of the queue and pushes it onto the top of the stack, Reduce, which pops the top item off the stack, LeftArc, which pops the top item off the stack, and adds it as a modifier to the front of the queue, and RightArc, which removes the front of the queue, pushes it onto the stack and adds it as a modifier to the top of the stack.  (§2 The Transition-based Parsing Algorithm: Page 189)  N is a queue of incoming words, and A is the set of dependency arcs that have been built.  Specifically, baseline features include single word features, features from word pairs, and features from three words, where w is a word and p is a part-of-speech tag.  (§3 Feature Templates: Page 189: Table 1)  The head, left/rightmost modifiers of S0 and the leftmost modifier of N0 have been used by most arc-eager transition-based parsers, where S0 is the top of the stack and N0, N1, and N2 are the front items from the queue.  Zhang et al., then, teaches features that include “a preset number of words following the word to be processed”, where the preset number can be from 1 to 3.  It would have been obvious to one having ordinary skill in the art to provide a preset number of words following the word to be processed as taught by Zhang et al. in a probabilistic model of transition-based dependency parsing of Zhou et al. for a purpose of improving an accuracy of parsers by considering richer feature sets.  

Concerning claim 3, Zhou et al. discloses that the goal is to find the highest-scored action sequence globally: y = arg max score(y’) for y’ₑGEN(x) (Equation (1)), where GEN(x) denotes all possible action sequences on x; a score of an action sequence y is: score(y)=Σaₑy θ • Φ(a), where the score of an action sequence is the linear sum of the scores of each action (§2.2 Global Learning and Beam Search: Page 1214: Equations (1) and (2)); here, determining scores of action sequences is “obtain the confidence levels, each corresponding to a respective one of the plurality of preset set update operations”; finding the highest-scored action sequence globally: y = arg max score(y’) for y’ₑGEN(x) (Equation (1)) is “taking a preset update operation with a highest one of the confidence levels as the target set update operation.”

Claims 7, 14, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al. (“A Neural Probabilistic Structured-Predication Model for Transition-Based Dependency Parsing”) in view of Buchholz (U.S. Patent Publication 2007/0016398) as applied to claims 1, 8, and 16 above, and further in view of Verma et al. (U.S. Patent Publication 2018/0349377).
Zhou et al. omits the limitation of “matching the plurality of words to be recognized with entity words in a preset word database” and “performing word fusion on the words to be recognized according to the matched entity words and part-of-speech information for the words to be recognized, to obtain the plurality of words and part-of-speech information corresponding to each of the words.”  However, it is known in the prior art to determine multi-word named entities that may be more useful for a meaning of the words than the words individually.  Generally, Verma et al. teaches converting natural language input to structured queries.  (Abstract)  A named entity recognition (NER) component 201 assigns respective probabilities for each word or combination of words indicating a likelihood of each word or combination of words corresponding to the entity of the input query.  (¶[0024]: Figure 2)  Specifically, for an input query, ‘midnight in Paris’, database signals 440 can be used to determine queries for an entity corresponding to a particular combination of words ‘midnight in Paris’ has a higher degree of popularity.  The entity scorer can utilize database signals 440 to determine that this particular entity is more likely to correspond to a movie.  (¶[0043] - ¶[0045]: Figure 4)  Verma et al., then, teaches “performing word segmentation” by “matching the plurality of words to be recognized with entity words in a preset word database”.  Conceptually, this performs “word fusion” on the individual words ‘midnight’, ‘in,’ and ‘Paris’ due to the combination of words being probable.  An objective is to improve processing of an input query in a natural language format.  (¶[0015])  It would have been obvious to one having ordinary skill in the art to perform word segmentation and word fusion of a combination of words by matching a plurality of words in a preset word database as taught by Verma et al. in a probabilistic model of transition-based dependency parsing of Zhou et al. for a purpose of improving processing of an input query in a natural language format.

Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicants’ disclosure.
Ting, Petrov et al. (‘279), Petrov et al. (‘544), Ylonen, Ng Tari et al., and Mutalikdesai et al. disclose related prior art.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARTIN LERNER whose telephone number is (571) 272-7608.  The examiner can normally be reached Monday-Thursday 8:30 AM-6:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571) 272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center.  Unpublished application information in Patent Center is available to registered users.  To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov.  Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format.  For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MARTIN LERNER/Primary Examiner
Art Unit 2657                                                                                                                                                                                                        October 13, 2022