DETAILED ACTION
Background
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to the Amendment filed on March 7, 2022.  This action is made final.
Claims 1, 8, 11, and 18 are amended.  Claims 21-25 are cancelled.  Claims 26-30 are new claims.  Claims 1-3, 7-14, 17-20, and 26-30 are pending for examination.  Claims 1, 8, 11, and 18 are independent claims.

Claim Objections
Regarding Claim 25, the claim is cancelled as specified in Applicant’s Amendment, rendering the previous objection moot.

Claim 30 is objected to because of the following informalities: the phrase “the separate end point detection process is voice interval” should read “the separate end point detection process is a voice interval.”  Appropriate correction is required. 

Claim Rejections - 35 USC § 112
Regarding Claims 21-25, the claims are cancelled as specified in Applicant’s Amendment, rendering the previous rejections moot.
Regarding Claims 1-3, 7-14, and 17-20, Applicant’s Amendment corrects the previous issues related to failure to comply with the written description requirement.  The previous rejections are withdrawn.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. 

Claims 1-3, 7-14, 17-20, and 26-30 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.  Independent Claims 1, 8, 11, and 18 as amended each recite the limitation “the final command pattern” or analogous variants in the claimed apparatuses and methods respectively.  As the limitation is preceded by multiple instances of “a final command pattern” in each of the claims, antecedent basis for the limitation in the claims is ambiguous.  Dependent Claims 2, 3, 7, 9, 10, 12-14, 17, 19, 20, and 26-30 incorporate the deficiency.
Note that the prior art analysis of relevant claims below is based on a most likely interpretation made in light of the above deficiencies. 

The following is a quotation of 35 U.S.C. 112(a):
(a)  IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 1-3, 7-14, 17-20, and 26-30 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement.  The claims contain subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, at the time the application was filed, had possession of the claimed invention.  Independent Claims 1, 8, 11, and 18 as amended each recite the limitations “wherein matching with a final command pattern is not performed regardless of other matchable command patterns when the matching result of the text information is MATCH or PARTIAL MATCH and matches a possible matchable command pattern having a NORMAL grade” or analogous variants in the claimed apparatuses and methods respectively.  Contrary to Applicant’s assertions on pages 10 and 11 of the Amendment (pages 2 and 3 of the Remarks), disclosure in the specification indicating that matching of a command pattern with the IMMEDIATE grade may be performed regardless of other command patterns does not provide reasonable support for possession of the limitation that matching with a final command pattern is not performed regardless of other matchable command patterns when the matching result of the text information is MATCH or PARTIAL_MATCH and matches a possible matchable command pattern having a NORMAL grade.  Applicant’s original disclosure does not appear to articulate a particular relationship, such as a mutually exclusive relationship, between IMMEDIATE, NORMAL, and WAIT_END grades, and even if it assumed that command patterns may only be assigned one of the three grades, no disclosure appears to reasonably support that the grades cannot share aspects or features.  No disclosure appears to reasonably provide support for possession of a relationship between the NORMAL grade and the MATCH or PARTIAL MATCH classifications that requires that “matching with a final command pattern is not performed regardless of other matchable command patterns when the matching result of the text information is MATCH or PARTIAL MATCH and matches a possible matchable command pattern having a NORMAL grade” as claimed.  This can be seen in evaluating the original disclosures associated with the NORMAL grade, which involve multiple statements simply listing the three grades, an ambiguous statement on pages 20 and 21 of the specification stating that “[t]he NORMAL grade may be determined as a recognition result when there is only a NORMAL-grade command pattern in MATCH or PARTIAL_MATCH,” and pseudocode on pages 21 and 22 of the specification that suggests features of handling IMMEDIATE, NORMAL, and WAIT_END patterns but does not support matching with a final command pattern not being performed regardless of other matchable command patterns.  Note also that a negative limitation, such as the limitation now asserted by Applicant, must have a basis in the original disclosure.  An alternative can be explicitly excluded if alternative elements are positively recited, yet mere absence of a positive recitation (such as performing a matching with a final command pattern regardless of other matchable command patterns) is not basis for an exclusion.  See MPEP § 2173.05[i].  Dependent Claims 2, 3, 7, 9, 10, 12-14, 17, 19, 20, and 26-30 incorporate the deficiency.  
In the alternative, the claims are rejected under 35 U.S.C. 112(b) as being indefinite as the scope of the limitations “wherein matching with a final command pattern is not performed regardless of other matchable command patterns when the matching result of the text information is MATCH or PARTIAL MATCH and matches a possible matchable command pattern having a NORMAL grade” is not clear from the face of the claims in view of the specification, as one of ordinary skill in the art would not reasonably be put on notice regarding what aspects of matching are excluded so as to satisfy a requirement that matching with a final command pattern is not performed regardless of other matchable command patterns.
Claims 1-3, 7-14, 17-20, and 26-30 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement.  The claims contain subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, at the time the application was filed, had possession of the claimed invention.  Independent Claims 1, 8, 11, and 18 as amended each recite the limitations “wherein matching with a final command pattern is performed after input times out when a possible matchable command pattern has a WAIT_END grade” or analogous variants in the claimed apparatuses and methods respectively.  No disclosure appears to reasonably provide support for possession of the noted limitation as no disclosure appears to describe performing matching after a timeout.  The examples of pseudocode provided on pages 20-24 of the specification appear to indicate that matching for the various grades is performed before any return of a timeout.  Although support might be inferred for selecting or processing a final command pattern occurring after input times out when a possible matchable command pattern has a WAIT_END grade, which was perhaps intended by Applicant, Applicant’s disclosure appears to indicate that matching of command patterns occurs prior to or independent of a timeout.  Dependent Claims 2, 3, 7, 9, 10, 12-14, 17, 19, 20, and 26-30 incorporate the deficiency.
Note that the prior art analysis of relevant claims below is based on a most likely interpretation made in light of the above deficiencies. 

Claim Rejections - 35 USC § 103
The following is a quotation 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 8-14, 18-20, and 26-30 are rejected under 35 U.S.C. 103 as being unpatentable over Gelfenbeyn et al., U.S. Patent Application 2016/0259775 A1 (published Sep. 8, 2016) (hereinafter “Gelfenbeyn”) in view of Sung et al., U.S. Patent Application 2018/0039477 A1 (published Feb. 8, 2018) (hereinafter “Sung”) and Aleksic et al., U.S. Patent Application 2017/0069309 A1 (published Mar. 9, 2017) (hereinafter “Aleksic”).
Regarding Claim 1, Gelfenbeyn teaches a Graphical User Interface (GUI) voice control apparatus having a processor to control the operation of the apparatus (see, e.g., Gelfenbeyn, Abstract and paras. 2, 33, and 37, describing systems and methods for context-based natural language processing, describing embodiments in which a computer system utilizes microprocessors, and describing a dialog system engine as a computer-enabled or processor-enabled system for supporting a dialog system interface), comprising:
A context information generator configured by the processor to dynamically reflect GUI status information and DB information in a language model to generate context information (see, e.g., id., para. 14, describing context information as including various forms of conversational and environmental context information including a state of the GUI currently running on a client; paras. 50, 53, 55, 57, and 58 and Fig. 2, describing and illustrating architecture of an exemplary dialog system engine comprising a dialog manager that coordinates activity of components of the engine, one or more rule databases in which dialog system rules are maintained, and one or more context databases which maintain a plurality of context description elements such as lists of terms, keywords, phrases, expressions, context variables, and context parameters associated with one or more dialog system rules [the context description elements and rule associations representing a language model in some form], the context database information supporting a process of determining conversational or environmental context for particular user requests; and paras. 42, 78, 79, 82, 114, and 115 and Figs. 1, 3, and 6-8, describing a platform as comprising a platform interface for creating, maintaining, and managing custom dialog system engines, describing and illustrating methods in which the platform interface is used by developers to generate custom dialog system elements such as system rules specific to a particular application, and describing and illustrating aspects of screenshots of the platform interface showing a process of setting up or creating elements of the dialog system engine such as intents, entities, synonyms, canonical values, and aliases or variable names [such elements representing aspects of a language model of context information]);
A voice recognizer configured by the processor to convert a voice signal into text to update text information (id., paras. 50 and 51 and Fig. 2, describing and illustrating the dialog system engine as comprising an automatic speech recognizer configured to receive and process speech-based user inputs into a sequence of parameter vectors and into a recognized input such as a textual input having one or more words, phrases, or sentences and describing the automatic speech recognizer as including one or more speech recognizers such as a pattern-based speech recognizer, free-dictation recognizer, address book based recognizer, dynamically created recognizer, etc.);
A natural language recognizer configured by the processor to reduce a number of possible command patterns matchable with the text information based on the context information as the text information is updated (see, e.g., id., para. 52 and Fig. 2, describing and illustrating the dialog system engine as comprising a natural language processing module that disassembles and parses the recognized input to produce and analyze utterances and to map recognized input to meaning representations; paras. 50 and 53, describing the dialog system engine as implementing context-based natural language processing and describing the dialog manager performing actions including discourse analysis, knowledge database query, and system action prediction based on discourse context [suggesting reducing possible matchable patterns in some form]; and paras. 92-94 and Fig. 5, describing and illustrating a process flow diagram for a method of context-based natural language processing in which the dialog system engine receives a user request having additional attributes such as contextual information and describing embodiments in which contextual information can be used for pre-filtering user intents [representing reducing possible matchable patterns]), and recognize an intent and entity of the voice signal by matching with a final command pattern (see, e.g., id., paras. 95-98 and Fig. 5, describing and illustrating the method of context-based natural language processing comprising the dialog system engine identifying a type of context associated with the user request, assigning a context label to the user request based on the result of the identification, and selecting or identifying, from a plurality of dialog system rules such as intents, a particular dialog system rule or intent that is associated with the context label and/or user request, and paras. 61-74, 94, and 103, describing entities as referring to various types of objects, describing operation of rules as based on a relationship between a particular action and at least one entity, describing representing an intent as a logical relation between at least one action and at least one entity object, and indicating recognition of a user request in relationship to various identified entities); and
A voice controller configured by the processor to output a control signal according to the recognized intent and entity (see, e.g., id., para. 56 and Fig. 2, describing and illustrating the dialog system engine as comprising an output renderer for transforming output of the dialog manager into a form suitable for providing to the user such as generating an audio message corresponding to the output of the dialog manager; paras. 98 and 99 and Fig. 5, describing and illustrating the method of context-based natural language processing comprising the dialog system engine generating a response to the user request by applying the dialog system rule to at least a part of the user request and delivering the response to the user; and paras. 73-75, and indicating various actions performed in relationship to identified intents and entities),
Wherein at least one of command patterns and entities is configured to be generated by processing information received from a content management system (CMS) of a target service (see, e.g., id., para. 49 and Fig. 1, describing and illustrating arrangements in which various third party web resources or services, provided via one or more web servers, provide information of various types to dialog system engines or dialog system interfaces; para. 49, describing embodiments in which the dialog manager contacts one or more task managers that may have knowledge of specific task domains and describing embodiments in which the dialog manager communicates with various computational or storage resources [which may be web resources or services as previously described] such as a rules database, a context database, an electronic address book, disparate knowledge databases, a map database, a points of interest database, news feed services, and many more; and paras. 124 and 125 and Claim 5, describing embodiments in which assignment of context labels involves the dialog system engine performing additional steps including acquiring additional data or information from web resources or web services and describing an example in which a particular entity is not understandable without acquiring additional information [indicating embodiments in which entities may be obtained from a third party service].  Note that any system such as a third party web resource or service that provides structured access to information can be viewed as a content management system.  Note also that the teachings anticipate the alternative language of the claim) or to be updated through a management website (e.g., id., paras. 42, 78, 79, 82, 114, and 115 and Figs. 1, 3, and 6-8, describing and illustrating a platform interface for creating, maintaining, and managing custom dialog system engines, describing and illustrating methods in which the platform interface is used by developers to generate elements such as rules, and describing and illustrating aspects of screenshots of the platform interface showing a process of setting up or creating elements of the dialog system engine such as intents, entities, synonyms, canonical values, and aliases or variable names), 
Wherein the natural language recognizer classifies a matching result of the text information into MATCH and NO_MATCH to reduce the number of possible matchable command patterns (see, e.g., id., paras. 92-94 and Fig. 5, describing context-based natural language processing in which the dialog system engine receives a user request having contextual information and describing embodiments in which contextual information can be used for pre-filtering user intents [such pre-filtering representing an arrangement in which user intents match or do not match in relationship to given contextual information]), 
Wherein matching with a final command pattern is not performed regardless of other matchable command patterns when the matching result of the text information is MATCH and matches a possible matchable command pattern having a NORMAL grade (see, e.g., id., paras. 97, 107, and 108, describing embodiments in which matching is performed in relationship to a determination of various contexts and a determination of values of variables within particular contexts, indicating matching not performed regardless of other matchable command patterns at least in the sense of matching performed with regard to other matchable command patterns.  Note that, even taken in view of the specification, the claimed designation NORMAL, without more, represents a placeholder that can be viewed as nonfunctional descriptive material not entitled to patentable weight and that is rendered obvious over any of various distinctions in matching behavior as discussed, such as related to or in contrast to certain contexts or certain embodiments); and 
Wherein a final command pattern is a finally matched command pattern from the possible matchable command patterns (see, e.g., id., paras. 97-99, describing the dialog system engine setting output contexts when intents are matched, generating a response to the user request by applying a dialog system rule to at least a part of the user request, and delivering the response to the user.  Note that operation to identify a command and generate a response as described comprises determining a final command pattern from possible matchable command patterns in some form).
However, although Gelfenbeyn suggests immediate responsiveness of the dialog system engine to user requests (see, e.g., Gelfenbeyn, Abstract and paras. 12, 13, 36, 39, 49, 55, 64, 75, 102, 117, and 123), it does not appear to explicitly teach that the voice recognizer is configured to convert the voice signal into text in real time.
Sung teaches a GUI voice control apparatus (e.g., Sung, Abstract, describing systems, devices, computerized methods, and computer program media for integrating voice-based interaction and control into a native GUI of an executed application), comprising a voice recognizer configured to convert a voice signal into text in real time to update text Information (see, e.g., id., paras. 68 and 71, describing embodiments in which a computing system implementing a voice-user interface includes a voice-user interface component library configured to obtain portions of text recognized in real time by a voice-service provider application from spoken utterance, and paras. 75 and 83, describing other embodiments comprising recognition of textual content in real time).
Gelfenbeyn and Sung are analogous art at least because they are from the same field of endeavor as the claimed invention, referencing GUI voice control apparatuses and methods and with teachings directed toward responsive voice control user interfaces.  Before the effective filing date, it would have been obvious to a person of ordinary skill in the art to combine the teachings of Gelfenbeyn and Sung and implement a GUI voice control apparatus in which a voice signal is converted into text in real time to update text information in order to provide a more responsive voice-based user interface (see, e.g., Sung, paras. 4, 7, 12, and 46; and in view of the value of real-time responsiveness in user interfaces well known in the art).  
However, Gelfenbeyn as modified by Sung appears to be silent regarding the natural language recognizer classifying a matching result of the text information into PARTIAL_ MATCH, wherein the matchable command patterns have IMMEDIATE, NORMAL, or WAIT_END grades, wherein the PARTIAL MATCH means a state in which a matching is possible as text information is updated, wherein matching with a final command pattern is performed regardless of other matchable command patterns when a possible matchable command pattern has an IMMEDIATE grade, and wherein matching with a final command pattern is performed after input times out when a possible matchable command pattern has a WAIT_END grade.
Aleksic teaches a voice control apparatus (e.g., Sung, Aleksic, describing systems, apparatus, computerized methods, and computer program media for receiving audio data including an utterance, obtaining context data that indicates one or more expected speech recognition results, determining an expected speech recognition result based on the context data, receiving an intermediate speech recognition result generated by a speech recognition engine, comparing the intermediate speech recognition result to the expected speech recognition result for the audio data based on the context data, and determining whether the intermediate speech recognition result corresponds to the expected speech recognition result for the audio data based on the context data), wherein a natural language recognizer classifies a matching result of text information into PARTIAL_ MATCH in addition to MATCH and NO_ MATCH (see, e.g., id., para. 38 and Fig. 3, describing and illustrating a diagram of an example process for endpointing of an utterance comprising determining whether there is either a match or no match or a partial match between intermediate speech recognition results and expected speech recognition results), wherein the matchable command patterns have IMMEDIATE, NORMAL, or WAIT_END grades (see, e.g., id., paras. 44-48 and Fig. 3, describing and illustrating the diagram of the example process for endpointing of an utterance comprising determining whether an intermediate speech recognition result corresponds to the expected speech recognition result for the audio data based on context data, describing arrangements in which the system initializes an end of speech condition and provides a final speech recognition result as soon as it is determined that an intermediate speech recognition result matches an expected speech recognition result [which can be viewed as representing an immediate matching grade], describing arrangements in which the system dynamically extends a timeout for an end of speech condition to receive additional audio data in response to determining the intermediate speech recognition result includes a partial match to the expected speech recognition result for the audio data based on the context data [which can be viewed as representing a normal matching grade], and describing arrangements in which the system extends a timeout by a predetermined or default amount of time in response to determining no match between an intermediate speech recognition result and context data [which can be viewed as representing a WAIT_END matching grade].  Note that, even taken in view of the specification, the claimed designations IMMEDIATE, NORMAL, or WAIT_END, without more, represent placeholders that can be viewed as nonfunctional descriptive material not entitled to patentable weight and that are rendered obvious by three different distinctions in matching behavior as discussed), wherein the PARTIAL MATCH means a state in which a matching is possible as text information is updated (see, e.g., id., para. 46, describing the system extending a timeout in response to determination of a partial match and describing an example in which a three word phrase is expected, the intermediate speech recognition result contains only two words, and the timeout is extended to allow additional time for the input of audio of the third word [representing a state in which a matching is still possible if the third word is received], and paras. 27 and 45, describing embodiments in which a result may be initiated as soon as the speech recognizer determines a match of sufficient similarity in relationship to the intermediate speech recognition result), wherein matching with a final command pattern is performed regardless of other matchable command patterns when a possible matchable command pattern has an IMMEDIATE grade (see, e.g., id., paras. 27 and 45, describing arrangements in which the system initializes an end of speech condition and provides a final speech recognition result as soon as it is determined that an intermediate speech recognition result matches an expected speech recognition result [which can be viewed as a matching performed regardless of other command patterns at least in the sense of other command patterns not being matched]), and wherein matching with a final command pattern is performed after input times out when a possible matchable command pattern has a WAIT_END grade (see, e.g., id., paras. 46-48, describing arrangements in which the system dynamically extends a timeout and describing arrangements in which the system extends a timeout by a predetermined or default amount of time in response to determining no match between an intermediate speech recognition result and context data [which can be viewed as representing a WAIT_END matching grade].  Note that a match determined based on a recognition result received after an extension can be viewed as representing matching a final command pattern after input times out at least in the sense that the match occurs after an initial timeout).
Aleksic is analogous art at least because it is from the same field of endeavor as the claimed invention, referencing voice control apparatus and methods and with teachings directed toward context-based voice recognition.  Before the effective filing date, it would have been obvious to a person of ordinary skill in the art to combine the teachings of Gelfenbeyn, Sung, and Aleksic and implement a GUI voice control apparatus in which a natural language recognizer classifies a matching result of text information into PARTIAL_ MATCH in addition to MATCH and NO_ MATCH to reduce the number of matchable command patterns, in which matchable command patterns have IMMEDIATE, NORMAL, or WAIT_END grades, in which the PARTIAL MATCH means a state in which a matching is possible as text information is updated, in which matching with a final command pattern is performed regardless of other matchable command patterns when a possible matchable command pattern has an IMMEDIATE grade, and in which matching with a final command pattern is performed after input times out when a possible matchable command pattern has a WAIT_END grade in order to improve accuracy and responsiveness of natural language processing in relationship to context data (see, e.g., Aleksic, paras. 3-8, 18-20, and 27; and in view of the value of partial or fuzzy classification known in the art and in view of the value of timeout-based processing well known in the art).  
Regarding Claim 2, Gelfenbeyn as modified by Sung and Aleksic teaches the GUI voice control apparatus according to Claim 1, wherein the GUI status information comprises GUI information and a service status (see, e.g., Gelfenbeyn, para. 14, describing context information as including various forms of conversational and environmental context information including mobile or software applications currently running on a client and a state of the GUI currently running on a client; para. 41, describing examples in which a dialog system interface can determine context information such as a current geographical location of the user, a current time, a date, and currently used mobile or software applications; para. 86, describing determining a current dialog context between a given end user and the dialog system engine; para. 100, describing the dialog system engine tracking user requests, dialog system responses, user activities, user location, currently running software or mobile applications, date and time, and other factors in order to identify a particular context for a certain user request; and paras. 6 and 123, describing an example in which a verbal user command “Next” may mean different actions in different mobile applications and describing a context label assigned to an application such as a browser such that a verbal request “Next” results in the dialog system engine generating a response that instructs a currently running browser to open a next webpage, such as by means of a callback URL, to fulfil the user request.  Note that various aspects of a status related to a user in interaction with the dialog system engine can be viewed as contextual information representing a service status consistent with discussion on page 15 of Applicant’s specification).
Regarding Claim 3, Gelfenbeyn as modified by Sung and Aleksic teaches the GUI voice control apparatus according to Claim 1, wherein the DB information comprises information on at least one of predefined command patterns and entities received from a command pattern and entity database (see, e.g., Gelfenbeyn, paras. 57, 80, 81, and 93, describing the one or more rule databases of the dialog system engine as storing dialog system rules [which can be viewed as predefined command patterns] including intents and entities and describing further embodiments in which entities and intents are stored in the one or more rule databases).
Regarding Claim 8, Gelfenbeyn as modified by Sung and Aleksic teaches a GUI voice control apparatus corresponding to the GUI voice control apparatuses of Claims 1 and 3.  The same rationales of rejection provided above are applicable.  Gelfenbeyn as modified by Sung and Aleksic further teaches the GUI voice control apparatus comprising a communicator, by the processor, configured to transmit a voice signal received in real time and the context information to a voice conversion server, transmit the context information to a natural language recognition server, and receive an intent and entity of the voice signal (see, e.g., Gelfenbeyn, paras. 12 and 36, describing embodiments in which the dialog system engine is running on a server and a dialog system interface is running on a client; para. 40, describing embodiments in which dialog system interfaces are on a server so that the dialog systems become a part of a website or web service and in which dialog system engines are implemented on a server such that their functionalities can be accessible to dialog system interfaces over the Internet; para. 43, describing embodiments in which the platform interface is a server-based solution so that it is accessible via the Internet; and para. 50, describing embodiments in which the dialog system engine, and the system for context-based natural language processing implemented as a part of the dialog system engine, are embedded or installed in a server).
Regarding Claim 9, Gelfenbeyn as modified by Sung and Aleksic teaches a GUI voice control apparatus corresponding to the GUI voice control apparatus of Claim 2.  In view of the discussion of Claim 8, the same rationale of rejection provided above is applicable.
Regarding Claim 10, Gelfenbeyn as modified by Sung and Aleksic teaches the GUI voice control apparatus according to Claim 8, wherein the voice conversion server comprises a text converter configured to convert the voice signal to text in real time without the context information to update text information and transmit the updated text information (see, e.g., Gelfenbeyn, para. 51, describing the automatic speech recognizer [ASR] as receiving and processing speech-based user inputs into a sequence of parameter vectors and describing embodiments in which the ASR is in the dialog system interface or the dialog system engine; paras. 12, 36, 40, 43, and 53, describing various server-based arrangements; and para. 13, describing embodiments in which user requests are pre-processed such that the dialog system interface recognizes spoken words and transforms audio user input into a text-based user input by applying a dialog system rule or a statistics-based dialog system responding scheme and describing arrangements in which, if the dialog system engine determines that the user request cannot be understood out of context, the dialog system engine identifies a context stored in a context database that relates to at least a portion of the user request [indicating arrangements in which a user request is pre-processed or understood without use of context information]).
Regarding Claim 11, Gelfenbeyn as modified by Sung and Aleksic teaches a GUI voice control method corresponding to the GUI voice control apparatus of Claim 1.  The same rationale of rejection provided above is applicable.
Regarding Claim 12, Gelfenbeyn as modified by Sung and Aleksic teaches a GUI voice control method corresponding to the GUI voice control apparatus of Claim 2.  The same rationale of rejection provided above is applicable.
Regarding Claim 13, Gelfenbeyn as modified by Sung and Aleksic teaches a GUI voice control method corresponding to the GUI voice control apparatus of Claim 3.  The same rationale of rejection provided above is applicable.
Regarding Claim 14, Gelfenbeyn as modified by Sung and Aleksic teaches a GUI voice control method corresponding to the GUI voice control apparatus of Claim 4.  The same rationale of rejection provided above is applicable.
Regarding Claim 18, Gelfenbeyn as modified by Sung and Aleksic teaches a GUI voice control method corresponding to the GUI voice control apparatus of Claims 8 and 10.  The same rationales of rejection provided above are applicable.
Regarding Claim 19, Gelfenbeyn as modified by Sung and Aleksic teaches a GUI voice control method corresponding to the GUI voice control apparatus of Claim 9.  In view of the discussion of Claim 18, the same rationale of rejection provided above is applicable.
Regarding Claim 20, Gelfenbeyn as modified by Sung and Aleksic teaches the GUI voice control method according to Claim 18, wherein, when there is no command pattern matching the text information, the text information is reset and text information updated afterwards in real time is processed (see, e.g., Sung, paras. 92, 95-97, and 107-110 and Fig. 4, describing and illustrating a flowchart of an exemplary process for initiating and performing voice-based interaction with and control of an executed application in which audio data indicative of a user’s interaction with an application is captured, contextual data is accessed and included in contextual query data transmitted to a voice server provider system, and operations are performed consistent with a structured response bundle received from the voice server provider system and describing embodiments in which the user’s utterances fail to include one or more parameters necessary to complete one or more actions, content is presented to the user which prompts the user to provide the information necessary to complete the one or more actions, and the process passes back to the step in which audio data indicative of a user’s interaction with an application is captured to capture additional utterances spoken by the user in response to the presented audio content [representing a reset of text information at least in the sense of restarting process steps of capturing and recognizing user utterances and representing no matching command pattern at least in the sense of no complete matching], and paras. 71, 75, and 83, describing embodiments comprising recognition of textual content in real time.  One of ordinary skill in the art would have been motivated to implement such an arrangement in which method steps of collecting text information are reset when no command pattern completely matches received text information under the same rationale as provided in the discussion of Claim 1 above and further in order to ensure complete information is received necessary for executing a certain command [see id., para. 109]).
Regarding Claim 26, Gelfenbeyn as modified by Sung and Aleksic teaches the GUI voice control apparatus according to Claim 1, wherein the final command pattern is obtained in real-time before a separate end point detection process (see, e.g., id., paras. 68, 71, 75, and 83, describing embodiments in voice recognition occurs in real time; and see, e.g., Aleksic, paras. 44-48 and Fig. 3, describing and contrasting arrangements in which the system initializes an end of speech condition and provides a final speech recognition result as soon as it is determined that an intermediate speech recognition result matches an expected speech recognition result [representing obtaining a final command pattern] and arrangements in which a timeout is extended [representing an endpoint detection process in some form].  Under such arrangements as described, immediate recognition of a result before a timeout is extended can be viewed as representing obtaining a final command pattern before a separate end point detection process as claimed.  One of ordinary skill in the art would have been motivated to perform such processing in real time under the same rationale as provided in the discussion of Claim 1 above).
Regarding Claim 27, Gelfenbeyn as modified by Sung and Aleksic teaches the GUI voice control apparatus of Claim 26.  In view of the discussion of Claim 8, the same rationale of rejection provided above is applicable.
Regarding Claim 28, Gelfenbeyn as modified by Sung and Aleksic teaches a GUI voice control method corresponding to the GUI voice control apparatus of Claim 26.  The same rationale of rejection provided above is applicable.
Regarding Claim 29, Gelfenbeyn as modified by Sung and Aleksic teaches a GUI voice control method corresponding to the GUI voice control apparatus of Claim 26.  In view of the discussion of Claim 18, the same rationale of rejection provided above is applicable.
Regarding Claim 30, Gelfenbeyn as modified by Sung and Aleksic teaches the GUI voice control apparatus according to Claim 26, wherein the separate end point detection process is a voice interval after receiving a voice signal input and a pause period (see, e.g., Aleksic, paras. 25, 29, 32, and 37, describing embodiments in which an end-of-speech [EOS] timeout may be extended to allow for additional audio data to be input in relationship to context and describing examples in which a user provides input followed by a pause and an EOS timeout is extended for the user to speak a remaining portion of a response [representing a voice interval after receiving a voice signal input and a pause period].  One of ordinary skill in the art would have been motivated to implement a separate end point detection process that is a voice interval after receiving a voice signal input and a pause period under the same rationale as provided in the discussion of Claim 1 above).

Claims 7 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Gelfenbeyn in view of Sung and Aleksic and in further view of Braho et al., U.S. Patent Application 2011/0093269 A1 (published Apr. 21, 2011) (hereinafter “Braho”).
Regarding Claim 7, Gelfenbeyn as modified by Sung and Aleksic teaches the voice control apparatus according to Claim 1 as discussed above and further teaches the apparatus wherein the natural language recognizer, when there is no command pattern matching the text information, addresses the text input by resetting the text information and processes text information updated in real time afterwards (see, e.g., Sung, paras. 92, 95-97, and 107-110 and Fig. 4, describing and illustrating a flowchart of an exemplary process for initiating and performing voice-based interaction with and control of an executed application in which audio data indicative of a user’s interaction with an application is captured, contextual data is accessed and included in contextual query data transmitted to a voice server provider system, and operations are performed consistent with a structured response bundle received from the voice server provider system and describing embodiments in which the user’s utterances fail to include one or more parameters necessary to complete one or more actions, content is presented to the user which prompts the user to provide the information necessary to complete the one or more actions, and the process passes back to the step in which audio data indicative of a user’s interaction with an application is captured to capture additional utterances spoken by the user in response to the presented audio content [representing a reset of text information at least in the sense of restarting process steps of capturing and recognizing user utterances and representing no matching command pattern at least in the sense of no complete matching], and paras. 71, 75, and 83, describing embodiments comprising recognition of textual content in real time.  One of ordinary skill in the art would have been motivated to implement such an arrangement in which method steps of collecting text information are reset when no command pattern completely matches received text information under the same rationale as provided in the discussion of Claim 1 above and further in order to ensure complete information is received necessary for executing a certain command [see id., para. 109]).
However, Gelfenbeyn as modified by Sung and Aleksic does not appear to explicitly teach ignoring the text input by resetting the text information.
Braho teaches a GUI voice control apparatus (e.g., Braho, Abstract, describing a speech recognition system that receives and analyzes speech input from a user in order to recognize and accept a response from the user), wherein a language recognizer, when there is no command pattern matching text information, ignores text input by resetting the text information (see, e.g., id., para. 7, describing a speech recognizer performing speech recognition by analyzing received speech input, determining how closely input matches models, and rejects or ignores the speech input if a confidence factor is below an acceptance threshold and describing embodiments in which a user is required to repeat the speech input [indicating a resetting of recognition information in some form]).
Braho is analogous art at least because it is from the same field of endeavor as the claimed invention, referencing speech recognition methods and with teachings directed toward responsive voice control user interfaces.  Before the effective filing date, it would have been obvious to a person of ordinary skill in the art to combine the teachings of Gelfenbeyn, Sung, Aleksic, and Braho and implement a GUI voice control apparatus in which a language recognizer ignores text input by resetting text information in order to provide more accurate responses (see, e.g., Braho, paras. 18-20; and in view of the value of handling of improper input well known in the art).  
Regarding Claim 17, Gelfenbeyn as modified by Sung and Aleksic and as further modified by Braho teaches a GUI voice control method corresponding to the GUI voice control apparatus of Claim 7.  The same rationale of rejection provided above is applicable.

Response to Arguments
Applicant’s arguments filed March 7, 2022, have been fully considered but are to some extent moot in view of the new grounds of rejection.  To the extent the arguments still apply, they are not persuasive.  Applicant argues on pages 10 and 11 of the Amendment (pages 2 and 3 of the Remarks) that Gelfenbeyn in view of Sung and Aleksic fails to teach or suggest limitation including “wherein matching with a final command pattern is performed regardless of other matchable command patterns when a possible matchable command pattern has an IMMEDIATE grade; wherein matching with a final command pattern is not performed regardless of other matchable command patterns when the matching result of the text information is MATCH or PARTIAL_MATCH and matches a possible matchable command pattern having a NORMAL grade; wherein matching with a final command pattern is performed after input times out when a possible matchable command pattern has a WAIT_END grade; and wherein the final command pattern is a finally matched command pattern from the possible matchable command patterns,” arguing that “[i]n Aleksic, only two options are possible – the speech recognition process is terminated as soon as there is a match, or if there is only a partial or no match, the time period is extended to obtain additional audio data” and so cannot teach the three required IMMEDIATE, NORMAL and WAIT_END grades.  While impacted by the related written description and indefiniteness rejections above, Applicant’s arguments are inconsistent with a broadest reasonable interpretation of the recited grades and related claim language and discount relevant teachings of Aleksic.  
The teachings of Aleksic render obvious the limitations at issue under a broadest reasonable interpretation standard.  As noted in the rejections above, Applicant’s original disclosure does not appear to articulate a particular relationship between IMMEDIATE, NORMAL, and WAIT_END grades, and even if it assumed that command patterns may only be assigned one of the three grades, no disclosure appears to reasonably support that the grades cannot share aspects or features.  A broadest reasonable interpretation of the IMMEDIATE, NORMAL, and WAIT_END grades includes various interpretations of commands that may be interpreted more immediately than other commands (any of which can be viewed as NORMAL versus commands with different handling) and commands that may be interpreted as involving more of a wait period in contrast to other commands.  Contrary to Applicant’s assertions, the noted disclosures of Aleksic related to Figure 3 do not preclude other consideration such as further processing in instances in which there is a match or further processing in instances where there is a partial match or no match, which is taught by other disclosures (see, e.g., Aleksic, paras. 25, 29, 32, 37, and 46-48).  Further, obviousness of other variations that may be implicated by the claim language at issue is viewed over the teachings of the applied references taken together, such as the pre-processing and context considerations of Gelfenbeyn as discussed above.

Conclusion
The following prior art made of record and not relied upon is considered pertinent to Applicant’s disclosure: Kwon, Nam-yeong, U.S. Patent Application 2017/0069317 A1 (published Mar. 9, 2017), teaching a voice recognition processor configured to determine recognition of utterances in relationship to classifications such as whether a voice command is a normal recognition utterance.
Note that pinpoint citations to prior art references provided in this action are exemplary and should not be taken as limiting; each of the references as a whole is considered to provide disclosure relevant to the claimed invention and may be relied upon for all that it would have reasonably suggested to one of ordinary skill in the art.  See MPEP § 2123.
Applicant’s Amendment necessitated any new grounds of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Conrad Pack whose telephone number is (571) 270-7967 and fax number is (571) 270-8967.  The examiner can normally be reached on Monday through Friday, 9:30 to 6:00 Eastern Time.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sherief Badawi can be reached on 571-272-9782.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/Conrad Pack/
Examiner, Art Unit 2174
6/17/2022


/SHERIEF BADAWI/Supervisory Patent Examiner, Art Unit 2174