DETAILED ACTION
This office action is in response to amendments filed on 2/28/2022. Previous Advisory Action put claims 3-7 as Allowable Subject Matter. This office action rescinds that allowable subject matter. Claims 1-21 are rejected in this Office Action.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Response to Arguments
Claims 3-7:
Previous Advisory Action made claims 3-7 Allowable Subject Matter. In this office action, these claims are rejected with new references. Please see the relevant 103 sections below.

Rejections under 35 USC 103
Claim 1 and 15
The applicant is presenting the same arguments that were previously rejected in the Final Office Action mailed 11/30/2021. That office action stated: The applicant’s arguments have been considered but are not found persuasive. The applicant asserts on page 10 last paragraph that Lee, at [0062], does not teach the recited, “based on a sequence of detected phrases and the respective associated time stamps,” because in Lee the timestamps are associated with Tasks, and not the detected phrases. In fact, Lee [0062] recites “At the next phase of timestamp2, the user provides an input utterance”, which unequivocally states the timestamp is associated with the utterance.
Claim 2, 8, 9 and 16
The applicant’s is presenting the same arguments that were previously rejected in the Final Office Action mailed 11/30/2021. The rationale provided in the Final Office Action is still relevant. That office action stated: The applicant’s arguments have been considered but are not persuasive. On page 11, para 3, the applicant alludes to "a quantized time stamp for the detected phrase which is relative to a previously detected phrase” that is not taught by Hardie or Lee. Timestamps mark a moment in time and they are ALWAYS relative to each other and they are unique. A timestamp instance at any time is RELATIVE to the time stamp prior because time marches forward, as do timestamps. The example cited states (light, 2) as indicative of a lower time stamp (which it is) than an earlier time stamp (turn, 9) which simply indicates an instance in time further back. The cited example shows relative timestamps as do any timestamp, which by definition increases as time progression warrants.
Claim 10 and 17
The arguments are moot as new references are used for rejection.
Claims 11, and 18
The arguments are moot as new references are used for rejection. See the 103 section below.
Claim 12, 14, 19 and 21
Claims 12, 14, 19 and 21 use new references for rejection. See the 103 section below.
Claims 13 and 20
 In response to applicant's argument that the examiner has combined an excessive number of references, reliance on a large number of references in a rejection does not, without more, weigh against the obviousness of the claimed invention.  See In re Gorman, 933 F.2d 982, 18 USPQ2d 1885 (Fed. Cir. 1991).
In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).
Claims 1-21 stand rejected in this office action.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 8, 9, 12, 15, 16, 19   are rejected under 35 U.S.C. 103 as being unpatentable over by Hardie (US 10482904) in view of Lee (US 20180075847 A1)

With respect to claims 1 Hardie teaches a system/method/computer readable medium/electronic memory to store an electronic representation of an audio stream; an electronic processor coupled to the memory; and logic circuitry coupled to the electronic processor and the electronic memory, the logic circuitry (C29 3rd para “As described herein, memory 210 and/or 404 may include volatile and nonvolatile memory, removable and non-removable media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program component, or other data. Such memory 210 and/or 404 includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, RAID storage systems, or any other medium which can be used to store the desired information and which can be accessed by a computing device. The memory 210 and/or 404 may be implemented as computer-readable storage media (“CRSM”), which may be any available physical media accessible by the processor(s) 204 and/or 400 to execute instructions stored on the memory 210 and/or 404. In one basic implementation, CRSM may include random access memory (“RAM”) and Flash memory. In other implementations, CRSM may include, but is not limited to, read-only memory (“ROM”), electrically erasable programmable read-only memory (“EEPROM”), or any other tangible medium which can be used to store the desired information and which can be accessed by the processor(s)”) to:
electronically detect a phrase in the stored electronic representation of the audio stream based on a pre-defined vocabulary (C18, para 1 “The wakeword may produce a wakeword confidence level”),
electronically associate a time stamp (C18, para1 “The wakeword detection 308 may also produce a timestamp indicating the time at which the wakeword was detected.”) with the detected phrase,
 and electronically classify  a spoken intent (C4 last para, “In various examples, the weighted confidence scores may not be higher than a threshold confidence score after performing ASR, and a third stage of analysis must be performed. In such examples, the remote speech processing service may perform natural language understanding (NLU) on the textual data determined using ASR on the audio signals to determine an intent expressed by the user in the speech utterance”)  [[based on a sequence of detected phrases and the respective associated time stamps.]]
Hardie does not teach based on a sequence of detected phrases and the respective associated time stamps
Lee teaches based on a sequence of detected phrases and the respective associated time stamps ([0062] At the next phase of timestamp 2, the user provides an input utterance “I want to go to thai” 630. The web-based conversational agent 140 may then determine two possible tasks: a local restaurant task with a probability 0.6 and a travel task with a probability 0.4, because there is ambiguity about the intent of the user. The user may want to go to the country Thailand or may want to go to a restaurant of Thailand food. The local restaurant task has a constraint (food, thai, 0.8) and three results (restaurant 1), (restaurant 2), (restaurant 3) in the database that are matching the constraint. The travel task has a constraint (to, thai, 0.7) and no result matching this constraint in the database. The constraints may be determined based on parsing of the input utterance 630 and some features listed in FIG. 5, as well as the previous tasks in the task lineage, i.e. the task information obtained at timestamp 0 and timestamp 1. For example, based on the previous transit tasks “from edgewater to new york” at timestamp 0 and “from leonia to new york” at timestamp 1, the web-based conversational agent 140 may estimate with a low probability 0.4 that the user's intent at timestamp 2 is a travel task “to Thailand”. This probability may be even lower when timestamp 1 and timestamp 2 are very close in time, because it is unlikely for the user to change mind about a transit task so fast.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie to include the teachings of Lee motivation being that timestamps keep track of the task lineages (Lee, [0064]).
With respect to claims 8, Hardie teaches electronically detecting a phrase in an electronic representation of an audio stream based on a pre-defined vocabulary (C18, para 1 “The wakeword may produce a wakeword confidence level”); 
electronically associating a quantized time stamp with the detected phrase which is relative to previously detected phrase stamp (C18, para1 “The wakeword detection 308 may also produce a timestamp indicating the time at which the wakeword was detected.”); and 
electronically classifying a spoken intent [[based on a sequence of detected phrases and the respective associated time stamps]] (C4 last para, “In various examples, the weighted confidence scores may not be higher than a threshold confidence score after performing ASR, and a third stage of analysis must be performed. In such examples, the remote speech processing service may perform natural language understanding (NLU) on the textual data determined using ASR on the audio signals to determine an intent expressed by the user in the speech utterance”).
Hardie does not teach based on a sequence of detected phrases and the respective associated time stamps
Lee teaches based on a sequence of detected phrases and the respective associated time stamps ([0062] At the next phase of timestamp 2, the user provides an input utterance “I want to go to thai” 630. The web-based conversational agent 140 may then determine two possible tasks: a local restaurant task with a probability 0.6 and a travel task with a probability 0.4, because there is ambiguity about the intent of the user. The user may want to go to the country Thailand or may want to go to a restaurant of Thailand food. The local restaurant task has a constraint (food, thai, 0.8) and three results (restaurant 1), (restaurant 2), (restaurant 3) in the database that are matching the constraint. The travel task has a constraint (to, thai, 0.7) and no result matching this constraint in the database. The constraints may be determined based on parsing of the input utterance 630 and some features listed in FIG. 5, as well as the previous tasks in the task lineage, i.e. the task information obtained at timestamp 0 and timestamp 1. For example, based on the previous transit tasks “from edgewater to new york” at timestamp 0 and “from leonia to new york” at timestamp 1, the web-based conversational agent 140 may estimate with a low probability 0.4 that the user's intent at timestamp 2 is a travel task “to Thailand”. This probability may be even lower when timestamp 1 and timestamp 2 are very close in time, because it is unlikely for the user to change mind about a transit task so fast.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie to include the teachings of Lee motivation being that timestamps keep track of the task lineages (Lee, [0064]).

With respect to claims 15 , Hardie teaches electronically detect a phrase in an electronic representation of an audio stream based on a pre-defined vocabulary (C18, para 1 “The wakeword may produce a wakeword confidence level”); 
electronically associate a time stamp with the detected phrase (C18, para1 “The wakeword detection 308 may also produce a timestamp indicating the time at which the wakeword was detected.”); and 
electronically classify a spoken intent [[based on a sequence of detected phrases and the respective associated time stamps]](C4 last para, “In various examples, the weighted confidence scores may not be higher than a threshold confidence score after performing ASR, and a third stage of analysis must be performed. In such examples, the remote speech processing service may perform natural language understanding (NLU) on the textual data determined using ASR on the audio signals to determine an intent expressed by the user in the speech utterance”).
Hardie does not teach based on a sequence of detected phrases and the respective associated time stamps
Lee teaches based on a sequence of detected phrases and the respective associated time stamps ([0062] At the next phase of timestamp 2, the user provides an input utterance “I want to go to thai” 630. The web-based conversational agent 140 may then determine two possible tasks: a local restaurant task with a probability 0.6 and a travel task with a probability 0.4, because there is ambiguity about the intent of the user. The user may want to go to the country Thailand or may want to go to a restaurant of Thailand food. The local restaurant task has a constraint (food, thai, 0.8) and three results (restaurant 1), (restaurant 2), (restaurant 3) in the database that are matching the constraint. The travel task has a constraint (to, thai, 0.7) and no result matching this constraint in the database. The constraints may be determined based on parsing of the input utterance 630 and some features listed in FIG. 5, as well as the previous tasks in the task lineage, i.e. the task information obtained at timestamp 0 and timestamp 1. For example, based on the previous transit tasks “from edgewater to new york” at timestamp 0 and “from leonia to new york” at timestamp 1, the web-based conversational agent 140 may estimate with a low probability 0.4 that the user's intent at timestamp 2 is a travel task “to Thailand”. This probability may be even lower when timestamp 1 and timestamp 2 are very close in time, because it is unlikely for the user to change mind about a transit task so fast.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie to include the teachings of Lee motivation being that timestamps keep track of the task lineages (Lee, [0064]).

With respect to claims 2, 9, and 16, Hardie teaches wherein the logic circuitry is further to: electronically monitor a continuous audio stream (C9 last para, “In some examples, the speech interface devices 108A and 108B may continuously collect or monitor, using various sensors, the environment 102 and the device states, to collect and determine the metadata 116. In other examples, responsive to a wakeword, the speech interface devices 108A and 108B may use the various sensors to collect and determine the metadata while streaming the audio signals 114A and 114B to the remote speech processing service 110.“); 
electronically detect the phrase in the continuous audio stream (C18, para 1 “The wakeword may produce a wakeword confidence level”); 
and electronically compute a quantized time stamp for the detected phrase which is relative to previously detected phrase (C18 2nd para “The wakeword detection 308 may also produce a timestamp indicating the time at which the wakeword was detected.”).

With respect to claims  12 and 19 Hardie does not explicitly disclose but Lee  teaches a second neural network trained to return a probability for each of two or more intent classifications based on detected phrases and time stamps respectively associated with the detected phrases as input features to the second neural network (Lee: [0062] At the next phase of timestamp 2, the user provides an input utterance “I want to go to thai” [detected phrases] 630. The web-based conversational agent 140 may then determine two possible tasks: a local restaurant task with a probability 0.6 and a travel task with a probability 0.4 [two or more intents], because there is ambiguity about the intent of the user. The user may want to go to the country Thailand or may want to go to a restaurant of Thailand food. The local restaurant task has a constraint (food, thai, 0.8) and three results (restaurant 1), (restaurant 2), (restaurant 3) in the database that are matching the constraint. The travel task has a constraint (to, thai, 0.7) and no result matching this constraint in the database. The constraints may be determined based on parsing of the input utterance 630 and some features listed in FIG. 5, as well as the previous tasks in the task lineage, i.e. the task information obtained at timestamp 0 and timestamp 1. For example, based on the previous transit tasks “from edgewater to new york” at timestamp 0 and “from leonia to new york” at timestamp 1, the web-based conversational agent 140 may estimate with a low probability 0.4 that the user's intent at timestamp 2 is a travel task “to Thailand”. This probability may be even lower when timestamp 1 and timestamp 2 are very close in time, because it is unlikely for the user to change mind about a transit task so fast and Lee: [0065] The confidence scores referred to in these methods are typically obtained by training on logs of previous interactions with an (actual or simulated) dialog system, as in the dialog state tracking challenges. Common methods for training include minimizing negative log likelihood of log linear or neural network models.) 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie to include the teachings of Lee motivation being that timestamps keep track of the task lineages (Lee, [0064]).

Claim 3, 5, 7 are rejected under 35 U.S.C. 103 as being unpatentable over by Hardie and Lee as applied to claims 1, 3, 5, respectively, and in further view of Jaffari (US 20190251425 A1), Machado (US 20200089757 A1) and Fukuda (US 20190378006 A1) 

With respect to claims 3 Hardie, Lee fails to explicitly disclose but Jaffari teaches wherein the logic circuitry comprises: an always-on phrase spotter circuit [[with a first neural network with an acoustic model  and a hidden Markov model to detect the phrase in the audio stream]]; and
a selectively powered [[intent classification circuit]] coupled to the always-on phrase spotter
circuit, wherein the selectively powered intent classification circuit is to power up in response to a signal from the always-on phrase spotter circuit to electronically classify the spoken intent (Jaffari: [0045] Thus, the main neural network [classification circuit] 1120 would only turn on to process the input after it receives a trigger signal from the always-on neural network 1110 that, for example, a preliminary recognition of the input has been conducted, and  [0063] The always ON neural network processes the input data continuously to wake up the main neural network when the presence of features that need to be classified  is detected... The proposed concept could be applied but not limited to various applications such as voice keyword detection, natural language processing, image detection and anomaly detection for biological signals.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie, Lee to include the teachings of Jaffari motivation being to provide a desired tradeoff between power consumption and reducing the probability of false negatives in detection an events (Jaffari, [0046]). 
Hardie, Lee, Jaffari fail to explicitly disclose but Machado teaches intent classification circuit (Machado:  [0062] Intent service 386 [intent classification circuit] receives  a natural language text string (e.g., as output by voice to text service 382) and parses the words and/or the phrases in the natural language text string to determine a desired intent of the natural language text string and/or a confidence in that determination. In some examples, the intent may be selected from a class of intents associated with desired data changes to a database and/or data store. In some examples, the class of intents may include one or more of inserting an entry (e.g., a record) in a database table, updating one or more fields of an entry (e.g., a record) in a database table, and/or the like. In some examples, intent service 386 may utilize one or more neural network classifiers to determine the intent and/or confidence level for the determined intent.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie, Lee, Jaffari to include the teachings of Machado motivation being allows a user to provide unstructured input in the form of spoken or written natural language (Machado, [0047]).
Hardie, Lee, Jaffari, Machado fail to explicitly disclose  but Fukuda teaches  first neural network with an acoustic model  and a hidden Markov model to detect the phrase in the audio stream ([0031] In the speech recognition, a neural network (NN) model is typically used for an acoustic model to produce a probability distribution over HMM (Hidden Markov Model) states from a speech (audio) signal.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie, Lee, Jaffari, Machado to include the teachings of  Fukuda, motivation being to train a mixed-band model that is capable of handling input signals in both of the broadband and narrowband systems ( Fukuda, [0032]).

With respect to claims 5 Lee further teaches a second neural network trained to return a probability for each of two or more intent classifications based on detected phrases and time stamps respectively associated with the detected phrases as input features to the second neural network (Lee: [0062] At the next phase of timestamp 2, the user provides an input utterance “I want to go to thai” [detected phrases] 630. The web-based conversational agent 140 may then determine two possible tasks: a local restaurant task with a probability 0.6 and a travel task with a probability 0.4 [two or more intents], because there is ambiguity about the intent of the user. The user may want to go to the country Thailand or may want to go to a restaurant of Thailand food. The local restaurant task has a constraint (food, thai, 0.8) and three results (restaurant 1), (restaurant 2), (restaurant 3) in the database that are matching the constraint. The travel task has a constraint (to, thai, 0.7) and no result matching this constraint in the database. The constraints may be determined based on parsing of the input utterance 630 and some features listed in FIG. 5, as well as the previous tasks in the task lineage, i.e. the task information obtained at timestamp 0 and timestamp 1. For example, based on the previous transit tasks “from edgewater to new york” at timestamp 0 and “from leonia to new york” at timestamp 1, the web-based conversational agent 140 may estimate with a low probability 0.4 that the user's intent at timestamp 2 is a travel task “to Thailand”. This probability may be even lower when timestamp 1 and timestamp 2 are very close in time, because it is unlikely for the user to change mind about a transit task so fast and Lee: [0065] The confidence scores referred to in these methods are typically obtained by training on logs of previous interactions with an (actual or simulated) dialog system, as in the dialog state tracking challenges. Common methods for training include minimizing negative log likelihood of log linear or neural network models.) 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie to include the teachings of Lee motivation being that timestamps keep track of the task lineages (Lee, [0064]).

With respect to claim 7 Hardie, Lee fails to explicitly disclose but Jaffari teaches wherein phrase spotter circuit is further to: asynchronously signal the intent classification circuit to power up and trigger the [[second neural network when a sequence of detected phrases is ready for classification]] Jaffari: [0045] Thus, the main neural network [classification circuit] 1120 would only turn on to process the input after it receives a trigger signal [asynchronously signal] from the always-on neural network 1110 that, for example, a preliminary recognition of the input has been conducted) 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie, Lee to include the teachings of Jaffari motivation being to provide a desired tradeoff between power consumption and reducing the probability of false negatives in detection an event (Jaffari, [0046]).
Hardie, Lee, Jaffari fail to explicitly disclose but Machado teaches second neural network when a sequence of detected phrases is ready for classification (Machado:  [0062] Intent service 386 [intent classification circuit and second neural network] receives  a natural language text string (e.g., as output by voice to text service 382) and parses the words and/or the phrases in the natural language text string to determine a desired intent of the natural language text string and/or a confidence in that determination. In some examples, the intent may be selected from a class of intents associated with desired data changes to a database and/or data store. In some examples, the class of intents may include one or more of inserting an entry (e.g., a record) in a database table, updating one or more fields of an entry (e.g., a record) in a database table, and/or the like. In some examples, intent service 386 may utilize one or more neural network classifiers to determine the intent and/or confidence level for the determined intent.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie, Lee, Jaffari to include the teachings of Machado motivation being allows a user to provide unstructured input in the form of spoken or written natural language (Machado, [0047]).

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over by Hardie, Lee, Jaffari, Machado and Fukuda as applied to claims 3 respectively, in further view of Fu (US 11024316 B1)
With respect to claims 4 Hardie, Lee, Jaffari, Machado, and Fukuda fail to explicitly disclose but Fu  teaches  wherein the acoustic model is further configured to:  automatically add time stamp information to text data for the detected phrase (Col 16 ll 46-53:At process 10002, feeding one or more audios into one or more automated speech recognition stream servers, and/or generating one or more transcript words with one or more timestamps for one or more current windows are performed and, Col 18 ll 39-42: For example, the N automated speech recognition (ASR) systems are configured to use at least one acoustic model (AM) and/or at least one language mode (LM).)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie, Lee, Jaffari, Machado and Fukuda to include the teachings of Fu motivation being that the ASR system 614 is updated and/or improved, for example, by feeding training data to the model (Fu, [0037]).

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Hardie, Lee, Jaffari, Machado and Fukuda as in claim 5 and further in view of McGann (US 20180338041)

With respect to claim 6 Hardie, Lee, Jaffari, Machado and Fukuda do not disclose but McGann teaches wherein the logic circuitry is further to: classify the spoken intent in accordance with a highest probability of the two or more intent classifications ([0015] “According to one embodiment of the invention, the selected intent has a highest probability of the computed probabilities”).  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie, Lee, Jaffari, Machado and Fukuda to include the teachings of McGann  motivation being to increase flexibility of automated speech-enabled systems to allow users to traverse a finite number or execution strategies (McGann, [0003-0004]).

Claim 10 and 17are rejected under 35 U.S.C. 103 as being unpatentable over by Hardie and Lee as applied to claims 8, 15  respectively, and in further view of Fukuda (US 20190378006 A1) 

With respect to claims 10 and 17 Hardie, Lee fail to explicitly disclose but Fukuda teaches wherein the logic circuitry comprises: a first neural network with an acoustic model which includes both an acoustic model and a hidden Markov model ([0031] In the speech recognition, a neural network (NN) model is typically used for an acoustic model to produce a probability distribution over HMM (Hidden Markov Model) states from a speech (audio) signal.) 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie, Lee, Machado, to include the teachings of  Fukuda, motivation being to train a mixed-band model that is capable of handling input signals in both of the broadband and narrowband systems (Fukuda, [0032]).

Claims 11 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over by Hardie, Lee, and Fukuda as applied to claims 10, 17 respectively, in further view of Fu (US 11024316 B1)

With respect to claims 11 and 18 Hardie, Lee, and Fukuda fail to explicitly disclose but Fu  teaches  wherein the acoustic model is further configured to:  automatically add time stamp information to text data for the detected phrase (Col 16 ll 46-53:At process 10002, feeding one or more audios into one or more automated speech recognition stream servers, and/or generating one or more transcript words with one or more timestamps for one or more current windows are performed and, Col 18 ll 39-42: For example, the N automated speech recognition (ASR) systems are configured to use at least one acoustic model (AM) and/or at least one language mode (LM).)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie, Lee, and Fukuda to include the teachings of Fu motivation being that the ASR system 614 is updated and/or improved, for example, by feeding training data to the model (Fu, [0037]).


Claims 13, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Hardie, Lee as in claim 12, 19 and further in view of McGann (US 20180338041)

With respect to claims  13 and 20 Hardie, Lee, do not disclose but McGann teaches wherein the logic circuitry is further to: classify the spoken intent in accordance with a highest probability of the two or more intent classifications ([0015] “According to one embodiment of the invention, the selected intent has a highest probability of the computed probabilities”).  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie, Lee to include the teachings of McGann  motivation being to increase flexibility of automated speech-enabled systems to allow users to traverse a finite number or execution strategies (McGann, [0003-0004]).


Claim 14, 21 are rejected under 35 U.S.C. 103 as being unpatentable over by Hardie and Lee as applied to claims 12, 19 respectively, and in further view of Jaffari (US 20190251425 A1), Machado (US 20200089757 A1) 

With respect to claim  14 and 21 Hardie, Lee fail to explicitly disclose but Jaffari teaches wherein phrase spotter circuit is further to: asynchronously signal the intent classification circuit to power up and trigger the [[second neural network when a sequence of detected phrases is ready for classification]] Jaffari: [0045] Thus, the main neural network [classification circuit] 1120 would only turn on to process the input after it receives a trigger signal [asynchronously signal] from the always-on neural network 1110 that, for example, a preliminary recognition of the input has been conducted) 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie, Lee to include the teachings of Jaffari motivation being to provide a desired tradeoff between power consumption and reducing the probability of false negatives in detection an event (Jaffari, [0046]).
Hardie, Lee, Jaffari fail to explicitly disclose but Machado teaches second neural network when a sequence of detected phrases is ready for classification (Machado:  [0062] Intent service 386 [intent classification circuit and second neural network] receives  a natural language text string (e.g., as output by voice to text service 382) and parses the words and/or the phrases in the natural language text string to determine a desired intent of the natural language text string and/or a confidence in that determination. In some examples, the intent may be selected from a class of intents associated with desired data changes to a database and/or data store. In some examples, the class of intents may include one or more of inserting an entry (e.g., a record) in a database table, updating one or more fields of an entry (e.g., a record) in a database table, and/or the like. In some examples, intent service 386 may utilize one or more neural network classifiers to determine the intent and/or confidence level for the determined intent.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Hardie, Lee, Jaffari to include the teachings of Machado motivation being allows a user to provide unstructured input in the form of spoken or written natural language (Machado, [0047]).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ATHAR N PASHA whose telephone number is (408)918-7675.  The examiner can normally be reached on Monday-Thursday Alternate Fridays, 7:30-4:30 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571)272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.   Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ATHAR N PASHA/Examiner, Art Unit 2657     
                                                                                                                                                                                                       

/DANIEL C WASHBURN/               Supervisory Patent Examiner, Art Unit 2657