DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
All objections/rejections not mentioned in this Office Action have been withdrawn by the Examiner.

Response to Amendments 
Applicant’s amendment filed on March 3, 2021 has been entered. 
In view of the amendment to the specification, the amendments to paragraphs 79-80, 82-87, 89, and 92-94 have been entered. 
In view of the amendments to paragraphs 79-80, 82-87, 89, and 92-94, the objections to paragraphs 79-80, 82-85, 87, 89, and 92-94 of the specification have been withdrawn. The objection to paragraph 84 of the specification is maintained, for the reasons provided in the response below.
In view of the amendment to the claims, the amendment of claims 1-3, 6, 10, 12, 15, and 17 and the cancellation of claims 5 and 14 have been acknowledged and entered.  
In view of the amendment to claims 12 and 17 and the cancellation of claims 5 and 14, the objection to claims 5, 12, 14, and 17 is withdrawn.
In view of the amendment to claims 3 and 12, the rejection of claims 3 and 12 under 35 U.S.C. §112 is withdrawn.
In view of the amendment to claim 1, the rejection of claims 1-9 under 35 U.S.C. §101 is withdrawn.
In light of the amended claims, new grounds for rejection under 35 U.S.C. §103 are provided in the response below. 

Response to Arguments
Applicant’s arguments regarding the prior art rejections under 35 U.S.C. §103, see pages 15-18 of the Response to Non-Final Office Action dated December 1, 2020, which was received on March 3, 2021 (hereinafter Response and Office Action, respectively), have been fully considered but they are not persuasive.
As Applicant has amended independent claim 1 to incorporate the limitations of parts of claim 2 and the entirety of claim 5 and independent claim 10 to incorporate the limitations of claim 14, the rejections of claims 1 and 10 have been amended to incorporate the rejection of the respective limitations of claims 2, 5, and 14, as appropriate. 
With respect to the rejection(s) of claim(s) 1 and 10 under 35 U.S.C. §103 in light of Sun (U.S. Pat. No. 10,460,722, hereinafter Sun) in view of Bocklet (U.S. Pat. App. Pub. No. 2018/0182388, hereinafter Bocklet) and Sakoe (U.S. Pat. No. 4,975,961, hereinafter Sakoe), applicant asserts that the references do not provide for: 
the neural network having a trigger neural path to output a trigger signal to the detector to control when the detector reviews the word output signals to recognize the word, wherein the trigger neural path controls when the detector reviews the word output signals based on the sequentially extracted feature vectors,
wherein the trigger neural path is configured to send a trigger signal having a review command to the detector to cause the detector to review the word output signals for a greatest signal value output signal, and
the detector to recognize the word of the plurality of words by detecting the word output signal having a greatest signal value during a period of time related to when the review command is received.

Specifically, Applicant asserts that Sakoe discloses “training the neural network to deal with different nonlinearities in a speech pattern by weighting coefficient at an output neuron that is assigned to the speech pattern under consideration.” (Response, pg. 16, citing Sakoe, col. 9, lines 1-25). Applicant further asserts that this section of Sakoe relates to training, and is not a “use case for recognizing words as claimed.” (Response, pg. 16-17). As well, Applicant asserts that this portion of the disclosure in Sakoe relates to “a weighting of a neuron and not a timing trigger signal as claimed.” (Response
In response to Applicant's argument that the references fail to show certain features of Applicant’s invention, it is noted that the features upon which applicant relies (i.e., use exclusive of training embodiments, timing trigger signal) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
As presented previously and recited below, Sakoe teaches “the detector to recognize the word of the plurality of words by detecting the word output signal having a greatest signal value.” (Sakoe, Col. 12, lines 20-26). Applicant, in arguing that Sakoe fails to teach all limitations of claim 1 and 10 as presented,  asserts that the disclosed embodiment of Sakoe are only related to “training and not a use.” (Response, page 16-17). However, the instant claims as presented here require no such limitation.  The broadest reasonable interpretation of claim 1 as presented includes any type or form of use, including training or otherwise. Further, though the claims as presented call for “recognizing the word of the plurality of words… during a period of time related to when the review command is received,” claims 1 and 10 as presented do not require a “timing trigger signal,” as asserted in the response. Applicant is invited to amend the claims to incorporate the above limitations, in light of specification support, such that Examiner can properly consider said limitations.
Further, Examiner respectfully disagrees that the disclosure of Sakoe is limited to training. Regarding use of the neural network, Examiner notes that, in the general neural network embodiment (see FIG. 3) of Sakoe relied on in the Office Action, “A maximum one of the output signal components indicates a result n of recognition of the input pattern being dealt with and is delivered to a utilization device (not shown), such as an input device of an electronic digital computer system.” (Sakoe, Col. 12, lines 23-26). The delivery of the result to a utilization device indicates use beyond mere training. As such, Applicant’s arguments regarding the limitations of the disclosure of Sakoe
Applicant further argues that Bocklet fails to disclose “a trigger signal to select words as claimed.” Specifically, Applicant asserts that “the rejection model 501 is separate from the key phrase model 502” and that Bocklet “only uses the key phrase model 502 to identify whether a key phrase of multiple words exists.”  (Response, page 17, citing Bocklet, ¶ [0052]). Examiner respectfully disagrees with this characterization of Bocklet.
Though Bocklet recites “rejection model 501 is separate from key phrase model 502,” this is a discussion of the separation of specific elements “such that the rejection score corresponding to rejection model 501 may be determined separately from the scores of states 526 of key phrase model 502.” (Bocklet, ¶ [0052], emphasis added). The use of the word “separate” does not preclude the disclosure of Bocklet indicating “key phrase may be detected and a system may be woken (e.g., via system wake indicator 216) and an optional command may be issued (e.g., via system command 218) based on the detected key phrase,” where the system command 218 can cause the system to “perform an operation such as starting an application, generating or retrieving data, or the like.” (Bocklet, ¶ [0041]). Therefore, Applicant’s arguments regarding the deficiencies of Bocklet are not persuasive.
With respect to the rejection(s) of claim(s) 6 and 15 under 35 U.S.C. §103 in light of Sun in view of Bocklet and Sakoe, Applicant’s arguments in light of the amendments have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in light of Sun in view of Bocklet, Sakoe, and Hsu (U.S. Pat. App. Pub. No. 2015/0161989, hereinafter Hsu).
The Applicant has not provided any further statement and therefore, the Examiner directs the Applicant to the below rationale.

Specification
The disclosure is objected to because of the following informalities: 
In paragraph [0084], the phrase “training signal 355” should read “trigger label 355”.  
Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-4, 7-8, 10-13 and 16-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sun in view of Bocklet and Sakoe.

Regarding claim 1, Sun discloses a continuous automatic speech segmentation and recognition (ASR) system, (“spoken language processing system 100”; Sun, Col. 1, lines 63-67); comprising a processor and a memory storing instructions that when executed by the processor (“The approaches described above may be implemented [through] instructions stored on a non-transitory machine readable medium that when executed by a processor… perform some or all of the procedures described above.”; Sun, Col. 11, lines 20-25) implement: a detector; (spoken language processing system 100 includes “a device 110 and a spoken language processing server 190.” The server 190 includes a speech recognition engine 282 which Sun, Col. 1, lines 63-67); a neural network to perform speech recognition processing (spoken language processing system 100 includes a feature analyzer 150. “[The] feature vectors are provided to the feature analyzer 150…” where the feature analyzer 150 can be an neural network “to perform the transformation from feature vectors to the representations of linguistic content.”; Sun, Col. 4, lines 15-52) on feature vectors sequentially extracted from an audio data stream (spoken language processing system 100 includes a feature extractor 140 which “receives the digitized audio signal and produces one feature vector for each 10 milliseconds of the audio signal,” thus the feature vectors are sequentially extracted; Sun, Col. 3, lines 50-55) to attempt to recognize a word from a set of words of a predetermined vocabulary; (“the output of the feature analyzer 150 is a sequence of observation vectors, where each entry in a vector is associated with a particular part of a linguistic unit, for example, part of an English phoneme” and where the English language is a predetermined vocabulary; Sun, Col. 4, lines 15-52); and the neural network having a trigger neural path (“the outputs of the feature analyzer 150 are provided to the trigger detector 160,” where the trigger detector 160 is the trigger neural path; Sun, Col. 7, lines 47-52) to output a trigger signal to the detector to control when the detector reviews the word output signals to recognize the word (“and upon detection of the trigger [by the trigger detector 160] at a particular time (e.g., a time instance or interval), the device passes audio data (e.g., a digitized audio signal or some processed form of such a signal) to...” the detector (i.e., the spoken language processing server 190.); Sun, Col. 7, lines 47-52; Col 2, lines 15-20) wherein the trigger neural path controls when the detector reviews the word output signals based on the sequentially extracted feature vectors (the trigger detector 160 (the trigger neural path) controls when the spoken language processing system 190 (the detector) receives and reviews the audio data (the word output signals) based on feature vectors; Sun, Col. 1, line 62 - Col 2, line 20). Further, Sun discloses a word neural path for key word detection (“output layer 850 with outputs corresponding to state of phonemes for a large vocabulary speech recognizer”; Sun, Col. Sun fails to expressly recite wherein the trigger neural path is configured to send a trigger signal having a review command to the detector to cause the detector to review the word output signals for a greatest signal value output signal, and the detector to recognize the word of the plurality of words by detecting the word output signal having a greatest signal value during a period of time related to when the review command is received.

Bocklet discloses systems and methods of linear scoring for key phrase detection using neural networks. (Bocklet, ¶¶ [0022]-[0023]). Regarding claim 1, Bocklet teaches the neural network (acoustic scoring module 203 implements key phrase models as part of a deep neural network (DNN); Bocklet, ¶[0039]) having word neural paths (“multiple (e.g., parallel) key phrase models may be provided in a single score-array” from the acoustic scoring module 203; Bocklet, ¶¶ [0050] and [0052]) to each output a word output signal  (Bocklet discloses the plurality of word neural paths for a plurality of key words/phrases, the paths each output a score, such as scores S1, S2, S3, . . . , Si, . . . , SN-1, SN (i.e., a word output signal); Bocklet, ¶ [0053]; FIG. 6) to the detector for each of the set of words (in embodiments having a second or additional key phrases, “the key phrase detection decoder 204 may generate a key phrase score or scores for each of such key phrase models (and at multiple time instances) for evaluation by controller 206”; Bocklet, ¶ [0041])… wherein the trigger neural path is configured to send a trigger signal having a review command to the detector (“key phrase may be detected and a system may be woken (e.g., via system wake indicator 216) and an optional command may be issued (e.g., via system command 218) based on the detected key phrase.” The system command 218 can cause the system to “perform an operation such as starting an application, generating or retrieving data, or the like,” thus a review command; Bocklet, ¶ [0041]) to cause the detector to review the word output signals for a greatest signal value output signal (The detector performs the action based on the trigger signal and the review command; Bocklet, ¶¶ [0040]-[0041])… during a period of time related to when the review command is received (“a key phrase score 215 Bocklet, ¶¶ [0040]). 

It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the neural network of the spoken language processing system of Sun to incorporate the teachings of Bocklet to include the neural network having word neural paths to each output a word output signal to the detector for each of the set of words. As recognized by Bocklet, the described vectorized scoring for the key phrases may provide advantages in terms of computational efficiency and power usage. (Bocklet, ¶ [0024]). However, Sun and Bocklet fail to expressly recite the detector to recognize the word of the plurality of words by detecting the word output signal having a greatest signal value.

Sakoe teaches methods of establishing and training multi-layer neural networks, such as in speech recognition. (Sakoe, Col. 3, lines 26-45) Regarding claim 1, Sakoe teaches the detector to recognize the word of the plurality of words by detecting the word output signal having a greatest signal value (“The decision unit 23 compares, with one another, the output signal components... A maximum one of the output signal components indicates a result n of recognition of the input pattern being dealt with…” where the “particular word specified by the word identifier (n) ... is adjusted so that the n-th output signal component has the maximum intensity,” thus the decision unit is detecting the word of the plurality of words, based on the signal having the greatest signal value; Sakoe, Col. 12, lines 20-26; Col. 9, lines 24-30).

It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the spoken language processing system of Sun as modified by the key phrase detection system of Bocklet to incorporate the teachings of Sakoe to include the detector to recognize the word of the plurality of words by detecting the word Sakoe. (Sakoe, Col. 2, lines 57-61).

Regarding claim 2, the rejection of claim 1 is incorporated. Sun further discloses wherein the detector is held in a quiescent state except when performing the speech recognition detecting under control of the trigger neural path (the spoken language processing server 190 or components thereof can be incorporated into the device 110, and “may be in a low-power mode until the trigger is detected... for the spoken language processing of the audio data,” where a low-power mode is understood as the quiescent state; Sun, Col. 11, lines 15-20).

Regarding claim 3, the rejection of claim 2 is incorporated. Sun further discloses the system of claim 2, wherein the detector is in the quiescent state when it is monitoring the trigger signal from the neural network for the review command (The detector becomes active when the communication interface 170 receives an indicator part of the input (the trigger signal) from the trigger detector 160. Without the trigger signal, which includes the period of time when monitoring for the trigger signal, the server 190 is in the low-power, or quiescent, state; Sun, Col. 8, lines 53-55; Col. 11, lines 15-20); and wherein the detector is in an active state when it is reviewing the word output signals from the neural network to recognize a word in response to detecting the review command (In response to the trigger signal, “the communication interface 170 [of the device 110] selects the part of the audio data (e.g., the sampled waveform) to send to the server 190.” once sent, the server processes that portion of the audio data (i.e., active state); Sun, Col. 8, lines 50-55).

Regarding claim 4, the rejection of claim 2 is incorporated. Sun further discloses the system of claim 2, wherein the trigger neural path causes the detector to review the word output signals (“the outputs of the feature analyzer 150 are provided to the trigger detector 160,” where the trigger detector 160 is the trigger neural path “and upon detection of the trigger at a particular time (e.g., a time instance or interval), the device passes audio data (e.g., a digitized audio signal or some processed form of such a signal) to...” the detector (i.e., the spoken language processing server 190.); Sun, Col. 7, lines 47-52; Col 2, lines 15-20) when the trigger neural path determines that the sequentially extracted feature vectors are likely to represent a word from the set of words (The detection logic 266 [of the trigger detector] uses the probability of a candidate trigger word (“when the probability... reaches a local maximum above a threshold...”) , as provided by the HMM”...to determine when a candidate trigger word occurs” in the feature vectors.; Sun, Col. 7, lines 57-61; Col. 8, lines 23-25; FIG. 2D).

Regarding claim 7, the rejection of claim 1 is incorporated. Sun discloses all of the elements of the current invention as stated above. However, Sun fails to expressly recite the system of claim 1, wherein the neural network comprises: a plurality of neurons, wherein at least some of the plurality of neurons store one or more prior states, and at least some of the plurality of neurons receive, as inputs, one or more stored prior states.
The relevance of Bocklet is described above with reference to claim 1. Regarding claim 7, Bocklet further discloses the system of claim 1, wherein the neural network comprises: a plurality of neurons (Bocklet describes a neural network having a plurality of nodes (neurons)  610; Bocklet, ¶¶ [0046] and [0053]), wherein at least some of the plurality of neurons store one or more prior states (Key phrase model 502 may include multiple states (or nodes), where states may be updated based on the initial states to create a final state; “key phrase models may be linearly stored with optional spare states between the key phrase models”; Bocklet, ¶ [0041], and [0070]), and at least some of the plurality of neurons (610) receive, as inputs, one or more stored prior states. (“Each of states 526 may include or be updated by one or more self-Bocklet, ¶ [0041]).  
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the neural network of the spoken language processing system of Sun to incorporate the teachings of Bocklet to include the neural network having word neural paths to each output a word output signal to the detector for each of the set of words. As recognized by Bocklet, the described vectorized scoring for the key phrases may provide advantages in terms of computational efficiency and power usage. (Bocklet, ¶ [0024]).

Regarding claim 8, the rejection of claim 1 is incorporated. Sun and Bocklet disclose all of the elements of the current invention as stated above. Sun further discloses a plurality of neurons of the trigger path (Sun discloses a feature analyzer 150 and a trigger detector 160 which can be a neural network, the neural network including nodes; Sun, Col. 4, lines 15-52; Col. 5, lines 2-4; Col. 7, lines 47-52) having weights based on training (“a multi-task training stage (b) adds weights 346 to the final transformation 845B, providing in the output layer both output units corresponding to… units for the trigger detection task”; Sun, Col. 10, lines 28-34) to output a greater signal value for each of the set of words than for words that are not in the set of words (“upon detection of the trigger,” where the trigger is any of the set of words, “the device passes audio data (e.g., a digitized audio signal or some processed form of such a signal) to...” the detector (i.e., the spoken language processing server 190.) Thus, a signal is sent when keywords are detected, which is greater than the signal when keywords are not detected (i.e., no signal, or a signal value of 0, for words which are not keywords). Sun, Col 2, lines 15-20). However, Sun and Bocklet fail to expressly recite wherein the neural network comprises: a plurality of neurons of each word path having weights based on training to output a greatest signal value on one word path and a lower signal value on the other word paths for each of the set of words.

The relevance of Sakoe is described above, with reference to claim 1. Regarding claim 8, Sakoe teaches wherein the neural network comprises: a plurality of neurons of each word path having weights based on training (Sakoe teaches a neural network having “A weighting coefficient or factor is attributed to each of the input to intermediate and the intermediate to output connections.”; Sakoe, Col. 1, lines 51-55) to output a greatest signal value on one word path (“After repetition of training, the neural network eventually learns optimum intermediate and output weighting coefficients to produce the n-th output signal component as the sole significant component of the output signal when the input signal time sequence X represents the particular word,” where the sole significant component has the maximum intensity. Thus, the n-th output signal as the sole significant component, has a “maximum intensity”; Sakoe, Col. 7, line 67 - Col. 8, line 7) and a lower signal value on the other word paths for each of the set of words (“In an output signal produced by the neural network… first through N-th output signal components have different intensities which depend primarily on … the intermediate and the output weighting coefficients u and v.” Given that the n-th output signal has the maximum intensity, all other output signals having “different intensities” must necessarily be less than the maximum intensity; Sakoe, Col. 7, lines 41-54). 

It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the spoken language processing system of Sun as modified by the key phrase detection system of Bocklet to incorporate the teachings of Sakoe to include a word memory to store the word output signals and a word comparator to compare each value of each stored word output signal for the period of time to one of a first threshold or each other value. The described elements allow a neural network to be “readily trained so as to reliably recognize connected words”, as recognized by Sakoe. (Sakoe, Col. 2, lines 57-61).

Regarding claim 10, Sun discloses a method for continuous automatic speech segmentation and recognition (ASR) (the method performed by “spoken language processing system 100”; Sun, Col. 1, lines 63-67), comprising: processing in a neural network, (spoken language processing system 100 includes a feature analyzer 150. “[The] feature vectors are provided to the feature analyzer 150…” where the feature analyzer 150 can be an neural network “to perform the transformation from feature vectors to the representations of linguistic content.”; Sun, Col. 4, lines 15-52) feature vectors sequentially extracted from an audio data stream (spoken language processing system 100 includes a feature extractor 140 which “receives the digitized audio signal and produces one feature vector for each 10 milliseconds of the audio signal,” thus the feature vectors are sequentially extracted; Sun, Col. 3, lines 50-55) to attempt to recognize a word from a set of words of a predetermined vocabulary (“the output of the feature analyzer 150 is a sequence of observation vectors, where each entry in a vector is associated with a particular part of a linguistic unit, for example, part of an English phoneme” where the English language is a predetermined vocabulary; Sun, Col. 4, lines 15-52); outputting from the neural network … to a detector …(spoken language processing system 100 includes “a device 110 and a spoken language processing server 190.” The server 190 includes a speech recognition engine 282 which “processes the feature vectors [from the neural network] to determine the words in the audio data”; Sun, Col. 1, lines 63-67); and outputting from the neural network a trigger output signal to the detector to control when the detector reviews the word output signals to recognize the word (“and upon detection of the trigger [by the trigger detector 160] at a particular time (e.g., a time instance or interval), the device passes audio data (e.g., a digitized audio signal or some processed form of such a signal) to...” the detector (i.e., the spoken language processing server 190.); Sun, Col. 7, lines 47-52; Col 2, lines 15-20). Further, Sun discloses a word neural path for key word detection (“output layer 850 with outputs corresponding to state of phonemes for a large vocabulary speech recognizer”; Sun, Col. 10, lines . However, Sun fails to expressly recite wherein controlling when the detector reviews the word output signals to recognize the word comprises: sending a trigger signal having a review command to the detector to cause the detector to review the word output signals for a greatest signal value output signal, and the detector recognizing the word of the plurality of words by detecting the word output signal having a greatest signal value during a period of time related to when the review command is received.

Bocklet discloses systems and methods of linear scoring for key phrase detection using neural networks. (Bocklet, ¶¶ [0022]-[0023]). Regarding claim 10, Bocklet teaches outputting from the neural network (acoustic scoring module 203 implements key phrase models as part of a deep neural network (DNN); Bocklet, ¶ [0039]) respective word output signals  (Bocklet discloses a plurality of word neural paths (paths extending from the nodes 610) for a plurality of key words/phrases, the paths each output a score, such as scores S1, S2, S3, . . . , Si, . . . , SN-1, SN (i.e., a word output signal); Bocklet, ¶ [0053]; FIG. 6) to a detector for each of the set of words (in embodiments having a second or additional key phrases, “the key phrase detection decoder 204 may generate a key phrase score or scores for each of such key phrase models (and at multiple time instances) for evaluation by controller 206”; Bocklet, ¶ [0041])... wherein controlling when the detector reviews the word output signals to recognize the word comprises: sending a trigger signal having a review command to the detector (“key phrase may be detected and a system may be woken (e.g., via system wake indicator 216) and an optional command may be issued (e.g., via system command 218) based on the detected key phrase.” The system command 218 can cause the system to “perform an operation such as starting an application, generating or retrieving data, or the like,” thus a review command; Bocklet, ¶ [0041]) to cause the detector to review the word output signals for a greatest signal value output signal (The detector performs the action based on the trigger signal and the review command; Bocklet, ¶¶ [0040]-[0041])… during a period of time related to when the review command is received (“a key phrase score 215 is generated at each time instance as associated with scores 214,” thus the detection is during the period of time when the review command is received; Bocklet, ¶¶ [0040]). 

It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the neural network of the spoken language processing system of Sun to incorporate the teachings of Bocklet to include outputting from the neural network respective word output signals to a detector for each of the set of words. As recognized by Bocklet, the described vectorized scoring for the key phrases may provide advantages in terms of computational efficiency and power usage. (Bocklet, ¶ [0024]). However, Sun and Bocklet fail to expressly recite the detector recognizing the word of the plurality of words by detecting the word output signal having a greatest signal value.

Sakoe teaches methods of establishing and training multi-layer neural networks, such as in speech recognition. (Sakoe, Col. 3, lines 26-45). Regarding claim 10, Sakoe teaches the detector recognizing the word of the plurality of words by detecting the word output signal having a greatest signal value (“The decision unit 23 compares, with one another, the output signal components... A maximum one of the output signal components indicates a result n of recognition of the input pattern being dealt with…” where the “particular word specified by the word identifier (n) ... is adjusted so that the n-th output signal component has the maximum intensity,” thus the decision unit is detecting the word of the plurality of words, based on the signal having the greatest signal value; Sakoe, Col. 12, lines 20-26; Col. 9, lines 24-30).

It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the spoken language processing system of Sun as modified by the key phrase detection system of Bocklet to incorporate the teachings of Sakoe to include the detector recognizing the word of the plurality of words by detecting the word output signal having a greatest signal value. The described elements allow a neural network to be “readily trained so as to reliably recognize connected words”, as recognized by Sakoe. (Sakoe, Col. 2, lines 57-61).

Regarding claim 11, the rejection of claim 10 is incorporated. Claim 11 is substantially the same as claim 2 and is therefore rejected under the same rationale as above.

Regarding claim 12, the rejection of claim 11 is incorporated. Claim 12 is substantially the same as claim 3 and is therefore rejected under the same rationale as above.

Regarding claim 13, the rejection of claim 11 is incorporated. Claim 13 is substantially the same as claim 4 and is therefore rejected under the same rationale as above.

Regarding claim 16, the rejection of claim 10 is incorporated. Claim 16 is substantially the same as claim 7 and is therefore rejected under the same rationale as above.

Regarding claim 17, the rejection of claim 10 is incorporated. Claim 17 is substantially the same as claim 8 and is therefore rejected under the same rationale as above.

Claims 6 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sun in view of Bocklet and Sakoe as applied to claims 1 and 10 above, and in further view of Hsu.

Regarding claim 6, the rejection of claim 1 is incorporated. Sun, Bocklet, and Sakoe disclose all of the elements of the current invention as stated above. However, Sun, Bocklet, and Sakoe fail to expressly recite wherein the detector includes: a word memory to store the word 

Hsu discloses systems and methods for keyword detection including keyword prediction. (Hsu, ¶ [0004]). Regarding claim 6, Hsu teaches wherein the detector includes: a word memory to store the word output signals (The system includes “non-volatile programmable memory” as part of the “software and/or firmware to implement functions of the speech keyword detector 44, the activity predictor 48 and/or the decision maker 52”; Hsu, ¶ [0095]); a word comparator to compare each value of each stored word output signal for the period of time to one of a first threshold or each other value (The system includes “the result Sdm [from the decision maker 22] is obtained by checking if the result Skw [from the speech keyword detector 14] … is greater than a first threshold,” where the result Skw is “comparison results [from comparing] sound in the signal Snd… with the candidate keywords… to respectively obtain comparison results,” thus Skw is the value of each stored word output signal for a period of time; Hsu, ¶¶ [0056], [0050]); a trigger comparator to compare a value of the trigger signal for the period of time to a second threshold (The system further includes “the result Sdm [from the decision maker 22] is obtained by checking if… the result Sap [from the activity predictor 18]… is greater than a second threshold,” where the result Sap is “a probability or likelihood for whether a user is about to give voice keyword,” where the result Sdm is the review command and Sap is the trigger signal; Hsu, ¶¶ [0056], [0052]); and the period of time is one of before or after when the review command is received (The result Sap is a result of the comparison between the signal ssd and the activity template “where each activity template… can include standard, typical, representative and/or most frequently sensed result(s) of an indicative activity (movement or state) which happens before or when user is about to give voice keyword.”  The result Sap is then Hsu, ¶¶ [0043], [0056]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the spoken language processing system of Sun, as modified by the key phrase detection system of Bocklet and the dynamically programmable multi-layer neural network of Sakoe, to incorporate the teachings of Hsu to include wherein the detector includes: a word memory to store the word output signals; a word comparator to compare each value of each stored word output signal for the period of time to one of a first threshold or each other value; a trigger comparator to compare a value of the trigger signal for the period of time to a second threshold; and the period of time is one of before or after when the review command is received. The described elements are “capable of processing the activity prediction result and the preliminary keyword detection result to provide… an improved keyword detection result”, as recognized by Hsu. (Hsu, ¶ [0007]).

Regarding claim 15, the rejection of claim 10 is incorporated. Claim 15 is substantially the same as claim 6 and is therefore rejected under the same rationale as above.

Claim 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sun in view of Bocklet and Sakoe as applied to claims 1 and 10 above, and in further view of Skowronski et al. (M. D. Skowronski and J. G. Harris, “Noise-Robust Automatic Speech Recognition Using a Predictive Echo State Network,” in IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 5, pp. 1724-1730, July 2007, hereinafter Skowronski).

Regarding claim 9, the rejection of claim 1 is incorporated. Sun, Bocklet, and Sakoe disclose all of the elements of the current invention as stated above. However, Sun, Bocklet, and Sakoe fail to expressly recite where the neural network is an echo state network.

Skowronski teaches automatic speech recognition using a predictive echo state network. (Skowronski, Abstract). Regarding claim 9, Skowronski teaches where the neural network is an echo state network (teaches the use of an echo state network (ESN), to overcome “several of the limitations of conventional [artificial neural networks] ANNs” as related to automatic speech recognition; Skowronski, pg. 1724, ¶ 3).

It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the spoken language processing system of Sun, as modified by the key phrase detection system of Bocklet and the dynamically programmable multi-layer neural network of Sakoe, to incorporate the teachings of Skowronski to include the system of claim 1, where the neural network is an echo state network. As recognized by Skowronski, ESN are less computationally expensive and complex than other ANNs for the purposes of automatic speech recognition. (Skowronski, pg. 1724, ¶¶ 2 and 3).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Edara et al. (U.S. Pat. App. Pub. No. 2014/0337131) discloses systems and methods for keyword determination based on trigger words.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Sean E. Serraguard whose telephone number is (313)446-6627.  The examiner can normally be reached on 07:00-17:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel C. Washburn can be reached on (571) 272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.







/Paras D Shah/Primary Examiner, Art Unit 2659                                                                                                                                                                                                        
04/08/2021