Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 9/25/2019, 10/24/2019, 11/29/2019, 1/31/2020, 6/25/2020, 8/5/2020, 10/12020, 12/30/2020, and 3/17/2021 are being considered by the examiner.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-6, 11, 18 and 20-23 are rejected under 35 U.S.C 101 because the claimed invention is directed to an abstract idea without significantly more.
The independent claims 1, 22, 23 recite A non-transitory computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to:
receive an indication of a notification; in accordance with receiving the indication of the notification: 
obtain one or more data streams from one or more sensors; 
determine, based on the one or more data streams, whether a user associated with the electronic device is speaking; and 

The limitations of “receive”, “obtain”, “determine” and “cause” as drafted cover a human organizing of activities where a human hears a clap or a preamble to signify an impending speech by a person, then pay attention to an impending speech, and based on what they hear, determine if a person is speaking, and if the person is not speaking clap their hands to indicate a notification has been received or repeat the preamble. 
This judicial exception is not integrated into a practical application. In particular claims 1 and 22 recite additional element of “processor”, “memory”, “programs”, which is a form of generic computer equipment. In the as-filed Specifications “[0007]    Example electronic devices are disclosed herein. An example electronic device comprises one or more processors; a memory; and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for: receiving an indication of a notification; in accordance with receiving the indication of the notification: obtaining one or more data streams from one or more sensors; determining, based on the one or more data streams, whether a user associated with the electronic device is speaking; and in accordance with a determination that the user is not speaking: causing an output associated with the notification to be provided.”, the elements “processor”, “memory”,  “computer program” are all general purpose computer devices.
 Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea. 
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a 
Claim 2 recites wherein the output associated with the notification includes a content of the notification.  This amounts to the human uttering the preamble. No additional limitations are present
Claim 3 recites in accordance with a determination that the user is speaking: forgo causing the output associated with the notification to be provided; and cause a second output associated with the notification to be provided, the second output being shorter in duration than the first output.   This amounts to the human, on determining there is a speech, to simply utter a shorter version of the preamble. No additional limitations are present.
Claims 4 recites obtaining the one or more data streams for a predetermined duration.  This amounts to a human paying attention to an utterance for no more than 10 seconds and to more than one streams at the same time. No additional limitations are present.
Claim 5 recites wherein the predetermined duration is based on a determined relevance score of the notification.  This amounts to a human deciding how urgent a notification is. No additional limitations are present.
Claim 6 recites wherein determining that the user is not speaking includes: determining, based on the one or more data streams, that the user is not speaking for a second predetermined duration.  This amounts to a human deciding there was no speech after waiting 2 seconds after the last speech that was detected. No additional limitations are present.
Claim 11 recites wherein determining whether the user associated with the electronic device is speaking includes: determining that the one or more data streams include a third portion indicating that the user is speaking; determining that a duration of the third portion is below a threshold duration; and in accordance with a determination that the duration of the third portion is below the threshold 
Claim 18 recites wherein while causing the output associated with the notification to be provided: receive an indication of a second notification; and in accordance with receiving the indication of the second notification:   cause an output associated with the second notification to be provided after the output associated with the notification is provided.  This amounts to a human who hears another preamble as he is reading out the first one. He finishes reading out the first preamble and then announces the second preamble. No additional limitations are present.
Claim 20 recites determine an importance score of the notification based on context information associated with the notification; and determine whether the importance score exceeds a first threshold; and wherein causing the output associated with the notification to be provided is performed in accordance with determining that the importance score exceeds the first threshold.  This amounts to a human deciding that repeating the preamble that dinner is ready is not warranted given the person is on the phone with his doctor. No additional limitations are present.
Claim 21 recites determine a timeliness score of the notification based on context information associated with the user; and determine whether the timeliness score exceeds a second threshold; and wherein causing the output associated with the notification to be provided is performed in accordance with determining that the timeliness score exceeds the second threshold.  This amounts to a human decides the notification is urgent and wait a threshold amount during an existing enunciation to say the urgent notification out loud. No additional limitations are present.





Claim Rejections-35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(2) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-3, 11, 22 and 23 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Faborg (US-20150194165-A1)
With respect to claims 1, 22 and 23, Faborg teaches A non-transitory computer-readable storage medium storing/electronic device comprising processors/method one or more programs ([0005] In another example, the disclosure is directed to a computer-readable storage medium encoded with instructions that, when executed by one or more processors of a computing device, cause the one or more processors to determine that a notification is scheduled for output by the computing device during a first time period. The instructions further cause the one or more processors to determine that a pattern of audio detected during the first time period is indicative of human speech. In response to determining that the pattern of audio detected during the first time is indicative of human speech, the instructions further cause the one or more processors to delay output of the notification during the first time period. The instructions further cause the one or more processors to determine that a pattern of audio detected during a second time period is not indicative of human speech and output at least a portion of the notification at an earlier in time of an end of the second time period or an expiration of a third time period.), the one or more programs comprising instructions ([0039] One or more processors 40 may implement functionality and/or execute instructions within computing device 2. For example, processors 40 on computing device 2 may receive and execute instructions stored by storage devices 60 that execute the functionality of UID module 6, notification module 10, and application modules 12. These instructions executed by processors 40 may cause computing device 2 to store information within storage devices 60 during program execution. Processors 40 may execute instructions in UID module 6 and notification module 10 to cause one or more of application modules 12 to delay output of notifications at inopportune times (such as when a user is speaking or interacting with computing device 2).), which when executed by one or more processors of an electronic device, cause the electronic device to: 
receive an indication of a notification ([0059] The example operations include determining, by a computing device, that a notification is scheduled for output by the computing device during a first time period (202)); 
in accordance with receiving the indication of the notification: 
obtain one or more data streams from one or more sensors ([0035] Example sensor devices 48 include an accelerometer, a gyroscope, an ambient light sensor, a proximity sensor...); 
determine, based on the one or more data streams, whether a user associated with the electronic device is speaking  ([0060] The example operations further include determining, by the computing device, that a pattern of audio detected during the first time period is indicative of human speech (204). ); and 
in accordance with a determination that the user is not speaking (0042] Notification module 10 may provide instructions to output the notification based on notification module 10 detecting the occurrence of one or more conditions. These conditions may be ... microphone 7 does not detect audio indicative of human speech, or a maximum delay time period is reached.): 
 ([0042] In some examples, notification module 10 provides instructions for UID module 6 to output a notification...microphone 7 does not detect audio indicative of human speech, or a maximum delay time period is reached.)
With respect to claim 2, Faborg teaches wherein the output associated with the notification includes a content of the notification ([0011] A notification may be any output (e.g., visual, auditory, tactile, etc.) that a computing device provides to convey information, and  [0063] The example operations may further include outputting, by the computing device, at least a portion of the notification at an earlier in time of an end of the second time period or an expiration of a third time period (210).).  
With respect to claim 3, Faborg teaches in accordance with a determination that the user is speaking ([0060] The example operations further include determining, by the computing device, that a pattern of audio detected during the first time period is indicative of human speech (204)):
 forgo causing the output associated with the notification to be provided ([0061] The example operations also include delaying, by the computing device, output of the notification during the first time period (206)); and 
cause a second output associated with the notification to be provided, the second output being shorter in duration than the first output. ([0066] Additionally, notification module 10 may determine that the pattern of audio detected during the second time period indicates the second time period is suitable for outputting the entire notification, wherein outputting at least the portion of the notification comprises outputting the entire notification. In some examples, outputting at least the portion of the notification further comprises outputting at least the portion of the notification when the ambient noise level is below a threshold noise level.)
With respect to claim 11, Faborg  teaches determining that the one or more data streams include a third portion indicating that the user is speaking ([0060] The example operations further include determining, by the computing device, that a pattern of audio detected during the first time period is indicative of human speech (204).); 
determining that a duration of the third portion is below a threshold duration ([0043]  In another example, speech pattern database 62 contains data representative of samples of human speech that notification module 10 may use to compare to detected audio to determine if the detected audio is indicative of human speech. Speech pattern database 62 may also include selected threshold levels for notification module 10 to match the detected audio to any particular speech pattern. Notification module 10 may use one or more of these selected threshold levels may to determine whether the detected audio is indicative of human speech… A threshold level may be any value determined by or set for computing device 2, and the threshold may be such that if the threshold is exceeded, it is likely that the detected audio is indicative of human speech.); and
in accordance with a determination that the duration of the third portion is below the threshold duration: determining that the user is not speaking ([0043] Notification module 10 may use one or more of these selected threshold levels may to determine whether the detected audio is indicative of human speech.).  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention 


Claim 4, 7, 8 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Faborg in further view of Goldstein (US-20150228292-A1)
With respect to claim 4, Faborg does not teach concurrently obtaining the one or more data streams for a predetermined duration.  
Goldstein teaches concurrently ([0017] … the close-talk detector (see FIG. 1) digitally processes the vibration sensor signal and one or more of the microphone signals, and detects or declares a close-talk event or close-talk state in the controller, that coincides with the user talking...) obtaining the one or more data streams for a predetermined duration.  (Abstract: A close-talk detector detects a near-end user's speech signal, and [0019] In one embodiment, when an initial close-talk event is declared, the declaration may then be held for a predefined minimum period of time (hold interval)).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg to include the teachings of Goldstein, motivation being to use dual channels consisting of vibration and microphone to automatically detect near-end speech (Goldstein [Abstract]).
With respect to claims 7 and 28 Faborg does not teach the one or more sensors include a microphone and a vibration sensor; and the one or more data streams include a first data stream obtained from the microphone and a second data stream obtained from the vibration sensor.  
Goldstein teaches wherein: 
the one or more sensors include a microphone and a vibration sensor (Abstract: Upon detecting speech using a vibration sensor signal and one or more microphone signals); and 
(0013. In addition, the housing contains a vibration sensor that may be rigidly mounted to the housing so as to perform non-acoustic pickup of the user's voice, such as through bone conduction.... A close-talk detector uses the vibration sensor and one or more micro phone signals, which microphone signals are also being used by an ANC controller, to control different aspects of ANC controller. FIG. 1 shows two such aspects of such a controller.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg to include the teachings of Goldstein, motivation being to use dual channels consisting of vibration and microphone to automatically detect near-end speech (Goldstein [Abstract]).
With respect to claims 7  Faborg does not teach the one or more sensors include a microphone and a vibration sensor; and the one or more data streams include a first data stream obtained from the microphone and a second data stream obtained from the vibration sensor.  
Goldstein teaches wherein: 
the one or more sensors include a microphone and a vibration sensor (Abstract: Upon detecting speech using a vibration sensor signal and one or more microphone signals); and 
the one or more data streams include a first data stream obtained from the microphone and a second data stream obtained from the vibration sensor. (0013. In addition, the housing contains a vibration sensor that may be rigidly mounted to the housing so as to perform non-acoustic pickup of the user's voice, such as through bone conduction.... A close-talk detector uses the vibration sensor and one or more micro phone signals, which microphone signals are also being used by an ANC controller, to control different aspects of ANC controller. FIG. 1 shows two such aspects of such a controller.)

With respect to claim 8, Faborg does not teach  
determining that the first data stream indicates that the user is speaking; 
determining that the second data stream indicates that the user is speaking; 
and in accordance with determining that the first data stream indicates that the user is speaking and in accordance with determining that the second data stream indicates that the user is speaking: determining that the user is speaking.  "
Goldstein teaches 
determining that the first data stream indicates that the user is speaking (Abstract: A close-talk detector detects a near-end user's speech signal, and [0017] ...the user speech is often picked-up by the error microphone 7… speech signal disturbs the adaptation of the filters W(z) and SA(z), possibly causing one or both of these adaptive filters to diverge from a solution, or become unstable. In order to prevent the divergence of these adaptive filters during user speech, the close-talk detector (see FIG. 1) digitally processes the vibration sensor signal and one or more of the microphone signals…); 
determining that the second data stream indicates that the user is speaking ([Abstract] Upon detecting speech using a vibration sensor signal, and [0017...the close-talk detector (see FIG. 1) digitally processes the vibration sensor signal and one or more of the microphone signals); 
and in accordance with determining that the first data stream indicates that the user is speaking and in accordance with determining that the second data stream indicates that the user is speaking: determining that the user is speaking.  ([0017]   In order to prevent the divergence of these adaptive filters during user speech, the close-talk detector (see FIG. 1) digitally processes the vibration sensor signal and one or more of the microphone signals, and detects or declares a close-talk event or close-talk state in the controller, that coincides with the user talking, in response to the close talk event being declared or detected, the controller slows down or freezes the filter adaptation.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg to include the teachings of Goldstein, motivation being to use dual channels consisting of vibration and microphone to automatically detect near-end speech (Goldstein [Abstract]).
With respect to claim 12, Faborg does not teach wherein the electronic device comprises a headset.  
Goldstein teaches wherein the electronic device comprises a headset ([0012] FIG. 1 is a block diagram of part of a consumer electronics personal listening device having an ANC system and in which an embodiment of the invention can be implemented...The housing may be, for example, that of a wired or wireless headset or earphone, a loose fitting ear bud housing...)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg to include the teachings of Goldstein, motivation being to use dual channels consisting of vibration and microphone to automatically detect near-end speech (Goldstein [Abstract]).

Claims 5 are rejected under 35 U.S.C. 103 as being unpatentable over Faborg in view of Goldstein and in further view of Larson (US-20140278444-A1).
With respect to claim 5, Faborg and Goldstein do not teach wherein the predetermined duration is based on a determined relevance score of the notification.  
Larson teaches wherein the predetermined duration is based on a determined relevance score of the notification ([0016] In some embodiments, determining if the device is currently receiving speech input from the user includes determining if a last speech input was received within a predetermined period of time. In some embodiments, the predetermined period of time is a function of a measure of a urgency of the output.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg and Goldstein to include the teachings of Larson, motivation being to prioritize delivery of outputs as a result of notifications based on the user context and interaction  (Larson [0050].)

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Faborg in further view of Larson.
With respect to claim 6, Faborg does not teach wherein determining that the user is not speaking includes: determining, based on the one or more data streams, that the user is not speaking for a second predetermined duration.  
Larson teaches determining, based on the one or more data streams, that the user is not speaking for a second predetermined duration ([0015] In some embodiments, determining that the device is no longer receiving speech input from the user includes determining that a predefined amount of time has elapsed between a time of a last speech input and a current time.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg to include the teachings of Larson, motivation being to prioritize delivery of outputs as a result of notifications based on the user context and interaction (Larson [0050].)
Claim 9 is rejected under 35 U.S.C. 103 as being as being unpatentable over Faborg in view of Goldstein and in further view of Dusan (US-20180324518-A1).
With respect to claims 9, Faborg and Goldstein do not teach

determining that the second data stream indicates that the user is not speaking; 
and in accordance with determining that the first data stream indicates that the user is speaking and determining that the second data stream indicates that the user is not speaking, determining that the user is not speaking."
Dusan teaches 
determining that the first data stream indicates that the user is speaking (Fig 3. Col 2 Acoustic Trigger Signal 212 row 3 shows microphone shows signal); 
determining that the second data stream indicates that the user is not speaking (Fig 3. Col 3 non-Acoustic Trigger Signal and row 2 shows stream indicating user not speaking); and 
in accordance with determining that the first data stream indicates that the user is speaking and determining that the second data stream indicates that the user is not speaking, determining that the user is not speaking.  (Fig 3 row 3 shows user not speaking, and [0046] Referring to FIG. 3, a table representing a combination of acoustic and non-acoustic triggers signals mapped to respective ASR trigger signals is shown in accordance with an embodiment. The table illustrates that acoustic trigger signal 212 and non-acoustic trigger signal 224 may have corresponding high or low digital signals (0 or 1 binary signals) depending on an event. A combination 302 of the trigger signals can be an output of an AND gate implemented by processor 214. The combination 302 may correspond to ASR trigger signal 202 sent by ASR triggering system 100 to the primary ASR server 200, and may be a high or low digital signal. Thus, processor 214 may generate ASR trigger signal 202 (or may output ASR trigger signal 202 as a binary “1” output) when acoustic trigger signal 212 and non-acoustic trigger signal 224 are simultaneously high digital signals. Similarly, when one or more acoustic trigger signal 212 or non-acoustic trigger signal 224 are low digital signals, processor 214 may not generate ASR trigger signal 202 (or may output ASR trigger signal 202 as a binary “0” output)) 

Claim 10 is rejected as under 35 U.S.C. 103 as being unpatentable over Faborg in view of Goldstein in view of Dusan and in further view of Fawaz (US-10566007-B2).
With respect to claim 10 Faborg, Goldstein and Dusan does not teach
determining that a first portion of the first data stream indicates that the user is speaking;   
determining that the second data stream indicates that the user is not speaking includes determining that a second portion of the second data stream indicates that the user is not speaking;
the first portion and the second portion have a same duration; and 
the first portion and the second portion are obtained at a same time. 
Fawaz teaches determining that a first portion of the first data stream indicates that the user is speaking (Col 7 ll 37-44: The following description of the example matching algorithm will be illustrated with a running example of a male speaker recording the two words “cup” and “luck” with a short pause between them. Referring to FIG. 3, the example vibration data 305 may be, for example, recorded from an accelerometer device in contact with a user's sternum at 64 kHz. Further, the speech signal data 310 may be recorded from a built-in laptop microphone at 44.1 kHz, 50 cm away from the user.);  
determining that the second data stream indicates that the user is not speaking includes determining that a second portion of the second data stream indicates that the user is not speaking (Col 10 ll 7-17: ...The system may provide protection both when the user is speaking and silent. For example, if a user is not speaking, and thus no vibration data is available, the system may ignore all received speech signals. This removes the threat of stealthy attacks such as mangled voice attack and further from biometric override attacks such as impersonation, wireless-based replay, etc.);
the first portion and the second portion have a same duration (Claim 10:… receiving, via an accelerometer device, recorded vibration data, wherein the recorded vibration data corresponds to speech from a user corresponding to the accelerometer device; receiving, via a microphone, recorded speech signals, wherein the speech signals are recorded substantially at the same time as the vibration data); and 
the first portion and the second portion are obtained at a same time (Col 9 ll 32-34: The final post-processing step may include measuring the signal similarity between the accelerometer and microphone signals by using the normalized cross correlation. This indicates that the two signals are included within each other as shown in the plot 540 of FIG. 5.).  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg Goldstein and Dusan to include the teachings of Fawaz, motivation being to use accelerometer and vibrations determine if the speech signals originated from the user  (Fawaz Abstract).

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Faborg in view of Goldstein and further in view of Mulherkar (US 10365887 B1).
With respect to claim 13 Goldstein teaches wherein: the electronic device includes the one or more sensors ([0013] Examples of the vibration sensor include a multi-axis accelerometer, a gyroscopic sensor); 
Neither Faborg nor Goldstein teach receiving the indication of the notification includes 
receiving the indication from an external electronic device, the notification being received at the external electronic device; and

Mulherkar teaches receiving the indication of the notification includes 
receiving the indication from an external electronic device (Col 3 ll 32-38  As shown in FIG. 1, the device 110 receives audio 11 from the environment of the device 110. In one example, the device 110 may be configured to perform the processes described herein only when an external headset (e.g., earbuds or Bluetooth headset) are operating with or plugged into the device 110), the notification being received at the external electronic device (Col 3 ll 57-59 The server 120 receives the audio data and determines a command and corresponding notification based on the audio data (illustrated as 108)); and
causing the output associated with the notification to be provided includes providing the output with a speaker of the electronic device (Col 4 ll 3-8  The server 120 also determines an output type for presentment of a notification to the user 10 (illustrated as 112). The output may be visual (i.e., displayed on a display of the device 110), audible (e.g., conveyed via a speaker of the device 110 or earbuds/headphones connected to the device 110).)  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg and Goldstein  to include the teachings of Mulherkar, motivation being to alert users who are using headphone to notifications received at the external devices (Mulherkar [Col2 ll 47-60].)

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Faborg in view of Goldstein in view of Mulherkar in further view of Larson
With respect to claim 14, Faborg does not teach while providing the output: detect a speech input; and   

Larson teaches while providing the output: detect a speech input ([0115. In some embodiments, during provision (524) of the speech output, the device receives (526) speech input from the user); and   
in response to detecting the speech input, terminate the output ([0115. In some embodiments, during provision (524) of the speech output, the device receives (526) speech input from the user...In such embodiments, the device will discontinue (528) speech output.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg, Goldstein and Mulherkar to include the teachings of Larson, motivation being to prioritize delivery of outputs as a result of notifications based on the user context and interaction  (Larson [0050].)

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Faborg in view of Goldstein in view of Mulherkar in further view of Nurmi (US 20070297618 A1).
With respect to claim 15 Faborg, Goldstein, Mulherkar do not teach 
While providing the output: detect a signal indicative of device removal; and in response to detecting the signal, terminate the output.  
Nurmi teaches wherein while providing the output: 
detect a signal indicative of device removal ([0021] If the earpiece is not in use, then at an interrupt step 36, the earpiece 12 transmits a non-use interrupt to the mobile audio device 14) and 
in response to detecting the signal, terminate the output ([0021] The non-use interrupt causes the mobile audio device 14 to change its state to stop playing the audio content.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg, Goldstein and Mulherkar to include the teachings of 

Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Faborg in view of Goldstein in view of Mulherkar in further view of Pisula (US 20100060586 A1).
With respect to claim 16 Faborg, Goldstein and Mulherkar do not teach while providing the output: detect a predetermined gesture performed at the electronic device; and in response to detecting the predetermined gesture, terminate the output.  
 Pisula teaches while providing the output: detect a predetermined gesture performed at the electronic device; and in response to detecting the predetermined gesture, terminate the output (0171] In some embodiments, in response to determining that the interaction by the user with the first physical button corresponds to the first predefined action, an audio status import of the workout by the user is provided (e.g., via speaker 111 or via headphones) (5015), and  [0216] The device detects (1014) a finger gesture on the touch screen display (e.g., swipe gesture 406, FIG. 4Z). In response to detecting the finger gesture on the touch screen display, the device performs (1016) a control operation in the application while maintaining display of the same locked-mode user interface for the application. For example, in response to detecting swipe gesture 406 on the touch screen display, the device terminates play of as audio file and initiates play of a next audio file from a playlist while maintaining display of the same locked-mode user interface UI 400Z for the workout support application 142.).  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg, Goldstein and Mulherkar to include the teachings of Pisula, motivation being to allow user interaction when electronic devices with touchscreens are in a locked mode (Pisula [0008].)


Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Faborg in further view of Gruber (US-20140195252-A1).
With respect to claim 17 Faborg does not teach 
receiving the indication of the notification includes receiving the notification at the electronic device; 
obtaining the one or more data streams includes obtaining the one or more data streams from an external electronic device including the one or more sensors; and 
causing the output associated with the notification to be provided includes causing the external electronic device to provide the output.  "
Gruber teaches receiving the indication of the notification includes receiving the notification at the electronic device ( [0185] In one embodiment, upon receiving text message 470 while in hands-free context, multimodal virtual assistant 1002 causes device 60 to output an audio indication, such as a beep or tone, indicating receipt of a text message. ); 
obtaining the one or more data streams includes obtaining the one or more data streams from an external electronic device including the one or more sensors ([0186] Pressing the button initiates a spoken dialog with assistant 1002, and allows the user to communicate with assistant 1002 via the BlueTooth connection and through a microphone and/or speaker installed in the vehicle.); and
causing the output associated with the notification to be provided includes causing the external electronic device to provide the output (0187] Once the spoken dialog has been initiated, assistant 1002 listens for spoken input. In one embodiment, assistant 1002 acknowledges the spoken input by some output mechanism that is easily detected by the user while in the hands-free context. An example is an audio beep or tone, and/or visual output on a vehicle dashboard that is easily seen by the user even while driving, and/or by some other mechanism.).  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg to include the teachings of Gruber, motivation being to allow user interaction in a hands-free context (Gruber [0010].)

Claims 18 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Faborg in further view of Larson
With respect to claim 18, Faborg does not teach 
while causing the output associated with the notification to be provided: receive an indication of a second notification; and 
in accordance with receiving the indication of the second notification:   cause an output associated with the second notification to be provided after the output associated with the notification is provided. 
 Larson teaches while causing the output associated with the notification to be provided ([0102] At a location designated by 405-1, a phone feature included on the same device as the digital assistant receives an incoming call, as indicated by ring-tone icon…, and [0106] As explained previously, the phone receives an incoming call, which the user answers in speech input SI2 by stating, "Hey John! Haven't heard from you in ages. How is the family?"): receive an indication of a second notification ([0106] During a speech input SI3, the user requests that the device inform the user of the Knicks' score whenever the game should end, stating, "Tell me the…); and 
in accordance with receiving the indication of the second notification:   cause an output associated with the second notification to be provided after the output associated with the notification is provided ([0106] In this example, speech output SO5 is not considered urgent because the Knicks' score will not change in the time that the user is speaking…For this reason, the device stays speech output SO5, as indicated by arrow 408, until the user has finished speaking, and then outputs speech output SO5).  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg to include the teachings of Larson, motivation being to prioritize delivery of outputs as a result of notifications based on the user context and interaction (Larson [0050].)
With respect to claim 21, Faborg does not teach 
determine a timeliness score of the notification based on context information associated with the user; and 
determine whether the timeliness score exceeds a second threshold; and   
wherein causing the output associated with the notification to be provided is performed in accordance with determining that the timeliness score exceeds the second threshold.  "
 Larson teaches 
determine a timeliness score ([119]…the predetermined amount of time is (560) a monotonically decreasing function of the measure of the urgency of the speech output, thereby providing speech outputs with a greater measure of urgency in a lesser amount of time. For example, in these embodiments, the device waits a shorter amount of time before providing an urgent speech output after the user has finished speaking than if the speech output was less urgent.) of the notification based on context information associated with the user ([0118] Flow paths 553-1, 553-2, and 553-3 represent additional operation that are optionally performed upon determining that provision of the speech output is not urgent, in accordance with some embodiments of method 500. It should be understood that the various operations described with respect to flow paths 553 are not necessarily mutually exclusive and, in some circumstances, combined.); and 
([119]…the predetermined amount of time is (560) a monotonically decreasing function of the measure of the urgency of the speech output, thereby providing speech outputs with a greater measure of urgency in a lesser amount of time. For example, in these embodiments, the device waits a shorter amount of time before providing an urgent speech output after the user has finished speaking than if the speech output was less urgent); and   
wherein causing the output associated with the notification to be provided is performed in accordance with determining that the timeliness score exceeds the second threshold ([0121] In some embodiments, when the device includes a display, upon determining that provision of the speech output is not urgent, the device provides (568) a displayed output corresponding to the speech output. In some embodiments,).  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg to include the teachings of Larson, motivation being to prioritize delivery of outputs as a result of notifications based on the user context and interaction (Larson [0050].)


Claims 19 is rejected under 35 U.S.C. 103 as being unpatentable over Faborg in further view of Ward (US-20170155965-A1).
With respect to claim 19 Faborg does not teach in accordance with determining that the second external electronic device has provided the second output associated with the notification: forgo causing the output associated with the notification to be provided; and 

Ward teaches and in accordance with determining that the second external electronic device has provided the second output associated with the notification: forgo causing the output associated with the notification to be provided ([0231] FIG. 18 illustrates one possible screen of implementing the option to prevent a further contextual menu from being displayed based on the received emergency alert. The media guidance application may receive, from a user, input indicating that a further contextual menu for the same emergency alert should not be presented to the user. This may be useful in instances where the user has already received all the information that the user desires regarding the emergency alert.); and 
wherein causing the output associated with the notification to be provided is performed in accordance with determining that the second external electronic device has not provided the second output associated with the notification. ([0232] The media guidance application may also generate for display an option to navigate to the web page using a different device (e.g., a tablet or a smart phone). The media guidance application may also provide an option to monitor the web page for updates and alert the user when an update is available...If the timestamp is updated, the media guidance application may alert the user of an update. Additionally or alternatively, the media guidance application may compare the content of the webpage to the content of the web page retrieved during the last time interval. If the web pages match, then no update has been made. However, if the web pages do not match, the media guidance application may alert the user of the change)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg to include the teachings of Ward, motivation being to 
 

Claims 20 is rejected under 35 U.S.C. 103 as being unpatentable over Faborg in further view of Gruber (US-10078487-B2), hereinafter referred to as Gruber_2
With respect to claim 20 Faborg does not teach determine an importance score of the notification based on context information associated with the notification; and determine whether the importance score exceeds a first threshold; and wherein causing the output associated with the notification to be provided is performed in accordance with determining that the importance score exceeds the first threshold.  
Gruber_2 teaches determine an importance score of the notification based on context information associated with the notification (Col 20 ll 58-65 If the user is simply watching TV, though, the digital assistant provides the audio prompt because the user's context suggests that interruptions or barge-ins will not be a nuisance. However, if an important communication is received (e.g., a text message or voicemail regarding a family emergency), the digital assistant determines that, even though the user is in an important meeting, the urgency of the communication warrants an interruption.); and 
determine whether the importance score exceeds a first threshold (Col 27 ll  50-65: For example, in some embodiments, the digital assistant determines a topic of importance to the user based on any of the following: historical data associated with the user (e.g., by determining that the user typically responds to communications about a certain topic quickly), an amount of notification items in the list of notification items that relate to that topic (e.g., by determining that the number of notification items relating to that topic satisfies a predetermined threshold, such as 2, 3, 5, or more notification items), a user-specified topic (e.g., the user requests to be alerted to any notifications relating to a particular topic), and the like. In some embodiments, the topic of importance is determined by the device automatically without human intervention, such as by determining a topic of importance based on historical data associated with the user, as described noted above.); and
wherein causing the output associated with the notification to be provided is performed in accordance with determining that the importance score exceeds the first threshold (Col 31 ll 28-31: 128 Returning to FIG. 6, upon determining that the adjusted urgency value satisfies the predetermined threshold (cf. 610), the digital assistant provides a first audio prompt to a user (612).).  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Faborg to include the teachings of Gruber_2, motivation being to ascertain urgency value of a notification from the context before outputting it (Gruber_2 [Abstract].)

Claims 24, 25, 37 and 38 are rejected under 35 U.S.C. 103 as being unpatentable over Gruber in further view of Aggarwal US 20200342863 A1 ()
With respect to claims 24, 37 and 38, Gruber teaches  A non-transitory computer-readable storage medium storing/electronic device comprising processors/method one or more programs comprising instructions when executed by one or more processors ([0027] In accordance with some implementations, an electronic device includes one or more processors, memory, and one or more programs; the one or more programs are stored in the memory and configured to be executed by the one or more processors and the one or more programs include instructions for performing the operations of any of the methods and/or techniques described herein. In accordance with some implementations, a computer readable storage medium has stored therein instructions, which, when executed by an electronic device, cause the device to perform the operations of any of the methods and/or techniques described herein. In accordance with some implementations, an electronic device includes means for performing the operations of any of the methods and/or techniques described herein. In accordance with some implementations, an information processing apparatus, for use in an electronic device includes means for performing the operations of any of the methods and/or techniques described herein.)
cause a first output associated with a received notification to be provided ([0279] The electronic device outputs an alert corresponding to an information item (1454). In some implementations, the alert is an audible alert (e.g., a beep, tone, ring, chime, etc.)...incoming text messages, emails, and application notifications may all cause the same sound to be output as an alert);
after the first output is provided: obtain one or more data streams from one or more sensors ([0285] In some implementations, in response to outputting the alert (and before the speech input is received at step (1458)), the electronic device initiates a listening mode for a first predetermined time period, and the speech input is received during the first predetermined time period (1456). The listening mode corresponds to a state in which the electronic device is monitoring and/or analyzing audio that is received by a microphone or transducer on the device. In some implementations, the predetermined time period is 2 seconds or less, 3 seconds or less, 5 seconds or less, 10 seconds or less, or any other appropriate duration.); 
determine, based on the one or more data streams, whether a user associated with the electronic device is speaking (0285] ...the electronic device initiates a listening mode for a first predetermined time period, and the speech input is received during the first predetermined time period (1456), and [0289] The electronic device determines whether the speech input includes a request for information about the alert (1460). Thus, the electronic device differentiates between inadvertent speech inputs that may be received after an alert has been output); 
Gruber does not teach in accordance with a determination that the user is speaking: provide at least a portion of the one or more data streams to an external electronic device, the portion including 
receive, from the external electronic device, an indication that the task has been initiated; and 
cause a second output based on the received indication to be provided.  
Aggarwal teaches in accordance with a determination that the user is speaking: provide at least a portion of the one or more data streams to an external electronic device ([008] “Assistant, when do I need to buy new tires?” When the vehicle computing device has a network connection with the server device, audio data corresponding to the spoken utterance can be transmitted to the server device for processing, the portion including data representing a received speech input requesting performance of a task associated with the notification.([008]“Assistant, when do I need to buy new tires?); 
receive from the external electronic device, an indication that the task has been initiated ([0006] In response to the server device receiving the audio data and version information from the vehicle computing device, the server device can generate instructions and/or data for providing to the vehicle computing device in order to cause the vehicle computing device to no longer transmit audio data);
cause a second output based on the received indication to be provided ([008] For instance, if the user had the latest version of the computing device, the server device would generate data that characterizes the intent, and an action and/or slot value(s) to cause the automated assistant at the vehicle computing device to provide an estimate of when the user should replace their tires based on sensor output of their tire sensors (e.g., “You should change your tires in about 2 months, or in about 1600 miles.”).) 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber to include the teachings of Aggarwal, motivation being A server device can be responsive to a particular intent request despite the intent request being associated with an action that computing device cannot execute.. (Aggarwal [Abstract]). 
With respect to claim 25 Gruber teaches wherein the speech input does not include a trigger phrase for initiating a digital assistant ( [0290]…In some implementations, the predetermined word or phrase is one of a plurality of predetermined words or phrases that indicate a user request for information about an alert. For example, the electronic device may be configured to respond to any of "What?," "SRI, what was that?," and "Read that to me" (and/or other appropriate words or phrases).); 

Claims 26 is rejected under 35 U.S.C. 103 as being unpatentable over Gruber in view of Aggarwal in further view of Nishikawa (US-20160189715-A1) 
With respect to claim 26, Gruber and Aggarwal do no teach determining that the user is not speaking: forgo providing the one or more data streams to the external electronic device; and discard the one or more data streams.  
Nishikawa teaches in accordance with determining that the user is not speaking: forgo providing the one or more data streams to the external electronic device ([0053] As seen above, if the section of the speech uttered by the user is not detected, noise contained in the first speech information is not removed. Further, the second speech information is not outputted, nor is the first speech information transmitted to the server. Thus, it is possible to prevent the performance of an unnecessary computation, as well as to prevent the transmission of unnecessary information.); 
discard the one or more data streams ([0053] Further, the second speech information is not outputted, nor is the first speech information transmitted to the server.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber and Aggarwal to include teachings of Nishikawa, motivation being to improve accuracy of speech recognition. (Nishikawa [0008]). 

Claims 27, 28 and 32 are rejected under 35 U.S.C. 103 as being unpatentable over Gruber in view of Aggarwal in view of Goldstein (US-20150228292-A1)
With respect to claim 27, Gruber, Aggarwal do not teach concurrently obtaining the one or more data streams for a predetermined duration.  
Goldstein teaches concurrently ([0017] … the close-talk detector (see FIG. 1) digitally processes the vibration sensor signal and one or more of the microphone signals, and detects or declares a close-talk event or close-talk state in the controller, that coincides with the user talking...) obtaining the one or more data streams for a predetermined duration.  (Abstract: A close-talk detector detects a near-end user's speech signal, and [0019] In one embodiment, when an initial close-talk event is declared, the declaration may then be held for a predefined minimum period of time (hold interval)).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber and Aggarwal to include the teachings of Goldstein, motivation being to use dual channels consisting of vibration and microphone to automatically detect near-end speech (Goldstein [Abstract]).
With respect to claim 28 Gruber and Aggarwal do not teach the one or more sensors include a microphone and a vibration sensor; and the one or more data streams include a first data stream obtained from the microphone and a second data stream obtained from the vibration sensor.  
Goldstein teaches wherein: 
the one or more sensors include a microphone and a vibration sensor (Abstract: Upon detecting speech using a vibration sensor signal and one or more microphone signals); and 
the one or more data streams include a first data stream obtained from the microphone and a second data stream obtained from the vibration sensor. (0013. In addition, the housing contains a vibration sensor that may be rigidly mounted to the housing so as to perform non-acoustic pickup of the user's voice, such as through bone conduction.... A close-talk detector uses the vibration sensor and one or more micro phone signals, which microphone signals are also being used by an ANC controller, to control different aspects of ANC controller. FIG. 1 shows two such aspects of such a controller.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber and Aggarwal to include the teachings of Goldstein, motivation being to use dual channels consisting of vibration and microphone to automatically detect near-end speech (Goldstein [Abstract]). 
With respect to claims 32, Gruber and Aggarwal do not teach wherein the electronic device comprises a headset.  
Goldstein teaches wherein the electronic device comprises a headset ([0012] FIG. 1 is a block diagram of part of a consumer electronics personal listening device having an ANC system and in which an embodiment of the invention can be implemented...The housing may be, for example, that of a wired or wireless headset or earphone, a loose fitting ear bud housing...)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber, Aggarwal to include the teachings of Goldstein, motivation being to use dual channels consisting of vibration and  microphone to  automatically detect near-end speech (Goldstein [Abstract]).
Claim 33 is rejected under 35 U.S.C. 103 as being unpatentable over Gruber in view of Aggarwal in view Goldstein and in further view of Larson
With respect to claim 33, Gruber, Aggarwal and Goldstein do not teach wherein:
the electronic device includes the one or more sensors; causing the first output to be provided includes providing the first output with a speaker of the electronic device; and 
causing the second output to be provided includes providing the second output with the speaker.  

the electronic device includes the one or more sensors ([0042] For example, a motion sensor 210, a light sensor 212, and a proximity sensor 214 arc coupled...);
causing the first output to be provided includes providing the first output with a speaker of the electronic device ([0043] An audio subsystem 226 is coupled to speakers 228 and a microphone 230 to facilitate voice-enabled functions, such as voice recognition, voice replication, digital recording, and telephony functions, and,  [0113] Upon determining that the device is not currently receiving speech input from the user, the device provides (518) the speech output to the user...In some embodiments, the device provides (520) audio data received from the remote user); and 
causing the second output to be provided includes providing the second output with the speaker ([0113] For example, in such embodiments, when the remote user (i.e., the other party) is talking during a telephone conversation, the device will nevertheless provide speech output from the digital assistant. In some embodiments, providing audio data (e.g., speech) received from the remote user and the speech output to the user contemporaneously means muting the audio data from the remote user temporarily while the speech output is provided).  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber, Aggarwal, Goldstein  to include the teachings of Larson, motivation being to prioritize delivery of outputs as a result of notifications based on the user context and interaction  (Larson [0050].)
Claims 34 is rejected under 35 U.S.C. 103 as being unpatentable over Gruber in view of Aggarwal in view of Goldstein in view of Larson and in further view of Kang (US-20210012775-A1
With respect to claim 34 Gruber, Aggarwal, Goldstein and Larson do not teach  
receives, from the electronic device, the data representing the received speech input requesting performance of the task associated with the notification; 

in accordance with sending the data to the second external electronic device: receives, from the second external electronic device, a result based on the initiation of the task; and in accordance with receiving the result based on the initiation of the task: 
sends the indication that the task has been initiated to the electronic device.  "
Kang teaches wherein the external electronic device:
receives, from the electronic device, the data representing the received speech input requesting performance of the task associated with the notification ([0218] Referring to FIG. 5, in operation 510, for example, the first electronic device 201 may obtain first input speech data including a first request for performing a first task by using a second electronic device (e.g., second electronic devices 202 ) and [0223] In operation 520, for example, the first electronic device 201 may transmit the first input speech data to an intelligent server (e.g., an intelligent server 408)  (201 is the electronic device, 408 is external device 1, and 202 is external device 2); 
sends the data to a second external electronic device ([0223] In operation 520, for example, the first electronic device 201 may transmit the first input speech data to an intelligent server (e.g., an intelligent server 408) (e.g., a speech recognition server) (e.g., Samsung Electronics' Bixby™ server) via a second network (e.g., a second network 299) (e.g., a long-range wireless communication). For example, the first electronic device 201 may transmit the first input speech data to the intelligent server 408 via the second network 299 by using a communication interface (e.g., a communication module 190) (e.g., a long-range wireless communication interface) of the first electronic device 201.); and 
in accordance with sending the data to the second external electronic device: receives, from the second external electronic device, a result based on the initiation of the task ([0224] In operation 530, for example, the first electronic device 201 may receive a first response related to adjustment of a state of the first electronic device 201 from the intelligent server 408.); and
in accordance with receiving the result based on the initiation of the task: sends the indication that the task has been initiated to the electronic device ([0225] According to one embodiment, the first electronic device 201 may receive, from the intelligent server 408, first task performing information for performing the first task which corresponds to the first request included in the first input speech).  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber, Agarwal, Goldstein, Larson to include the teachings of Kang, motivation being to ascertain context from the environment for task control Kang [0003].)

Claims 29 is rejected under 35 U.S.C. 103  as being as being unpatentable over Gruber , Aggarwal, Goldstein as applied to Claim 28 and  in further view of Dusan (US-20180324518-A1) 
With respect to claim 29 Gruber, Aggarwal, Goldstein do not teach determining whether the second data stream indicates that the user is speaking for a second predetermined duration.  
Dusan teaches wherein determining whether the user is speaking includes determining whether the second data stream indicates that the user is speaking for a second predetermined duration.  ([0088] In an embodiment, input command pattern 1302 includes a predetermined sequence of phonemes spoken by user...may be a phrase or series of phonemes such as in the word “sixty-two” that can be broken into the syllables “six-ty-two.” Each syllable, and the pauses between syllables, may have a predetermined duration...Processor 214 may monitor the accelerometer signal for voice activity that corresponds to the pre-trained sequence of phonemes to identify progression to a final state that triggers ASR server 200).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber, Aggarwal, Goldstein to include the teachings of Dusan, 

Claims 30 and 31 are rejected under 35 U.S.C. 103  as being as being unpatentable over Gruber , Aggarwal, Goldstein as applied to Claim 28 and  in further view of Dusan and in further view of Basye (US 20140163978 A)
With respect to claim 30 Gruber, Aggarwal, Goldstein do not teach 
determining that the second data stream includes a portion indicating that the user is speaking; 
determining whether a duration of the portion indicating that the user is speaking is below a threshold duration; and 
Dusan teaches 
determining that the second data stream includes a portion indicating that the user is speaking ([0028] The processor can also receive an acoustic trigger signal, based on an acoustic signal generated by a microphone of the ASR triggering system, e.g., microphone data representing acoustic vibrations of the sound from the user speaking or humming.); 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber, Aggarwal, Goldstein to include the teachings of Dusan, motivation being to use  non-acoustic signals to avoid false triggers that drain device power and frustrate the user (Dusan [Summary].)
Neither Gruber, Aggarwal, Goldstein or Dusan teach
In accordance with determining that the duration is below the threshold duration: determining that the user is not speaking.
Basye teaches determining whether a duration of the portion indicating that the user is speaking is below a threshold duration ([0071] Alternately, the user computing device 200 may determine (via the speech detection module 108) that at least a threshold amount of time has passed since an audio input that includes speech has been obtained. ); and 
in accordance with determining that the duration is below the threshold duration: determining that the user is not speaking ([0071] Alternately, the user computing device 200 may determine (via the speech detection module 108) that at least a threshold amount of time has passed since an audio input that includes speech has been obtained.); 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber, Aggarwal, Goldstein and Dusan to include the teachings of  Basye, motivation being to manage device power (Basye [Abstract].)
With respect to claims 31, Gruber, Aggarwal, Goldstein do not teach
determining that the first data stream indicates that the user is speaking; 
determining that the second data stream indicates that the user is not speaking; 
and in accordance with determining that the first data stream indicates that the user is speaking and determining that the second data stream indicates that the user is not speaking, determining that the user is not speaking.  "
Dusan teaches 
determining that the first data stream indicates that the user is speaking (Fig 3. Col 2 Acoustic Trigger Signal 212 row 3 shows microphone shows signal); 
determining that the second data stream indicates that the user is not speaking (Fig 3. Col 3 non-Acoustic Trigger Signal and row 2 shows stream indicating user not speaking); and 
in accordance with determining that the first data stream indicates that the user is speaking and determining that the second data stream indicates that the user is not speaking, determining that the user is not speaking.  (Fig 3 row 3 shows user not speaking, and [0046] Referring to FIG. 3, a table representing a combination of acoustic and non-acoustic triggers signals mapped to respective ASR trigger signals is shown in accordance with an embodiment. The table illustrates that acoustic trigger signal 212 and non-acoustic trigger signal 224 may have corresponding high or low digital signals (0 or 1 binary signals) depending on an event. A combination 302 of the trigger signals can be an output of an AND gate implemented by processor 214. The combination 302 may correspond to ASR trigger signal 202 sent by ASR triggering system 100 to the primary ASR server 200, and may be a high or low digital signal. Thus, processor 214 may generate ASR trigger signal 202 (or may output ASR trigger signal 202 as a binary “1” output) when acoustic trigger signal 212 and non-acoustic trigger signal 224 are simultaneously high digital signals. Similarly, when one or more acoustic trigger signal 212 or non-acoustic trigger signal 224 are low digital signals, processor 214 may not generate ASR trigger signal 202 (or may output ASR trigger signal 202 as a binary “0” output)) 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber, Aggarwal, Goldstein to include the teachings of Dusan, motivation being to use  non-acoustic signals to avoid false triggers that drain device power and frustrate the user (Dusan [Summary].)

Claims 35 and 36 are rejected under 35 U.S.C. 103 as being unpatentable over Gruber in view of Aggarwal in further view of Nurmi in further view of Zhang (US 20130326576 A1).
With respect to claim 35 Gruber and Aggarwal do not teach
causing the first output to be provided includes causing a second external electronic device to provide the first output, the second external electronic device including the one or more sensors; 
obtaining the one or more data streams includes obtaining the one or more data streams from the second external electronic device; and
causing the second output to be provided includes causing the second external electronic device to provide the second output.
0021] During play, at a digital signal-processing step 30, the mobile audio device receives or reads from memory a digital signal including audio content. At a digital signal-processing step 31, this digital signal is processed and sent to a digital/analog converter. At the conversion step 32, the digital/analog converter converts the digital signal to analog, and this analog signal is sent to the earpiece 12. During the analog signal reception step 33, the earpiece receives the signal and converts the signal to sound waves.); 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber, Aggarwal to include the teachings of Nurmi, motivation being to conserve power after non-use for a predetermined amount of time (Nurmi [0027].)
Gruber, Aggarwal and Nurmi do not teach 
obtaining the one or more data streams includes obtaining the one or more data streams from the second external electronic device; and
causing the second output to be provided includes causing the second external electronic device to provide the second output.  
Zhang teaches obtaining the one or more data streams includes obtaining the one or more data streams from the second external electronic device ([0002] The second playback device starts receiving content data that begins from the synchronized playback start time from a server. After the data that is memorized in a memory reaches a specified volume, the second playback device transmits a playback preparation completion notification to the first playback device); and 
causing the second output to be provided includes causing the second external electronic device to provide the second output ([0002] Upon receiving this command, the second playback device starts playing back content data that begins from the synchronized playback start time.). 

With respect to claim 36 Gruber does not teach receives, from the electronic device, the data representing the received speech input requesting performance of the task associated with the notification; determines whether the speech input is associated with an intent to perform an action associated the notification; and in accordance with determining that the speech input is associated with an intent to perform an action associated with the notification: initiates the task; and   sends the indication that the task has been initiated to the electronic device.  
Aggarwal teaches receives, from the electronic device, the data representing the received speech input requesting performance of the task associated with the notification ([0006] The spoken utterance can be, “Assistant, when do I need to buy new tires?”); 
determines whether the speech input is associated with an intent to perform an action associated the notification ([006] The server device can process audio data corresponding to the spoken utterance and, based on the audio data and version information, determine that an action (e.g., retrieving data from four fluid sensors) corresponding to the intent is only supported by later versions (e.g., the second version) of the vehicle computing device.); and 
in accordance with determining that the speech input is associated with an intent to perform an action associated with the notification: initiates the task; and   sends the indication that the task has been initiated to the electronic device ([0006] The server device can process audio data corresponding to the spoken utterance and, based on the audio data and version information, determine that an action (e.g., retrieving data from four fluid sensors) corresponding to the intent is only supported by later versions (e.g., the second version) of the vehicle computing device. In response to the server device receiving the audio data and version information from the vehicle computing device, the server device can generate instructions and/or data for providing to the vehicle computing device in order to cause the vehicle computing device to no longer transmit audio data corresponding to similar types of intent requests to the server device. )  
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify the teachings of Gruber to include the teachings of Aggarwal, motivation being A server device can be responsive to a particular intent request despite the intent request being associated with an action that computing device cannot execute.. (Aggarwal [Abstract]). 


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ATHAR N PASHA whose telephone number is (408)918-7675.  The examiner can normally be reached on Monday-Thursday Alternate Fridays, 7:30-4:30 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571)272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact 






/A.N.P./               Examiner, Art Unit 2657                                                                                                                                                                                         
/Paras D Shah/               Primary Examiner, Art Unit 2659                                                                                                                                                                                         
04/16/2021