DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
           The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
           Applicant's arguments filed 5/9/22 have been fully considered but they are not persuasive. 
          Regarding the 35 U.S.C. 103 rejection of Claims 1, 8 and 15 with references Sun and Kusano, Applicant argues that the cited portions of Sun fail to disclose to “use a first algorithm to detect whether the captured audio content may include a wake phrase, wherein the wake phrase consists of a set of one or more words” and further does not teach to “use a second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content” as a result, and thus argues that Sun fails to disclose limitations “us[ing] a first algorithm to detect whether the captured audio content may include a wake phrase, wherein the wake phrase consists of a set of one or more words that cause the network microphone device to provide at least a portion of the captured audio content to a voice service corresponding to the wake phrase,” (ii) “us[ing] a second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content,” and (iii) “when the second algorithm confirms that the first algorithm detected the wake phrase in captured audio content, transmit[ting], via the network interface, at least a portion of the captured audio content to the voice service corresponding to the wake phrase.” (Amendment, pg. 16, sec. IV – pg. 18, second para).           Examiner respectfully disagrees as Sun discloses using a front end detecting circuit 310 in a first detection phase (i.e. a first algorithm) to analyze received voice to determine if it contains word “Hi” of eventual keyword “Hi Patent” (para. [0044]), and after the front end detecting circuit 310 confirms (para. [0044]) that the voice contains the voice of the word "Hi" (i.e., limitation to “use a first algorithm to detect whether the captured audio content may include a wake phrase, wherein the wake phrase consists of a set of one or more words”), Sun’s system uses as second detection phase with a speech recognition processor 320 (i.e., a second algorithm) to judge whether the voice is the voice of the keyword "Hi-Patent" (para. [0047]), corresponding to limitations to “use a first algorithm to detect whether the captured audio content may include a wake phrase, wherein the wake phrase consists of a set of one or more words” and to “use a second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content” since confirming that the voice/audio signal includes subword "Hi" as part of judging whether the voice signal includes "Hi-Patent" as implying confirming the audio content includes the at least one wake word. What Sun does not explicitly disclose included limitation “wherein the wake phrase consists of a set of one or more words that cause the microphone device to provide at least a portion of the captured audio content to a voice service corresponding to the wake phrase”.
Applicant’s arguments with respect to Claims 1, 8 and 15 and Sun, Kusano and additional references Shires and Wood not disclosing limitations “us[ing] a first algorithm to detect whether the captured audio content may include a wake phrase, wherein the wake phrase consists of a set of one or more words that cause the network microphone device to provide at least a portion of the captured audio content to a voice service corresponding to the wake phrase,” (ii) “us[ing] a second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content,” and (iii) “when the second algorithm confirms that the first algorithm detected the wake phrase in captured audio content, transmit[ting], via the network interface, at least a portion of the captured audio content to the voice service corresponding to the wake phrase” (Amendment, pg. 16, sec. IV – pg. 19, sec. C) have been considered but are moot in light of new grounds of rejection with reference Tulli as presented below.

Response to Amendment
            The prior 35 U.S.C. 112 rejection of claims 1-20 (12/7/21) is hereby withdrawn in light of amendments to the independent claims.

Claim Objections
Claim 9 is objected to because of the following informalities:  “wherein using the second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content comprises activating a first wake-word wake phrase engine and a second wake-word wake phrase engine, wherein the activated first wake-word wake phrase engine, when activated, is configured to use the second algorithm to determine whether the captured audio content includes a first wake-word wake phrase, and wherein the second wake-word wake phrase engine, when activated, is configured to use the second algorithm to determine whether the captured audio content includes a second wake-word wake phrase.”.  should be “wherein using the second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content comprises activating a first  wake phrase engine and a second  wake phrase engine, wherein the activated first  wake phrase engine, when activated, is configured to use the second algorithm to determine whether the captured audio content includes a first  wake phrase, and wherein the second  wake phrase engine, when activated, is configured to use the second algorithm to determine whether the captured audio content includes a second  wake phrase.”. Appropriate correction is required.


Claim Rejections - 35 USC § 103
               In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
           The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


1.         Claims 1, 2, 6, 8, 10, 13, 15, 17 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Sun US PGPUB 2016/0171976 A1 (“Sun”) in view of Tulli US PGPUB 2018/0330727 A1 (“Tulli”)
         Per Claim 1, Sun discloses a microphone device comprising: 
             one or more microphones (para. [0039]); 
             one or more processors (para. [0029]); 
            at least one tangible, non-transitory, computer-readable medium (para. [0040]-[0041]);    
           capture audio content via the one or more microphones (para. [0039]; para. [0041]); 
           use a first algorithm to detect whether the captured audio content may include a wake phrase, wherein the wake phrase consists of a set of one or more words (para. [0034]-[0036]; the front end detecting circuit 310 is in a first detection phase to judge whether the voice signal Sa contains the voice of the subword "Hi" according to the subword model parameters…, para. [0044]; para. [0046]); 
           after detecting via the first algorithm, that the captured audio content may include the wake phrase, use a second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content, wherein the second algorithm is more computationally intensive than the first algorithm (para, [0013]; If the front end detecting circuit 310 confirms that the voice signal Sa contains the voice of the subword "Hi", the front end detecting circuit 310 generates a first interrupt signal INT1 to the speech recognition processor 320…, para. [0044]; In the second detection phase…the speech recognition processor 320 judges whether the voice signal Sa is the voice of the keyword "Hi-Patent" according to the keyword model parameters…, para. [0047], confirming that the voice/audio signal includes subword "Hi" as part of judging whether the voice signal includes "Hi-Patent" as implying confirming the audio content includes the at least one wake word, extra processing of keyword “Patent” and utilization of more resources as implying more computationally intensive algorithm than the first algorithm); and 
          when the second algorithm does not confirm that the first algorithm detected the wake phrase in the captured audio content, cease further processing of the detected audio content (para. [0047]; if the speech recognition processor 320 judges that the voice signal Sa is not the voice of the keyword "Hi-Patent", the speech recognition processor 320 does not assert the second interrupt signal INT2 to the main processor 330 and the speech recognition processor 320 is disabled again…, para. [0048], disabling processing as ceasing processing of the detected audio content)
           Sun does not explicitly disclose a network interface, program instructions stored on the at least one tangible, non-transitory computer-readable medium that are executable by the one or more processors wherein the wake phrase consists of a set of one or more words that cause the microphone device to provide at least a portion of the captured audio content to a voice service corresponding to the wake phrase or when the second algorithm confirms that the first algorithm detected the wake phrase in the captured audio content, transmit via the network interface, at least a portion of the captured audio content to the voice service corresponding to the wake phrase;
            However, these features are taught by Tulli:
            a network interface (Abstract; para. [0032])
           program instructions stored on the at least one tangible, non-transitory computer-readable medium that are executable by the one or more processors (para. [0028]-[0029]) such that the network microphone device is configured to
            wherein the wake phrase consists of a set of one or more words that cause the microphone device to provide at least a portion of the captured audio content to a voice service corresponding to the wake phrase (the first processor placing a copy of the processed received audio signal into a circular buffer of a preselected size and the first processor executing a first voice recognition algorithmic model to detect the presence of a predefined wake word…, para. [0012]); 
          when the second algorithm confirms that the first algorithm detected the wake phrase in the captured audio content, transmit via the network interface, at least a portion of the captured audio content to the voice service corresponding to the wake phrase (if the second voice recognition algorithmic model determines that the predefined wake word is present in the second buffer, then forwarding the contents of the second buffer and the third buffer to a voice processing service…, para. [0012]):
            It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to combine the teachings of Tulli with Sun in arriving at limitations “a network interface, program instructions stored on the at least one tangible, non-transitory computer-readable medium that are executable by the one or more processors wherein the wake phrase consists of a set of one or more words that cause the microphone device to provide at least a portion of the captured audio content to a voice service corresponding to the wake phrase or when the second algorithm confirms that the first algorithm detected the wake phrase in the captured audio content, transmit via the network interface, at least a portion of the captured audio content to the voice service corresponding to the wake phrase, because such combination would have resulted in reducing average overall power consumption in devices communicating with services (Tulli, Abstract; para. [0041]).
           Per Claim 2, Sun in view of Tulli discloses the network microphone device of claim 1, 
               Sun discloses the microphone device is configured to, before using the first algorithm to detect whether the captured audio content may include a wake phrase perform the wake-word detection process, use a voice activity detection algorithm to determine whether the captured audio content includes voice activity, wherein the microphone device is configured to use the first algorithm to detect whether the captured audio content may include a wake phrase in response to determining that the captured audio content includes voice activity (fig. 4A; para. [0052])
            Tulli discloses program instructions stored on the at least one tangible, non-transitory computer-readable medium that are executable by the one or more processors such that the network microphone device is configured to (para. [0029])
           Tulli discloses wherein the network microphone device and the network microphone device is configured to use the first algorithm (Abstract; para. [0012]; para. [0026]).
           Per Claim 6, Sun in view of Tulli discloses the network microphone device of claim 1, 
               Tulli discloses wherein the network interface comprises one or more network communication components in a disabled state, and wherein the program instructions that are executable by the one or more processors such that the network microphone device is configured to transmit, via the network interface, at least a portion of the captured audio content to the voice service corresponding to the wake phrase when the second algorithm confirms that the first algorithm detected the wake phrase in the captured audio content comprise program instructions that are executable by the one or more processors such that the network microphone device is configured to: enable the one or more network communication components (Abstract; if the second voice recognition algorithmic model determines that the predefined wake word is present in the second buffer, then forwarding the contents of the second buffer and the third buffer to a voice processing service…, para. [0012]; a second processor which can operate in at least a low power/low clock rate mode and a high power/high clock rate mode. When the first processor determines the presence of the wake word, it causes the second processor to switch to the high power/high clock rate mode and to execute a tight algorithmic model …, para. [0015]; para. [0028]-[0029]; para. [0032], low power state of second processor as suggesting disabled state); and
            use the enabled one or more network communication components to transmit the at least a portion of the captured audio content to the voice service corresponding to the wake phrase (if the second voice recognition algorithmic model determines that the predefined wake word is present in the second buffer, then forwarding the contents of the second buffer and the third buffer to a voice processing service located remote from the computerized device…, para. [0012]; para. [0015]; para. [0032]; para. [0078]-[0080]).
            Per Claim 8, Sun discloses a tangible, non-transitory, computer-readable media storing instructions executable by one or more processors, cause a microphone device to perform operations (para. [0040]-[0041]) comprising: 
              capturing audio content via one or more microphones of the microphone device (para. [0039]; para. [0041]); 
              using a first algorithm to detect whether the captured audio content may include a wake phrase (para. [0034]-[0036]; the front end detecting circuit 310 is in a first detection phase to judge whether the voice signal Sa contains the voice of the subword "Hi" according to the subword model parameters…, para. [0044]; para. [0046]); 
           after detecting via the first algorithm, that the captured audio content may include the wake phrase, using a second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content, wherein the second algorithm is more computationally intensive than the first algorithm (para, [0013]; If the front end detecting circuit 310 confirms that the voice signal Sa contains the voice of the subword "Hi", the front end detecting circuit 310 generates a first interrupt signal INT1 to the speech recognition processor 320…, para. [0044]; In the second detection phase…the speech recognition processor 320 judges whether the voice signal Sa is the voice of the keyword "Hi-Patent" according to the keyword model parameters…, para. [0047], confirming that the voice/audio signal includes subword "Hi" as part of judging whether the voice signal includes "Hi-Patent" as implying confirming the audio content includes the at least one wake word, extra processing of keyword “Patent” and utilization of more resources as implying more computationally intensive algorithm than the first algorithm); and 
          when the second algorithm does not confirm that the first algorithm detected the wake phrase in the captured audio content, ceasing further processing of the detected audio content (para. [0047]; if the speech recognition processor 320 judges that the voice signal Sa is not the voice of the keyword "Hi-Patent", the speech recognition processor 320 does not assert the second interrupt signal INT2 to the main processor 330 and the speech recognition processor 320 is disabled again…, para. [0048], disabling processing as ceasing processing of the detected audio content)
           Sun does not explicitly disclose a network microphone device, wherein the wake phrase consists of a set of one or more words that cause the microphone device to provide at least a portion of the captured audio content to a voice service corresponding to the wake phrase or when the second algorithm confirms that the first algorithm detected the wake phrase in the captured audio content, transmitting via a network interface of the network microphone device, at least a portion of the captured audio content to the voice service corresponding to the wake phrase
            However, these features are taught by Tulli:
            a network microphone device (para. [0026])
            wherein the wake phrase consists of a set of one or more words that cause the microphone device to provide at least a portion of the captured audio content to a voice service corresponding to the wake phrase (the first processor placing a copy of the processed received audio signal into a circular buffer of a preselected size and the first processor executing a first voice recognition algorithmic model to detect the presence of a predefined wake word…, para. [0012]); 
          when the second algorithm confirms that the first algorithm detected the wake phrase in the captured audio content, transmitting via a network interface of the network microphone device, at least a portion of the captured audio content to the voice service corresponding to the wake phrase (if the second voice recognition algorithmic model determines that the predefined wake word is present in the second buffer, then forwarding the contents of the second buffer and the third buffer to a voice processing service…, para. [0012]):
            It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to combine the teachings of Tulli with Sun in arriving at limitations “a network microphone device “, “wherein the wake phrase consists of a set of one or more words that cause the microphone device to provide at least a portion of the captured audio content to a voice service corresponding to the wake phrase” and “when the second algorithm confirms that the first algorithm detected the wake phrase in the captured audio content, transmitting via a network interface of the network microphone device, at least a portion of the captured audio content to the voice service corresponding to the wake phrase”, because such combination would have resulted in reducing average overall power consumption in devices communicating with services (Tulli, Abstract; para. [0041]).      
          Per Claim 10, Sun in view of Tulli discloses the tangible, non-transitory, computer-readable media of claim 8, 
             Sun discloses before using the first algorithm to detect whether the captured audio content may include a wake phrase perform the wake-word detection process, using a voice activity detection algorithm to determine whether the captured audio content includes voice activity, wherein the microphone device is configured to use the first algorithm to detect whether the captured audio content may include a wake phrase in response to determining that the captured audio content includes voice activity (fig. 4A; para. [0052])
          Tulli discloses wherein the network microphone device is configured to use the first algorithm (Abstract; para. [0012]; para. [0026]).
         Per Claim 13, Sun in view of Tulli discloses the tangible, non-transitory, computer-readable media of claim 8, 
            Tulli discloses wherein the network interface comprises one or more network communication components in a disabled state, transmitting, via the network interface, at least a portion of the captured audio content to the voice service corresponding to the wake phrase when the second algorithm confirms that the first algorithm detected the wake phrase in the captured audio content comprises: enabling the one or more network communication components (Abstract; if the second voice recognition algorithmic model determines that the predefined wake word is present in the second buffer, then forwarding the contents of the second buffer and the third buffer to a voice processing service…, para. [0012]; a second processor which can operate in at least a low power/low clock rate mode and a high power/high clock rate mode. When the first processor determines the presence of the wake word, it causes the second processor to switch to the high power/high clock rate mode and to execute a tight algorithmic model …, para. [0015]; para. [0032], low power state of second processor as suggesting disabled state); and
            using the enabled one or more network communication components to transmit the at least a portion of the captured audio content to the voice service corresponding to the wake phrase (if the second voice recognition algorithmic model determines that the predefined wake word is present in the second buffer, then forwarding the contents of the second buffer and the third buffer to a voice processing service located remote from the computerized device…, para. [0012]; para. [0015]; para. [0032]; para. [0078]-[0080]).
        Per Claim 15, Sun discloses a method comprising: 
            capturing audio content via one or more microphones of a microphone device (Abstract; para. [0039]; para. [0041]); 
           using by the microphone device, a first algorithm to detect whether the captured audio content may include a wake phrase (para. [0034]-[0036]; the front end detecting circuit 310 is in a first detection phase to judge whether the voice signal Sa contains the voice of the subword "Hi" according to the subword model parameters…, para. [0044]; para. [0046]); 
           after detecting via the first algorithm, that the captured audio content may include the wake phrase, using a second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content, wherein the second algorithm is more computationally intensive than the first algorithm (para, [0013]; If the front end detecting circuit 310 confirms that the voice signal Sa contains the voice of the subword "Hi", the front end detecting circuit 310 generates a first interrupt signal INT1 to the speech recognition processor 320…, para. [0044]; In the second detection phase…the speech recognition processor 320 judges whether the voice signal Sa is the voice of the keyword "Hi-Patent" according to the keyword model parameters…, para. [0047], confirming that the voice/audio signal includes subword "Hi" as part of judging whether the voice signal includes "Hi-Patent" as implying confirming the audio content includes the at least one wake word, extra processing of keyword “Patent” and utilization of more resources as implying more computationally intensive algorithm than the first algorithm); and 
          when the second algorithm does not confirm that the first algorithm detected the wake phrase in the captured audio content, ceasing further processing of the detected audio content (para. [0047]; if the speech recognition processor 320 judges that the voice signal Sa is not the voice of the keyword "Hi-Patent", the speech recognition processor 320 does not assert the second interrupt signal INT2 to the main processor 330 and the speech recognition processor 320 is disabled again…, para. [0048], disabling processing as ceasing processing of the detected audio content)
           Sun does not explicitly disclose a network microphone device, wherein the wake phrase consists of a set of one or more words that cause the microphone device to provide at least a portion of the captured audio content to a voice service corresponding to the wake phrase or when the second algorithm confirms that the first algorithm detected the wake phrase in the captured audio content, transmitting via a network interface of the network microphone device, at least a portion of the captured audio content to the voice service corresponding to the wake phrase
            However, these features are taught by Tulli:
            a network microphone device (para. [0026])
            wherein the wake phrase consists of a set of one or more words that cause the microphone device to provide at least a portion of the captured audio content to a voice service corresponding to the wake phrase (the first processor placing a copy of the processed received audio signal into a circular buffer of a preselected size and the first processor executing a first voice recognition algorithmic model to detect the presence of a predefined wake word…, para. [0012]); 
          when the second algorithm confirms that the first algorithm detected the wake phrase in the captured audio content, transmitting via a network interface of the network microphone device, at least a portion of the captured audio content to the voice service corresponding to the wake phrase (if the second voice recognition algorithmic model determines that the predefined wake word is present in the second buffer, then forwarding the contents of the second buffer and the third buffer to a voice processing service…, para. [0012]):
            It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to combine the teachings of Tulli with Sun in arriving at limitations “a network microphone device “, “wherein the wake phrase consists of a set of one or more words that cause the microphone device to provide at least a portion of the captured audio content to a voice service corresponding to the wake phrase” and “when the second algorithm confirms that the first algorithm detected the wake phrase in the captured audio content, transmitting via a network interface of the network microphone device, at least a portion of the captured audio content to the voice service corresponding to the wake phrase”, because such combination would have resulted in reducing average overall power consumption in devices communicating with services (Tulli, Abstract; para. [0041]).
      Per Claim 17, Sun in view of Tulli discloses the method of claim 15, 
              Sun discloses before using the first algorithm to detect whether the captured audio content may include a wake phrase perform the wake-word detection process, using a voice activity detection algorithm to determine whether the captured audio content includes voice activity, wherein the microphone device is configured to use the first algorithm to detect whether the captured audio content may include a wake phrase in response to determining that the captured audio content includes voice activity (fig. 4A; para. [0052]) (fig. 4A; para. [0052])
             Tulli discloses wherein the network microphone device is configured to use the first algorithm (Abstract; para. [0012]; para. [0026])
        Per Claim 19, Sun in view of Tulli discloses the method of claim 15, 
             Tulli discloses wherein when the network interface comprises one or more network communication components in a disabled state, transmitting, via the network interface, at least a portion of the captured audio content to the voice service corresponding to the wake phrase when the second algorithm confirms that the first algorithm detected the wake phrase in the captured audio content comprises: enabling the one or more network communication components (Abstract; if the second voice recognition algorithmic model determines that the predefined wake word is present in the second buffer, then forwarding the contents of the second buffer and the third buffer to a voice processing service…, para. [0012]; a second processor which can operate in at least a low power/low clock rate mode and a high power/high clock rate mode. When the first processor determines the presence of the wake word, it causes the second processor to switch to the high power/high clock rate mode and to execute a tight algorithmic model …, para. [0015]; para. [0032], low power state of second processor as suggesting disabled state); and
            using the enabled one or more network communication components to transmit the at least a portion of the captured audio content to the voice service corresponding to the wake phrase (if the second voice recognition algorithmic model determines that the predefined wake word is present in the second buffer, then forwarding the contents of the second buffer and the third buffer to a voice processing service located remote from the computerized device…, para. [0012]; para. [0015]; para. [0032]; para. [0078]-[0080]).

2.     Claim 4, 5, 11, 12 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Sun in view of Tulli as applied to claims 1, 8 and 15 above, and further in view of Shires US PGPUB 2014/0229184 A1 (“Shires”)
         Per Claim 4, Sun in view of Tulli discloses the network microphone device of claim 3,
              Sun discloses wherein a first microphone of the one or more microphones is in an enabled state (para. [0039]) 
             to capture the audio content via the one or more microphones comprises to capture first audio content via the enabled first microphone (para. [0039]; para. [0041])
             wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first microphone (para. [0070])
             Tulli discloses a second microphone of the one or more microphones is in a disabled state wherein the program instructions that are executable by the one or more processors such that the network microphone device is configured to: comprises program instructions that are executable by the one or more processors such that the network microphone device is configured to capture the audio content via the one or more microphones comprise program instructions that are executable by the one or more processors such that the network microphone device is configured to capture first audio content via the enabled first microphone (Abstract; if the second voice recognition algorithmic model determines that the predefined wake word is present in the second buffer, then forwarding the contents of the second buffer and the third buffer to a voice processing service…, para. [0012]; a second processor which can operate in at least a low power/low clock rate mode and a high power/high clock rate mode. When the first processor determines the presence of the wake word, it causes the second processor to switch to the high power/high clock rate mode and to execute a tight algorithmic model …, para. [0015]; para. [0032], low power state of second processor as suggesting disabled state), and 
              Sun in view of Tulli does not explicitly disclose wherein the program instructions stored on the at least one tangible, non-transitory computer-readable medium comprise further program instructions that are executable by the one or more processors such that the network microphone device is configured to: responsive to determining that the captured audio content includes voice activity, enable the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones
             However, this feature is taught by Shires:
             wherein the program instructions stored on the at least one tangible, non-transitory computer-readable medium comprise further program instructions that are executable by the one or more processors such that the network microphone device is configured to: responsive to determining that the captured audio content includes voice activity, enable the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones (para. [0013]-[0014]; The recognition results from the received main device input signals and the received secondary device signals may be compared to one another to confirm a recognition result…, para. [0025]; claim 16)
           At the time of the effective filing of the invention, it would have been obvious to one of ordinary skill in the art to combine the teachings of Shires with the device Sun in view of Tulli in arriving at limitation “wherein the program instructions stored on the at least one tangible, non-transitory computer-readable medium comprise further program instructions that are executable by the one or more processors such that the network microphone device is configured to: responsive to determining that the captured audio content includes voice activity, enable the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones”, because such combination would have resulted in conserving power usage (Shires, para. [0020]).
          Per Claim 5, Sun in view of Tulli discloses the network microphone device of claim 1, 
             Sun discloses wherein a first microphone of the one or more microphones is in an enabled state (para. [0039]) 
             wherein to capture the audio content via the one or more microphones comprises to capture first audio content via the enabled first microphone (para. [0039]; para. [0041])
              Sun in view of Tulli does not explicitly disclose wherein a second microphone of the one or more microphones is in a disabled state, wherein the program instructions that are executable by the one or more processors such that the network microphone device is configured to capture the audio content via the one or more microphones comprise program instructions that are executable by the one or more processors such that the network microphone device is configured to capture first audio content via the enabled first microphone or wherein the program instructions stored on the at least one tangible, non-transitory computer-readable medium comprise further program instructions that are executable by the one or more processors such that the network microphone device is configured to: responsive to determining that the captured audio content includes the wake phrase, enable the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones
            However, these features are taught by Shires:         
            wherein a second microphone of the one or more microphones is in a disabled state (para. [0013]-[0014]; para. [0022]; claim 16),
           wherein the program instructions that are executable by the one or more processors such that the network microphone device is configured to capture the audio content via the one or more microphones comprise program instructions that are executable by the one or more processors such that the network microphone device is configured to capture first audio content via the enabled first microphone (para. [0032]; para. [0036]), and 
            wherein the program instructions stored on the at least one tangible, non-transitory computer-readable medium comprise further program instructions that are executable by the one or more processors such that the network microphone device is configured to: responsive to determining that the captured audio content includes the wake phrase, enable the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones (para. [0013]-[0014]; para. [0022]; In response to receiving the trigger signals, the secondary devices 210A-F may transition from a low power or "sleep" state and transition to a higher power state …The recognition results from the received main device input signals and the received secondary device signals may be compared to one another to confirm a recognition result…, para. [0025]; claim 16)
           It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to combine the teachings of Shires with the device Sun in view of Tulli in arriving at limitations “wherein a second microphone of the one or more microphones is in a disabled state”, ”wherein the program instructions that are executable by the one or more processors such that the network microphone device is configured to capture the audio content via the one or more microphones comprise program instructions that are executable by the one or more processors such that the network microphone device is configured to capture first audio content via the enabled first microphone” and “wherein the program instructions stored on the at least one tangible, non-transitory computer-readable medium comprise further program instructions that are executable by the one or more processors such that the network microphone device is configured to: responsive to determining that the captured audio content includes the wake phrase, enable the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones”, because such combination would have resulted in conserving power usage (Shires, para. [0020]).       
          Per Claim 11, Sun in view of Tulli discloses the tangible, non-transitory, computer-readable media of claim 10,
            Sun discloses wherein when a first microphone of the one or more microphones is in an enabled state (para. [0039]) 
             capturing the audio content via the one or more microphones comprises capturing first audio content via the enabled first microphone (para. [0039]; para. [0041]; para. [0070])
             Tulli discloses wherein a second microphone of the one or more microphones is in a disabled state (Abstract; if the second voice recognition algorithmic model determines that the predefined wake word is present in the second buffer, then forwarding the contents of the second buffer and the third buffer to a voice processing service…, para. [0012]; a second processor which can operate in at least a low power/low clock rate mode and a high power/high clock rate mode. When the first processor determines the presence of the wake word, it causes the second processor to switch to the high power/high clock rate mode and to execute a tight algorithmic model …, para. [0015]; para. [0032], low power state of second processor as suggesting disabled state), and 
              Sun in view of Tulli does not explicitly disclose responsive to determining that the captured audio content includes voice activity, enabling the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones
             However, this feature is taught by Shires:
             responsive to determining that the captured audio content includes voice activity, enabling the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones (para. [0013]-[0014]; The recognition results from the received main device input signals and the received secondary device signals may be compared to one another to confirm a recognition result… para. [0025])
           At the time of the effective filing of the invention, it would have been obvious to one of ordinary skill in the art to combine the teachings of Shires with the device Sun in view of Tulli in arriving at limitation “responsive to determining that the captured audio content includes voice activity, enabling the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones”, because such combination would have resulted in conserving power usage (Shires, para. [0020]).
        Per Claim 12, Sun in view of Tulli discloses the tangible, non-transitory, computer-readable media of claim 8,
             Sun discloses wherein a first microphone of the one or more microphones is in an enabled state (para. [0039]) 
             wherein capturing the audio content via the one or more microphones comprises capturing first audio content via the enabled first microphone (para. [0039]; para. [0041])
            Sun in view of Tulli does not explicitly disclose wherein a second microphone of the one or more microphones is in a disabled state, or responsive to determining that the captured audio content includes the wake phrase, enabling the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones
            However, these features are taught by Shires:         
             wherein a second microphone of the one or more microphones is in a disabled state (para. [0013]-[0014]; para. [0022]; claim 16),
            responsive to determining that the captured audio content includes the wake phrase, enabling the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones (para. [0013]-[0014]; para. [0022]; In response to receiving the trigger signals, the secondary devices 210A-F may transition from a low power or "sleep" state and transition to a higher power state …The recognition results from the received main device input signals and the received secondary device signals may be compared to one another to confirm a recognition result…, para. [0025]; claim 16)
           It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to combine the teachings of Shires with the device Sun in view of Tulli in arriving at limitations “wherein a second microphone of the one or more microphones is in a disabled state, or responsive to determining that the captured audio content includes the wake phrase, enabling the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones”, because such combination would have resulted in conserving power usage (Shires, para. [0020]).
        Per Claim 18, Sun in view of Tulli discloses the method of claim 15, 
                Sun disclose wherein a first microphone of the one or more microphones is in an enabled state (para. [0039])  
               capturing the audio content via the one or more microphones comprises capturing first audio content via the enabled first microphone (para. [0039]; para. [0041])
               Tulli discloses wherein a second microphone of the one or more microphones is in a disabled state (Abstract; if the second voice recognition algorithmic model determines that the predefined wake word is present in the second buffer, then forwarding the contents of the second buffer and the third buffer to a voice processing service…, para. [0012]; a second processor which can operate in at least a low power/low clock rate mode and a high power/high clock rate mode. When the first processor determines the presence of the wake word, it causes the second processor to switch to the high power/high clock rate mode and to execute a tight algorithmic model …, para. [0015]; para. [0032], low power state of second processor as suggesting disabled state),
             Sun in view of Tulli does not explicitly disclose responsive to determining that the captured audio content includes (i) voice activity or (ii) the wake phrase, enabling the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones
             However, this feature is taught by Shires:
             responsive to determining that the captured audio content includes (i) voice activity or (ii) the wake phrase, enabling the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones (para. [0013]-[0014]; The recognition results from the received main device input signals and the received secondary device signals may be compared to one another to confirm a recognition result…, para. [0025]; claim 16)
           It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to combine the teachings of Shires with the device Sun in view of Tulli in arriving at limitation “responsive to determining that the captured audio content includes (i) voice activity or (ii) the wake phrase, enabling the disabled second microphone, wherein capturing the audio content via the one or more microphones further comprises capturing second audio content via the enabled first and second microphones”, because such combination would have resulted in conserving power usage (Shires, para. [0020]).

3.     Claims 3, 9, 16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Sun in view of Tulli as applied to claims 1, 8 and 15 above, and further in view of Wood et al US 2019/0066687 A1 (“Wood”)
        Per Claim 3, Sun in view of Tulli discloses the network microphone device of claim 1, 
            Tulli discloses: wherein the program instructions that are executable by the one or more processors such that the network microphone device is configured to use the second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content comprise program instructions that are executable by the one or more processors such that the network microphone device is configured to activate a first wake phrase engine and a second wake phrase engine, wherein the activated first wake phrase engine is configured to use the second algorithm to determine whether the captured audio content includes a first wake phrase (Abstract; para. [0012]; para. [0028]-[0029]), and
             Sun in view of Tulli does not explicitly disclose wherein the activated second wake phrase engine is configured to use the second algorithm to determine whether the captured audio content includes a second wake phrase that is different than the first wake phrase
            However, this feature is taught by Wood (In some embodiments, user interface and command module 128 may perform trigger word detection for a single trigger word. …, para. [0072]-[0073]; In some other embodiments, user interface and command module 128 may perform trigger word detection for multiple trigger words. For example, user interface and command module 128 may perform trigger word detection for the trigger words “Hey Roku” and “OK Google.” In some embodiments, different trigger words may correspond to different digital assistants 180…, para. [0074]; voice platform 192 may perform a secondary trigger word detection…, para. [0087], secondary trigger word detection as including performing trigger word detection for multiple trigger words)
            It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to combine the teachings of Wood with the device Sun in view of Tulli in arriving at limitation “wherein the activated second wake phrase engine is configured to use the second algorithm to determine whether the captured audio content includes a second wake phrase that is different than the first wake phrase”, because such combination would have resulted in improving trigger word detection accuracy (Wood, para. [0087])
           Per Claim 9, Sun in view of Tulli discloses the tangible, non-transitory, computer-readable media of claim 8,
             Tulli discloses: wherein using the second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content comprises activating a first wake phrase engine and a second wake phrase engine, wherein the activated first wake phrase engine is configured to use the second algorithm to determine whether the captured audio content includes a first wake phrase (Abstract; para. [0012]), and
           Sun in view of Tulli does not explicitly disclose wherein the second wake phrase engine, when activated is configured to use the second algorithm to determine whether the captured audio content includes a second wake phrase
            However, this feature is taught by Wood (In some embodiments, user interface and command module 128 may perform trigger word detection for a single trigger word. …, para. [0072]-[0073]; In some other embodiments, user interface and command module 128 may perform trigger word detection for multiple trigger words. For example, user interface and command module 128 may perform trigger word detection for the trigger words “Hey Roku” and “OK Google.” In some embodiments, different trigger words may correspond to different digital assistants 180…, para. [0074]; voice platform 192 may perform a secondary trigger word detection…, para. [0087], secondary trigger word detection as including performing trigger word detection for multiple trigger words)
            It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to combine the teachings of Wood with the device Sun in view of Tulli in arriving at limitation “wherein the second wake phrase engine, when activated is configured to use the second algorithm to determine whether the captured audio content includes a second wake phrase”, because such combination would have resulted in improving trigger word detection accuracy (Wood, para. [0087]).
         Per Claim 16, Sun in view of Tulli discloses the method of claim 15,
            Tulli discloses: wherein using the second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content comprises activating a first wake phrase engine and a second wake phrase engine, wherein the activated first wake phrase engine is configured to use the second algorithm to determine whether the captured audio content includes a first wake phrase (Abstract; para. [0012]), and
           Sun in view of Tulli does not explicitly disclose wherein the second wake phrase engine, when activated is configured to use the second algorithm to determine whether the captured audio content includes a second wake phrase
            However, this feature is taught by Wood (In some embodiments, user interface and command module 128 may perform trigger word detection for a single trigger word. …, para. [0072]-[0073]; In some other embodiments, user interface and command module 128 may perform trigger word detection for multiple trigger words. For example, user interface and command module 128 may perform trigger word detection for the trigger words “Hey Roku” and “OK Google.” In some embodiments, different trigger words may correspond to different digital assistants 180…, para. [0074]; voice platform 192 may perform a secondary trigger word detection…, para. [0087], secondary trigger word detection as including performing trigger word detection for multiple trigger words)
            It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to combine the teachings of Wood with the device Sun in view of Tulli in arriving at limitation “wherein the second wake phrase engine, when activated is configured to use the second algorithm to determine whether the captured audio content includes a second wake phrase”, because such combination would have resulted in improving trigger word detection accuracy (Wood, para. [0087]).
         Per Claim 20, Sun in view of Tulli discloses the method of claim 15, 
           Tulli discloses wherein using a second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content comprises: confirming that (i) the captured audio content includes a first wake phrase via a first wake phrase engine (para. [0012])
          Sun in view of Tulli does not explicitly disclose wherein using a second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content comprises: confirming (ii) the captured audio content does not include a second wake phrase via a second wake phrase engine 
          However, this feature is taught by Wood (In some embodiments, user interface and command module 128 may perform trigger word detection for a single trigger word. …, para. [0072]-[0073]; In some other embodiments, user interface and command module 128 may perform trigger word detection for multiple trigger words. For example, user interface and command module 128 may perform trigger word detection for the trigger words “Hey Roku” and “OK Google.” In some embodiments, different trigger words may correspond to different digital assistants 180…, para. [0074]; voice platform 192 may perform a secondary trigger word detection…, para. [0087], secondary trigger word detection as including performing trigger word detection for multiple trigger words)
            It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to combine the teachings of Wood with the device Sun in view of Tulli in arriving at limitation “wherein using a second algorithm to confirm whether the first algorithm detected the wake phrase in the captured audio content comprises: confirming (ii) the captured audio content does not include a second wake phrase via a second wake phrase engine”, because such combination would have resulted in improving trigger word detection accuracy (Wood, para. [0087]).

Allowable Subject Matter
Claims 7 and 14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
          The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See PTO 892 form.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to OLUJIMI A ADESANYA whose telephone number is (571)270-3307. The examiner can normally be reached Monday-Friday 8:30-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/OLUJIMI A ADESANYA/Primary Examiner, Art Unit 2658