Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
1.	The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
DETAILED ACTION
Claim Rejections - 35 USC § 103
2.	Claims 1-2, 12-15, 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Daley (2018/0018965) in view of Naik et al. (US Patent 8,428,758) and further in view of Kirsch et al. (2015/0358730).
As to claim 1, Daley teaches a network interface (Fig. 1, interface to network 116); one or more microphones (Fig. 1, microphone 106); one or more speakers (loudspeaker 106); one or more processors ([0005]), the housing carrying at least the network interface, the one or more microphones, the audio stage, the one or more speakers ([0015]), the one or more processors, and data storage having stored therein instructions executable by the one or more processors to cause a device to perform functions comprising: while playing first audio in a given environment at a given loudness via the one or more speakers ([0011] – playing music turning the volume up and down, [0015] – speaker 106 is playing music); detecting within the recorded audio a wake word to invoke a voice assistant ([0004, 0012, 0017]; Fig. 1 – 110 “Hey Bose”); in response to detecting the wake word ducking the first audio ([0008, 0012]) ducking the first audio while recording audio representing a voice input to the voice assistant ([0012, 0015]) and sending the voice input to the voice assistant ([0012-0013]); receiving from the voice assistant in response to the voice input second audio representing a spoken response to the voice input ([0003] – in a VUI, a user speaks commands and the system responds by speaking back, [0013] – a VUI serves the role of a virtual personal assistant and provides information to a user); and receiving the second audio representing the spoken response to the voice input ([0011-0013, 0003]).
Daley does not explicitly discuss an audio stage comprises an amplifier; recording into the buffer audio representing a voice input to the voice assistant; and in response to the receiving the second audio, ducking the first audio while playing back the ducked first audio concurrently with the second audio. While Daley does not clearly discuss in response to the receiving the second audio ducking the first audio while play back the response, Daley teaches ducking the first audio while playing back the response during interaction between user so that the speaker’s own microphone can her the utterance and the system and the audio to be resumed when interaction is done between the user and the system (Fig. 1 – 110 “Hey Bose”, [0015]) and the system respond to a user speak commands by speaking back [0003]; hence Daley clearly suggests that the system responses by speaking back to the user with the second audio; and ducking the first audio while priming the voice user interface to receiver further input ([0012]) and the speaker’s own microphone can hear the utterance ([0015]). It would have been obvious to one of ordinary skill in the art to duck the first audio while playing back the response during interaction between user and the system and the audio to be resumed when interaction is done between the user and the system. 
Naik teaches a playback device comprising: a network interface (Fig. 2, network device 58); one or more microphones (col. 7, lines 36-40); an audio stage (Fig. 2, 42 and 62); one or more speakers (col. 7, lines 36-55); one or more processors (Fig. 2; processor(s) 50); a housing (Fig. 1, the phone), the housing carrying at least the network interface, the one or more microphones, the audio stage, the one or more speakers, the one or more processors, and data storage having stored therein instructions executable by the one or more processors to cause the playback device to perform functions (Figs. 1-2; at least col. 7, line 36 through col. 8, line 7) comprising: while playing back first audio in a given environment at a given loudness via the one or more speakers (col. 7, lines 43-62; col. 2, lines 12-23; col. 13, lines 56-60): recording, via the one or more microphones, audio into a buffer (col. 12, lines 36-43); ducking the first audio, into the buffer, audio representing a voice feedback (col. 1, lines 32-39); receiving, during playback in response to a feedback event, second audio stream includes a voice announcement pertaining to the primary audio stream and the primary audio data and the voice feedback data are analyzed to determine a loudness value (col. 2, lines 8-25; col. 9, line 64 through col. 10, line 19; col. 13, lines 43-52); and a ducking process in which a primary audio is ducked during playback in response to a feedback event – the feedback event is a request by a user to play associated voice feedback information which is a second audio response represent a spoken response come back from the system to the user (col. 19, lines 17-37 and col. 20, lines 18-43). while the primary reference Daley teaches ducking the first audio while playing back the response during interaction between user so that the speaker’s own microphone can her the utterance and the system and the audio to be resumed when interaction is done between the user and the system (Fig. 1 – 110 “Hey Bose”, [0015]) and the system respond to a user speak commands by speaking back [0003]; hence Daley clearly suggests that the system speaking back or response to the second audio representing the spoken response and it would have been obvious to one of ordinary skill in the art to duck the first audio while playing back the response during interaction between user and the system and the audio to be resumed when interaction is done between the user and the system; it would have been obvious ducking the first audio or reducing the volume of the first audio temporarily during a concurrent playback in which the voice feedback or second audio is mixed into the audio stream in order to improve from the viewpoint of a listener, as discussed by Naik in col. 1, lines 32-39.
Kirsch teaches adjusting the audio level of the playback audio signal then the first signal level exceeds the playback signal level by at least the target difference (at least abstract and [0033]); an audio stage (Fig. 1, voltage controlled amplifier 175) comprises an amplifier 175 ([0033-0035]).
It would have been obvious before the effective filing date of the claimed invention to incorporate the teachings of Naik and Kirsch into the teachings of Daley for the purpose of controlling the loudness of concurrently outputted audio streams by ducking the first audio while playing back the ducked first audio concurrently with the second audio in order to improve audio perceptibility from the viewpoint to a listener.
As to claims 2 and 15, Naik teaches ducking the firstMcDONNELL BOEHNEN HULBERT & BERGHOFF LLP300 SOUTH WACKER DRIVECHICAGO, ILLINOIS 60606TELEPHONE (312) 913-0001audio to a first volume level, and wherein ducking the first audio while playing back the ducked first audio concurrently with the second audio representing the spoken response to the voice input comprises ducking the first audio to a second volume level that is different from the first volume level (col. 1, lines 32-39; col. 9, line 64 through col. 10, line 19; col. 13, lines 43-52 – audio ducking techniques relies on maintaining a relative loudness difference between the primary and secondary media streams based upon loudness values associated with each of the primary and secondary media items in which the primary media item has a relatively low loudness, the secondary media item ducked instead in order to maintain the desire relative loudness difference, it would have been obvious when ducking the volume of the first audio in which the second media stream is being concurrently playback; the second volume of the first audio would be lower than the first volume level in order to improve audio perceptibility to a listener).
As to claim 12, Daley teaches the playback device of claim 1 wherein the functions further comprise detecting that the spoken response to the voice input has been responded ([0011]), the system continue to listen for wake words and if it hears one through the noise it will respond by ducking or reducing volume the first audio and priming the VUI to receive further input and the audio to be resumed when interaction is done between the user and the system ([0012, 0015]); and Naik teaches ducking the first audio during the period of simultaneous playback such that a relative loudness difference is generally maintained with respect to the loudness of the first and second audio streams (abstract); ducking one of the media files during the period of concurrently playback (col. 1, lines 6-10, lines 32-36, lines 40-43; col. 2, lines 18-23). It would have been obvious to duck the first audio until the spoken response to the voice input has been played back for the purpose of improve audio perceptibility to a listener; it would have been obvious that ducking the first audio until the spoken response to the voice input has been played back since the system continue listen for wake words and if it hears one through the noise then it will duck volume of the first audio in order to prime the VUI to receive further input.
As to claims 13 and 19, Daley teaches the device of claim 1 and the method of claim 14, sending the voice input to the voice assistant ([0012]). Naik teaches voice synthesis program implemented on a server associated with the digital media content provider (col. 12, line 16-21); and the memory used for buffering during operation of the device (col. 8, lines 32-33). It would have been obvious to send the audio in the memory buffer to the voice assistant for the purpose of priming the voice assistant to receive further input.
Claims 14 and 20 rejected for the same reasons discussed above with respect to claim 1. With respect to claim 20, Daley teaches a computer readable medium having stored therein instructions that when executed by one or more processors to perform the steps of the claim ([0019]); and Naik further teaches a non-transitory computer readable medium having stored therein instructions that when executed by one or more processors of a playback device cause the playback device to perform the steps of the claim (col. 8, lines 34-47; col. 27, lines 29-34).

3.	Claims 3-4, 6, 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Daley, Naik, and Kirsch in view of Rand (2016/0036962).
As to claims 3 and 16, Naik teaches ducking the firstMcDONNELL BOEHNEN HULBERT & BERGHOFF LLP300 SOUTH WACKER DRIVECHICAGO, ILLINOIS 60606TELEPHONE (312) 913-0001audio to a first volume level, and wherein ducking the first audio while playing back the ducked first audio concurrently with the second audio representing the spoken response to the voice input comprises ducking the first audio to a second volume level that is different from the first volume level (col. 1, lines 32-39; col. 9, line 64 through col. 10, line 19; col. 13, lines 43-52 – audio ducking techniques relies on maintaining a relative loudness difference between the primary and secondary media streams based upon loudness values associated with each of the primary and secondary media items in which the primary media item has a relatively low loudness, the secondary media item ducked instead in order to maintain the desire relative loudness difference, it would have been obvious when ducking the volume of the first audio in which the second media stream is being concurrently playback; the second volume of the first audio would be lower than the first volume level in order to improve audio perceptibility to a listener). Daley, Naik, and Kirsch do not explicitly discussed the playback device of claim 2 and the method of claim 15, wherein the functions further comprise determining a loudness of background noise in the given environment, and ducking the first audio to a particular volume level that is based on the loudness of background noise in the given environment.
Rand teaches controlling both the presence and level of ducking according to his or her preferences and to levels calculated based on background noise and other factors ([0009]); background processing to ensure that the relative loudness is proportional and auto ducking is used such that background audio will be ducked below its normal level when voice is detected and adjusting the relative level of voice vs. overall background audio ([0136]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claims invention to incorporate the teachings of Rand into the teachings of Daley, Naik, and Kirsch for the purpose of reducing the level of one audio signal by the presence of another and level of ducking to levels calculated based on background noise and other factors.
As to claims 4 and 17, Rand teaches the playback device of claim 3 and the method of claim 16, wherein ducking the first audio to the particular volume level that is based on the loudness of background noise in the given environment ([0009, 0136]) comprises ducking the first audio in proportion to a difference between the given loudness of the first audio and the determined loudness of the background noise and adjusting the relative level of voice vs. overall background audio ([0136]).
As to claim 6, Naik teaches loudness values measured in decibels (col. 14, line 63 through col. 15, line 15); and Rand teaches each device couples to a microphone, audio signals admitted through the microphones wherein signals are identified as either human voice or not, playing background audio from a source other than the device’s microphone (claim 53), microphones for recording the loudness of the background noise in the given environment and using sample of background noise through periodic activation and recording of the microphone and to set or change the voice activity detection threshold, and providing a delay at the beginning of a PTT message measure background noise ([0206]), it would have been obvious to modify Rand so that microphones is used for measuring the loudness of the recorded background noise in order to duck the first audio to a difference between the loudness of the first audio and the measured loudness of the background noise.

4.	Claim 11 is rejected under 35 U.S.C. 103(a) as being unpatentable over Daley, Naik, and Kirsch as applied to claim 1 in view of Odom (US Patent 5,740,260).
	As to claim 11, Naik teaches while playing back first audio in a given environment at a given loudness (col. 2, lines 17-18; col. 13, lines 56-60), and playing back the ducked first audio concurrently with the second audio (col. 10, lines 1-19).  Daley, Naik, and Kirsch do not teach a first playback device of a group of playback device and playing back the ducked first audio concurrently with the second audio in synchrony with one or more second playback devices.
	Odom teaches analog signal processors are more effective in equipment such as audio compressors, duckers, noise reduction systems, and the like (col. 2, lines 18-25); synchronizing outputs of digital and analog devices (col. 2, lines 47-53).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claim invention to incorporate the teachings of Odom into the teachings of Daley, Naik, and Kirsch for the purpose of having the entire ensemble of devices function as a single system, as discussed by Odom in col. 9, lines 10-15.
Allowable Subject Matter
5.	Claims 5, 7, 9, 18 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Claim 8 objected because it depends on objected claim 7. Claim 10 objected because it depends on objected claim 9.
Double Patenting
6.	The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

7.	Claim 1 rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 9,942,678. Although the claims at issue are not identical, they are not patentably distinct from each other because all the claimed limitations recited in the present application are transparently found in the U.S Patent 9,942,678 with obvious wording variations.
 U.S. Patent Application 16/806,747
U.S. Patent 9,942,678
A playback device comprising:
a network interface;
one or more microphones;
an audio stage comprising an amplifier;
one or more speakers;
one or more processors;
A playback device comprising:
a network interface;
one or more microphones;
an audio stage comprising an amplifier;
one or more speakers;
one or more processors;
a housing, the housing carrying at least the network interface, the one or more microphones, the audio stage, the one or more speakers, the one or more processors, and data storage having stored therein instructions executable by the one or more processors to cause the playback device to perform functions comprising:
a housing, the housing carrying at least the network interface, the one or more microphones, the audio stage, the one or more speakers, the one or more processors, and computer readable media having stored therein instructions executable by the one or more processors to cause the playback device to perform operations comprising:
while playing back first audio in a given environment at a given loudness via the audio stage and the one or more speakers: 
while playing back first audio in a given environment at a given loudness via the audio stage and the one or more speakers: 
recording, via the one or more microphones, audio into a buffer;
capturing, via the one or more microphones, a voice input;
detecting, within the recorded audio, a wake word to invoke a voice assistant;
determining that the captured voice input incudes audio data representing a wake word to invoke a voice assistant service;
in response to detecting the wake word: ducking the first audio while recording, into the buffer, audio representing a voice input to the voice assistant and sending, to the voice assistant, the recorded audio in the buffer representing the voice input to the voice assistant;
in response to determining that the captured voice input includes audio data representing the wake word to invoke the voice assistant service: sending, via the  network interface to one or more servers of the voice assistant service, the voice input and determining a loudness of background noise in the given environment, wherein the background noise comprises ambient noise in the given environment;
receiving, from the voice assistant in response to the voice input, second audio representing a spoken response to the voice input; and
after determining the loudness of background noise, receiving, via the network interface from the one or more servers of the voice assistant service in response to the voice input, second audio data representing a spoken response to the voice input;
in response to receiving the second audio representing the spoken response to the voice input, ducking the first audio while playing back the ducked first audio concurrently with the second audio representing the spoken response to the voice input via the audio stage and the one or more speakers.
in response to receiving the second audio data representing the spoken response to the voice input, ducking the first audio in proportion to a difference between the given loudness of the first audio and the determined loudness of the background noise; and
playing back the ducked first audio concurrently with the second audio representing the spoken response to the voice input via the audio stage and the one or more speakers.


The examiner also notes that claims 5, 6, 7, 8, 9, 10, 11, 12, 14, 18, 20 of the ‘747 Application respectively corresponds to Claim 2, 6, 7, 8, 3, 4, 5, 9, 17, 18, 11 of the ‘678 Application.
Response to Arguments
8.	Applicant's arguments filed 5/18/22 have been fully considered but they are not persuasive. Applicant argues that with respect to claims 1, 14, and 20, the combination of Daley, Naik, and Kirsch fails to teach “in response to receiving the second audio representing the spoken response to the voice input, ducking the first audio… ”. Examiner respectfully submits that Daley teaches ducking the first audio while playing back the response during interaction between user so that the speaker’s own microphone can hear the utterance and the system and the audio to be resumed when interaction is done between the user and the system (Fig. 1 – 110 “Hey Bose”, [0015]) and the system respond to a user speak commands by speaking back [0003]; hence Daley clearly suggests that the system responses by speaking back to the user with the second audio; and ducking the first audio while priming the voice user interface to receiver further input ([0012]) and the speaker’s own microphone can hear the utterance ([0015]). It would have been obvious to one of ordinary skill in the art to duck the first audio while playing back the response during interaction between user and the system and the audio to be resumed when interaction is done between the user and the system. In the same field of endeavor, Naik teaches a ducking process in which a primary audio is ducked during playback in response to a feedback event – the feedback event is a request by a user to play associated voice feedback information which is a second audio response represent a spoken response come back from the system to the user (col. 19, lines 17-37 and col. 20, lines 18-43). The motivation to combine Naik teachings into the teachings of Daley for the purpose of ducking the first audio or reducing the volume of the first audio temporarily during a concurrent playback in which the voice feedback or second audio is mixed into the audio stream in order to improve from the viewpoint of a listener, as discussed by Naik in col. 1, lines 32-39.
	Applicant argues that Daley does teach that the system may respond to spoken commands by speaking back [0003]. However, like the spoken response from a user in [0003], Daley is silent as to any ducking while the system is speaking back. Examiner respectfully submits that Daley teaches ducking the first audio while playing back the response during interaction between user so that the speaker’s own microphone can hear the utterance and the system and the audio to be resumed when interaction is done between the user and the system (Fig. 1 – 110 “Hey Bose”, [0015]); it would have been obvious to one of ordinary skill in the art to duck the first audio while playing back the response during interaction between user and the system and the audio to be resumed when interaction is done between the user and the system; and Naik teaches a ducking process in which a primary audio is ducked during playback in response to a feedback event – the feedback event is a request by a user to play associated voice feedback information which is a second audio response represent a spoken response come back from the system to the user (col. 19, lines 17-37 and col. 20, lines 18-43). The motivation to combine Naik teachings into the teachings of Daley for the purpose of ducking the first audio or reducing the volume of the first audio temporarily during a concurrent playback in which the voice feedback or second audio is mixed into the audio stream in order to improve from the viewpoint of a listener, as discussed by Naik in col. 1, lines 32-39.
Applicant argues that Naik teaches the secondary audio stream is played back upon detection of a feedback event, such as a user-initiated or system initiated track or playlist change; Naik does not include any teaching that the feedback event may amount to receiving the secondary audio stream, so as to teach or suggest the responsive relationship between receiving the second audio and ducking the first audio… (remarks page 10). In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). Please refer to previous response with respect to the combination of Daley, Naik, and Kirsch.
Applicant argues that with regard to the 103 rejections of claims 3-4, 6, 16-17, Rand does not remedy the deficiencies of Daley, Naik, and Kirsch with respect to the base claims 1 and 14. Please refer to the above rejections and arguments with respect to claims 1 and 14.
Applicant argues that with regard to the 103 rejection of claim 11, Odom does not remedy the deficiencies of Daley, Naik, and Kirsch with respect to the base claim 1. Please refer to the above rejections and arguments with respect to claim 1.
Conclusion
9.	THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

10.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to QUYNH H NGUYEN whose telephone number is (571)272-7489. The examiner can normally be reached Monday-Friday 7AM-3PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached on 571-272-7488. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/QUYNH H NGUYEN/Primary Examiner, Art Unit 2652