DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This Office action is based on the communications filed December 30, 2021. Claims 1 – 20 are currently pending and considered below.

Claim Objections
Claims 1 and 11 are objected to because of the following informalities:  The first occurrence of VLI should be presented as Virtual Line In (VLI) in a manner similar to the first occurrence of NMD, network microphone device (NMD).  Appropriate correction is required.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 6, 8 – 11, 16, and 18 – 20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Torok et al. (US 2018/0234756 A1), hereinafter Torok.

Claim 1: Torok discloses a system comprising a network microphone device (NMD), a first playback device, and a second playback device (see at least, FIG. 1B, FIG. 9), wherein the NMD is configured to perform first functions comprising: 
detecting, via at least one microphone, a voice input (see at least, “The input audio (i.e., sound waves) corresponding to the natural language command may be captured by one or more microphone(s) of the device 104(4) due to the proximity of the device 104(4) to the user 102 when the utterance is spoken,” Torok [0039]); 
determining, via a voice assistant (see at least, “In some implementations, the device 104( 4) may process the captured audio. In other implementations, some or all of the processing of the input audio may be performed by additional computing devices 120(1), 120(2), ... , 120(N) (collectively 120) of the remote system 114, which are accessible to the device 104(4) over the network(s) 116. In some configurations, the device 104(4) is configured to identify a predefined "wake word" (i.e., a predefined utterance),” Torok [0039], “Upon the device 104 identifying the user 102 speaking the predefined wake word (in some instances), the device 104 may begin uploading audio data (representing the audio captured in the environment 106) to the remote system 114 over the network 116. In response to receiving this audio data, one or more computing devices 120 of the remote system 114 may begin performing automated speech recognition (ASR) on the audio signal to generate text, and may perform natural language understanding (NLU) on the generated text to determine one or more voice commands. For instance, the remote system 114 may determine, based on the audio data received from the device 104(4) over the network 116, that the user 102 is requesting to create the group of devices 104 including all of the user's 102 registered audio playback devices 104,” Torok [0040]), that voice input includes a command to group the first playback device and a second playback device (see at least, “In any case, the user 102 can create groups of devices, and can control the groups of the devices 104 using his/her voice. In the example of FIG. 1B, the user 102 wishes to create a group of the devices 104 such that the devices 104 in the group are later controllable by individual voice commands,” Torok [0037], “Alternatively, the user 102 can speak a natural language command, such as "Create a group named 'Everywhere', including all of my audio playback devices,” Torok [0039]); and 
according to the command in the voice input, forming a VLI group that includes the first playback device and the second playback device (see at least, “The throughput test may complete upon determining a first device 104 that passes the throughput test, and forming the group with the first device 104 designated as the audio distribution master device without conducting any additional rounds or taking any additional data throughput measurements prior to group formation. In this manner, the throughput test can be conducted with very little latency as compared to existing throughput tests that spend any and all time necessary to test each and every device 104 in the to-be-formed group in order to determine the best audio distribution master,” Torok [0044]), 
wherein the first playback device is configured as a first VLI device in the VLI group to perform second functions comprising: streaming, via a network interface of the first playback device, the audio content from one or more servers (see at least, “An individual audio playback device 104 may detect input audio based on an utterance spoken by the user 102, send audio data to the remote system 114, and the device 104, or another device 104, may receive a command from the remote system 114 in response to sending the audio data. Upon the device 104 receiving the command, the device 104 (or a group of the devices 104) can operate in a particular manner, such as by outputting audio (e.g., audio of an audio file corresponding to an artist requested by the user 102, audio of a text-to-speech (TTS) translation of a text response to a query made by the user 102, etc.). An audio file corresponding to audio content, such as music, may be retrievable from a content source(s) 119, which may be remotely located from the environment 106. Such remote (or cloud-based) content sources 119 are commonly known as content streaming sources where the user 102 subscribes to a service allowing the user 102 to access a library of audio files made available to the user 102 from the content sources 119. The content source(s) 119 may be part of the same system as the remote system 114, or the content source(s) 119 may be a separate system 119 that is made accessible to the remote system 114. Additionally, or alternatively, the content source(s) 119 may be located in the environment 106, such as a personal database of audio files that the user 102 can access for playback via one or more of the devices 104 in the environment 106. As such, receiving content from the content source(s) 119, as described herein, can comprise receiving the content directly from the content source(s) 119, or via the remote system 114, and possibly over the network 116 via the WAP 117,” Torok [0036], see also Torok [0126], “Device C, being the audio distribution master device of the "Everywhere" group, receives the command (either directly from the remote system 114 at block 902 when it is the master receiver, or otherwise forwarded from the master receiver). The command may instruct the audio distribution master to retrieve a first audio file 905 from a content source. At 904, the audio distribution master receives ( e.g., by following the link in the first command) a first audio file 905 from the content source 119 and via the WAP 117 in the environment 106. The audio file 905 corresponds to the content identifier in the first command. The first content identifier in the first command may be a link (e.g., a Uniform Resource Locator (URL)) pointing to the content source 119 where the audio file 905 is to be obtained, and the audio distribution master device 104 may use the link to retrieve the audio file 905,” Torok [0128]); 
sending, via the network interface of the first playback device, a VLI domain audio stream representing the streamed audio content to one or more VLI receivers of the VLI group (see at least, “At 906, one or more slaves in the group of devices that are to engage in synchronized audio playback of the audio file 905 receive the first audio file 905 from the audio distribution master device (e.g., device C),” Torok [0129]); and 
playing back the VLI domain audio stream via at least one speaker (see at least, “At 908, the devices 104 in the "Everywhere" group, which now possess the first audio file 905, can output audio of the first audio file 905 in a synchronized manner,” Torok [0130]), and wherein the second playback device is configured to perform third functions comprising: 
as a VLI receiver in the VLI group, receiving, via a network interface of the second playback device, the VLI domain audio stream representing the streamed audio content (see at least, “At 906, one or more slaves in the group of devices that are to engage in synchronized audio playback of the audio file 905 receive the first audio file 905 from the audio distribution master device (e.g., device C),” Torok [0129]); 
converting, via one or more processors, the VLI domain audio stream to a native domain audio stream (see at least, “The time synch module 265 is configured to synchronize time between the device 104 and one or more other devices 104 in a group 316. The time synch protocol may run separate from the rest of the audio system, and keeps the audio pipeline 255 clocks of all grouped devices 104 in sync. One device 104 can act as a time master (typically a different device as the audio distribution master). The time master exchanges timestamp information with slaves so that all slave devices can calculate and correct the time differences (Skew, drift=dSkew/dt) between themselves and the time master. Time synchronization establishes a common time base between the master device and the slaves. The devices 104 have their own crystal oscillators that run at slightly different frequencies. For example, the crystals on respective devices 104 can be off by 20 PPM slow or fast ( e.g., 20 μs per second). Two devices can therefore differ by up to 40 PPM. If this 40 PPM is not corrected, the phase coherence between speakers will be off by more than 150 μs in only 4 seconds, and will be off by more than 5 ms in about 2 minutes,” Torok [0134], “Therefore, the relative offset between clocks (skew) and the relative change in skew over time (drift) can be measured and use to resample audio rates to match the master device's audio playback rate, thereby correcting the differences between respective device 104 clocks,” Torok [0135]); and 
playing back the native domain audio stream in synchrony with playback of the VLI domain audio stream by the first playback device (see at least, “At 906, one or more slaves in the group of devices that are to engage in synchronized audio playback of the audio file 905 receive the first audio file 905 from the audio distribution master device (e.g., device C),” Torok [0129]).

Claim 6: Torok discloses the system of claim 1, wherein the second playback device is configured to perform additional third functions comprising: receiving a VLI playback command from the first playback device; converting the VLI playback command to a corresponding native domain instruction; and performing the native domain instruction (see at least, “Synchronized output of audio begins with audio distribution. For instance, all of the devices 104 in a group 316 can receive the same audio file. A streaming protocol can be implemented that allows an audio distribution master device to send messages to slave devices instructing the slaves to "play this audio file at this time." The audio distribution master device can be responsible for coordinating audio distribution from the content source(s) 119 to the slave devices 104,” Torok [0132], “The time synch module 265 is configured to synchronize time between the device 104 and one or more other devices 104 in a group 316. The time synch protocol may run separate from the rest of the audio system, and keeps the audio pipeline 255 clocks of all grouped devices 104 in sync. One device 104 can act as a time master (typically a different device as the audio distribution master). The time master exchanges timestamp information with slaves so that all slave devices can calculate and correct the time differences (Skew, drift=dSkew/dt) between themselves and the time master. Time synchronization establishes a common time base between the master device and the slaves. The devices 104 have their own crystal oscillators that run at slightly different frequencies. For example, the crystals on respective devices 104 can be off by 20 PPM slow or fast ( e.g., 20 μs per second). Two devices can therefore differ by up to 40 PPM. If this 40 PPM is not corrected, the phase coherence between speakers will be off by more than 150 μs in only 4 seconds, and will be off by more than 5 ms in about 2 minutes,” Torok [0134], “Therefore, the relative offset between clocks (skew) and the relative change in skew over time (drift) can be measured and use to resample audio rates to match the master device's audio playback rate, thereby correcting the differences between respective device 104 clocks,” Torok [0135]).

Claim 8: Torok discloses the system of claim 1, wherein determining that voice input includes the command to group the first playback device and the second playback device comprises: streaming, via the network interface of the first playback device to one or more servers of the voice assistance, data representing the voice input (see at least, “Upon the device 104 identifying the user 102 speaking the predefined wake word (in some instances), the device 104 may begin uploading audio data (representing the audio captured in the environment 106) to the remote system 114 over the network 116. In response to receiving this audio data, one or more computing devices 120 of the remote system 114 may begin performing automated speech recognition (ASR) on the audio signal to generate text, and may perform natural language understanding (NLU) on the generated text to determine one or more voice commands. For instance, the remote system 114 may determine, based on the audio data received from the device 104(4) over the network 116, that the user 102 is requesting to create the group of devices 104 including all of the user's 102 registered audio playback devices 104,” Torok [0040]).

Claim 9: Torok discloses the system of claim 1, wherein the first playback device comprises the voice assistant, and wherein determining that voice input includes the command to group the first playback device and the second playback device comprises: processing the voice input locally on the first playback device via the voice assistant (see at least, “In some implementations, the device 104( 4) may process the captured audio,” Torok [0039], “Following ASR processing, the ASR results may be sent by the speech recognition engine 858 to other processing components, which may be local to the device performing ASR and/or distributed across the network(s) 116,” Torok [0103]).

Claim 10: Torok discloses the system of claim 1, wherein the first playback device comprises the NMD (see at least, “The input audio (i.e., sound waves) corresponding to the natural language command may be captured by one or more microphone(s) of the device 104(4) due to the proximity of the device 104(4) to the user 102 when the utterance is spoken,” Torok [0039], “audio playback devices 104 (or devices A-D) acting as a master device for distributing audio to one or more slave devices in the group,” Torok [0061]).

Claims 11, 16, and 18 – 20 are directed to a method to be performed by a system substantially similar in scope to claims 1, 6, and 8 – 10, respectfully, and therefore are rejected for the same reasons.

Allowable Subject Matter
Claims 2 – 5, 7, 12 – 15, and 17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOSEPH SAUNDERS whose telephone number is (571)270-1063. The examiner can normally be reached Monday-Thursday, 9:00 a.m. - 4 p.m., EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached on (571)272-7488. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JOSEPH SAUNDERS JR/Primary Examiner, Art Unit 2652