DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Remarks
The present Office Action is in response to Applicant’s amendment filed on 9/7/2021.  Claims 1-7 and 10-16 are now pending in the present application.  Claims 8 and 9 have been canceled by the Applicant.  This Action is made FINAL.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following 
(A)the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.  The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.  The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being 

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: a computation device as recited in the steps of receiving, by a computation device, first audio streams from the plurality of client devices; generating, by the computation device from the first audio streams, second audio streams for rendering by respective client devices among the plurality of client devices; wherein the specific audio stream is generated by the computation device for the specific client device from a subset of the first audio streams and sent by the computation device to the specific client device; and outputting, by the computation device, the generated second audio streams to the respective client devices for rendering in claims 1 and 16.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
Applicant’s specification, as published, at paragraphs 0005 and 0047 disclose the computation device being a host and the host may be a network-based or cloud-based host (e.g., 
If Applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, Applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office Action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors.  In considering patentability of the claims the Examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the Examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 4, 6, 11, 13, 15, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over LaFata et al. (U.S. Patent Application Publication No. 2016/0014373 A1) (hereinafter LaFata) in view of Lyren (U.S. Patent Application Publication No. 2019/0261126 A1) (hereinafter Lyren).

Regarding claim 1, LaFata discloses a method of hosting a teleconference among a plurality of client devices arranged in two or more acoustic spaces, each client device having an audio capturing capability and/or an audio rendering capability (Figure 1 and paragraph 0027 disclose the conference call is being held with participants at four different locations--Locale A, Locale B, Locale C and Locale D.  In the present example, locales A, B and C each has a single participant connecting to the conference call, each user using his or her individual user device.  
grouping the plurality of client devices into two or more groups based on their belonging to respective acoustic spaces, wherein the two or more groups include a first group into which two or more client devices in the plurality of client devices are grouped, wherein the two or more client devices belong to a first acoustic space in the two or more acoustic spaces (Paragraph 0029 disclose a "locale" refers to the physical space where audio feedback from multiple active devices used by multiple conference participants to connect to the same conference call would create feedback problems or would create unsynchronized speaker signals or would inhibit the proper functioning of acoustic echo cancellation used in conventional cloud-based conferencing systems.  A "shared locale" refers to a locale or physical location occupied by two or more participants of a conference call.  Figure 2 and paragraphs 0041 and 0042 disclose a same-locale multiple-device conferencing method 200 starts by detecting the presence of two or more audio clients connecting to the conference call from the same physical location or same locale (202).  With the detection of two or more audio client connections at the same physical location, the method 200 then organizes the audio client nodes at the same physical location (204));
receiving, by a computation device over one or more networks, first audio streams from the plurality of client devices (Figure 4 and paragraphs 0053 and 0057 disclose the aggregation 
generating, by the computation device from the first audio streams, second audio streams for rendering by respective client devices among the plurality of client devices, wherein the second audio streams are generated based on the grouping of the plurality of client devices into the two or more groups (Figure 4 and paragraphs 0054, 0056, 0057, 0059 and 0060 disclose with inbound microphone packet streams 30 being received from multiple audio client nodes 1-3 in the shared locale, the aggregation node 25 stores the inbound microphone data packets into respective jitter buffers 70.  The aggregation node 25 pulls microphone data packets out of each jitter buffer and from the local audio client node (audio client 4) and processes the microphone data packets through the respective delay line 74.  The delay lines 74 introduce delays that are specific to each audio client node to each inbound microphone packet stream.  The microphone data packets for each inbound microphone packet stream go through its own delay line 74.  With each delay line 74 applying the audio-client-specific delay, the microphone signals from all the 
outputting, by the computation device, the generated second audio streams to the respective client devices for rendering (Figure 4 and paragraphs 0056, 0057, and 0060 disclose the outbound microphone packet stream 80 from the aggregation node 25 is then provided through the network socket 82 to the audio conferencing server.  The network socket 82 of the aggregation node 25 receives inbound speaker packet stream 84 from the audio conferencing server destined to the locale associated with the aggregation node.  In operation, the aggregation node 25 receives the inbound speaker packet stream 84 and generates separated speaker data packets destined for each audio client.  The aggregation node 25 processes the separated speaker data packets for each audio client through the respective output delay line 94.  The speaker data packets for each audio client node go through its own delay line 94.  With each delay line 94 applying the audio-client-specific playout delay, the speaker signals for all the audio clients will become dealigned.  The network socket 42 pulls the delay-adjusted speaker data packets from the delay lines and sends the speaker data packets to the respective audio client as the audio client receive stream.  The delay-adjusted speaker data packets form the outbound speaker packet stream 50.  The outbound speaker packet stream 50 are then provided through the network socket 42 to the respective audio client nodes.  Figure 3 and paragraph 0049 discloses audio client 20 also processes incoming audio data packets received either from the audio conferencing server 15 or the audio client aggregation node 25 that are to be played out on the speaker 54 of the host device.  At the host device, incoming audio data packets are received by the network socket 42.  The audio client 20 receives incoming audio data packets from the network socket 42 in the form of an audio client receive stream 50.  The audio client receive stream 50 is also referred to as a speaker packet stream).

In analogous art, Lyren discloses for each client device among the plurality of client devices: detecting whether the respective client device renders audio via headphone loudspeakers, and for each client device that is determined to render audio via headphone loudspeakers, assigning active sound sources to virtual source locations in a virtual listening environment to be rendered on the client device (Figure 2 and paragraphs 0063 and 0065 disclose block 210 makes a determination as to whether headphones, earphones, or another electronic device capable of providing binaural sound to the listener are connected and/or available.  This determination includes determining whether the electronic device (headphones, earphones, etc.) is connected to the electronic device providing the sound.  For instance, determine whether the headphones or earphones are plugged into an audio port of a laptop, desktop, smartphone, or another electronic device.  For instance, determine whether the headphones or earphones are in wireless communication with the electronic device providing the sound.  Paragraphs 00168, 0173, and 0178 disclose from the point-of-view of the listener, the sound originates or emanates from an object, point, area, or direction.  This location for the origin of the sound is the sound localization point (SLP).  By way of example, the SLP can be an actual point in space (e.g., an empty point in space 1-2 meters away from the head of the listener) or a point on or at a physical or virtual object (e.g., a mouth or head of an augmented reality (AR) or virtual reality (VR) image).  The indication shows the user the location of the sound 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to incorporate detecting whether headphones are providing sound to a listener and creating a virtual sound location point, as described in Lyren, with mixing signals from microphones in various locations for a teleconference, as described in LaFata, because doing so is using a known technique to improve a similar method in the same way.  Combining detecting whether headphones are providing sound to a listener and creating a virtual sound location point of Lyren with mixing signals from microphones in various locations for a teleconference of LaFata was within the ordinary ability of one of ordinary skill in the art based on the teachings of Lyren.
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to combine the teachings of LaFata and Lyren to obtain the invention as specified in claim 1.

Regarding claim 4, as applied to claim 1 above, LaFata, as modified by Lyren, further discloses wherein generating the second audio streams comprises:
for an active sound source in a given acoustic space, applying a signal processing technique to the first audio streams from client devices that are grouped in a group corresponding to the given acoustic space, to generate a source audio stream that represents captured audio for 
generating the second audio streams from the source audio stream Figure 4 and paragraph 0056 disclose the aligned microphone signals are mixed together by mixer 76 to form a mixed audio data packet 78 containing the audio signals of all of the audio clients connected to the aggregation node 25.  The mixed audio data packets form an outbound microphone packet stream 80).

Regarding claim 6, as applied to claim 1 above, LaFata, as modified by Lyren, further discloses wherein the second audio streams are generated to be the same for all client devices in a given group of client devices (Figure 4 and paragraph 0057 disclose the inbound speaker audio signals 86 are coupled to a splitter 88 to produce copies of the inbound speaker audio signals 86 for each audio client.  The splitter 88 produces individual speaker data packets 89 destined for each audio client node (for example, audio client nodes 1-4).  In particular, each audio client node receives speaker audio signals from all other conference participants).

Regarding claim 11, as applied to claim 1 above, LaFata, as modified by Lyren, further discloses wherein grouping the plurality of client devices based on their belonging to respective acoustic spaces involves at least one of: acoustic watermarking; receiving a user input indicative of a list of client devices present in at least one acoustic space; proximity detection using Bluetooth communication between client devices; and visual inspection using one or more video cameras (Paragraph 0089 discloses the remote audio conferencing server identifies a locale by sampling over a preview period the background noise signature received from each audio client requesting to join a conference call.  In some embodiments, the remote audio conferencing server may generate an Acoustic Background Spectrum of each audio client connecting to the conference call.  The Acoustic Background Spectrum of each audio client can be thought of as a room fingerprint and would enable the remote audio conferencing server to identify if two or more audio clients may be in the same locale).

Regarding claim 13, as applied to claim 1 above, LaFata, as modified by Lyren, further discloses for at least one group of client devices, determining a transmission latency between each of the client devices in the at least one group of client devices and a device hosting the teleconference (Paragraph 0065 discloses the audio client aggregation node 25 provides the common timebase for aligning the microphone audio streams from all the audio client nodes.  The audio client aggregation node accepts connections from other audio client devices in the room and uses the audio client timestamp information provided in the audio clients' microphone data packets to provide the aligned mix of all the microphone audio signals in the shared locale to the remote audio conferencing server 15.  A clock offset for each audio client relative to the 
adding respective delays to the second audio streams for the client devices in the at least one group of client devices based on the determined transmission latencies, to time-synchronize the second audio streams for the client devices in the at least one group of client devices (Paragraph 0604 discloses the present invention, the audio clients' microphone audio packets are timestamped by the respective host OS audio API.  The timing information is used by the input signal aligner 72 at the aggregation node 25 to align the audio clients' microphone audio streams.  Paragraph 0055 discloses the input delay lines 74 are controlled by an input signal aligner 72 which is configured to generate delay lines adjustment values for each of the input delay lines 74.  Paragraph 0071 discloses the input signal aligner 72 generates input delay lines adjustment values for the delay lines 74 based on the difference between the audio client timestamp and the reference timestamp).

Regarding claim 15, LaFata discloses a computation device (Paragraph 0022 discloses the invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor) comprising:
a computer processor (Paragraph 0022 discloses a processor); and
a non-transitory computer-readable storage medium storing a computer program (Paragraph 0022 discloses the invention can be implemented in numerous ways, including as a 
grouping the plurality of client devices into two or more groups based on their belonging to respective acoustic spaces, wherein the two or more groups include a first group into which two or more client devices in the plurality of client devices are grouped, wherein the two or more client devices belong to a first acoustic space in the two or more acoustic spaces (Paragraph 0029 disclose a "locale" refers to the physical space where audio feedback from multiple active devices used by multiple conference participants to connect to the same conference call would create feedback problems or would create unsynchronized speaker signals or would inhibit the proper functioning of acoustic echo cancellation used in conventional cloud-based conferencing systems.  A "shared locale" refers to a locale or physical location occupied by two or more participants of a conference call.  Figure 2 and paragraphs 0041 and 0042 disclose a same-locale multiple-device conferencing method 200 starts by detecting the presence of two or more audio clients connecting to the conference call from the same physical location or same locale (202).  With the detection of two or more audio client connections at the same physical location, the method 200 then organizes the audio client nodes at the same physical location (204);
receiving, by the computation device, first audio streams from the plurality of client devices (Figure 4 and paragraphs 0053 and 0057 disclose the aggregation node 25 is configured 
generating, by the computation device from the first audio streams, second audio streams for rendering by respective client devices among the plurality of client devices, wherein the second audio streams are generated based on the grouping of the plurality of client devices into the two or more groups (Figure 4 and paragraphs 0054, 0056, 0057, 0059 and 0060 disclose with inbound microphone packet streams 30 being received from multiple audio client nodes 1-3 in the shared locale, the aggregation node 25 stores the inbound microphone data packets into respective jitter buffers 70.  The aggregation node 25 pulls microphone data packets out of each jitter buffer and from the local audio client node (audio client 4) and processes the microphone data packets through the respective delay line 74.  The delay lines 74 introduce delays that are specific to each audio client node to each inbound microphone packet stream.  The microphone data packets for each inbound microphone packet stream go through its own delay line 74.  With each delay line 74 applying the audio-client-specific delay, the microphone signals from all the 
outputting, by the computation device, the generated second audio streams to the respective client devices for rendering (Figure 4 and paragraphs 0056, 0057, and 0060 disclose the outbound microphone packet stream 80 from the aggregation node 25 is then provided through the network socket 82 to the audio conferencing server.  The network socket 82 of the aggregation node 25 receives inbound speaker packet stream 84 from the audio conferencing server destined to the locale associated with the aggregation node.  In operation, the aggregation node 25 receives the inbound speaker packet stream 84 and generates separated speaker data packets destined for each audio client.  The aggregation node 25 processes the separated speaker data packets for each audio client through the respective output delay line 94.  The speaker data packets for each audio client node go through its own delay line 94.  With each delay line 94 applying the audio-client-specific playout delay, the speaker signals for all the audio clients will become dealigned.  The network socket 42 pulls the delay-adjusted speaker data packets from the delay lines and sends the speaker data packets to the respective audio client as the audio client receive stream.  The delay-adjusted speaker data packets form the outbound speaker packet stream 50.  The outbound speaker packet stream 50 are then provided through the network socket 42 to the respective audio client nodes.  Figure 3 and paragraph 0049 discloses audio client 20 also processes incoming audio data packets received either from the audio conferencing server 15 or the audio client aggregation node 25 that are to be played out on the speaker 54 of the host device.  At the host device, incoming audio data packets are received by the network socket 42.  The audio client 20 receives incoming audio data packets from the network socket 42 in the form of an audio client receive stream 50.  The audio client receive stream 50 is also referred to as a speaker packet stream).

In analogous art, Lyren discloses for each client device among the plurality of client devices: detecting whether the respective client device renders audio via headphone loudspeakers, and for each client device that is determined to render audio via headphone loudspeakers, assigning at least one active sound source to at least one virtual source location in a virtual listening environment to be rendered on the client device  (Figure 2 and paragraphs 0063 and 0065 disclose block 210 makes a determination as to whether headphones, earphones, or another electronic device capable of providing binaural sound to the listener are connected and/or available.  This determination includes determining whether the electronic device (headphones, earphones, etc.) is connected to the electronic device providing the sound.  For instance, determine whether the headphones or earphones are plugged into an audio port of a laptop, desktop, smartphone, or another electronic device.  For instance, determine whether the headphones or earphones are in wireless communication with the electronic device providing the sound.  Paragraphs 00168, 0173, and 0178 disclose from the point-of-view of the listener, the sound originates or emanates from an object, point, area, or direction.  This location for the origin of the sound is the sound localization point (SLP).  By way of example, the SLP can be an actual point in space (e.g., an empty point in space 1-2 meters away from the head of the listener) or a point on or at a physical or virtual object (e.g., a mouth or head of an augmented reality (AR) or virtual reality (VR) image).  The indication shows the user the location of the sound 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to incorporate detecting whether headphones are providing sound to a listener and creating a virtual sound location point, as described in Lyren, with mixing signals from microphones in various locations for a teleconference, as described in LaFata, because doing so is using a known technique to improve a similar method in the same way.  Combining detecting whether headphones are providing sound to a listener and creating a virtual sound location point of Lyren with mixing signals from microphones in various locations for a teleconference of LaFata was within the ordinary ability of one of ordinary skill in the art based on the teachings of Lyren.
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to combine the teachings of LaFata and Lyren to obtain the invention as specified in claim 15.

Regarding claim 16, LaFata discloses a non-transitory computer-readable storage medium storing a computer program (Paragraph 0022 discloses the invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a 
grouping the plurality of client devices into two or more groups based on their belonging to respective acoustic spaces, wherein the two or more groups include a first group into which two or more client devices in the plurality of client devices are grouped, wherein the two or more client devices belong to a first acoustic space in the two or more acoustic spaces (Paragraph 0029 disclose a "locale" refers to the physical space where audio feedback from multiple active devices used by multiple conference participants to connect to the same conference call would create feedback problems or would create unsynchronized speaker signals or would inhibit the proper functioning of acoustic echo cancellation used in conventional cloud-based conferencing systems.  A "shared locale" refers to a locale or physical location occupied by two or more participants of a conference call.  Figure 2 and paragraphs 0041 and 0042 disclose a same-locale multiple-device conferencing method 200 starts by detecting the presence of two or more audio clients connecting to the conference call from the same physical location or same locale (202).  With the detection of two or more audio client connections at the same physical location, the method 200 then organizes the audio client nodes at the same physical location (204);
receiving, by a computation device, first audio streams from the plurality of client devices (Figure 4 and paragraphs 0053 and 0057 disclose the aggregation node 25 is configured to receive inbound microphone data packets from four shared locale audio clients.  In the present example, the aggregation node 25 is configured on the host device of audio client node 4.  Accordingly, audio client nodes 1-3 are connected to the aggregation node 25 through a local 
generating, by the computation device from the first audio streams, second audio streams for rendering by respective client devices among the plurality of client devices, wherein the second audio streams are generated based on the grouping of the plurality of client devices into the two or more groups (Figure 4 and paragraphs 0054, 0056, 0057, 0059 and 0060 disclose with inbound microphone packet streams 30 being received from multiple audio client nodes 1-3 in the shared locale, the aggregation node 25 stores the inbound microphone data packets into respective jitter buffers 70.  The aggregation node 25 pulls microphone data packets out of each jitter buffer and from the local audio client node (audio client 4) and processes the microphone data packets through the respective delay line 74.  The delay lines 74 introduce delays that are specific to each audio client node to each inbound microphone packet stream.  The microphone data packets for each inbound microphone packet stream go through its own delay line 74.  With each delay line 74 applying the audio-client-specific delay, the microphone signals from all the audio clients will be lined up after the delay lines.  The aligned microphone signals are mixed together by mixer 76 to form a mixed audio data packet 78 containing the audio signals of all of the audio clients connected to the aggregation node 25.  The mixed audio data packets form an 
outputting, by the computation device, the generated second audio streams to the respective client devices for rendering (Figure 4 and paragraphs 0056, 0057, and 0060 disclose 
LaFata does not explicitly disclose for each client device among the plurality of client devices: detecting whether the respective client device renders audio via headphone loudspeakers, and for each client device that is determined to render audio via headphone 
In analogous art, Lyren discloses for each client device among the plurality of client devices: detecting whether the respective client device renders audio via headphone loudspeakers, and for each client device that is determined to render audio via headphone loudspeakers, assigning at least one active sound source to at least one virtual source location in a virtual listening environment to be rendered on the client device  (Figure 2 and paragraphs 0063 and 0065 disclose block 210 makes a determination as to whether headphones, earphones, or another electronic device capable of providing binaural sound to the listener are connected and/or available.  This determination includes determining whether the electronic device (headphones, earphones, etc.) is connected to the electronic device providing the sound.  For instance, determine whether the headphones or earphones are plugged into an audio port of a laptop, desktop, smartphone, or another electronic device.  For instance, determine whether the headphones or earphones are in wireless communication with the electronic device providing the sound.  Paragraphs 00168, 0173, and 0178 disclose from the point-of-view of the listener, the sound originates or emanates from an object, point, area, or direction.  This location for the origin of the sound is the sound localization point (SLP).  By way of example, the SLP can be an actual point in space (e.g., an empty point in space 1-2 meters away from the head of the listener) or a point on or at a physical or virtual object (e.g., a mouth or head of an augmented reality (AR) or virtual reality (VR) image).  The indication shows the user the location of the sound source or SLP where the binaural sound will originate to the listener.  This location can be a physical or virtual object, a point, an area, or a direction.  The SLP can be an actual point in space (e.g., an empty point in space 1-2 meters away from the head of the listener) or a point on 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to incorporate detecting whether headphones are providing sound to a listener and creating a virtual sound location point, as described in Lyren, with mixing signals from microphones in various locations for a teleconference, as described in LaFata, because doing so is using a known technique to improve a similar method in the same way.  Combining detecting whether headphones are providing sound to a listener and creating a virtual sound location point of Lyren with mixing signals from microphones in various locations for a teleconference of LaFata was within the ordinary ability of one of ordinary skill in the art based on the teachings of Lyren.
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to combine the teachings of LaFata and Lyren to obtain the invention as specified in claim 16.

Claims 2, 3, and 5 are rejected under 35 U.S.C. 103 as being unpatentable over LaFata in view of Lyren as applied to claim 1 above, and further in view of Virolainen et al. (U.S. Patent Application Publication No. 2009/0264114 A1) (hereinafter Virolainen).

Regarding claim 2, as applied to claim 1 above, LaFata, as modified by Lyren, discloses the claimed invention except explicitly disclosing wherein generating the second audio streams comprises: for an active sound source in a given acoustic space, determining the client device in the given acoustic space that is closest to the active sound source; generating a source audio 
In analogous art, Virolainen discloses wherein generating the second audio streams comprises: for an active sound source in a given acoustic space, determining the client device in the given acoustic space that is closest to the active sound source; generating a source audio stream that represents captured audio for the currently active sound source based on the first audio stream from the determined client device, disregarding the first audio streams from any other client devices in the same group as the determined client device; and generating the second audio streams from the source audio stream (Paragraph 0050 discloses the mixer 202 may employ a dynamic mixing algorithm.  The dynamic mixing algorithm may enable calculation of various audio features for the microphone signals T1(t), T2(t), T3(t) . . . TN(t) and, based on these features, the dynamic mixing algorithm may attempt to mix signal(s) from microphone(s) that have (or typically have) the highest energy or best signal-to-noise ratio as compared to other signals.  As such, for example, the mixer 202 (e.g., via the dynamic mixing algorithm) may be configured to select one of the microphone signals T1(t), T2(t), T3(t) . . . TN(t) at any given time for inclusion as the downmixed signal 204 on the basis of which one of the signals has better properties than the other signals.  Thus, in some examples, if a speaker (e.g., a speaking person) is picked up on more than one microphone among the devices in a room, the microphone closest to the speaker (or at least having the best audio properties) may be selected as the signal to be included in the downmixed signal 204.  As an example, if one slave terminal 142 is closest to a speaker, but the other slave terminals 144 and 146 and/or the master device 140 also picks up the 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to incorporate the microphone closest to the speaker selected as the signal to be included in the downmixed signal, as described in Virolainen, with mixing signals from microphones, as described in LaFata, as modified by Lyren, because doing so is using a known technique to improve a similar method in the same way.  Combining the microphone closest to the speaker selected as the signal to be included in the downmixed signal of Lyren with mixing signals from microphones of LaFata, as modified by Lyren, was within the ordinary ability of one of ordinary skill in the art based on the teachings of Virolainen.
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to combine the teachings of LaFata, Lyren, and Virolainen to obtain the invention as specified in claim 2.

Regarding claim 3, as applied to claim 2 above, LaFata, as modified by Lyren, discloses the claimed invention except explicitly disclosing wherein determining the client device in the given acoustic space that is closest to the active sound source is based on at least one of: measuring sound volumes of audio events in first audio streams from client devices in a group corresponding to the given audio space; and measuring times of arrival of audio events in first audio streams from client devices in a group corresponding to the given audio space.
In analogous art, Virolainen discloses wherein determining the client device in the given 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to incorporate selecting the microphone closest to the speaker as the microphone having the highest energy as compared to other signals, as described in Virolainen, with mixing signals from microphones, as described in LaFata, as modified by Lyren, because doing so is using a known technique to improve a similar method in the same way.  Combining selecting the microphone closest to the speaker as the microphone having the highest energy as compared to other signals of Virolainen with mixing signals from microphones of LaFata, as modified by Lyren, was within the ordinary ability of one of ordinary skill in the art based on the teachings of Virolainen.
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to combine the teachings of LaFata, Lyren, and Virolainen to obtain the invention as specified in claim 3.

Regarding claim 5, as applied to claim 1 above, LaFata, as modified by Lyren, discloses the claimed invention except explicitly disclosing wherein for a given group of client devices, first audio streams from client devices in the given group of client devices are not used for generating second audio streams for the client devices in the given group of client devices.
In analogous art, Virolainen discloses wherein for a given group of client devices, first audio streams from client devices in the given group of client devices are not used for generating second audio streams for the client devices in the given group of client devices (Paragraph 0006 discloses the conference switch 100, also referred to as a conference bridge, mixes incoming speech signals from each site and sends the mixed signal back to each site.  The speech signal coming from the current site is usually removed from the mixed signal that is sent back to this same site.  Figure 1 and paragraph 0037 disclose the conference switch 148 may be configured to mix incoming speech signals from each site and sends the mixed signal back to each site, except that the speech signal coming from the current site may be removed from the mixed signal that is sent back to the current site).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to incorporate removing speech signals from a site from a mixed signal sent back to the site, as described in Virolainen, with mixing signals from microphones in various locations for a teleconference, as described in LaFata, as modified by Lyren, because doing so is using a known technique to improve a similar method in the same way.  Combining removing speech signals from a site from a mixed signal sent back to the site of Virolainen with mixing signals from microphones in various locations for a teleconference of LaFata, as modified by Lyren, was within the ordinary 
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to combine the teachings of LaFata, Lyren, and Virolainen to obtain the invention as specified in claim 5.

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over LaFata in view of Lyren as applied to claim 1 above, and further in view of Cartwright et al. (U.S. Patent Application Publication No. 2015/0244869 A1) (hereinafter Cartwright).

Regarding claim 7, as applied to claim 1 above, LaFata, as modified by Lyren, discloses the claimed invention except explicitly disclosing determining a linear mapping function for mapping the first audio streams to the second audio streams based on the grouping of the plurality of client devices into the two or more groups; and generating the second audio streams from the first audio streams by applying the linear mapping function to the first audio streams.
In analogous art, Cartwright discloses determining a linear mapping function for mapping the first audio streams to the second audio streams based on the grouping of the plurality of client devices into the two or more groups; and generating the second audio streams from the first audio streams by applying the linear mapping function to the first audio streams (Paragraph 0033 discloses it is desirable to provide a multi-party audio conference system which allows to overlay a plurality of audio signals originating from a plurality of different terminals or endpoints of the audio conference system, such that a listener is provided with spatial cues regarding the different talkers at the plurality of different terminals.  Paragrpah0038 discloses a 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to incorporate mapping a soundfield using linear transformation, as described in Cartwright, with transforming audio signals, as described in LaFata, as modified by Lyren, because doing so is combining prior art elements according to known methods to yield predictable results.  Combining mapping a soundfield using linear transformation of Cartwright with transforming audio signals of LaFata, as modified by Lyren, was within the ordinary ability of one of ordinary skill in the art based on the teachings of Cartwright.
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to combine the teachings of LaFata, Lyren, and Cartwright to obtain the invention as specified in claim 7.

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over LaFata in view of Lyren as applied to claim 1 above, and further in view of Zhang et al. (U.S. Patent Application Publication No. 2010/0074433 A1) (hereinafter Zhang).

Regarding claim 10, as applied to claim 1 above, LaFata, as modified by Lyren, discloses the claimed invention except explicitly disclosing at least one of: performing single-channel echo cancellation for at least one client device among the plurality of client devices to suppress a representation of the second audio stream received by the at least one client device in the first audio stream output by the at least one client device; and performing multi-channel echo 
In analogous art, Zhang discloses at least one of: performing single-channel echo cancellation for at least one client device among the plurality of client devices to suppress a representation of the second audio stream received by the at least one client device in the first audio stream output by the at least one client device; and performing multi-channel echo cancellation for at least one group of client devices to suppress representations of the second audio streams received by the client devices in the at least one group of client devices in the first audio streams output by the client devices in the at least one group of client devices (Figure 1 and paragraph 0027 disclose a cellular phone comprising a transmission device configured to facilitate communication with other remote participants could incorporate the MC-AEC system 100 for improved echo reduction while operating in a speakerphone mode.  Paragraph 0006 discloses audio signals captured by one or more microphones comprised within an audio conferencing system are adjusted to provide an improvement in acoustic echo cancellation (AEC).  More particularly, a multi-party spatial audio conferencing system (e.g., a conferencing system that gives a listener the impression that remote participants are dispersed throughout a three dimensional virtual environment) comprises a speaker array configured to output spatialized audio signals and one or more microphones configured to capture and relay a sound signal comprising an echo of the spatialized audio signal to a multi-channel acoustic echo cancellation (MC-AEC) unit having a plurality of echo cancellers).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to incorporate 
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to combine the teachings of LaFata, Lyren, and Zhang to obtain the invention as specified in claim 10.

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over LaFata in view of Lyren as applied to claim 1 above, and further in view of Ahgren et al. (U.S. Patent Application Publication No. 2016/0050491 A1) (hereinafter Ahgren).

Regarding claim 12, as applied to claim 1 above, LaFata, as modified by Lyren, discloses the claimed invention except explicitly disclosing for at least one group of client devices, determining a relative spatial arrangement of the client devices in the respective group of client devices, wherein generating the second audio streams is further based on the determined relative spatial arrangement of client devices in the at least one group of client devices.
In analogous art, Ahgren discloses for at least one group of client devices, determining a relative spatial arrangement of the client devices in the respective group of client devices (Paragraph 0096 discloses a communication client application can detect that the user terminal 
wherein generating the second audio streams is further based on the determined relative spatial arrangement of client devices in the at least one group of client devices (Paragraphs 0115 and 0116 disclose the method may further comprise encoding the output audio signal at an encoder of said network entity to produce an encoded output audio signal and transmitting the encoded output audio signal over the communications network to said user device.  In exemplary embodiments the method is performed based on detecting that the user device and the at least one further user device are located in a common acoustic space).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to incorporate determining whether devices are co-located based on known locations of the devices and generating output signals based on the locations of the devices, as described in Ahgren, with 
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to combine the teachings of LaFata, Lyren, and Ahgren to obtain the invention as specified in claim 12.

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over LaFata in view of Lyren as applied to claim 1 above, and further in view of Hirsch et al. (U.S. Patent Application Publication No. 2012/0260232 A1) (hereinafter Hirsch).

Regarding claim 14, as applied to claim 1 above, LaFata, as modified by Lyren, discloses the claimed invention except explicitly disclosing wherein the grouping the plurality of client devices into two or more groups is further based on at least one of: operating systems of the client devices; and CPU availabilities of the client devices.
In analogous art, Hirsch discloses wherein the grouping the plurality of client devices into two or more groups is further based on at least one of: operating systems of the client devices; and CPU availabilities of the client devices (Paragraph 0085 discloses each mobile device category may be associated with the group of mobile devices that run a particular mobile 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to incorporate grouping mobile devices by operating system, as described in Hirsch, with using mobile devices in a conference call, as described in LaFata, as modified by Lyren, because doing so is combining prior art elements according to known methods to yield predictable results.  Combining grouping mobile devices by operating system of Hirsch with using mobile devices in a conference call of LaFata, as modified by Lyren, was within the ordinary ability of one of ordinary skill in the art based on the teachings of Hirsch.
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains to combine the teachings of LaFata, Lyren, and Hirsch to obtain the invention as specified in claim 14.
Response to Arguments
Applicant’s arguments with respect to claims 1-7 and 10-16 have been considered but are moot in view of the new grounds of rejection.
Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant's disclosure.
Osada (U.S. Patent Application Publication No. 2014/0112505 A1) discloses information processing system, computer-readable non-transitory storage medium having stored therein information processing program, information processing control method, and information .

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office Action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the Examiner should be directed to MARK G. PANNELL whose telephone number is (303) 297-4245.  The Examiner can normally be reached on Monday through Friday 8:00 am to 3:00 pm (Mountain Time).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool.  To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the Examiner by telephone are unsuccessful, the Examiner’s 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at (866) 217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call (800) 786-9199 (IN USA OR CANADA) or (571) 272-1000.






/Mark G. Pannell/Examiner, Art Unit 2642