Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION

a.	Claims 1-2, 5-14, and 17-30 are in the present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA :
	- claims 1, 13, 25, and 28 are amended
	- claims 3-4 and 15-16 are cancelled
b.	This is a final action on the merits based on Applicant’s claims submitted on 04/11/2022.


Response to Arguments

Regarding Independent claims 1, 13, 25, and 28 previously rejected under 35 U.S.C. § 103, Applicant's arguments, see “The cited portions of Sun, Kuhr, Thapa, Yeh, and Clavel individually or in combination, fail to disclose the specific combination of claim 1. For example, the cited portions of Sun, Kuhr, Thapa, Yeh, and Clavel, individually or in combination, fail to disclose “based upon a first user indication, (a) determine that the first participant corresponds to a preferred participant (b) provide a control signal to a network device to adjust a first bit rate of the first audio stream and (c) increase a first gain of the first decoded audio” as in claim 1.” on page 13, filed on 04/11/2022, with respect to Sun et al. US Pub 2016/0142851 (hereinafter “Sun”), in view of Kuhr et al. US Pub 2011/0135098 (hereinafter “Kuhr”) and of Thapa US Pub 2009/0088880 (hereinafter “Thapa”), and further in view of Yeh US Pub 2005/0147261 (hereinafter “Yeh”) and of Clavel et al. US Pub 2014/0173467 (hereinafter “Clavel”), have been fully considered but are moot, over the limitations of “provide a control signal to a network device to adjust a first bit rate of the first audio stream”. Said limitations are newly added to the amended Claims 1, 13, 25, and 28 and has been addressed in instant office action, as shown in section 35 USC 103 rejection below, with newly identified prior art teaching from newly found reference Hourunranta et al. US Patent 6704281 (hereinafter “Hourunranta”) in combination with previously applied references, thus rendering said Applicant’s arguments moot.

Claim Interpretation

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) is invoked. Such claim limitations are: “means for providing” in claims 28-30
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f):
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action.

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

Claims 1, 3, 10, 13, 15, 23, 25, and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Sun et al. US Pub 2016/0142851 (hereinafter “Sun”), in view of Kuhr et al. US Pub 2011/0135098 (hereinafter “Kuhr”) and of Thapa US Pub 2009/0088880 (hereinafter “Thapa”), and further in view of Yeh US Pub 2005/0147261 (hereinafter “Yeh”), of Hourunranta et al. US Patent 6704281 (hereinafter “Hourunranta”), and of Clavel et al. US Pub 2014/0173467 (hereinafter “Clavel”). 
Regarding claim 1 (Currently Amended)
Sun discloses in Fig. 7 a mobile device for managing audio during a conference, the mobile device comprising:
a spatial steering processor (i.e. “generating unit 603” in Fig. 6; “the apparatus 600 comprises a generating unit 603 configured to generate the surround sound field from the received audio signals at least partially based on the estimated topology.” [0073 and furthermore “As shown in FIG. 3, upon receipt of the audio signals captured by a group of audio capturing devices 101 at step S301, the topology of these audio capturing devices are estimated at step S302.  Estimating the topology of positions of audio capturing devices 101 within the group is important to the subsequent spatial processing, which has direct impact on reproducing the sound field.” [0031]) configured to:
steer first decoded audio to be projected from a speaker at a first angle (“represents the spatial location of an audio capturing device with distance to the center of R and angle of .phi..sub.M, and .alpha.  represents the source location at angle .phi.” [0056-0057]), and steer second decoded audio (repeated operation depend on how many audio capturing devices 101) to be projected from the speaker at a second angle (“For example, assume that the group contains three audio capturing devices 101 having angles of .pi./2, 3.pi./4, and 3.pi./2 and same distance to the center at 4 cm.” [0058]), and
the speaker configured to: project the first decoded audio at the first angle; and project the second decoded audio at the second angle (“To this end, in accordance with embodiments of the present invention, the weights for respective audio signals, represented as the mapping matrix, may be dynamically adapted based on the topology of audio capturing devices as estimated at step S303.  Still considering the above example topology where three audio capturing devices 101 have angles of .pi./2, 3.pi./4, and 3.pi./2 and same distance to the center at 4 cm, if the mapping matrix is adapted according to this specific topology,” [0059-0063])
receive a signal indicating detection of head movement (i.e. “head-related transfer functions (HRTF)) associated with a user of the mobile device (“binaural rendering, in which audio is played back through a pair of earphones or headphones, may be desired since users are expected to listen to the audio files on mobile devices. B-format to binaural conversion can be achieved approximately by summing loudspeaker array feeds that are each filtered by a head-related transfer functions (HRTF) matching the loudspeaker position.” [0067]), wherein the user of the mobile device is distinct from the first participant and second participant (“FIG. 7 is a block diagram illustrating a user terminal 700 for implementing example embodiments of the present invention.  The user terminal 700 may operate as the audio capturing device 101 as discussed herein.  In some embodiments, the user terminal 700 may be embodied as a mobile phone.” [0080]; Fig. 7. The first and second participants are regular user terminals while a third participant is a mobile phone.);
Sun does not specifically teach shift the first angle and the second angle by a shift amount in response to receiving the signal.
In an analogous art, Kuhr discloses receive a signal (see Figs. 5 and 10B) indicating detection of head movement associated with a user of the mobile device (“As described above with reference to FIG. 5, a head tracker HT may be incorporated into a headset, and the head position signal therefrom may be used by an audio processing unit to compensate for the movement of the head and thereby maintain the illusion of a number of immobile virtual sound sources.” [0121]); and shift the first angle and the second angle by a shift amount in response to receiving the signal (“As indicated above, this can be done by switching or interpolating the applied filters and/or equalizers as a function of the listener's head movements.  In one embodiment, this can be done by determining the azimuth angular movement from the head tracker HT data, and by effectively mathematically moving the virtual sound sources by an azimuth angle of the opposite value (e.g., if the head moves by .DELTA..theta., the sources are moved by -.DELTA..theta.).  This mathematical movement can be achieved by rotating the angle that is used to select filter data from a HRTF for a particular source, or by shifting the source angles in the parameter tables/databases of the filters.” [0121] and furthermore “According to another embodiment of the present invention, the perceived movement of the sources can be compensated for by mapping the current desired source angle (or current measured head angle) to a modified source angle (or modified head angle) that yields a perception closest to the desired direction. The mapping function can be determined from angular localization errors for each direction within the tracked range if these errors are known.  As another approach, controls may be provided to the user to allow adjustment to the mapping function so as to minimize the perceived motion of the sources.  FIG. 10A shows an exemplary mapping function that relates the modified source angle (or negative of the modified head angle) to the current desired source angle (or negative of the measured head angle).  Also shown in FIG. 10A is a dashed straight line for the case where the modified angle would be equal to the input angle (desired angle).  As can be seen by comparing the exemplary mapping to the straight line, there is some compression of the modified angle (e.g., slope less than 1) near a source angle of zero and 180 degrees (e.g., front and back).  In other instances, there may be some expansion of the modified angle (e.g., slope greater than 1) near a source angle of zero and 180 degrees (e.g., front and back).” [0122]).
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Sun’s method for generating adaptive audio content using HRTF technology, to include Kuhr’s method for providing surround audio signals, in order to provide an efficient audio conference system (Kuhr [0010]).
	Sun and Kuhr do not specifically teach the first decoded audio corresponding to a decoded version of a first audio stream from a first device associated with a first participant of the conference, the second decoded audio corresponding to a decoded version of a second audio stream from a second device associated with a second participant of the conference, and the first decoded audio synchronized with the second decoded audio.
In an analogous art, Thapa discloses a mobile (i.e. “client device 110C” in Fig. 2; “For example, client devices other than a computer running client software can be used.  Examples include PDAs, mobile phones, web-enabled TV, and SIP phones and terminals (i.e., phone-type devices using the SIP protocol that typically have a small video screen and audio capability).” [0026]) for managing audio during a conference (“In particular, the present invention is directed towards synchronization and/or mixing of audio and video streams during a networked video conference call.” [0003]),
the first decoded audio (i.e. “output audio stream 392” in Fig. 3) corresponding to a decoded version of a first audio stream (i.e. “input audio streams 302” in Fig. 3) from the first device (i.e. “client device 110A” in Fig. 2); 
the second decoded audio corresponding to a decoded version of a second audio stream from a second device (similarly audio decoding activities this time for “client device 110B” in Fig. 2) associated with a second participant (i.e. “Sender 102B” in Fig. 2) of the conference (“it will be assumed that each sender client 110A-B creates the data streams for its respective participant 102A-B; that these data streams are sent to server 120 which retransmits them to the receiver client 110C, and that the receiver client 110C is responsible for synchronizing and mixing the data streams to produce the appropriate data streams for display to the receiver 102C.  That is, in this example, all synchronization and mixing are performed locally at the client 110C.” [0028]).
Thapa further discloses a video conference method including participants in different locations (“This is a multi-point example since the three participants are at different network locations.” [0023]), henceforth, different participants at different locations must use different decoders (“The client receives audio streams 302 and video streams 304 from the various sender clients 110A-B (via the server 120) and produces an output audio stream 392 (typically, only one) and output video stream(s) 394 (possibly, more than one) for display on the receiver client 110C.” [0032]). 
Thapa further discloses in Fig. 3 a first decoder (i.e. “audio decoder 320”) configured to decode a first audio stream (i.e. “input audio streams 302”) from a first device (“To allow full video and audio capability, each client 110 preferably includes at least one camera (for video capture), display (for video play back), microphone (for audio capture) and speaker (for audio play back).” [0021]; [0025]) associated with a first participant (e.g. “participant 102A” in Fig. 1) of the conference ("FIG. 3 is a block diagram of one example of a client for synchronizing and mixing audio and video streams according to the invention. The client includes audio buffers 310, audio stream decoders 320, audio mixer 330 and audio output module 340. The client also includes video buffers 350, video stream decoders 360, optional video mixer 370 and video output module 380." [0032]; Fig. 3) to generate a first decoded audio (i.e. “output audio stream 392” [0032-0033]; Fig. 3); 
a second decoder configured to decode a second audio stream from a second device (duplicate setup similar to the first decoder in Fig. 3) associated with a second participant (e.g. “participant 102B” in Fig. 1) of the conference to generate a second decoded audio (similarly to “output audio stream 392” [0032-0033]; Fig. 3).
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Sun’s method for generating adaptive audio content using HRTF technology, to include Thapa’s method of synchronization and/or mixing of audio and video streams during a networked video conference call, in order to provide an efficient audio conference system (Thapa [0006-0007]).
Sun, Kuhr, and Thapa do not specifically teach based upon a first user indication, (a) determine that the first participant corresponds to a preferred participant (b) provide a control signal to a network device to adjust a first bit rate of the first audio stream and (c) increase a first gain of the first decoded audio; steer first decoded audio to be projected from a speaker at a first angle, the first angle based upon a second user indication of an angle at which audio is to be projected, the first decoded audio corresponding to a decoded version of the first audio stream from the first device; steer second decoded audio to be projected from the speaker at a second angle, the second decoded audio corresponding to a decoded version of the second audio stream from a second device associated with a second participant of the conference, and the first decoded audio synchronized with the second decoded audio.
In an analogous art, Yeh discloses based upon a first user indication (i.e. “angle of incidence” [0036]), (a) determine that the first participant (i.e. “human speaker 102” in Fig. 7) corresponds to a preferred participant (i.e. “speaker 102” and not “listener 126” [0036]; also Fig. 7)
and (c) increase a first gain of the first decoded audio (i.e. increase sound volume as distance between speaker and listener increases “An additional spatial filter effect that follows is to lower the intensity, or volume, attenuate the higher frequencies, and add some forms of reverberation, for example, whereby the listener perceives the audio source increasing in distance from the listener.  Again, this perceived effect is adjustable by the listener.  Thus, the perceived audio source can be translated to the left for example 132, translated in added distance 130 or a combination of left translation and added distance 134.” [0005]);
steer the first decoded audio to be projected from a speaker at a first angle (“derives an angle of incidence, .phi., for the voice of the human speaker 102 preferably relative to the microphone 402 or center of the microphone array 502, 602” [0036]), the first angle based upon a second user indication of an angle at which audio is to be projected (“an angle of incidence via listener inputs 121 at the listener interface 120” [0036]), the first decoded audio corresponding to a decoded version of the first audio stream from the first device (“encapsulated along with the voice data 709 with an extended VoIP 710, accommodating this data, and the data is transmitted as packets 140, 150 via a network 110” [0036]);
steer the second decoded audio to be projected from the speaker at a second angle (“The spatial filtering of the second VoIP processing device 112 includes the angle of incidence information by interpolating 716 the selected HRTFs to account for an angle of incidence if not already overridden by the listener via listener inputs 121 at the listener interface 120.  In this example, the human speaker 102 is left of center of a microphone assembly 402 or array 502, 602.  With the listener 126 having set the source preference to be that the human speaker acoustical image is nominally facing the listener when the listener is facing the audio speaker array 122, 124, then the resulting "imaged" audio source 728 is perceived to be right of center of the audio speaker array 122, 124.” [0036]), the second decoded audio corresponding to a decoded version of the second audio stream from a second device associated with a second participant of the conference (“the resulting "imaged" audio source 728 is perceived to be right of center of the audio speaker array 122, 124.” [0036]), and the first decoded audio synchronized with the second decoded audio (“As illustrated in FIG. 8, the first transmitted angle of incidence, the second transmitted angle of incidence substantially orthogonal to the first transmitted angle of incidence, or a relative distance setting or any combination 717 is used to drive the interpolation 804 of the HRTF database to a solution of filter coefficients between previously quantified incident angles, i.e., those having filter coefficient arrays based on acoustical measurements, so that the convolution includes the spatial filters adjusted for one or both of the transmitted incidence angles.  In embodiments having planar implementations, the HRTFs may be a function of frequency and azimuth angle. In a horizontal plane HRTF interpolation example, the interpolation can be a linear interpolation of the HRTF coefficients for the stored azimuth angles of incidence that bound the derived azimuth angle of incidence.” [0037]);
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Sun’s method for generating adaptive audio content using HRTF technology, as modified by Kuhr and Thapa, to include Yeh’s method of producing, adjusting and maintaining natural sounds, e.g., speaking voices, in a telecommunication environment, in order to sync up multiple audio streams during a conferencing session (Yeh [Abstract]).
Sun, Kuhr, Thapa, and Yeh do not specifically teach (b) provide a control signal to a network device to adjust a first bit rate of the first audio stream.
In an analogous art, Hourunranta discloses (b) provide a control signal (i.e. “input control information”) to a network device (i.e. “speech bit-rate control element 115” in Fig. 5) to adjust a first bit rate of the first audio stream (e.g. “speech input signal 114” in Fig. 5; “Correspondingly the speech encoder 110 is provided with a speech bit-rate control element 115 that controls the operation of the speech encoder 110 according to the input control information. Further to the prior art solution of FIG. 1, the terminal also comprises an input element 130 for transferring preference information 131 that defines the preferred proportions between different media types in the multiplexed bit-stream 123. The information is preferably transformed into control information 132 that is input directly or indirectly to the control elements 115, 103 of the encoders.” Col. 5, lines 53-62 and also “second encoder is arranged to adjust the bit-rate of the second bit-stream according to the received control signal from the controller.” [Claim 1 text]
Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Sun’s method for generating adaptive audio content using HRTF technology, as modified by Kuhr, Thapa, and Yeh, to include Yeh’s method for bit-rate control in a multimedia device, in order to sync up multiple audio streams during a conferencing session (Hourunranta, col. 2, lines 10-27)
Sun, Kuhr, Thapa, Yeh, and Hourunranta do not specifically teach decode a first audio stream, based upon a determination to prioritize the first audio stream, and the determination to prioritize the first audio stream being based upon a frame energy or an active level.
In an analogous art, Clavel discloses decode a first audio stream, based upon a determination to prioritize the first audio stream (“Presenting the chat group audio can include selecting a volume limit based on the chat group priority, mixing the constituent audio streams of the chat group into a chat group audio stream, processing the mixed chat group audio stream to meet the volume limit, and playing the processed chat group audio stream at the user device.” [0044]), and the determination to prioritize the first audio stream being based upon a frame energy or an active level (The Specification defines active level similar to the volume of the audio stream [0042]; “The volume of each chat group audio stream preferably varies directly with the assigned priorities, wherein the highest priority chat groups are preferably the loudest (e.g., has the highest amplitude limit) and the lowest priority chat groups are preferably the quietest (e.g., has the lowest amplitude limit).” [0043] and furthermore “Determining the audio settings for each chat group preferably includes determining the discernability of the chat group audio stream.  While the audio streams for all chat groups are preferably played to the user, the audio streams are preferably processed to varying degrees to adjust the discernibility of the chat group audio stream, dependent on the chat group priority.  The discernibility of the chat group audio streams preferably varies directly with the assigned priorities, wherein the discernability of the chat group audio stream increases with increasing priority (e.g., the highest priority chat groups are preferably the most discernible and the lowest priority chat groups are preferably the least discernible), an example of which is shown in FIG. 2” [0041]).
	Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Sun’s method for generating adaptive audio content using HRTF technology, as modified by Kuhr, Thapa, Yeh, and Hourunranta, to include Clavel’s method of audio parameter setting determination based on the chat group priority, in order to sync up multiple audio streams during a conferencing session (Clavel [0034]). Thus, a person of ordinary skill would have appreciated the ability to incorporate Clavel’s method of audio parameter setting determination based on the chat group priority into Sun’s method for generating adaptive audio content using HRTF technology since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.

Regarding claim 10
Sun, as modified by Kuhr, Thapa, Yeh, Hourunranta, and Clavel, previously discloses the mobile device of claim 1, 
Sun further discloses further comprising an antenna configured to receive the first audio stream (“As shown, the user terminal 700 includes an antenna(s) 712 in operable communication with a transmitter 714 and a receiver 716.” [0081]; Fig. 7).

Regarding claim 13 (Currently Amended)
A method for managing audio during a conference, the method comprising:
decoding, at a first decoder of a mobile device, a first audio stream, based upon a determination to prioritize the first audio stream, received from a first device associated with a first participant of the conference to generate a first decoded audio, the determination to prioritize the first audio stream being based upon a frame energy or an active level;
decoding, at a second decoder of the mobile device, a second audio stream received from a second device associated with a second participant of the conference to generate a second decoded audio;
determining that a first device is associated with a first participant of the conference;
based upon a first user indication, (a) determine that the first participant corresponds to a preferred participant (b) provide a control signal to a network device to adjust a first bit rate of the first audio stream and (c) increase a first gain of the first decoded audio;
steering, at the spatial steering processor, the first decoded audio to be projected from a speaker at a first angle, the first angle based upon a second user indication of an angle at which audio is to be projected, the first decoded audio corresponding to a decoded version of a first audio stream from the first device; and
steering, at the spatial steering processor, the second decoded audio to be projected from the speaker at a second angle, the second decoded audio corresponding to a decoded version of a second audio stream from a second device associated with a second participant of the conference, and the first decoded audio synchronized with the second decoded audio;
receiving a signal indicating detection of head movement associated with a user of the mobile device, wherein the user of the mobile device is distinct from the first participant and second participant; and
shifting the first angle and the second angle by a shift amount in response to receiving the signal.
The scope and subject matter of method claim 13 is drawn to the method of using the corresponding apparatus claimed in claim 1. Therefore method claim 13 corresponds to apparatus claim 1 and is rejected for the same reasons of obviousness as used in claim 1 rejection above.

Regarding claim 23
The method of claim 13, wherein the first audio stream is received via an antenna of the mobile device.
The scope and subject matter of method claim 23 is drawn to the method of using the corresponding apparatus claimed in claim 10. Therefore method claim 23 corresponds to apparatus claim 10 and is rejected for the same reasons of obviousness as used in claim 10 rejection above.

Regarding claim 25 (Currently Amended)
A non-transitory computer-readable medium comprising instructions for managing audio during a conference, the instructions, when executed by a spatial steering processor in a mobile device, cause the spatial steering processor to perform operations comprising:
decoding, at a first decoder of a mobile device, a first audio stream, based upon a determination to prioritize the first audio stream, received from a first device associated with a first participant of the conference to generate a first decoded audio, the determination to prioritize the first audio stream being based upon a frame energy or an active level;
decoding, at a second decoder of the mobile device, a second audio stream received from a second device associated with a second participant of the conference to generate a second decoded audio;
determining that a first device is associated with a first participant of the conference;
based upon a first user indication, (a) determine that the first participant corresponds to a preferred participant (b) provide a control signal to a network device to adjust a first bit rate of the first audio stream and (c) increase a first gain of the first decoded audio;
steering the first decoded audio to be projected from a speaker at a first angle, the first angle based upon a second user indication of an angle at which audio is to be projected, the first decoded audio corresponding to a decoded version of a first audio stream from the first device;
steering the second decoded audio to be projected from the speaker at a second angle, the second decoded audio corresponding to a decoded version of a second audio stream from a second device associated with a second participant of the conference, and the first decoded audio synchronized with the second decoded audio; and
receiving a signal indicating detection of head movement associated with a user of the mobile device, wherein the user of the mobile device is distinct from the first participant and second participant; and
shifting the first angle and the second angle by a shift amount in response to receiving the signal.
The scope and subject matter of non-transitory computer readable medium claim 25 is drawn to the computer program product of using the corresponding apparatus claimed in claim 1. Therefore computer program product claim 25 corresponds to apparatus claim 1 and is rejected for the same reasons of obviousness as used in claim 1 rejection above.

Regarding claim 28 (Currently Amended)
An apparatus for managing audio during a conference, the apparatus comprising:
means for decoding, at a first decoder of a mobile device, a first audio stream received from a first device, based upon a determination to prioritize the first audio stream, associated with a first participant of the conference to generate a first decoded audio, the determination to prioritize the first audio stream being based upon a frame energy or an active level;
means for decoding, at a second decoder of the mobile device, a second audio stream received from a second device associated with a second participant of the conference to generate a second decoded audio;
means for determining that a first device is associated with a first participant of the conference;
means for determining, based upon a first user indication that the first participant corresponds to a preferred participant;
means for providing, based upon the first user indication, a control signal to a network device to adjust a first bit rate of the first audio stream;
means for increasing, based upon the first user indication, a first gain of the first decoded audio;
means for steering the first decoded audio to be projected from means for projecting at a first angle, the first angle based upon a second user indication of an angle at which audio is to be projected, and steering the second decoded audio to be projected from the means for projecting at a second angle, the first decoded audio corresponding to a decoded version of a first audio stream from the first device, the second decoded audio corresponding to a decoded version of a second audio stream from a second device associated with a second participant of the conference, and the first decoded audio synchronized with the second decoded audio; and
the means for projecting; and
means for receiving a signal indicating detection of head movement associated with a user of the mobile device, wherein the user of the mobile device is distinct from the first participant and second participant; and
means for shifting the first angle and the second angle by a shift amount in response to receiving the signal.
The scope and subject matter of apparatus claim 28 is similar to the scope and subject matter as claimed in apparatus claim 1. Therefore apparatus claim 28 corresponds to apparatus claim 1 and is rejected for the same reasons of obviousness as used in claim 1 rejection above.

Claims 2, 14, and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Sun, in view of Kuhr, Thapa, Yeh, Hourunranta, and Clavel, and further in view of Oh et al. US Pub 2010/0145711 (hereinafter “Oh”). 
Regarding claim 2
Sun, as modified by Kuhr, Thapa, Yeh, Hourunranta, and Clavel, previously discloses the mobile device of claim 1, wherein the spatial steering processor is configured to:
Sun further discloses apply a first head-related transfer function (HRTF) to the first decoded audio to steer the first decoded audio; and apply a second HRTF to the second decoded audio to steer the second decoded audio (“binaural rendering, in which audio is played back through a pair of earphones or headphones, may be desired since users are expected to listen to the audio files on mobile devices.  B-format to binaural conversion can be achieved approximately by summing loudspeaker array feeds that are each filtered by a head-related transfer functions (HRTF) matching the loudspeaker position.” [0067-0068]).
Sun/Kuhr/Thapa/Yeh/Clavel do not specifically teach select based on determining that the first participant corresponds to the preferred participant, a first head-related transfer function (HRTF) for the first device, the first HRTF corresponding to the first angle and a first gain;
select, based on determining that the second participant does not correspond to the preferred participant, a second HRTF for the second device, the second HRTF corresponding to the second angle and a second gain, wherein the first gain is higher than the second gain:
apply the first HRTF head related transfer function (HRTF) to the first decoded audio to steer the first decoded audio; and
apply the second HRTF to the second decoded audio to steer the second decoded audio.
In an analogous art, Oh discloses select based on determining that the first participant corresponds to the preferred participant (as previously taught by Yeh Fig. 4), a first head-related transfer function (HRTF) for the first device, the first HRTF corresponding to the first angle and a first gain (“the first gain information generating unit 114a is able to generate first gain information.  Furthermore, in case that a downmix signal (DMX) is a mono signal, when the extra multi-channel information generating unit 116 does not generate HRTF information for a binaural mode, the first gain information generating unit 114a is able to generate first gain information.” [0052]);
select, based on determining that the second participant does not correspond to the preferred participant (as previously taught by Yeh Fig. 4), a second HRTF for the second device, the second HRTF corresponding to the second angle and a second gain (“In case that HTRF information for a binaural mode is generated, second gain information for adjusting an object gain can be included within the HRTF information.  So, if the first gain information for adjusting a gain of object is generated, generation and transport of gain information may be overlapped.  Details for the binaural mode and the like will be explained later together with the extra multi-channel generating unit 116.” [0052]), wherein the first gain is higher than the second gain (the gain is a user-defined parameter “Another object of the present invention is to provide an apparatus for processing an audio signal and method thereof, by which gain and panning of  an object can be controlled based on a selection made by a user.” [0007]), therefore it is easy for one skilled in the art to apply Oh’s teaching to configure wherein the first gain is higher than the second gain);
apply the first HRTF head related transfer function (HRTF) to the first decoded audio to steer the first decoded audio; and apply the second HRTF to the second decoded audio to steer the second decoded audio (“In this case, an extra multi-channel parameter (EMI) includes HRTF (head-related transfer functions) information for a binaural mode and second gain information.  Meanwhile, details for the object information (OI), the mix information (MXI), the first gain information, the extra multi-channel information (EMI) and the like will be explained later with reference to FIG. 2.  Moreover, in case of generating the first gain information, the information generating unit 110 transfers multi-channel information (MI) including the first gain information to the multi-channel decoder 130.  In case of not generating the first gain information, the information generating unit 110 transfers multi-channel information (MI) excluding the first gain information and the extra multi-channel information (EMI) to the multi-channel decoder 130.  Its details will be explained later with reference to FIG. 2.  In addition, the information generating unit 110 is capable of generating downmix processing information (DPI) using the object information (OI) and the mix information (MXI).” [0044]).
	Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Sun’s method for generating adaptive audio content using HRTF technology, as modified by Kuhr, Thapa , Yeh, Hourunranta, and Clavel, to include Oh’s method for decoding an audio signal, in order to sync up multiple audio streams during a conferencing session “Generally, while downmixing several audio objects to be a mono or stereo signal, parameters from the individual object signals can be extracted.  These parameters can be used in a decoder of an audio signal, and positioning/panning of the individual sources can be controlled by user' selection.” Oh [0002]. Thus, a person of ordinary skill would have appreciated the ability to incorporate Oh’s method for decoding an audio signal into Sun’s method for generating adaptive audio content using HRTF technology since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.

Regarding claim 14
The method of claim 13, further comprising:
applying a first head-related transfer function (HRTF) to the first decoded audio to steer the first decoded audio; and
applying a second HRTF to the second decoded audio to steer the second decoded audio.
The scope and subject matter of method claim 14 is drawn to the method of using the corresponding apparatus claimed in claim 2. Therefore method claim 14 corresponds to apparatus claim 2 and is rejected for the same reasons of obviousness as used in claim 2 rejection above.

Regarding claim 26
The non-transitory computer-readable medium of claim 25, wherein the operations further comprise:
applying a first head-related transfer function (HRTF) to the first decoded audio to steer the first decoded audio; and
applying a second HRTF to the second decoded audio to steer the second decoded audio.
The scope and subject matter of non-transitory computer readable medium claim 26 is drawn to the computer program product of using the corresponding apparatus claimed in claim 2. Therefore computer program product claim 26 corresponds to apparatus claim 2 and is rejected for the same reasons of obviousness as used in claim 2 rejection above.

Claims 5, 9, 17, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Sun, in view of Kuhr, Thapa, Yeh, Hourunranta, and Clavel, and further in view of Knappe et al. US Patent 6,850,496 (hereinafter “Knappe”). 
Regarding claim 5
Sun, as modified by Kuhr, Thapa, Yeh, Hourunranta, and Clavel, previously discloses the mobile device of claim 1, further comprising:
Sun/Kuhr/Thapa/Yeh/Hourunranta/Clavel do not specifically teach a first buffer configured to receive the first audio stream from the first device; a second buffer configured to receive the second audio stream from the second device; and
a delay controller configured to generate a control signal, the control signal provided to the first buffer and to the second buffer to synchronize first buffered audio that is output from the first buffer with second buffered audio that is output from the second buffer, wherein the first decoded audio is synchronized with the second decoded audio based on the synchronization of the first buffered audio and the second buffered audio.
In an analogous art, Knappe discloses a method for managing audio during a conference (“This present invention relates generally to voice conferencing, and more particularly to systems and methods for use with packet voice conferencing to create a perception of spatial separation between conference callers.” Col. 1, lines 6-10)
a first buffer (i.e. “jitter buffer 90” in Fig. 6) configured to receive the first audio stream from the first device (“Endpoint A also receives two packet voice data streams, one over virtual channel 68 from endpoint B, and one over virtual channel 74 from endpoint C.” col. 6, lines 11-13);
a second buffer (i.e. “jitter buffer 94” in Fig. 6) configured to receive the second audio stream from the second device (“Endpoint A also receives two packet voice data streams, one over virtual channel 68 from endpoint B, and one over virtual channel 74 from endpoint C.” col. 6, lines 11-13); and
a delay controller (i.e. “controller 88” in Fig. 6) configured to generate a control signal, the control signal provided to the first buffer and to the second buffer to synchronize first buffered audio that is outputted from the first buffer with second buffered audio that is output from the second buffer (“Controller 88 manages each channel mapper by providing mapping instructions, e.g., the number of input voice data channels, the number of output presentation mixing channels, and the presentation sound field sector that should be occupied by the presentation mixing channels.  This last instruction can be replaced by more specific instructions, e.g., delay the left channel 2 ms, mix 50% of the left channel into the right channel, etc., to accomplish the mapping.  In the former case, the channel mapper itself contains the ability to calculate a mapping to a desired sound field; in the latter case, these computations reside in the controller, and the channel mapper performs basic signal processing functions such as channel delaying, mixing, phase shifting, etc., as instructed.” Col. 7, lines 22-36),
wherein the first decoded audio is synchronized with the second decoded audio based on the synchronization of the first buffered audio and the second buffered audio (“In the embodiment shown in FIG. 6, controller 88 controls jitter buffer synchronization by manipulating the relative delays of the buffers” col. 7, lines 18-19, also lines 6-17).
	Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Sun’s method for generating adaptive audio content using HRTF technology, as modified by Kuhr, Thapa, Yeh, Hourunranta, and Clavel, to include Knappe’s method for packet voice conferencing in order to minimize disruptions during a conferencing session “A system and method are disclosed for packet voice conferencing.  The system and method divide a conferencing presentation sound field into sectors, and allocate one or more sectors to each conferencing endpoint.  At some point between capture and playout, the voice data from each endpoint is mapped into its designated sector or sectors.  Thereafter, when the voice data from a plurality of participants from multiple endpoints is combined, a listener can identify a unique apparent location within the presentation sound field for each participant.  The system allows a conference participant to increase their comprehension when multiple participants speak simultaneously, as well as alleviate confusion as to who is speaking at any given time." Knappe, [Abstract]. Thus, a person of ordinary skill would have appreciated the ability to incorporate Knappe’s method for packet voice conferencing into Sun’s method for generating adaptive audio content using HRTF technology since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.

Regarding claim 9
Sun, as modified by Kuhr, Thapa, Yeh, Hourunranta, Clavel, and Knappe, previously discloses the mobile device of claim 5, 
Knappe further discloses wherein the first buffer comprises a first de-jitter buffer (i.e. “jitter buffer 90” in Fig. 6), and wherein the second buffer comprises a second de-jitter buffer (i.e. “jitter buffer 94” in Fig. 6; “Jitter buffers 90, 92, and 94 receive the voice data streams output by decoders 84 and 86.  The purpose of the jitter buffers is to provide for smooth audio playout, i.e., to account for the normal fluctuations in voice data sample arrival rate from the decoders (both due to network delays and to the fact that many samples arrive in each packet).  Each jitter buffer ideally attempts to insert as little delay in the transmission path as possible, while ensuring that audio playout is rarely, if ever, starved for samples.  Those skilled in the art recognize that various methods of jitter buffer management are well known, and the selection of a particular method is left as a design choice.  In the embodiment shown in FIG. 6, controller 88 controls jitter buffer synchronization by manipulating the relative delays of the buffers.” Col. 7, lines 6-19).

Regarding claim 17
The method of claim 13, further comprising:
receiving, at a first buffer of the mobile device, the first audio stream from the first device;
receiving, at a second buffer of the mobile device, the second audio stream from the second device; and
generating a control signal at a delay controller for the mobile device, the control signal provided to the first buffer and to the second buffer to synchronize first buffered audio that is outputted from the first buffer with second buffered audio that is output from the second buffer,
wherein the first decoded audio is synchronized with the second decoded audio based on the synchronization of the first buffered audio and the second buffered audio.
The scope and subject matter of method claim 17 is drawn to the method of using the corresponding apparatus claimed in claim 5. Therefore method claim 17 corresponds to apparatus claim 5 and is rejected for the same reasons of obviousness as used in claim 5 rejection above.

Regarding claim 21
The method of claim 17, wherein the first buffer comprises a first de-jitter buffer, and wherein the second buffer comprises a second de-jitter buffer.
The scope and subject matter of method claim 21 is drawn to the method of using the corresponding apparatus claimed in claim 9. Therefore method claim 21 corresponds to apparatus claim 9 and is rejected for the same reasons of obviousness as used in claim 9 rejection above.

Claims 6-8, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sun, in view of Kuhr, Thapa, Yeh, Hourunranta, Clavel, and Knappe, and further in view of Klingbeil et al. US Pub 2016/0105473 (hereinafter “Klingbeil”). 
Regarding claim 6
Sun, as modified by Kuhr, Thapa, Yeh, Hourunranta, Clavel, and Knappe, previously discloses the mobile device of claim 5, wherein the delay controller is further configured to:
The combination of Sun/Kuhr/Thapa/Yeh/Hourunranta/Clavel/Knappe does not specifically teach wherein the delay controller is further configured to: compare a first time stamp of the first audio stream with a second time stamp of the second audio stream, the first time stamp and the second time stamp based on a common clock source; and determine a time difference between the first time stamp and the second time stamp.
In an analogous art, Klingbeil discloses compare (“First, the method inserts any incoming packets into the jitter buffer. There may be zero or more packets available. Next, the method records the arrival time of the very first incoming packet, which is referred to as time TO. The subsequent audio packets will have an ideal arrival time relative to time TO at each fixed clock intervals. Next, on subsequent packet arrivals, the method compares the actual arrival time to the "ideal" arrival time relative to time TO.” [0028-0030]) a first time stamp of the first audio stream (i.e. “arrival time of the very first incoming packet”) with a second time stamp of the second audio stream (i.e. “subsequent packet arrivals”), the first time stamp and the second time stamp based on a common clock source (“In some embodiments of the present invention, the jitter buffer of the receive process of either the audio client node or the audio conferencing server operates based on a fixed clock interval.  The fixed clock intervals represent the times an audio packet is expected to arrive at the destination audio client.  The receive process may compute an expected arrival time for audio packets in a call session.” [0026]); and
determine a time difference between the first time stamp and the second time stamp (“The difference in the actual arrival time to the ideal arrival time is a measurement of the packet queuing delay.  Some packets may appear to arrive early relative to the ideal time since the establishment of the ideal time base is also subject to any jitter present at time TO.  Others packets may arrive later relative to the ideal time.” [0030]. One skilled in the art can use Klingbeil’s teaching to manipulate how the different time stamps of packet arrival can be measured and compared to one another).
	Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Sun’s method for generating adaptive audio content using HRTF technology, as modified by Kuhr, Thapa, Yeh, Hourunranta, Clavel, and Knappe, to include Klingbeil’s adaptive audio stream with latency compensation in order to sync up multiple audio streams using jitter buffer “In order to present a continuous unbroken audio stream to the listener, a jitter buffer at the destination node can be used to absorb the delay variation.  A jitter buffer is a specialized priority queue where the incoming audio packets are ordered by increasing audio timestamp." Klingbeil [0003]. Thus, a person of ordinary skill would have appreciated the ability to incorporate Klingbeil’s adaptive audio stream with latency compensation into Sun’s method for generating adaptive audio content using HRTF technology since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.

Regarding claim 7
Sun, as modified by Kuhr, Thapa, Yeh, Hourunranta, Clavel, Knappe, and Klingbeil, previously discloses the mobile device of claim 6, 
Knappe further discloses wherein the control signal (generated by controller 88 in Fig. 6) indicates to the first buffer to delay outputting the first buffered audio (“In the embodiment shown in FIG. 6, controller 88 controls jitter buffer synchronization by manipulating the relative delays of the buffers.” Col. 7, line 18-19) by the time difference if the first time stamp indicates an earlier time than the second time stamp (specifically implemented using time stamp difference in Klingbeil’s teaching “The jitter buffer is implemented as a priority queue in which the audio packets stored therein are sorted or ordered by increasing audio timestamp.  Accordingly, incoming audio packets, which may have unpredictable and out-of-order arrival times, are stored in the jitter buffer 16 in sorted order based on the audio timestamp from oldest to newest.” [0017]; [0043]).

Regarding claim 8
Sun, as modified by Kuhr, Thapa, Yeh, Hourunranta, Clavel, Knappe, and Klingbeil, previously discloses the mobile device of claim 6, 
Knappe further discloses wherein the control signal (generated by controller 88 in Fig. 6) indicates to the second buffer to delay outputting the second buffered audio (“In the embodiment shown in FIG. 6, controller 88 controls jitter buffer synchronization by manipulating the relative delays of the buffers.” Col. 7, line 18-19) by the time difference if the second time stamp indicates an earlier time than the first time stamp (specifically implemented using time stamp difference in Klingbeil’s teaching “The jitter buffer is implemented as a priority queue in which the audio packets stored therein are sorted or ordered by increasing audio timestamp.  Accordingly, incoming audio packets, which may have unpredictable and out-of-order arrival times, are stored in the jitter buffer 16 in sorted order based on the audio timestamp from oldest to newest.” [0017]; [0043]).

Regarding claim 18
The method of claim 17, further comprising:
comparing, at the mobile device, a first time stamp of the first audio stream with a second time stamp of the second audio stream, the first time stamp and the second time stamp based on a common clock source; and
determining, at the mobile device, a time difference between the first time stamp and the second time stamp.
The scope and subject matter of method claim 18 is drawn to the method of using the corresponding apparatus claimed in claim 6. Therefore method claim 18 corresponds to apparatus claim 6 and is rejected for the same reasons of obviousness as used in claim 6 rejection above.

Regarding claim 19
The method of claim 18, wherein the control signal indicates to the first buffer to delay outputting the first buffered audio by the time difference if the first time stamp indicates an earlier time than the second time stamp.
The scope and subject matter of method claim 19 is drawn to the method of using the corresponding apparatus claimed in claim 7. Therefore method claim 19 corresponds to apparatus claim 7 and is rejected for the same reasons of obviousness as used in claim 7 rejection above.

Regarding claim 20
The method of claim 18, wherein the control signal indicates to the second buffer to delay outputting the second buffered audio by the time difference if the second time stamp indicates an earlier time than the first time stamp.
The scope and subject matter of method claim 20 is drawn to the method of using the corresponding apparatus claimed in claim 8. Therefore method claim 20 corresponds to apparatus claim 8 and is rejected for the same reasons of obviousness as used in claim 8 rejection above.

Claims 11, 24, 29, and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Sun, in view of Kuhr, Thapa, Yeh, Hourunranta, and Clavel, and further in view of Chan et al. US Pub 2009/0299739 (hereinafter “Chan”). 
Regarding claim 11
Sun, as modified by Kuhr, Thapa, Yeh, Hourunranta, and Clavel, previously discloses the mobile device of claim 1, further comprising a modem that includes the spatial steering processor.
Sun/Kuhr/Thapa/Yeh/Hourunranta/Clavel do not specifically teach the mobile device of claim 1, further comprising a modem that includes the spatial steering processor.
In an analogous art, Chan discloses the mobile device (i.e. “communications device D50”), further comprising a modem (i.e. “mobile station modem (MSM)”) that includes the spatial steering processor (“FIG. 34 shows a block diagram of a communications device D50 that is an implementation of device D10.  Device D50 includes a chip or chipset CS10 (e.g., a mobile station modem (MSM) chipset) that includes apparatus MF100.  Chip/chipset CS10 may include one or more processors, which may be configured to execute all or part of apparatus MF100 (e.g., as instructions).  Chip/chipset CS10 includes a receiver, which is configured to receive a radio-frequency (RF) communications signal and to decode and reproduce an audio signal encoded within the RF signal, and a transmitter, which is configured to encode an audio signal that is based on the processed multichannel signal produced by apparatus MF100 and to transmit an RF communications signal that describes the encoded audio signal.  One or more processors of chip/chipset CS10 may be configured to perform a spatial processing operation as described above on the processed multichannel signal (e.g., one or more operations that determine the distance between the audio sensing device and a particular sound source, reduce noise, enhance signal components that arrive from a particular direction, and/or separate one or more sound components from other environmental sounds), such that the encoded audio signal is based on the spatially processed signal.” [0172]).
	Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Sun’s method for generating adaptive audio content using HRTF technology, as modified by Kuhr, Thapa Yeh, Hourunranta, and Clavel, to include Chan’s method of processing a multichannel audio signal in order to sync up multiple audio streams “A method for processing a multichannel audio signal may be configured to control the amplitude of one channel of the signal relative to another based on the levels of the two channels.  One such example uses a bias factor, which is based on a standard orientation of an audio sensing device relative to a directional acoustic information source, for amplitude control of information segments of the signal. “Chan [Abstract]. Thus, a person of ordinary skill would have appreciated the ability to incorporate Chan’s method of processing a multichannel audio signal into Sun’s method for generating adaptive audio content using HRTF technology since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.

Regarding claim 24
The method of claim 13, wherein the spatial steering processor is included in a modem of the mobile device.
The scope and subject matter of method claim 24 is drawn to the method of using the corresponding apparatus claimed in claim 11. Therefore method claim 24 corresponds to apparatus claim 11 and is rejected for the same reasons of obviousness as used in claim 11 rejection above.

Regarding claim 29
The apparatus of claim 28, wherein the means for steering the first decoded audio and the second decoded audio is included in a modem of a mobile device.
The scope and subject matter of apparatus claim 29 is similar to the scope and subject matter as claimed in apparatus claim 11. Therefore apparatus claim 29 corresponds to apparatus claim 11 and is rejected for the same reasons of obviousness as used in claim 11 rejection above.

Regarding claim 30
The apparatus of claim 29, wherein the first audio stream is received via an antenna of the mobile device.
The scope and subject matter of apparatus claim 30 is similar to the scope and subject matter as claimed in apparatus claim 10. Therefore apparatus claim 30 corresponds to apparatus claim 10 and is rejected for the same reasons of obviousness as used in claim 10 rejection above.

Claims 12, 22, and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Sun, in view of Kuhr, Thapa, Yeh, Hourunranta, and Clavel, and further in view of Heikkinen et al. US Patent 7,394,833 (hereinafter “Heikkinen”). 
Regarding claim 12
Sun, as modified by Kuhr, Thapa, Yeh, Hourunranta, and Clavel, previously discloses the mobile device of claim 1, 
Sun/Kuhr/Thapa/Yeh/Hourunranta/Clavel do not specifically teach wherein the mobile device, the first device, and the second device each comprise a user equipment (UE) that is compatible with a Third Generation Partnership Project (3GPP) standard.
In an analogous art, Heikkinen discloses wherein the mobile device, the first device, and the second device each comprise a user equipment (UE) (“The device could be a cellular telephone or a personal communicator, where the packetized encoded speech data is received through a wireless communications channel.” Col. 4, lines 14-16) that is compatible with a Third Generation Partnership Project (3GPP) standard (“3GPP Technical specification Group Services and System Aspects, 3G TS 26.090 V3.0.1, September 1999, 61 p.), which decodes the audio data and returns a decoded audio data frame to the RTP module 16 at step (5).” Col. 6, lines 4-8).
	Before the effective filling date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Sun’s method for generating adaptive audio content using HRTF technology, as modified by Kuhr, Thapa, Yeh, Hourunranta, and Clavel, to include Heikkinen’s system and method for performing synchronization through the use of a modified speech decoder, in order to sync up multiple audio streams during a conferencing session “The device includes a jitter buffer for storing speech data and a jitter buffer controller, and the unit that generates the synchronization delay adjustment request comprises the jitter buffer controller.  The jitter buffer controller may determine an average amount of time that a frame resides in the jitter buffer; and can then adjust the synchronization delay so that the average duration approaches a desired jitter buffer residency duration." Heikkinen, col. 4, lines 19-26. Thus, a person of ordinary skill would have appreciated the ability to incorporate Heikkinen’s system and method for performing synchronization through the use of a modified speech decoder into Sun’s method for generating adaptive audio content using HRTF technology since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.

Regarding claim 22
The method of claim 13, wherein the mobile device, the first device, and the second device each comprise a user equipment (UE) that is compatible with a Third Generation Partnership Project (3GPP) standard.
The scope and subject matter of method claim 22 is drawn to the method of using the corresponding apparatus claimed in claim 12. Therefore method claim 22 corresponds to apparatus claim 12 and is rejected for the same reasons of obviousness as used in claim 12 rejection above.

Regarding claim 27
The non-transitory computer-readable medium of claim 25, wherein the mobile device, the first device, and the second device each comprise a user equipment (UE) that is compatible with a Third Generation Partnership Project (3GPP) standard.
The scope and subject matter of non-transitory computer readable medium claim 27 is drawn to the computer program product of using the corresponding apparatus claimed in claim 12. Therefore computer program product claim 27 corresponds to apparatus claim 12 and is rejected for the same reasons of obviousness as used in claim 12 rejection above.

Conclusion

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHUONG M NGUYEN whose telephone number is (571)272-8184. The examiner can normally be reached M-F 10:00am - 6:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Derrick Ferris can be reached on 571-272-3123. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CHUONG M NGUYEN/Patent Examiner, Art Unit 2411                                                                                                                                                                                                        
/GARY MUI/Primary Examiner, Art Unit 2464