DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
In the present application, filed on or after March 16, 2013, claims 1-2, 4-11, 13-22, and 26-27 have been considered and examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 11/02/2020 has been entered.

Respond to Applicant’s Arguments/Remarks
Applicant’s arguments, see Remarks, filed 11/02/2020, with respect to the rejection(s) of claims 1-2, 4-11, and 13-26, have been fully considered but are not deemed persuasive. 
On pages 13-17 of Applicant’s remarks, Applicant argues that the combination of Lee, Classen, Kraemer, and Jahnke does not teach or suggest the claimed invention because none of references discloses determining locations of a first mobile device and a second device relative to each other during recording of at least one first audio object where the first mobile device uses that location in order to mix the audio object, and further if the teachings of Lee and Kraemer are combined, the location of rendering device relative to the recording device at the time the audio object was recorded does not matter.
Kraemer to disclose a method for mixing audio sources to create an audio object based on location information of the audio sources during the recording (Kraemer: column 15 lines 63 – column 16 lines 40, and FIG. 9-11: FIG. 9 illustrates an example scene 900 for object-oriented audio capture. The scene 900 represents a simplified view of an audio-visual scene such as may be constructed for a movie, television, or other video. In the scene 900, two actors 910 are performing, and their sounds and actions are recorded by a microphone 920 and camera 930 respectively. For simplicity, a single microphone 920 is illustrated, although in some cases the actors 910 may wear individual microphones. Similarly, individual microphones can also be supplied for props (not shown)) wherein the audio sources/rendering devices comprise audio data can be recorded or otherwise obtained including dialog, background music, and sounds generated by any item (such as a car, an airplane, or any prop), or any audio clip (Kraemer: Abstract, column 2 lines 32-56, column 3 lines 39-column 4 lines 4, and FIG. 9).
Thus, the locations of sound sources/rendering devices relative to the recording device are considered as location attributes of the sound sources/rendering devices. As a result, Applicant arguments are not deemed persuasive, and the previous rejections pertaining to the previous set of claims are sustained (see reiterated rejections below for details).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-2, 4-5, 9-11, 13-14, 18-22, and 26-27 are rejected under 35 U.S.C. 103 as being unpatentable over Lee et al. (Lee – US 2008/0207115 A1) in view of Classen  (Classen – US 8,068,105 B1), Kraemer et al. (Kraemer – US 8,396,576 B2) and Jahnke (Jahnke – US 2005/0179701 A1).

As to claim 1, Lee discloses a method comprising:
the plurality of mobile devices comprising at least the first mobile device and a second mobile device (Lee: Abstract and FIG. 1 the first audio player 100 and the second audio player 200), where the first mobile device provides a user interface (Lee: [0027] and FIG. 2 the display unit 160);
receiving, with the first mobile device, at least one first audio object from the second mobile device (Lee: Abstract, [0010], [0013], [0030], [0054], and FIG. 4-5: According to an example embodiment of the present invention, an example of sharing the audio file based on a user at a first mobile device (i.e., a first audio player 100) and the other party at a second mobile device (i.e., a second audio player 200) is described), 
determining, with the first mobile device, the location of the second mobile device relative to the first mobile device (Lee: Abstract, [0010], [0013], [0030], [0054], and FIG. 4-5: the play-management unit is also used to output the audio file based on the moving direction of the other mobile device, to control the volume of the audio file based on the sensed relative location of the other mobile device, to play a predetermined sound audio, if the sensed relative location of the other mobile device is within outside a certain distance, and to terminate the currently played audio file, if the sensed relative location of the other mobile device is within a certain distance).

Lee does not explicitly disclose the method steps of 
initiating, with a first mobile device, a mixing session to create a spatial audio mix using data transfer between a plurality of mobile devices to form an audio scene;
wherein a location of the second mobile device relative to the first mobile device is determined based upon locations of the first mobile device and the second mobile device relative to each other during recording of the at least one first audio object;
providing, with the first mobile device, at least one input with the user interface of the first mobile device, where the at least one input is configured to be used to modify the at least one first audio object to form at least one modified first audio object; and
mixing, with the first mobile device, at least the at least one modified first audio object with at least one second audio object to create the spatial audio mix, where the mixing is based, at least partially, upon the determined location of the second mobile device relative to the first mobile device, where modification of the at least one first audio object is configured to control at least one spatial aspect of the audio scene, where the spatial audio mix is configured to be perceived from a listening position corresponding to a location of the first mobile device in the audio scene, where the at least one first audio object and the at least one second audio object correspond, at least partially, to parts of the audio scene represented with the spatial audio mix.

However, it has been known in the art of audio processing to implement the method steps of providing, with the first mobile device, at least one input with the user interface of the first mobile device, where the at least one input is configured to be used to modify the at least one first audio object to form at least one modified first audio object, as suggested by Classen, which discloses providing, with the first mobile device, at least one input with the user interface of the first mobile device (Classen: column 7 lines 16-30 and FIG. 1-4 the fader 475), where the at least one input is configured to be used to modify the at least one first audio object to form at least one modified first audio object (Classen: Abstract, column 2 lines 41-53, column 4 lines 44-column 5 lines13, column 8 lines 36-64, column 10 lines 29-45, and FIG. 1-5: the control data of an audio mixer can be processed in order to dynamically display changes in the audio mixer controls instead of the corresponding audio properties. In a mixing environment, audio properties can be adjusted continuously or at regular or irregular intervals using audio mixer controls. For example, in addition to fading in and fading out of a signal, the signal intensity of an object can be changing over time).
Therefore, in view of teachings by Lee and Classen, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to implement in the audio system of Lee include the method steps providing, with the first mobile device, at least one input with the user interface of the first mobile device, where the at least one input is configured to be used to modify the at least one first audio object to form at least one modified first audio object, as suggested by Classen. The motivation for this is to adjust audio properties of audio objects using an input device. 

The combination of Lee and Classen does not explicitly disclose the mixing method steps as claimed.

However, it has been known in the art of spatial audio to implement the method steps as claimed, as suggested by Kraemer, which discloses the method steps of 
initiating, with a first mobile device, a mixing session to create a spatial audio mix using data transfer between a plurality of mobile devices to form an audio scene (Kraemer: FIG. 9 the microphone 920 and the location tracking devices 912);
wherein a location of the second mobile device relative to the first mobile device is determined based upon locations of the first mobile device and the second mobile device relative to each other during recording of the at least one first audio object (Kraemer: column 15 lines 63 – column 16 lines 40, and FIG. 9-11: FIG. 9 illustrates an example scene 900 for object-oriented audio capture. The scene 900 represents a simplified view of an audio-visual scene such as may be constructed for a movie, television, or other video. In the scene 900, two actors 910 are performing, and their sounds and actions are recorded by a microphone 920 and camera 930 respectively. For simplicity, a single microphone 920 is illustrated, although in some cases the actors 910 may wear individual microphones. Similarly, individual microphones can also be supplied for props (not shown); thus, it is reasonable/obvious to interpret that the sound source(s) generated/collected by individual microphone is updated ;
mixing, with the first mobile device, at least the at least one modified first audio object (Classen: Abstract, column 2 lines 41-53, column 4 lines 44-column 5 lines13, column 8 lines 36-64, column 10 lines 29-45, and FIG. 1-5) with at least one second audio object to create the spatial audio mix (Kraemer: column 16 lines 35-36: The audio signals are audio objects to which position data is applied, i.e. the audio signals are audio objects even prior to application of the position data), where the mixing is based, at least partially, upon the determined location of the second mobile device relative to the first mobile device (Kraemer: column 15 lines 63 – column 16 lines 40, and FIG. 9-11: FIG. 9 illustrates an example scene 900 for object-oriented audio capture. The scene 900 represents a simplified view of an audio-visual scene such as may be constructed for a movie, television, or other video. In the scene 900, two actors 910 are performing, and their sounds and actions are recorded by a microphone 920 and camera 930 respectively. For simplicity, a single microphone 920 is illustrated, although in some cases the actors 910 may wear individual microphones. Similarly, individual microphones can also be supplied for props (not shown); thus, it is reasonable/obvious to interpret that the sound source(s) generated/collected by individual microphone is updated during the mix based on the location of the individual microphone associated a corresponding actor), where modification of the at least one first audio object is configured to control at least one spatial aspect of the audio scene (Classen: Abstract, column 2 lines 41-53, column 4 lines 44-column 5 lines13, column 8 lines 36-64, column 10 lines 29-45, and FIG. 1-5 and Kraemer: column 17 lines 51-58 and FIG. 9-10), where the at least one first audio object and the at least one second audio object correspond, at least partially, to parts of the audio scene represented with the spatial audio mix (Kraemer: column 17 lines 51-58 and FIG. 9-10: When a renderer receives two linked objects, the renderer can choose to render the two objects separately or together. Thus, instead of rendering a marching band as a single point source on one speaker, for instance, a renderer can render the marching band as a sound field of audio objects together on a variety of speakers. As the band moves in a video, for instance, the renderer can move the sound field across the speakers).
Therefore, in view of teachings by Lee, Classen, and Kraemer, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to implement in the audio system of Lee and Classen, to include the method steps as claimed, as suggested by Kraemer. The motivation for this is to create audio objects having location attributes. 

The combination of Lee, Classen, and Kraemer does not explicitly disclose where the spatial audio mix is configured to be perceived from a listening position corresponding to a location of the first mobile device in the audio scene.
However, it has been known in the art of signal processing to implement where the spatial audio mix is configured to be perceived from a listening position corresponding to a location of the first mobile device in the audio scene, as suggested by Jahnke, which discloses where the spatial audio mix is configured to be perceived from a listening position corresponding to a location of the first mobile device in the audio scene (Jahnke: Abstract, [0048]-[0050], [0057], and FIG. 2).
Therefore, in view of teachings by Lee, Classen, Kraemer, and Jahnke, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to implement in the audio system of Lee, Classen, and Kraemer to include where the spatial audio mix is configured to be perceived from a listening position corresponding to a location of the first mobile device in the audio scene, as suggested by Jahnke. The motivation for this is to dynamically generate sound effects based on listener positions. 

As to claim 2, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 1 further comprising the method as in claim 1, wherein
the at least one second audio object comprises at least one of:
an audio object received by the first mobile device from a third mobile device of the plurality of mobile devices (Kraemer: Abstract, column 3 lines 23-34, column 17 lines 38-65, and FIG. 9-11 the plurality of sound source location data 1002: When a renderer receives two linked objects, the renderer can choose to render the two objects separately or together. Thus, instead of rendering a marching band as a single point source on one speaker, for instance, a renderer can render the marching band as a sound field of audio objects together on a variety of speakers); or
an audio object comprising audio captured via at least one microphone of the first mobile device (Kraemer: column 15 lines 63 – column 16 lines 40, and FIG. 9-11: FIG. 9 illustrates an example scene 900 for object-oriented audio capture. The scene 900 represents a simplified view of an audio-visual scene such as may be constructed for a movie, television, or other video. In the scene 900, two actors 910 are performing, and their sounds and actions are recorded by a microphone 920 and camera 930 respectively. For simplicity, a single microphone 920 is illustrated, although in some cases the actors 910 may wear individual microphones. Similarly, individual microphones can also be supplied for props (not shown); thus, it is .

As to claim 4, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 1 further comprising the method as in claim 1, further comprising coupling at least the first mobile device and the second mobile device with at least one wireless link, where the at least one first audio object is received via the wireless link (Lee: Abstract, [0010], [0024]-[0025], [0032], [0054], and FIG. 4-5: The wireless communication unit 130 receives the audio file transmitted from the second audio player 200. Here, as a result of the sensing from the location-sensing unit 120, the audio file can be transmitted from the second audio player 200, only when the second audio player 200 is located within a certain range (i.e., when the second audio player 200 is located within a short distance). The wireless communication can be part of a wireless network, and can utilize wireless fidelity (WiFi), ultra-wide band (UWB), near field communication (NFC), Bluetooth, or infrared communication to establish communication between mobile devices, i.e., the first audio player 100 and the second audio player 200).

As to claim 5, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 1 further comprising the method as in claim 1, further comprising rendering, with the first mobile device, the spatial audio mix while the mixing is being performed.
However, Kraemer discloses a mixed audio object (column 17 lines 51-58 and FIG. 9-10) is adapted to be selectively played back by speakers (Abstract, column 5 lines 23-41, column 
Therefore, in view of teachings by Lee, Classen, Kraemer, and Jahnke, it would have been obvious to one of the ordinary skill in the art at the time of the claimed invention to implement in the audio system of Lee, Classen, and Jahnke to include the method step of rendering, with the first mobile device, the spatial audio mix while the mixing is being performed, as suggested by Kraemer, as desired. The motivation for this is to enhance listener immersion in the acoustic environment as well as to implement positioning of 3D objects that correspond accurately to their position in the visual field.

As to claim 9, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 1 further comprising the method as in claim 1, wherein the at least one first audio object corresponds to another part of the audio scene (Lee: [0011], [0035], [0055], [0063], and FIG. 4-6: The play-management unit 140 can control the output direction of the audio file based on the moving direction of the second audio player 200 sensed by the location-sensing unit 120, and can also control the volume of the audio file played in proportion to the distance between the user who uses the first audio player 100 and the other party who uses the second audio player 200 (i.e., operation of the third step, shown in FIG. 4)).

As to claim 10, Lee, Classen, Kraemer, and Jahnke discloses all the first mobile device limitations as claimed that mirrors the method steps in claim 1; thus, claim 10 is interpreted and thus rejected for the reasons set forth above in the consideration and rejections of claim 1, and the details are as followings: 
a first mobile device (Lee: Abstract and FIG. 1 the first audio player 100 and the second audio player 200 and Kraemer: FIG. 9 the microphone 920 and the location tracking devices 912) comprising:
at least one processor, and
at least one non-transitory memory comprising computer program code, the at least one non-transitory memory and the computer program code configured to, with the at least one processor, cause the first mobile device to perform operations, the operations comprising:
initiating, at the first mobile device, a mixing session (Kraemer: column 16 lines 35-36: The audio signals are audio objects to which position data is applied, i.e. the audio signals are audio objects even prior to application of the position data)  to create a spatial audio mix using data transfer between a plurality of mobile devices to form an audio scene (Kraemer: FIG. 9 the microphone 920 and the location tracking devices 912), the plurality of mobile devices comprising at least the first mobile device and a second mobile device (Lee: Abstract and FIG. 1 the first audio player 100 and the second audio player 200), where the first mobile device provides a user interface (Lee: [0027] and FIG. 2 the display unit 160);
allowing receiving, at the first mobile device, of at least one first audio object from the second mobile device (Lee: Abstract, [0010], [0013], [0030], [0054], and FIG. 4-5: According to an example embodiment of the present invention, an example of sharing the audio file based on a user at a first mobile device (i.e., a first audio player 100) and the other party at a second mobile device (i.e., a second audio player 200) is described), wherein a location, of the second mobile device relative to the first mobile device is determined based upon locations of the first mobile device and the second mobile device relative to each other during recording of the at least one first audio object (Kraemer: column 15 lines 63 – column 16 lines 40, and FIG. The scene 900 represents a simplified view of an audio-visual scene such as may be constructed for a movie, television, or other video. In the scene 900, two actors 910 are performing, and their sounds and actions are recorded by a microphone 920 and camera 930 respectively. For simplicity, a single microphone 920 is illustrated, although in some cases the actors 910 may wear individual microphones. Similarly, individual microphones can also be supplied for props (not shown); thus, it is reasonable/obvious to interpret that the sound source(s) generated/collected by individual microphone is updated during the mix based on the location of the individual microphone associated a corresponding actor);
determining, at the first mobile device, the location of the second mobile device relative to the first mobile device (Lee: Abstract, [0010], [0013], [0030], [0054], and FIG. 4-5: the play-management unit is also used to output the audio file based on the moving direction of the other mobile device, to control the volume of the audio file based on the sensed relative location of the other mobile device, to play a predetermined sound audio, if the sensed relative location of the other mobile device is within outside a certain distance, and to terminate the currently played audio file, if the sensed relative location of the other mobile device is within a certain distance);
providing, at the first mobile device, at least one input with the user interface of the first mobile device, where the at least one input is configured to be used to modify the at least one first audio object to form at least one modified first audio object (Classen: Abstract, column 2 lines 41-53, column 4 lines 44-column 5 lines13, column 8 lines 36-64, column 10 lines 29-45, and FIG. 1-5: the control data of an audio mixer can be processed in order to dynamically display changes in the audio mixer controls instead of the corresponding audio properties. In a mixing ; and
cause mixing, at the first mobile device, of at least the at least one modified first audio object (Classen: Abstract, column 2 lines 41-53, column 4 lines 44-column 5 lines13, column 8 lines 36-64, column 10 lines 29-45, and FIG. 1-5) with at least one second audio object to create the spatial audio mix (Kraemer: column 16 lines 35-36: The audio signals are audio objects to which position data is applied, i.e. the audio signals are audio objects even prior to application of the position data), where the mixing is based, at least partially, upon the determined location of the second mobile device relative to the first mobile device (Kraemer: column 15 lines 63 – column 16 lines 40, and FIG. 9-11: FIG. 9 illustrates an example scene 900 for object-oriented audio capture. The scene 900 represents a simplified view of an audio-visual scene such as may be constructed for a movie, television, or other video. In the scene 900, two actors 910 are performing, and their sounds and actions are recorded by a microphone 920 and camera 930 respectively. For simplicity, a single microphone 920 is illustrated, although in some cases the actors 910 may wear individual microphones. Similarly, individual microphones can also be supplied for props (not shown); thus, it is reasonable/obvious to interpret that the sound source(s) generated/collected by individual microphone is updated during the mix based on the location of the individual microphone associated a corresponding actor), where modification of the at least one first audio object is configured to control at least one spatial aspect of the audio scene (Classen: Abstract, column 2 lines 41-53, column 4 lines 44-column 5 lines13, column 8 lines 36-64, column 10 lines 29-45, and FIG. 1-5 and Kraemer: column 17 lines 51-58 and FIG. 9-10), where the spatial audio mix is configured to be perceived from a listening position corresponding to a location of the first mobile device in the audio scene (Jahnke: Abstract, [0048]-[0050], [0057], and FIG. 2), where the at least one first audio object and the at least one second audio object correspond, at least partially, to parts of the audio scene represented with the spatial audio mix (Kraemer: column 17 lines 51-58 and FIG. 9-10: When a renderer receives two linked objects, the renderer can choose to render the two objects separately or together. Thus, instead of rendering a marching band as a single point source on one speaker, for instance, a renderer can render the marching band as a sound field of audio objects together on a variety of speakers. As the band moves in a video, for instance, the renderer can move the sound field across the speakers).

As to claim 11, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 10 further comprising the first mobile device as in claim 10, wherein the at least one second audio object comprises at least one of:
an audio object received by the first mobile device from a third mobile device of the plurality of mobile devices (Kraemer: Abstract, column 3 lines 23-34, column 17 lines 38-65, and FIG. 9-11 the plurality of sound source location data 1002: When a renderer receives two linked objects, the renderer can choose to render the two objects separately or together. Thus, instead of rendering a marching band as a single point source on one speaker, for instance, a renderer can render the marching band as a sound field of audio objects together on a variety of speakers); or
an audio object comprising audio captured via at least one microphone of the first mobile device (Kraemer: column 15 lines 63 – column 16 lines 40, and FIG. 9-11: FIG. 9 illustrates an example scene 900 for object-oriented audio capture. The scene 900 represents a simplified view their sounds and actions are recorded by a microphone 920 and camera 930 respectively. For simplicity, a single microphone 920 is illustrated, although in some cases the actors 910 may wear individual microphones. Similarly, individual microphones can also be supplied for props (not shown); thus, it is reasonable/obvious to interpret that the sound source(s) generated/collected by individual microphone is updated during the mix based on the location of the individual microphone associated a corresponding actor).

As to claim 13, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 10 further comprising the first mobile device as in claim 10, wherein the operations further comprise: coupling at least the first mobile device and the second mobile device with at least one wireless link, where the at least one first audio object is received via the wireless link (Lee: Abstract, [0010], [0024]-[0025], [0032], [0054], and FIG. 4-5: The wireless communication unit 130 receives the audio file transmitted from the second audio player 200. Here, as a result of the sensing from the location-sensing unit 120, the audio file can be transmitted from the second audio player 200, only when the second audio player 200 is located within a certain range (i.e., when the second audio player 200 is located within a short distance). The wireless communication can be part of a wireless network, and can utilize wireless fidelity (WiFi), ultra-wide band (UWB), near field communication (NFC), Bluetooth, or infrared communication to establish communication between mobile devices, i.e., the first audio player 100 and the second audio player 200).

As to claim 14, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 10 except for the claimed limitations of the first mobile device as in claim 10, wherein the operations further comprise:
rendering, with the first mobile device, the spatial audio mix while the mixing is being performed.
However, Kraemer discloses a mixed audio object (column 17 lines 51-58 and FIG. 9-10) is adapted to be selectively played back by speakers (Abstract, column 5 lines 23-41, column 7 lines 11-29, column 8 lines 8-22, column 14 lines 11-26, column 15 lines 35-43, column 16 lines 16 – column 17 lines 50, FIG. 2, FIG. 8 and FIG. 10).
Therefore, in view of teachings by Lee, Classen, Kraemer, and Jahnke, it would have been obvious to one of the ordinary skill in the art at the time of the claimed invention to implement in the audio system of Lee, Classen, and Jahnke to include the method step of rendering, with the first mobile device, the spatial audio mix while the mixing is being performed, as suggested by Kraemer, as desired. The motivation for this is to enhance listener immersion in the acoustic environment as well as to implement positioning of 3D objects that correspond accurately to their position in the visual field.

As to claim 18, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 10 further comprising the first mobile device as in claim 10, wherein the at least one first audio object corresponds to another part of the audio scene (Lee: [0011], [0035], [0055], [0063], and FIG. 4-6: The play-management unit 140 can control the output direction of the audio file based on the moving direction of the second audio player 200 sensed by the location-sensing unit 120, and can also control the volume of the audio file played in proportion to the distance between the .

As to claim 19, Lee, Classen, Kraemer, and Jahnke discloses all the non-transitory computer readable medium comprising program instructions for causing a first mobile device to perform limitations as claimed that mirrors the method steps in claim 1; thus, claim 19 is interpreted and thus rejected for the reasons set forth above in the consideration and rejections of claim 1, and the details are as followings:  
a non-transitory computer readable medium comprising program instructions for causing a first mobile device to perform at least the following:
initiating, at the first mobile device, a mixing session (Kraemer: column 16 lines 35-36: The audio signals are audio objects to which position data is applied, i.e. the audio signals are audio objects even prior to application of the position data) to create a spatial audio mix using data transfer between a plurality of mobile devices to form an audio scene (Kraemer: FIG. 9 the microphone 920 and the location tracking devices 912), the plurality of mobile devices comprising at least the first mobile device and a second mobile device (Lee: Abstract and FIG. 1 the first audio player 100 and the second audio player 200), where the first mobile device provides a user interface (Lee: [0027] and FIG. 2 the display unit 160);
receiving, at the first mobile device, at least one first audio object from the second mobile device (Lee: Abstract, [0010], [0013], [0030], [0054], and FIG. 4-5: According to an example embodiment of the present invention, an example of sharing the audio file based on a user at a first mobile device (i.e., a first audio player 100) and the other party at a second mobile device (i.e., a second audio player 200) is described), 
wherein a location of the second mobile device relative to the first mobile device is determined based upon locations of the first mobile device and the second mobile device relative to each other during recording of the at least one first audio object (Kraemer: column 15 lines 63 – column 16 lines 40, and FIG. 9-11: FIG. 9 illustrates an example scene 900 for object-oriented audio capture. The scene 900 represents a simplified view of an audio-visual scene such as may be constructed for a movie, television, or other video. In the scene 900, two actors 910 are performing, and their sounds and actions are recorded by a microphone 920 and camera 930 respectively. For simplicity, a single microphone 920 is illustrated, although in some cases the actors 910 may wear individual microphones. Similarly, individual microphones can also be supplied for props (not shown); thus, it is reasonable/obvious to interpret that the sound source(s) generated/collected by individual microphone is updated during the mix based on the location of the individual microphone associated a corresponding actor);
determining, at the first mobile device, the location of the second mobile device relative to the first mobile device (Lee: Abstract, [0010], [0013], [0030], [0054], and FIG. 4-5: the play-management unit is also used to output the audio file based on the moving direction of the other mobile device, to control the volume of the audio file based on the sensed relative location of the other mobile device, to play a predetermined sound audio, if the sensed relative location of the other mobile device is within outside a certain distance, and to terminate the currently played audio file, if the sensed relative location of the other mobile device is within a certain distance);
providing, at the first mobile device, at least one input with the user interface of the first mobile device, where the at least one input is configured to be used to modify the at least one first audio object to form at least one modified first audio object (Classen: Abstract, column 2 lines 41-53, column 4 lines 44-column 5 lines13, column 8 lines 36-64, column 10 lines 29-45, and FIG. 1-5: the control data of an audio mixer can be processed in order to dynamically display changes in the audio mixer controls instead of the corresponding audio properties. In a mixing environment, audio properties can be adjusted continuously or at regular or irregular intervals using audio mixer controls. For example, in addition to fading in and fading out of a signal, the signal intensity of an object can be changing over time); and
mixing, at the first mobile device, at least the at least one modified first audio object (Classen: Abstract, column 2 lines 41-53, column 4 lines 44-column 5 lines13, column 8 lines 36-64, column 10 lines 29-45, and FIG. 1-5) with at least one second audio object to create the spatial audio mix (Kraemer: column 16 lines 35-36: The audio signals are audio objects to which position data is applied, i.e. the audio signals are audio objects even prior to application of the position data), where the mixing is based, at least partially, upon the determined location of the second mobile device relative to the first mobile device (Kraemer: column 15 lines 63 – column 16 lines 40, and FIG. 9-11: FIG. 9 illustrates an example scene 900 for object-oriented audio capture. The scene 900 represents a simplified view of an audio-visual scene such as may be constructed for a movie, television, or other video. In the scene 900, two actors 910 are performing, and their sounds and actions are recorded by a microphone 920 and camera 930 respectively. For simplicity, a single microphone 920 is illustrated, although in some cases the actors 910 may wear individual microphones. Similarly, individual microphones can also be supplied for props (not shown); thus, it is reasonable/obvious to interpret that the sound source(s) generated/collected by individual microphone is updated during the mix based on the location of the individual microphone , where modification of the at least one first audio object is configured to control at least one spatial aspect of the audio scene (Classen: Abstract, column 2 lines 41-53, column 4 lines 44-column 5 lines13, column 8 lines 36-64, column 10 lines 29-45, and FIG. 1-5 and Kraemer: column 17 lines 51-58 and FIG. 9-10), where the spatial audio mix is configured to be perceived from a listening position corresponding to a location of the first mobile device in the audio scene (Jahnke: Abstract, [0048]-[0050], [0057], and FIG. 2), where the at least one first audio object and the at least one second audio object correspond, at least partially, to parts of the audio scene represented with the spatial audio mix (Kraemer: column 17 lines 51-58 and FIG. 9-10: When a renderer receives two linked objects, the renderer can choose to render the two objects separately or together. Thus, instead of rendering a marching band as a single point source on one speaker, for instance, a renderer can render the marching band as a sound field of audio objects together on a variety of speakers. As the band moves in a video, for instance, the renderer can move the sound field across the speakers).

As to claim 20, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 19 further comprising the computer readable medium as in claim 19, wherein the at least one second audio object comprises at least one of:
an audio object received by the first mobile device from a third mobile device of the plurality of mobile devices (Kraemer: Abstract, column 3 lines 23-34, column 17 lines 38-65, and FIG. 9-11 the plurality of sound source location data 1002: When a renderer receives two linked objects, the renderer can choose to render the two objects separately or together. Thus, instead of rendering a marching band as a single point source on one speaker, for instance, a ; or
an audio object comprising audio captured via at least one microphone of the first mobile device (Kraemer: column 15 lines 63 – column 16 lines 40, and FIG. 9-11: FIG. 9 illustrates an example scene 900 for object-oriented audio capture. The scene 900 represents a simplified view of an audio-visual scene such as may be constructed for a movie, television, or other video. In the scene 900, two actors 910 are performing, and their sounds and actions are recorded by a microphone 920 and camera 930 respectively. For simplicity, a single microphone 920 is illustrated, although in some cases the actors 910 may wear individual microphones. Similarly, individual microphones can also be supplied for props (not shown); thus, it is reasonable/obvious to interpret that the sound source(s) generated/collected by individual microphone is updated during the mix based on the location of the individual microphone associated a corresponding actor).

As to claim 21, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 1 further comprising the method as in claim 1 further comprising storing the spatial audio mix in at least one non-transitory memory (Kraemer: column 3 lines 53-column 4 lines 4, column 12 lines 46-57, and FIG. 1: The object creation module 114 can store the audio objects in an object data repository 116, which can include a database or other data storage).

As to claim 22, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 21 further comprising the method as in claim 21 further comprising rendering the stored spatial audio mix (Kraemer: column 17 lines 51-58 and FIG. 9-10: When a renderer receives two .

As to claim 26, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 1 further comprising the method as in claim 1, where the at least one input provided with the user interface of the first mobile device is configured to be used to modify at least one of:
a direction,
a location,
a distance,
a playback level (Classen: Abstract, column 2 lines 41-53, column 4 lines 44-column 5 lines13, column 8 lines 36-64, column 10 lines 29-45, and FIG. 1-5: the control data of an audio mixer can be processed in order to dynamically display changes in the audio mixer controls instead of the corresponding audio properties. In a mixing environment, audio properties can be adjusted continuously or at regular or irregular intervals using audio mixer controls. For example, in addition to fading in and fading out of a signal, the signal intensity of an object can be changing over time), or
a reverberation level of the at least one first audio object to form the at least one modified first audio object.

As to claim 27, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 1 further comprising the method as in claim 1, wherein the receiving, with the first mobile device, of the at least one first audio object from the second mobile device comprises receiving the at least one first audio object via a short range communication system of the first mobile device (Lee: Abstract, [0010], [0024]-[0025], [0032], [0054], and FIG. 4-5: The wireless communication unit 130 receives the audio file transmitted from the second audio player 200. Here, as a result of the sensing from the location-sensing unit 120, the audio file can be transmitted from the second audio player 200, only when the second audio player 200 is located within a certain range (i.e., when the second audio player 200 is located within a short distance). The wireless communication can be part of a wireless network, and can utilize wireless fidelity (WiFi), ultra-wide band (UWB), near field communication (NFC), Bluetooth, or infrared communication to establish communication between mobile devices, i.e., the first audio player 100 and the second audio player 200).

Claims 6-7 and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Lee et al. (Lee – US 2008/0207115 A1) in view of Classen  (Classen – US 8,068,105 B1), Kraemer et al. (Kraemer – US 8,396,576 B2) and Jahnke (Jahnke – US 2005/0179701 A1) and further in view of Barry (Barry – US 2009/0132075 A1).

As to claim 6, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 1 except for the claimed limitations of the method as in claim 1, wherein the user interface of the first mobile device is configured to receive a user input, wherein the user input causes at least one of:
the mixing session to be initiated, or the mixing session to be stopped.
However, it has been known in the art of audio processing to implement wherein a user interface of the first mobile device is configured to receive the user input, wherein the user input causes at least one of: the mixing session to be initiated, or the mixing session to be stopped, as suggested by Barry, which discloses wherein a user interface of the first mobile device is configured to receive the user input, wherein the user input causes at least one of: the mixing session to be initiated, or the mixing session to be stopped (Abstract, [0010]-[0011], [0083]-[0087], [0106], [0109], [0118], [0125], and FIG. 19-20:  When a channel with video media is enabled it will start playback from the point at which the file was last enabled and play once through only (default setting that can be changed in the Advanced Audio screen). A mix is finished when the `Stop Record` button 62 is selected and activated (19.2 (A))).
Therefore, in view of teachings by Lee, Classen, Kraemer, Jahnke and Barry it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to implement in the audio system of Lee, Classen, Kraemer, and Jahnke to include the method steps as claimed, as suggested by Barry. The motivation for this is to selective control a mixing process. 

As to claim 7, Lee, Classen, Kraemer, Jahnke and Barry disclose the limitations of claim 6 further comprising the method as in claim 6, further comprising:
in response to the user input to stop the mixing session, sending a request to each of the plurality of mobile devices to stop the mixing session (Barry: Abstract, [0010]-[0011], [0083]-[0087], [0106], [0109], [0118], [0125], and FIG. 19-20:  When a channel with video media is enabled it will start playback from the point at which the file was last enabled and play once .

As to claim 15, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 10 except for the claimed limitations of the first mobile device as in claim 10, wherein the user interface of the first mobile device is configured to receive a user input, wherein the user input causes at least one of:
the mixing session to be initiated, or
the mixing session to be stopped.
However, it has been known in the art of audio processing to implement wherein a user interface of the first mobile device is configured to receive the user input, wherein the user input causes at least one of: the mixing session to be initiated, or the mixing session to be stopped, as suggested by Barry, which discloses wherein a user interface of the first mobile device is configured to receive the user input, wherein the user input causes at least one of: the mixing session to be initiated, or the mixing session to be stopped (Abstract, [0010]-[0011], [0083]-[0087], [0106], [0109], [0118], [0125], and FIG. 19-20:  When a channel with video media is enabled it will start playback from the point at which the file was last enabled and play once through only (default setting that can be changed in the Advanced Audio screen). A mix is finished when the `Stop Record` button 62 is selected and activated (19.2 (A))).
Therefore, in view of teachings by Lee, Classen, Kraemer, Jahnke and Barry it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to implement in the audio system of Lee, Classen, Kraemer, and Jahnke to Barry. The motivation for this is to selective control a mixing process. 

As to claim 16, Lee, Classen, Kraemer, Jahnke and Barry disclose the limitations of claim 15 further comprising the first mobile device as in claim 15, wherein the operations further comprise:
in response to the user input to stop the mixing session, sending a request to each of the plurality of mobile devices to stop the mixing session (Barry: Abstract, [0010]-[0011], [0083]-[0087], [0106], [0109], [0118], [0125], and FIG. 19-20:  When a channel with video media is enabled it will start playback from the point at which the file was last enabled and play once through only (default setting that can be changed in the Advanced Audio screen). A mix is finished when the `Stop Record` button 62 is selected and activated (19.2 (A))).

Claims 8 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Lee et al. (Lee – US 2008/0207115 A1) in view of Classen  (Classen – US 8,068,105 B1), Kraemer et al. (Kraemer – US 8,396,576 B2) and Jahnke (Jahnke – US 2005/0179701 A1) and further in view of Ahya et al. (Ahya – US 2009/0298419 A1).

As to claim 8, Lee, Classen, Kraemer, and Jahnke disclose the limitations of claim 1 except for the claimed limitations of the method as in claim 1, further comprising displaying, on a display of the first mobile device, the determined location of at least the second mobile device relative to the first mobile device.
displaying, on a display of the first mobile device, the determined location of at least the second mobile device relative to the first mobile device, as suggested by Ahya, which discloses the method steps of displaying, on a display of the first mobile device, the determined location of at least the second mobile device relative to the first mobile device (Abstract, [0019]-[0025], and FIG. 1).
Therefore, in view of teachings by Lee, Classen, Kraemer, Jahnke and Ahya it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to implement in the audio system of Lee, Classen, Kraemer and Jahnke to include the method steps of displaying, on a display of the first mobile device, the determined location of at least the second mobile device relative to the first mobile device, as suggested by Ahya. The motivation for this is to allow a user to visualize other users in proximity to the user via a display of a mobile device of the user.

As to claim 17, Lee, Classen, Kraemer and Jahnke disclose the limitations of claim 10 except for the claimed limitations of the first mobile device as in claim 10, wherein the operations further comprise:
displaying, on a display of the first mobile device, the determined location of at least the second mobile device relative to the first mobile device.
However, it has been known in the art of location tracking to implement the method steps of displaying, on a display of the first mobile device, the determined location of at least the second mobile device relative to the first mobile device, as suggested by Ahya, which discloses the method steps of displaying, on a display of the first mobile device, the determined location of at least the second mobile device relative to the first mobile device (Abstract, [0019]-[0025], and FIG. 1).
Therefore, in view of teachings by Lee, Classen, Kraemer, Jahnke and Ahya it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to implement in the audio system of Lee, Classen, Kraemer and Jahnke to include the method steps of displaying, on a display of the first mobile device, the determined location of at least the second mobile device relative to the first mobile device e, as suggested by Ahya. The motivation for this is to allow a user to visualize other users in proximity to the user via a display of a mobile device of the user.

Citation of Pertinent Art
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
Korhonen, US 2008/0045140 A1, discloses audio system employing multiple mobile devices in concert.
Kim et al., US 2014/0146984 A1, discloses constrained dynamic amplitude panning in collaborative sound systems.
Allen et al., US 9,763,280 B1, discloses mobile device assignment within wireless sound system based on device specifications.
Filev et al., US 8,712,328 B1, discloses surround sound effects provided by cell phones.
Algazi et al. discloses Immersive Spatial Sound for Mobile Multimedia.
Bleidt et al. discloses Object Based Audio Opportunities for Improved Listening Experience and Increased Listener Involvement. 
Lee et al. discloses Cocktail Party on the Mobile. 
Thalmann et al. discloses The Mobile Audio Ontology Experiencing Dynamic Music Objects on Mobile Devices. 
SMPTE discloses Metadata based audio production for Next Generation Audio formats.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to QUANG PHAM whose telephone number is (571)-270-3668.  The examiner can normally be reached on Monday - Thursday 9:30 AM - 5:00 PM EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, QUAN-ZHEN WANG can be reached on (571)-272-3114.  The fax phone number for the organization where this application or proceeding is assigned is (571)-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/QUANG PHAM/Primary Examiner, Art Unit 2684