DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 23-25, 27, 29-31, 33, 35-37 and 39 is/are rejected under 35 U.S.C. 103 as being unpatentable over Seok U.S. PAP 2016/0054976 A1 in view of McNeeney UK Patent Application GB 2506404.

Regarding claim 23 Seok teaches a method, comprising: 
receiving, by way of an input user interface of a media player, a selection of a play button (playing a multi-track audio file, see par. [0006] ;  the input unit 170 of the apparatus 100 for producing media contents includes an input panel to which various control commands for the apparatus 100 for producing media contents are input from the user, see par. [0028]); 
playing simultaneously, by way of an output user interface of the media player, and in response to the selection, two different versions of the same musical song (reproducing, by a media file editing apparatus, a multi-track audio file in which an accompaniment track and a vocal track corresponding to the accompaniment track are synthesized with each other, see par. [0006]); 
receiving, by way of the input user interface, a volume adjustment (receiving, by the media file editing apparatus, a volume control command of the vocal track from a user, see par. [0006]).
However Seok does not teach in response to receiving the volume adjustment, increasing a volume of one of the versions being played and decreasing a volume of the other of the versions being played.  
In the same field of endeavor McNeeney teaches a computer implemented method of transitioning, also known as cross-fading, from a first audio track to a second audio track, see page 1 lines 1-6. "Blending" occurs when a DJ transitions from a first song to a second song. To make the process aurally pleasing, they may gradually "fade in" the new track, while "fading out" the old one, increasing the volume of the new track while decreasing the volume of the current IS track. This is an example of a "blend": two tracks that are played sequentially, with a transition between the two. Although blends are often achieved manually by DJs, there have been a number of attempts to perform the task automatically, on a computer system, see page 1 lines 9-17.
It would have been obvious to one of ordinary skill in the art to combine the Seok invention with the teachings of McNeeney for the benefit of making the sound editing aurally pleasing, see page 1 lines 9-17.
Regarding claim 24 Seok teaches the method of claim 23, wherein one of the versions includes a mixture of instrumental and vocal components (multi-track audio file in which an accompaniment track and a vocal track corresponding to the accompaniment track are synthesized with each other, see par. [0010]; he user himself or herself may produce media contents recorded with an original singer's voice of a user's favorite song and the user's voice in a duet mode, see abstract); 
and wherein the other of the versions includes one and not the other of an instrumental component or a vocal component (users voice data, see par. [0007]).  
Regarding claim 25 McNeeney teaches the method of claim 23, wherein the increasing increases the volume of the one of the versions being played in inverse proportion to the volume of the other of the versions being played (the transition from the first audio track to the second audio track comprises decreasing the volume of the first audio track from a first level to a second level over a transition period and simultaneously increasing the volume of the second audio track from a third level to a fourth level over the transition period, see claim 19).  
Regarding claim 27 Seok teaches the method of claim 23, wherein the playing plays in synchrony the two different versions of the same musical song (reproducing unit 130 simultaneously reproduces independently the vocal track and the accompaniment track in the multi-track audio file stored in the storage unit 110, and the vocal track and the accompaniment track independently reproduced by the reproducing unit 130 are simultaneously output through the output unit 150 to be transmitted to a user, see par. [0027]).  

Regarding claim 29 Seok teaches a media player system, comprising: 
at least one user interface including an input user interface and an output user interface; 
a memory storing a program; and a processor coupled to the at least one user interface and the memory, the processor being controllable by the program to perform a method including: receiving, by way of an input user interface of a media player, a selection of a play button (playing a multi-track audio file, see par. [0006] ;  the input unit 170 of the apparatus 100 for producing media contents includes an input panel to which various control commands for the apparatus 100 for producing media contents are input from the user, see par. [0028]); 
playing simultaneously, by way of an output user interface of the media player, and in response to the selection, two different versions of the same musical song (reproducing, by a media file editing apparatus, a multi-track audio file in which an accompaniment track and a vocal track corresponding to the accompaniment track are synthesized with each other, see par. [0006]); 
receiving, by way of the input user interface, a volume adjustment (receiving, by the media file editing apparatus, a volume control command of the vocal track from a user, see par. [0006]).
However Seok does not teach in response to receiving the volume adjustment, increasing a volume of one of the versions being played and decreasing a volume of the other of the versions being played.  
In the same field of endeavor McNeeney teaches a computer implemented method of transitioning, also known as cross-fading, from a first audio track to a second audio track, see page 1 lines 1-6. "Blending" occurs when a DJ transitions from a first song to a second song. To make the process aurally pleasing, they may gradually "fade in" the new track, while "fading out" the old one, increasing the volume of the new track while decreasing the volume of the current IS track. This is an example of a "blend": two tracks that are played sequentially, with a transition between the two. Although blends are often achieved manually by DJs, there have been a number of attempts to perform the task automatically, on a computer system, see page 1 lines 9-17.
It would have been obvious to one of ordinary skill in the art to combine the Seok invention with the teachings of McNeeney for the benefit of making the sound editing aurally pleasing, see page 1 lines 9-17.

Regarding claim 30 Seok teaches the system of claim 29, wherein one of the versions includes a mixture of instrumental and vocal components (multi-track audio file in which an accompaniment track and a vocal track corresponding to the accompaniment track are synthesized with each other, see par. [0010]; he user himself or herself may produce media contents recorded with an original singer's voice of a user's favorite song and the user's voice in a duet mode, see abstract); 
and wherein the other of the versions includes one and not the other of an instrumental component or a vocal component (users voice data, see par. [0007]).  
Regarding claim 31 McNeeney teaches the system of claim 29, wherein the increasing increases the volume of the one of the versions being played in inverse proportion to the volume of the other of the versions being played (the transition from the first audio track to the second audio track comprises decreasing the volume of the first audio track from a first level to a second level over a transition period and simultaneously increasing the volume of the second audio track from a third level to a fourth level over the transition period, see claim 19).   
Regarding claim 33 Seok teaches the system of claim 29, wherein the playing plays in synchrony the two different versions of the same musical song (reproducing unit 130 simultaneously reproduces independently the vocal track and the accompaniment track in the multi-track audio file stored in the storage unit 110, and the vocal track and the accompaniment track independently reproduced by the reproducing unit 130 are simultaneously output through the output unit 150 to be transmitted to a user, see par. [0027]).    
Regarding claim 35 Seok teaches a non-transitory computer-readable medium storing instructions which, when executed by a computer processor, causes the computer processor to perform a method, the method comprising: receiving, by way of an input user interface of a media player, a selection of a play button (playing a multi-track audio file, see par. [0006] ;  the input unit 170 of the apparatus 100 for producing media contents includes an input panel to which various control commands for the apparatus 100 for producing media contents are input from the user, see par. [0028]); 
playing simultaneously, by way of an output user interface of the media player, and in response to the selection, two different versions of the same musical song (reproducing, by a media file editing apparatus, a multi-track audio file in which an accompaniment track and a vocal track corresponding to the accompaniment track are synthesized with each other, see par. [0006]); 
receiving, by way of the input user interface, a volume adjustment (receiving, by the media file editing apparatus, a volume control command of the vocal track from a user, see par. [0006]).
However Seok does not teach in response to receiving the volume adjustment, increasing a volume of one of the versions being played and decreasing a volume of the other of the versions being played.  
In the same field of endeavor McNeeney teaches a computer implemented method of transitioning, also known as cross-fading, from a first audio track to a second audio track, see page 1 lines 1-6. "Blending" occurs when a DJ transitions from a first song to a second song. To make the process aurally pleasing, they may gradually "fade in" the new track, while "fading out" the old one, increasing the volume of the new track while decreasing the volume of the current IS track. This is an example of a "blend": two tracks that are played sequentially, with a transition between the two. Although blends are often achieved manually by DJs, there have been a number of attempts to perform the task automatically, on a computer system, see page 1 lines 9-17.
It would have been obvious to one of ordinary skill in the art to combine the Seok invention with the teachings of McNeeney for the benefit of making the sound editing aurally pleasing, see page 1 lines 9-17.

Regarding claim 36 Seok teaches the non-transitory computer-readable medium of claim 35, wherein one of the versions includes a mixture of instrumental and vocal components (multi-track audio file in which an accompaniment track and a vocal track corresponding to the accompaniment track are synthesized with each other, see par. [0010]; he user himself or herself may produce media contents recorded with an original singer's voice of a user's favorite song and the user's voice in a duet mode, see abstract); 
and wherein the other of the versions includes one and not the other of an instrumental component or a vocal component (users voice data, see par. [0007]).  
Regarding claim 37 McNeeney teaches the non-transitory computer-readable medium of claim 35, wherein the increasing increases the volume of the one of the versions being played in inverse proportion to the volume of the other of the versions being played (the transition from the first audio track to the second audio track comprises decreasing the volume of the first audio track from a first level to a second level over a transition period and simultaneously increasing the volume of the second audio track from a third level to a fourth level over the transition period, see claim 19).    
Regarding claim 39 Seok teaches the non-transitory computer-readable medium of claim 35, wherein the playing plays in synchrony the two different versions of the same musical song (reproducing unit 130 simultaneously reproduces independently the vocal track and the accompaniment track in the multi-track audio file stored in the storage unit 110, and the vocal track and the accompaniment track independently reproduced by the reproducing unit 130 are simultaneously output through the output unit 150 to be transmitted to a user, see par. [0027]).    

Claim(s) 26, 32 and 38 is/are rejected under 35 U.S.C. 103 as being unpatentable over Seok U.S. PAP 2016/0054976 A1 in view of McNeeney UK Patent Application GB 2506404, further in view of .
Regarding claim 26 Seok in view of McNeeney does not teach the method of claim 23, wherein the volume adjustment is performed by sliding a sliding bar volume control.  
In a similar field of endeavor Cohen teaches systems and methods for providing streaming, dynamically editable social media content, such as songs, music videos, or other such content. Audio may be delivered to a computing device of a user in a multi-track format, or as separate audio files for each track. The computing device may instantiate a plurality of synchronized audio players and simultaneously playback the separate audio files. The user may individually adjust parameters for each audio player, allowing dynamic control over the media content during use, see abstract. In various implementations, the user interface may include toggle buttons, switches, volume controls, panning controls, equalizer dials, sliders, or other elements to allow the user to interact with a trac, see par. [0055].
It would have been obvious to one of ordinary skill in the art to combine the Seok in view of McNeeney invention with the teachings of Cohen for the benefit of allowing a user to dynamically control media content during use, see abstract.
Regarding claim 32 Seok in view of McNeeney does not teach the method system of claim 29, wherein the volume adjustment is performed by sliding a sliding bar volume control.  
In a similar field of endeavor Cohen teaches systems and methods for providing streaming, dynamically editable social media content, such as songs, music videos, or other such content. Audio may be delivered to a computing device of a user in a multi-track format, or as separate audio files for each track. The computing device may instantiate a plurality of synchronized audio players and simultaneously playback the separate audio files. The user may individually adjust parameters for each audio player, allowing dynamic control over the media content during use, see abstract. In various implementations, the user interface may include toggle buttons, switches, volume controls, panning controls, equalizer dials, sliders, or other elements to allow the user to interact with a trac, see par. [0055].
It would have been obvious to one of ordinary skill in the art to combine the Seok in view of McNeeney invention with the teachings of Cohen for the benefit of allowing a user to dynamically control media content during use, see abstract.

Regarding claim 38 Seok in view of McNeeney does not teach the method non-transitory computer-readable medium of claim 35, wherein the volume adjustment is performed by sliding a sliding bar volume control.  
In a similar field of endeavor Cohen teaches systems and methods for providing streaming, dynamically editable social media content, such as songs, music videos, or other such content. Audio may be delivered to a computing device of a user in a multi-track format, or as separate audio files for each track. The computing device may instantiate a plurality of synchronized audio players and simultaneously playback the separate audio files. The user may individually adjust parameters for each audio player, allowing dynamic control over the media content during use, see abstract. In various implementations, the user interface may include toggle buttons, switches, volume controls, panning controls, equalizer dials, sliders, or other elements to allow the user to interact with a trac, see par. [0055].
It would have been obvious to one of ordinary skill in the art to combine the Seok in view of McNeeney invention with the teachings of Cohen for the benefit of allowing a user to dynamically control media content during use, see abstract.

Claim(s) 28, 34 and 40 is/are rejected under 35 U.S.C. 103 as being unpatentable over Seok U.S. PAP 2016/0054976 A1 in view of McNeeney UK Patent Application GB 2506404 further in view of Karras U.S. PAP 2019/0171936.

Regarding claim 28 Seok in view of McNeeney does not teach the method of claim 23, further comprising pairing the two different versions using a U-net neural network.  
In a similar field of endeavor Karras teaches a U-net neural network topology, the neural network 110 shown in FIG. 1A comprises a U-net neural network. In an embodiment, the U-net neural network is trained to denoise images, processing corrupted image data to generate clean image data. Although training of the U-net neural network is described in the context of image processing, the U-net neural network may be trained to generate other output data using the progressive modification technique. Depending on the task, the network output data may be audio data, or video data, see par. [0068].
It would have been obvious to one of ordinary skill in the art to combine the Seok in view of McNeeny invention with the teachings of Karras for the benefit of using an improved u-net model with a high level of accuracy for audio processing, see par. [0067].

Regarding claim 34 Seok in view of McNeeney does not teach the system of claim 29, wherein the method includes pairing the two different versions using a U-net neural network.  In a similar field of endeavor Karras teaches a U-net neural network topology, the neural network 110 shown in FIG. 1A comprises a U-net neural network. In an embodiment, the U-net neural network is trained to denoise images, processing corrupted image data to generate clean image data. Although training of the U-net neural network is described in the context of image processing, the U-net neural network may be trained to generate other output data using the progressive modification technique. Depending on the task, the network output data may be audio data, or video data, see par. [0068].
It would have been obvious to one of ordinary skill in the art to combine the Seok in view of McNeeny invention with the teachings of Karras for the benefit of using an improved u-net model with a high level of accuracy for audio processing, see par. [0067].


Regarding claim 40 Seok in view of McNeeney does not teach the non-transitory computer-readable medium of claim 35, wherein the method includes pairing the two different versions using a U-net neural network.
In a similar field of endeavor Karras teaches a U-net neural network topology, the neural network 110 shown in FIG. 1A comprises a U-net neural network. In an embodiment, the U-net neural network is trained to denoise images, processing corrupted image data to generate clean image data. Although training of the U-net neural network is described in the context of image processing, the U-net neural network may be trained to generate other output data using the progressive modification technique. Depending on the task, the network output data may be audio data, or video data, see par. [0068].
It would have been obvious to one of ordinary skill in the art to combine the Seok in view of McNeeny invention with the teachings of Karras for the benefit of using an improved u-net model with a high level of accuracy for audio processing, see par. [0067].
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Pertinent prior art available on form 892.
Chen ‘294 teaches a n audio ducking method which adjusts loudness levels in audio content, specifically between two tracks, see abstract.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Ortiz-Sanchez whose telephone number is (571)270-3711. The examiner can normally be reached Monday- Friday 9AM-6PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHAEL ORTIZ-SANCHEZ/Primary Examiner, Art Unit 2656