DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 

Response to Arguments
Applicant’s arguments with respect to claim(s) 24-26, 29-32, and 34-37 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 24-26, 29-32, and 34-37 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Eppolito, US 2008/0253577 A1 (previously cited), further in view of Kim et al., US 2012/0008789 A1 (previously cited in IDS received 9/23/2014, and hereafter Kim), and Tanner, Jr. et al., US 6,307,941 B1 (previously cited and hereafter Tanner). 
Regarding claim 24, Eppolito teaches a method for multi-channel panning where the “panner can support an arbitrary number of input channels” (see Eppolito, abstract).  In one example, Eppolito teaches the input channel audio signals are in a 5.1 surround sound format where “five visual elements 120a-120e… represent five different source audio channels” and teaches an example of the user interface with “the visual elements 120… in a default position… each… in front of a speaker 112” (see Eppolito, ¶ 0026 and figure 1, units 112a-112e and 120a-120e).  Eppolito then expands upon this example for any multichannel sound format, where in a user interface “each visual element 120 corresponds to one source channel” (see Eppolito, ¶ 0043).  Eppolito teaches input channels as “source 120 corresponds to one source channel”, and when the number of input channels is different from the number of output speakers “there will still be one visual element 120 corresponds per source channel” (see Eppolito, ¶¶ 0026, 0043, and 0055, and figure 1, units 120a-120e).  Next, Eppolito teaches a 10.2 surround having two height speakers, and further teaches that the “visual elements 120”, or input channels, are displayed in a 3D user interface in order to show the position from which the sound originates (see Eppolito, ¶ 0103).  Therefore, in another embodiment, Eppolito teaches receiving input channel audio signals in a 10.2 surround sound format with two height input channel signals being displayed in a 3D user interface, where the 10.2 surround sound format will have at least one input channel configuration, such as a default input channel configuration where the 10 audio channels each correspond to the positions where the 10 speakers should be positioned, and this teaches the feature of “receiving input channel audio signals including at least one height input channel signal and an input channel configuration” (see Eppolito, ¶¶ 0026, 0043, and 0103). 
Additionally, Eppolito teaches that the number of physical output channels can be different from the number of input audio channels, such as having a 5.1 source signal is mapped to a stereo output system (see Eppolito, ¶¶ 0033, 0045, and 0054-0055).  It is clear that this mapping operation is applied to any combination of the number of source channels and number of physical speakers, such that an input 10.2 source signal is able to be mapped to a 5.1 speaker setup (see Eppolito, ¶¶ 0032, 0054-0055, and 0103).  The mapping operation changes the balance (re-balances and/or re-positions) of the channels by adjusting gains (amplification and/or attenuation) associated with the channels to map the input sources to given number of physical speakers (see Eppolito, ¶¶ 0034, 0067-0070, 0112, 0118-0119, 0122, and 0143-0146, figures 9, 10, and 13).  Therefore, Eppolito also discloses the feature of “obtaining gains based on the input channel configuration and an output channel configuration;” (see Eppolito, ¶¶ 0032-0034, 0054-0055, 0067-0068, 0112, 0118-0119, 0122, and 0143-0146, and figures 9, 
Kim discloses a method for producing 3D sound, wherein head related transfer functions (HRTF) are used to give perceived elevation audio signals by filtering the input audio signals with the appropriate HRTF (see Kim, abstract).  Similar to the cited prior art, Kim teaches “receiving input channel audio signals including at least one height input channel signal” where the output channel configuration is a 5.1 speaker layout (see Kim, ¶¶ 0121-0128 and figure 6).  Importantly, Kim teaches a filter unit that uses a head related transfer function (HRTF) filter to model a sound being generated from an elevation higher than the physical speakers, where the physical speakers are arranged on a level surface (see Kim, ¶¶ 0033-0036 and figure 1, unit 110).  Furthermore, the HRTF filter is calculated by dividing HRTF-2 by HRTF1, where HRTF2 is the transfer function from the virtual sound source (i.e., the sound being generated from an elevation) to the listener’s ears, and HRTF1 is the transfer function from the physical speakers to the listener’s ears (see Kim, ¶ 0037).  It is further noted that Kim teaches that “each of the at least one height input channel signal is outputted to at least two of output channel audio signals via at least two output speakers located on the horizontal plane” where each left and right top channel signal is filtered through respective HRTF filters and gain blocks to be output in the front left, rear left, rear right, and front right channels (see Kim, ¶¶ 0036-0037, 0121, 0131, 0138, and 0146, figure 6, units 602, 605, 611, 612, 614, and 615 and figures 7-8).  It would have been obvious to one of ordinary skill in the art at the time of the invention to modify Eppolito with the features of Kim for the purpose of providing virtual speakers for height channels when a sound system does not have physical loudspeakers in those positions (see Eppolito, ¶ 0054-0055 and 0103-0105, makes obvious that input sound channels corresponding to off-plane elevation channels can be mapped to physical speakers arranged in-plane speakers, such as 5.1 surround speakers or stereo speakers, in view of Kim, ¶¶ 0036-0040 and 0146, 
Tanner discloses a method for localizing virtual sound, wherein the clarity and perceived location is improved for virtual sources (see Tanner, abstract).  Tanner discloses spatial cues, such as the interaural time difference (ITD) and interaural intensity difference (IID), where a listener localizes a sound primarily through ITDs below 1.5 KHz and localizes a sound primarily through IIDs in higher frequency ranges (see Tanner, column 1, lines 20-50).  Tanner also teaches that the HRTFs describe the frequency dependent amplitude (i.e., intensity) and time-delay differences associated with sound originating from a specific direction, such that HRTFs provide ITD and IID spatial cues (see Tanner, column 1, lines 51-65).  In particular, Tanner discloses a modulation of the binaural cues to create a barely perceptible movement in the virtual sound sources to “trick” the listener into hearing improved localization (see Tanner, column 4, lines 40-54, and lines 64-65, column 5, lines 13-18, 22-26, 37-43, and 52-58, and column 6, lines 6-13).  Tanner further gives an example of a virtual sound system that modifies the spatial cues in pairs of HRTFs (i.e., a DSIR is a pair of HRTFs) and the modification may be performed in the time or frequency domain (see Tanner, column 4, lines 51-63 and column 11, line 47 - column 12, line 22, and figures 9 and 10).  One of ordinary skill in the art at time of the effective filing date would have found it obvious to use Tanner’s modulation of cues in the audible range to improve localization when using HRTFs to create a virtual speaker position, as suggested by Kim, and it would have been obvious to one of ordinary skill in the art at time of the invention to combine Eppolito, Kim, and Tanner to improve the localization of height cues.  Therefore the combination makes obvious:    
“An immersive three-dimensional (3D) sound reproducing method comprising: 
receiving input channel audio signals including at least one height input channel signal and an input channel configuration;” (see Eppolito, abstract, ¶¶ 0026, 0043, and 0103, which teaches a re-configurable multi-channel sound space with different number and positions of sound sources, such as 

“obtaining gains based on the input channel configuration and an output channel configuration;” (see Eppolito, ¶¶ 0032-0034, 0054-0055, 0067-0068, 0112, 0118-0119, 0122, and 0143-0146, and figures 9, 10, and 13, which teaches the calculation of gains for each input channel based on the number of input sources, input source positions, the number of physical output speakers, and the output speakers positions when the input configuration (number of sources and/or positions) do not match the output configuration; also see Kim, ¶¶ 0129-0138 and figure 7, where gain values are adjusted to position the virtual sound sources);

“obtaining a first head-related transfer function (HRTF) based on the input channel configuration, to provide a sense of elevation using the output channel configuration indicating a plurality of output speakers located on a horizontal plane;” (see Eppolito, ¶¶ 0026, 0043, 0054-0055, 0070, 0103, 0112, 0118-0119, 0122, and 0143-0146, and figures 9, 10, and 13, in view of Kim, ¶¶ 0036-0037, 0121, 0131, 0138, and 0146, and figures 6-8, where Eppolito teaches the input channel configuration with height speakers and Kim teaches HRTFs that represent the transfer functions from physical speakers to the listener, such that first HRTFs represent physical speakers on a horizontal plane and do not include the physical height speakers);

“obtaining a second HRTF used according to an input channel audio signal at a predetermined position, the input channel audio signal being output through at least two speakers at positions different from the predetermined position, based on the input channel configuration, the output channel configuration, and a frequency range of dynamic cue;” (see Eppolito, ¶¶ 0032-0034, 0043, 0054-0055, 0070, and 0143-0146, in view of Kim, ¶¶ 0036-0037, 0121, 0131, 0138, and 0146, and figures 6-8, where Eppolito teaches mapping input channels for which there are no physical speakers to two or more physical output channels, and Kim makes obvious the second HRTF that has the transfer function from the virtual sound source to the listener, where the virtual sound source is located at a different elevation to the listener and/or physical speaker, and the second HRTFs with varying elevations are used to emulate a top left speaker and a top right speaker; also see Tanner, column 1, lines 20-65, column 4, lines 40-65, column 5, lines 13-18, 22-26, 37-43, and 52-58, column 6, lines 6-13, and column 11, line 47 - column 12, line 22, and figures 9 and 10, wherein the ITDs and IIDSs, being the spatial cues of the HRTFs, are modulated in an audible range, such as from 500 Hz to 10 kHz, to improve localization);

“obtaining a HRTF filter by dividing the second HRTF by the first HRTF; and” (see Kim, ¶¶ 0036-0040, 0131, and 0146, figures 6-8, where the HRTF filter is comprised of the second HRTF divided by the first HRTF); 

“elevation rendering the input channel audio signals based on the gains, the HRTF filter, to provide the sense of elevation using the output channel configuration,” (see Eppolito, ¶¶ 0043, 0054-0055, 0067-0070, 0103, 0112, 0118-0119, 0122, and 0143-0146, and figures 9, 10, and 13, in view of Kim, ¶¶ 0036-0037, 0121, 0131, 0138, and 0146, and figures 6-8, wherein Eppolito teaches mapping input height channels for which there are no physical speakers to the physical output channels arranged in non-elevated positions, and Kim makes obvious filtering the input height channels through different HRTF filters and gains, respectively, to emulate, or virtualize, the missing left and right height speakers),

, and” (see Kim, ¶¶ 0036-0040, in view of Tanner, column 1, lines 20-65, column 4, lines 40-65, column 5, lines 13-18, 22-26, 37-43, and 52-58, and column 6, lines 6-13, where Kim teaches that the HRTF is a function in the frequency domain and the input signals, or height channels, are filtered in the frequency domain by frequency bands, and Tanner illustrates that the HRTFs include coefficients for these bands, wherein the binaural cues are modulated to improve localization), and

“wherein each of the at least one height input channel signal is outputted to at least two of output channel audio signals via at least two output speakers located on the horizontal plane.” (see Kim, ¶¶ 0036-0037, 0121, 0131, 0138, and 0146, figure 6, units 602, 605, 611, 612, 614, and 615 and figures 7-8, where each left and right top channel signal is filtered through respective HRTF filters and gain blocks to be output in the front left, rear left, rear right, and front right channels).  
 
Regarding claim 25, see the preceding rejection with respect to claim 24 above.  The combination makes obvious “the method of claim 24, wherein the dynamic cue represents speaker-to-listener orientation.” (see Kim, ¶¶ 0036-0037 and 0057-0058, discloses the use of HRTFs, where ITDs and gain values are used to emulate a virtual speaker position; and see Tanner, column 1, lines 20-65, discloses dynamic cues, such as the ITD and the IID based on location, and see Tanner, column 4, lines 40-63, wherein these dynamic cues are modulated to better localize a virtualized sound source).
Regarding claim 26, see the preceding rejection with respect to claim 24 above.  The combination makes obvious “the method of claim 24, wherein the second HRTF is determined based on spatial locations of an output channel signal and an input channel signal located at a predetermined elevation.” (see Eppolito, ¶¶ 0032-0034, 0043, 0054-0055, 0103, and 0143-0146, in view of Kim, ¶¶ 0036-0037, 0121, 0131, 0138, and 0146, and figures 6-8, which makes obvious emulating a virtual loudspeaker with a different elevation, or height, than a physical loudspeaker in order to emulate a 3D sound space input configuration, such as a 10.2 surround signal having two height channels, using a 2D loudspeaker layout, such as a 2.0 or 5.1 loudspeaker layout).
Regarding claim 29, see the preceding rejection with respect to claim 24 above.  The combination makes obvious the method of claim 24, and likewise makes obvious the apparatus with these features (see the rejection of claim 24, which makes obvious this apparatus).
claims 30 and 31, see the preceding rejection with respect to claims 25 and 26 above.  The combination makes obvious the method of claims 25 and 26, and likewise makes obvious the apparatus with these features (see the rejection of claims 25 and 26, respectively for claims 30 and 31, which make obvious these features in the apparatus of claim 29).  Additionally, the combination makes obvious that the input and output signals are located “on a same plane” for virtualization of a speaker (see Eppolito, ¶¶ 0054-0055 and 0118, in view of Kim, ¶¶ 0036-0037 and 0146). 
Regarding claim 32, see the preceding rejection with respect to claim 29 above.  The combination makes obvious “the apparatus of claim 31, wherein the same plane is the horizontal plane.” (see Eppolito, ¶¶ 0054-0055, and 0118, wherein the source channel that is positioned between two speakers is mapped to two or more speakers, and see Kim, ¶¶ 0036-0037, 0121, 0131, 0138, and 0146, and figures 6-8, wherein Kim teaches using two or more physical loudspeakers in the horizontal plane to emulate at least two different virtual loudspeaker positions, such that the combination makes obvious a virtual loudspeaker at a different azimuth, such as when there are more source channels than physical output speakers).
Regarding claim 34, see the preceding rejection with respect to claim 24 above.  The combination makes obvious “a non-transitory computer readable recording medium having embodied thereon a computer program, which when executed by a processor, performs the method of claim 24.” (see Eppolito, ¶ 0147, which makes obvious this medium in the combination).
Regarding claim 35, see the preceding rejection with respect to claims 24 and 32 above.  The combination makes obvious “the method of claim 24, wherein the second HRTF is based on spatial locations of an output channel signal and an input channel signal located on the horizontal plane.” (see Eppolito, ¶¶ 0054-0055, and 0118, wherein the source channel that is positioned between two speakers is mapped to two or more speakers, and see Kim, ¶¶ 0036-0037, 0121, 0131, 0138, and 0146, and figures 6-8, wherein Kim teaches using two or more physical loudspeakers in the horizontal plane to 
Regarding claim 36, see the preceding rejection with respect to claim 24 above.  The combination makes obvious the “method of claim 24, wherein the first HRTF indicates information regarding paths from a spatial location of the plurality of output speakers to ears of an audience, and the second HRTF indicates information regarding paths from a spatial location of a virtual speaker located at a predetermined elevation to ears of the audience”, where Kim teaches a filter unit that uses an HRTF filter to model a sound being generated from an elevation higher than the physical speakers that are arranged on a level surface, and the HRTF filter is calculated by dividing HRTF-2 by HRTF1, where the first HRTF (HRTF1) is the transfer function from the physical speakers to the listener’s ears and the second HRTF (HRTF2) is the transfer function from the virtual sound source (i.e., the sound being generated from an elevation) to the listener’s ears (see Kim, ¶¶ 0035-0037 and 0146).  
Regarding claim 37, see the preceding rejection with respect to claim 29 above.  The combination makes obvious the “apparatus of claim 29, wherein the first HRTF indicates information regarding paths from a spatial location of the plurality of output speakers to ears of an audience, and the second HRTF indicates information regarding paths from a spatial location of a virtual speaker located at a predetermined elevation to ears of the audience” where Kim teaches a filter unit that uses an HRTF filter to model a sound being generated from an elevation higher than the physical speakers that are arranged on a level surface, and the HRTF filter is calculated by dividing HRTF-2 by HRTF1, where the first HRTF (HRTF1) is the transfer function from the physical speakers to the listener’s ears and the second HRTF (HRTF2) is the transfer function from the virtual sound source (i.e., the sound being generated from an elevation) to the listener’s ears (see Kim, ¶¶ 0035-0037 and 0146).   

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Chung, US 9,504,933 B1, (previously cited), teaches a 3D sound system using HRTFs (see abstract and column 3, lines 46-63);
Kasai et al., US 9,504,934 B1, (previously cited), teaches a method for localizing an audio source (see abstract and figure 2);
Van Den Berghe et al., US 2010/0027819 A1, (previously cited), teaches an encoder for combining channels and a decoder for decoding the combined channels, such as recovering a height channel in a combined stream (see abstract and ¶ 0059 and 0064-0065);
Chabanne, US 2011/0164755 A1, (previously cited), teaches a method for enhancing audio reproduction using height channels, wherein matrix encoding and/or decoding is used to create audio signals for the height channels (see abstract and ¶ 0040); 
Jot et al., US 7,231,054 B1, (previously cited), teaches a method for 3D audio using loudspeakers and HRTFs (see abstract);
Lee et al., US 2011/0222693 A1, (previously cited), teaches left and right vertical direction virtual speakers that are emulated by front left, front right, left surround, and right surround speakers (see abstract, ¶ 0031-0036, and figure 1);
Pulkki, “Virtual Sound Source Positioning Using Vector Base Amplitude Panning”, (previously cited), teaches the VBAP method for two dimensions and three dimensions in order to create virtual sources in 2D or 3D (see abstract and pp. 458-461, sections 1.2 through 2.3);

Lee et al., “Virtual Height Speaker Rendering for Samsung 10.2-channel Vertical Surround System”, (previously cited), teaches the use of HRTFs to create virtual height speakers when reproduced on a 7.1 channel loudspeaker system (see abstract).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Daniel R Sellers whose telephone number is (571)272-7528. The examiner can normally be reached Mon - Fri 10:00-4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fan S Tsang can be reached on (571)272-7547. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Daniel R Sellers/Examiner, Art Unit 2653