Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on December 15, 2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Response to Arguments and Amendments
The amendment filed on May 3, 2022 has been entered. Claims 1-30 remain pending in the application. 
The applicant notes that the prior art of record does not specifically teach the limitation of “performing spatial audio encoding with respect to the ambisonic audio data to obtain a foreground audio signal and a corresponding spatial component, the spatial component defining spatial characteristics of the foreground audio signal”, as recited by claim 11. However, after further review after the interview, the Examiner respectfully disagrees with this assertion. The mapping can be found under Ghido under paragraphs [0072] and [0208]. The “audio signal is a spatial audio object transport channel or a High Order Ambisonics transport channel” can be interpreted as the spatial audio encoding with respect to the ambisonic audio data. Furthermore, the “coder components including bandwidth extension and parametric spatial coding tools” can be interpreted as the spatial characteristics of the foreground audio signal. The applicant notes that the prior art of record does not specifically teach the limitation of “performing a gain and shape analysis with respect to the foreground audio signal to obtain a gain and a shape representative of the foreground audio signal”, as recited by claim 11. However, after further review after the interview, the Examiner respectfully disagrees with this assertion. The mapping can be found under Ghido under paragraphs [0255] and [0256].  The “The side information comprises low pass (LP) shape information and scalar gains that are estimated within an HREP analysis block (not depicted)” can be interpreted as the gain and shape analysis.
The arguments made above with respect to independent claim 11 also applies to independent claims 1, 12, and 23 as well as to the dependent claims.
Hence, the applicant’s arguments are not persuasive.
Claim Rejections - 35 USC § 102
	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
	
Claims 1, 3, 11-12, 14-15, 22-23, and 25 are rejected under 35 U.S.C. 102(a)(1) as being anticipated over Ghido (U.S. Publication No. 20180190303).
Regarding claim 1, Ghido discloses a device configured to encode ambisonic audio data, ([0208] - the audio pre-processor 200 performs a pre-processing of each SAOC transport channel or each High Order Ambisonics (HOA) transport channel separately. [0238] – a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein… Figure 9A – pre-processor 200, core encoder 900) the device comprising:
a memory configured to store the scene-based audio data ([0231] – The inventive encoded audio signal can be stored on a digital storage medium); and
one or more processors configured to (Figure 9A – audio pre-processor 200, Figure 10C – HREP preprocessor 100):
perform spatial audio encoding with respect to the ambisonic audio data to obtain a foreground audio signal and a corresponding spatial component, the spatial component defining spatial characteristics of the fore ground audio signal ([0072] - Mapping of applause classification result onto MGC and HREP settings. HREP is a stand-alone pre-/post processor which embraces all other coder components including bandwidth extension and parametric spatial coding tools. [0208] - the audio pre-processor 200 performs a pre-processing of each SAOC transport channel or each High Order Ambisonics (HOA) transport channel separately as illustrated in FIG. 10A. In this case, the audio signal is a spatial audio object transport channel or a High Order Ambisonics transport channel);
perform a gain and shape analysis with respect to the foreground audio signal to obtain a gain and a shape representative of the foreground audio signal ([0255] - FIG. 9C displays the signal flow inside the HREP processor within the encoder. The preprocessing is applied by splitting the input signal into a low pass (LP) part and a high pass (HP) part. [0256] - The side information comprises low pass (LP) shape information and scalar gains that are estimated within an HREP analysis block (not depicted));
encode the gain and the shape to obtain a coded gain and a coded shape ([0014] - The principle of this approach is illustrated in FIG. 12. The dynamics of the input signal is reduced by a gain modification (multiplicative pre-processing) prior to its encoding. In this way, “peaks” in the signal are attenuated prior to encoding. [0051] - HREP (High Resolution Envelope Processing) is a tool for improved coding of signals that predominantly consist of many dense transient events, such as applause, rain drop sounds, etc. At the encoder side, the tool works as a pre-processor with high temporal resolution before the actual perceptual audio codec by analyzing the input signal, attenuating and thus temporally flattening the high frequency part of transient events, and generating a small amount of side information (1-4 kbps for stereo signals)); and
specify, in a bitstream, the coded gain and the coded shape ([0014] - The parameters of the gain modification are transmitted in the bitstream. [0025] - Depending on the shape of the gain modification function the frequency response of the analysis filters is altered according to the composite window function).
Regarding claim 3, Ghido discloses the device of claim 1, 
wherein the one or more processors are further configured to quantize the gain to obtain a quantized gain as the coded gain ([0156] - The value of gfloat [k] is quantized and clipped to the range allowed by the chosen value of the extendedGain Range configuration option).
Regarding claim 11, Ghido discloses a method of encoding ambisonic audio data, the method comprising:
performing spatial audio encoding with respect to the ambisonic audio data to obtain a foreground audio signal and a corresponding spatial component, the spatial component defining spatial characteristics of the foreground audio signal ([0072] - Mapping of applause classification result onto MGC and HREP settings. HREP is a stand-alone pre-/post processor which embraces all other coder components including bandwidth extension and parametric spatial coding tools. [0208] - the audio pre-processor 200 performs a pre-processing of each SAOC transport channel or each High Order Ambisonics (HOA) transport channel separately as illustrated in FIG. 10A. In this case, the audio signal is a spatial audio object transport channel or a High Order Ambisonics transport channel. Figure 9A – pre-processor 200, core encoder 900):
performing a gain and shape analysis with respect to the foreground audio signal to obtain a gain and a shape representative of the foreground audio signal ([0255] - FIG. 9C displays the signal flow inside the HREP processor within the encoder. The preprocessing is applied by splitting the input signal into a low pass (LP) part and a high pass (HP) part. [0256] - The side information comprises low pass (LP) shape information and scalar gains that are estimated within an HREP analysis block (not depicted));
encoding the gain and the shape to obtain a coded gain and a coded shape ([0014] - The principle of this approach is illustrated in FIG. 12. The dynamics of the input signal is reduced by a gain modification (multiplicative pre-processing) prior to its encoding. In this way, “peaks” in the signal are attenuated prior to encoding. [0051] - HREP (High Resolution Envelope Processing) is a tool for improved coding of signals that predominantly consist of many dense transient events, such as applause, rain drop sounds, etc. At the encoder side, the tool works as a pre-processor with high temporal resolution before the actual perceptual audio codec by analyzing the input signal, attenuating and thus temporally flattening the high frequency part of transient events, and generating a small amount of side information (1-4 kbps for stereo signals));  and
specifying, in a bitstream, the coded gain and the coded shape ([0014] - The parameters of the gain modification are transmitted in the bitstream. [0025] - Depending on the shape of the gain modification function the frequency response of the analysis filters is altered according to the composite window function).
Regarding claim 12, Ghido discloses a device configured to decode a bitstream representative of encoded ambisonic audio data, (0238] – a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein… Figure 9B – post-processor 100, core decoder 930), the device comprising:
a memory configured to store the bitstream, the bitstream including an encoded audio object and a corresponding spatial component that defines spatial characteristics of the encoded foreground audio signal, the encoded foreground audio signal including a coded gain and a coded shape (bitstream: [0013] - The principle of this approach is illustrated in FIG. 12. The dynamics of the input signal is reduced by a gain modification (multiplicative pre-processing) prior to its encoding. In this way, “peaks” in the signal are attenuated prior to encoding. encoded audio object: [0030] - a core decoder for decoding the core encoded signal using the core side information to obtain a decoded core signal; and a post-processor for post-processing the decoded core signal using the time-variable high frequency gain information. Spatial characteristics: [0223] - …and a High Order Ambisonics (HOA) decoder 596 are provided. Memory: [0231] – The inventive encoded audio signal can be stored on a digital storage medium); and
one or more processors configured to (Figure 9B – audio post-processor 100, Figure 10C – HREP postprocessor 100):
	perform a gain and shape synthesis with respect to the coded gain and the coded shape to obtain a foreground audio signal ([0257] - The decoder side processing is outlined in Fig . The side information on HP shape information and scalar gains are parsed from the bit stream (not depicted) and applied to the signal resembling a decoder post-processing inverse to that of the encoder pre-processing); and
reconstruct, based on the foreground audio signal and the spatial component, the ambisonic audio data ([0048] - these signal portions are reconstructed by the audio post-processing subsequent to the decoder operation).
Regarding claim 14, Ghido discloses the device of claim 12, 
wherein the coded gain comprises a gain difference ([0008] - and a second layer coding section that inputs the difference signal thereto, selects a second quantization target band of the difference signal from the plurality of sub-bands to obtain second band information, and obtains a second gain of the difference signal of the second quantization target band and to generate second coded information including the second band information and second gain coded information obtained by encoding the second gain);
wherein the one or more processors are further configured to (Figure 9B – audio post-processor 100, Figure 10C – HREP postprocessor 100):
obtain, from the bitstream (Figure 10C – bitstream), a reference coded gain (Figure 1 – gain information. Note: Figure 1 shows a detailed look into audio post-processor 100 while Figure 10C shows a more broad and general look at the workings of the device (and includes the same post-processor 100)); and
add the reference coded gain to the gain difference to obtain a gain of the ambisonic audio data ([0171] - the low pass shaper consisting of 1176 and 117a of FIG. 5A in the embodiment is configured to apply the shaping function rs[f] that depends on the time-variable high frequency gain information for the corresponding block. [0178] - side information decoder 620 that generates and calculates a decoded gain 621 and/or a decoded gain compensation value 622 based on the corresponding gain precision information and the corresponding compensation precision information. [0208] - In this case, the audio signal is a spatial audio object transport channel or a High Order Ambisonics transport channel).
Regarding claim 15, Ghido discloses the device of claim 12, 
wherein the one or more processors are further configured to dequantize the coded gain and the coded shape to obtain a gain and a shape ([0193] - The parameter beta_factor (which is an improved parameterization of parameter beta) is used to expand the gains after dequantization during post–processing. [0211] - Additionally, the audio decoding apparatus has the post processor 100 for post-processing the decoded core signal 102 using the time-variable high frequency gain information 104. [0257] - The decoder side processing is outlined in Fig . The side information on HP shape information and scalar gains are parsed from the bit stream (not depicted) and applied to the signal resembling a decoder post-processing inverse to that of the encoder pre-processing), and	
wherein the one or more processors are configured to perform the gain and shape synthesis with respect to the gain and the shape to obtain the audio object (Figure 4 – analysis windower 115, DFT processor 116 [0129] - The DFT processor 116 has an output connected to an input of a low pass shaper 117. The low pass shaper 117 actually performs the low pass filtering action, and the output of the low pass shaper 117 is connected to a DFT inverse processor 118 for generating a sequence of blocks of low pass time domain sampling values. Finally, a synthesis windower 119 is provided at an output of the DFT inverse processor for windowing the sequence of blocks of low pass time domain sampling values using a synthesis window. The output of the synthesis windower 119 is a time domain low pass signal. Thus, blocks 115 to 119 correspond to the “low pass filter” block 111 of FIG. 2, and blocks 121 and 113 correspond to the “subtractor” 113 of FIG. 2. Thus, in the embodiment illustrated in FIG. 4, the band extractor further comprises the audio signal windower 121 for windowing the audio signal 102 using the analysis window and the synthesis window to obtain a sequence of windowed blocks of audio signal values. [0117] - Furthermore, the low pass shaper consisting of 1176 and 117a of FIG. 5A in the embodiment is configured to apply the shaping function rs [f] that depends on the time-variable high frequency gain information for the cor responding block. An implementation of the shaping function rs [f] has been discussed before, but alternative functions can be used as well).
Regarding claim 22, Ghido discloses the device of claim 12,
wherein the one or more processors are further configured to render the ambisonic audio data to one or more speaker feeds ([0212] - The channel-individual post processor 100 outputs post-processed channels that can be output to a digital/analog converter and subsequently connected loudspeakers or that can be output to some kind of further processing or storage or any other suitable procedure for processing audio objects or audio channels), and
wherein the device comprises one or more speakers configured to reproduce, based on the speaker feeds, a soundfield represented by the scene-based audio data ([0212] - The channel-individual post processor 100 outputs post-processed channels that can be output to a digital/analog converter and subsequently connected loudspeakers or that can be output to some kind of further processing or storage or any other suitable procedure for processing audio objects or audio channels).
Regarding claim 23, Ghido discloses a method of decoding a bitstream representative of ambisonic audio data, the method comprising: 
obtaining, from the bitstream, an encoded foreground audio signal and a corresponding spatial component that defines spatial characteristics of the encoded fore ground audio signal, the encoded foreground audio signal including a coded gain and a coded shape (bitstream: [0013] - The parameters of the gain modification are transmitted in the bitstream. encoded audio signal: [0030] - a core decoder for decoding the core encoded signal using the core side information to obtain a decoded core signal; and a post-processor for post-processing the decoded core signal using the time-variable high frequency gain information. spatial characteristics: 0223] - …and a High Order Ambisonics (HOA) decoder 596 are provided);
performing a gain and shape synthesis with respect to the coded gain and the coded shape to obtain a foreground audio signal ([0257] - The decoder side processing is outlined in Fig . The side information on HP shape information and scalar gains are parsed from the bit stream (not depicted) and applied to the signal resembling a decoder post-processing inverse to that of the encoder pre-processing. Figure 4 – analysis windower 115, DFT processor 116 [0129] - The DFT processor 116 has an output connected to an input of a low pass shaper 117. The low pass shaper 117 actually performs the low pass filtering action, and the output of the low pass shaper 117 is connected to a DFT inverse processor 118 for generating a sequence of blocks of low pass time domain sampling values. Finally, a synthesis windower 119 is provided at an output of the DFT inverse processor for windowing the sequence of blocks of low pass time domain sampling values using a synthesis window. The output of the synthesis windower 119 is a time domain low pass signal. Thus, blocks 115 to 119 correspond to the “low pass filter” block 111 of FIG. 2, and blocks 121 and 113 correspond to the “subtractor” 113 of FIG. 2. Thus, in the embodiment illustrated in FIG. 4, the band extractor further comprises the audio signal windower 121 for windowing the audio signal 102 using the analysis window and the synthesis window to obtain a sequence of windowed blocks of audio signal values. [0117] - Furthermore, the low pass shaper consisting of 1176 and 117a of FIG. 5A in the embodiment is configured to apply the shaping function rs [f] that depends on the time-variable high frequency gain information for the cor responding block. An implementation of the shaping function rs [f] has been discussed before, but alternative functions can be used as well); and
reconstructing, based on the foreground audio signal and the spatial component, the ambisonic audio data ([0048] - these signal portions are reconstructed by the audio post-processing subsequent to the decoder operation).
Regarding claim 25, Ghido discloses 	the method of claim 23, 
wherein the coded gain comprises a gain difference, and wherein the method further comprises: obtaining, from the bitstream, a reference coded gain (Figure 1 – gain information Figure 10C – bitstream. Note: Figure 1 shows a detailed look into audio post-processor 100 while Figure 10C shows a more broad and general look at the workings of the device (and includes the same post-processor 100). [0171] - the low pass shaper consisting of 1176 and 117a of FIG. 5A in the embodiment is configured to apply the shaping function rs[f] that depends on the time-variable high frequency gain information for the corresponding block. [0178] - side information decoder 620 that generates and calculates a decoded gain 621 and/or a decoded gain compensation value 622 based on the corresponding gain precision information and the corresponding compensation precision information); and
adding the reference coded gain to the gain difference to obtain a gain of the ambisonic audio data ([0178] - side information decoder 620 that generates and calculates a decoded gain 621 and/or a decoded gain compensation value 622 based on the corresponding gain precision information and the corresponding compensation precision information. [0208] - In this case, the audio signal is a spatial audio object transport channel or a High Order Ambisonics transport channel).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 13, and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Ghido (U.S. Publication No. 20180190393) in view of Wylie (apt-X100: Low-Delay, Low Bit-Rate Subband ADPCM Digital Audio Coding).
Regarding claim 2, Ghido discloses all of the limitations as in claim 1, above.
However, Ghido does not disclose the device of claim 1, 
wherein the one or more processors are configured to perform the gain and shape analysis according to an AptX compression algorithm with respect to the foreground audio signal to obtain the gain and the shape representative of the audio object.
Wylie does teach the device of claim 1, 
wherein the one or more processors are configured to perform the gain and shape analysis according to an AptX compression algorithm with respect to the foreground audio signal to obtain the gain and the shape representative of the audio object (Pg. 83 – The apt-X100 algorithm aims to code… digital audio signals… It provides at the output of the coder a 16-bit word… thus achieving 4:1 compression. Figure 4 – Variation of quantizer gain with subbands).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Wylie in order to implement the device of claim 1, wherein the one or more processors are configured to perform the gain and shape analysis according to an AptX compression algorithm with respect to the foreground audio signal to obtain the gain and the shape representative of the audio object. Doing so allows for digital audio and video recorders to use up to 16 channels, which allows for multichannel or multilingual recording (Wylie – Page 83).
Regarding claim 13, Ghido discloses all of the limitations as in claim 12, above.
However, Ghido does not disclose the device of claim 12,
wherein the one or more processors are configured to perform the gain and shape synthesis according to an AptX decompression algorithm to obtain the foreground audio signal.
Wylie does teach the device of claim 12,
wherein the one or more processors are configured to perform the gain and shape synthesis according to an AptX decompression algorithm to obtain the foreground audio signal (Pg. 83 – The apt-X100 algorithm aims to code… digital audio signals… It provides at the output of the coder a 16-bit word… thus achieving 4:1 compression. Figure 4 – Variation of quantizer gain with subbands). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Wylie in order to implement the device of claim 12, wherein the one or more processors are configured to perform the gain and shape synthesis according to an AptX decompression algorithm to obtain the foreground audio signal. Doing so allows for digital audio and video recorders to use up to 16 channels, which allows for multichannel or multilingual recording (Wylie – Page 83).
Regarding claim 24, Ghido discloses all of the limitations as in claim 23, above.
However, Ghido does not disclose the device of claim 23,
wherein performing the gain and shape synthesis comprises performing the gain and shape synthesis according to an AptX decompression algorithm to obtain the foreground audio signal.
Wylie does teach the device of claim 23,
wherein performing the gain and shape synthesis comprises performing the gain and shape synthesis according to an AptX decompression algorithm to obtain the foreground audio signal (Pg. 83 – The apt-X100 algorithm aims to code… digital audio signals… It provides at the output of the coder a 16-bit word… thus achieving 4:1 compression. Figure 4 – Variation of quantizer gain with subbands). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Wylie in order to implement the device of claim 23, wherein performing the gain and shape synthesis comprises performing the gain and shape synthesis according to an AptX decompression algorithm to obtain the foreground audio signal. Doing so allows for digital audio and video recorders to use up to 16 channels, which allows for multichannel or multilingual recording (Wylie – Page 83).
Claims 4, 16, 26, and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Ghido (U.S. Publication No. 20180190393) in view of Svedberg (U.S. Publication No. 20160088297).
Regarding claim 4, Ghido discloses all of the limitations as in claim 1, above.
However, Ghido does not disclose the device of claim 1,
wherein the one or more processors are further configured to recursively quantize the gain as a course quantized gain and one or more fine quantized residuals, the course quantized gain and the one or more fine quantized residuals representative of the coded gain.
Svedberg does teach the device of claim 1,
wherein the one or more processors are further configured to recursively quantize the gain as a course quantized gain and one or more fine quantized residuals, the course quantized gain and the one or more fine quantized residuals representative of the coded gain ([0004] - Gain and shape components are then encoded using a shape quantizer which is tuned for the normalized shape input and again quantizer which handles the dynamics of the signal. This structure is well used in e.g. audio coding since the division into dynamics and shape (or fine structure) fits well with the perceptual auditory model. [0275] - A quantized gain parameter with precision derived from the current allocation is entropy coded to represent the relative gains of each side of the split, and the entire decoding process is recursively applied).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Svedberg in order to implement the device of claim 1, wherein the one or more processors are further configured to recursively quantize the gain as a course quantized gain and one or more fine quantized residuals, the course quantized gain and the one or more fine quantized residuals representative of the coded gain. Doing so allows the device to limit the maximum size of bits to 32 in order to avoid the need for multi-precision calculations when decoding vectors ([0275]).
Regarding claim 16, Ghido discloses all of the limitations as in claim 12, above. 
Ghido discloses the device of claim 12, 
wherein the one or more processors are configured to perform the gain and shape synthesis with respect to the gain and the coded shape to obtain the audio object (Figure 4 – analysis windower 115, DFT processor 116 [0129] - The DFT processor 116 has an output connected to an input of a low pass shaper 117. The low pass shaper 117 actually performs the low pass filtering action, and the output of the low pass shaper 117 is connected to a DFT inverse processor 118 for generating a sequence of blocks of low pass time domain sampling values. Finally, a synthesis windower 119 is provided at an output of the DFT inverse processor for windowing the sequence of blocks of low pass time domain sampling values using a synthesis window. The output of the synthesis windower 119 is a time domain low pass signal. Thus, blocks 115 to 119 correspond to the “low pass filter” block 111 of FIG. 2, and blocks 121 and 113 correspond to the “subtractor” 113 of FIG. 2. Thus, in the embodiment illustrated in FIG. 4, the band extractor further comprises the audio signal windower 121 for windowing the audio signal 102 using the analysis window and the synthesis window to obtain a sequence of windowed blocks of audio signal values. [0117] - Furthermore, the low pass shaper consisting of 1176 and 117a of FIG. 5A in the embodiment is configured to apply the shaping function rs [f] that depends on the time-variable high frequency gain information for the cor responding block. An implementation of the shaping function rs [f] has been discussed before, but alternative functions can be used as well).
However, Ghido does not disclose the device of claim 12,
wherein the coded gain comprises a coarsely quantized gain and one or more fine quantized residuals,
wherein the one or more processors are further configured to dequantize, based on the course quantized gain and the one or more fine quantized residuals, the coded gain to obtain a gain.
Svedberg does teach the device of claim 12,
wherein the coded gain comprises a coarsely quantized gain and one or more fine quantized residuals ([0004] - Gain and shape components are then encoded using a shape quantizer which is tuned for the normalized shape input and again quantizer which handles the dynamics of the signal. This structure is well used in e.g. audio coding since the division into dynamics and shape (or fine structure) fits well with the perceptual auditory model. [0275] - A quantized gain parameter with precision derived from the current allocation is entropy coded to represent the relative gains of each side of the split, and the entire decoding process is recursively applied),
wherein the one or more processors are further configured to dequantize, based on the course quantized gain and the one or more fine quantized residuals, the coded gain to obtain a gain ([0004] - Gain and shape components are then encoded using a shape quantizer which is tuned for the normalized shape input and again quantizer which handles the dynamics of the signal. This structure is well used in e.g. audio coding since the division into dynamics and shape (or fine structure) fits well with the perceptual auditory model. [0075] - A norm de-quantizer 66 uses the NORMO-bits to provide a norm factor g. The norm factor is then used to form the final output vector 2 being a reconstructed Sample 3 of the original audio/video sample).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Svedberg in order to implement the device of claim 12, wherein the coded gain comprises a coarsely quantized gain and one or more fine quantized residuals, wherein the one or more processors are further configured to dequantize, based on the course quantized gain and the one or more fine quantized residuals, the coded gain to obtain a gain. Doing so allows the device to limit the maximum size of bits to 32 in order to avoid the need for multi-precision calculations when decoding vectors ([0275]).
Regarding claim 26, Ghido discloses all of the limitations as in claim 23, above.
Ghido discloses the method of claim 23,
wherein performing the gain and shape synthesis comprises performing the gain and shape synthesis with respect to the gain and the shape to obtain the audio object (Figure 4 – analysis windower 115, DFT processor 116 [0129] - The DFT processor 116 has an output connected to an input of a low pass shaper 117. The low pass shaper 117 actually performs the low pass filtering action, and the output of the low pass shaper 117 is connected to a DFT inverse processor 118 for generating a sequence of blocks of low pass time domain sampling values. Finally, a synthesis windower 119 is provided at an output of the DFT inverse processor for windowing the sequence of blocks of low pass time domain sampling values using a synthesis window. The output of the synthesis windower 119 is a time domain low pass signal. Thus, blocks 115 to 119 correspond to the “low pass filter” block 111 of FIG. 2, and blocks 121 and 113 correspond to the “subtractor” 113 of FIG. 2. Thus, in the embodiment illustrated in FIG. 4, the band extractor further comprises the audio signal windower 121 for windowing the audio signal 102 using the analysis window and the synthesis window to obtain a sequence of windowed blocks of audio signal values. [0117] - Furthermore, the low pass shaper consisting of 1176 and 117a of FIG. 5A in the embodiment is configured to apply the shaping function rs [f] that depends on the time-variable high frequency gain information for the cor responding block. An implementation of the shaping function rs [f] has been discussed before, but alternative functions can be used as well). 
However, Ghido does not disclose method of claim 23, further comprising:
dequantizing the coded gain and the coded shape to obtain a gain and a shape.
Svedberg does teach the method of claim 23, further comprising:
dequantizing the coded gain and the coded shape to obtain a gain and a shape ([0008] - gain-shape VQ [0075] – MVPX deindexing…A norm de-quantizer 66 uses the NORMO-bits to provide a norm factor g. The norm factor is then used to form the final output vector 2 being a reconstructed Sample 3 of the original audio/video sample [0142] - The MPVQ deindexing starts in step 250. In step 260, VQ dimensions N and the number of unit pulses K are achieved from the codec bit allocation loop. In step 270, a size and offsets are found. In step 280, a leading sign is extracted from the incoming bit stream and in step 285, the MPVO index is obtained from the incoming bit stream. These quantities are utilized in step 290, where the MPVO index is decomposed).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Svedberg in order to implement the method of claim 23, further comprising dequantizing the coded gain and the coded shape to obtain a gain and a shape. Doing so allows the device to limit the maximum size of bits to 32 in order to avoid the need for multi-precision calculations when decoding vectors ([0275]).
Regarding claim 27, Ghido discloses all of the limitations as in claim 23, above.
Ghido discloses the method of claim 13, wherein performing the gain and shape synthesis comprises performing the gain and shape synthesis with respect to the gain and the coded shape to obtain the audio object (Figure 4 – analysis windower 115, DFT processor 116 [0129] - The DFT processor 116 has an output connected to an input of a low pass shaper 117. The low pass shaper 117 actually performs the low pass filtering action, and the output of the low pass shaper 117 is connected to a DFT inverse processor 118 for generating a sequence of blocks of low pass time domain sampling values. Finally, a synthesis windower 119 is provided at an output of the DFT inverse processor for windowing the sequence of blocks of low pass time domain sampling values using a synthesis window. The output of the synthesis windower 119 is a time domain low pass signal. Thus, blocks 115 to 119 correspond to the “low pass filter” block 111 of FIG. 2, and blocks 121 and 113 correspond to the “subtractor” 113 of FIG. 2. Thus, in the embodiment illustrated in FIG. 4, the band extractor further comprises the audio signal windower 121 for windowing the audio signal 102 using the analysis window and the synthesis window to obtain a sequence of windowed blocks of audio signal values. [0171] - Furthermore, the low pass shaper consisting of 1176 and 117a of FIG. 5A in the embodiment is configured to apply the shaping function rs [f] that depends on the time-variable high frequency gain information for the corresponding block. [0117] - Furthermore, the low pass shaper consisting of 1176 and 117a of FIG. 5A in the embodiment is configured to apply the shaping function rs [f] that depends on the time-variable high frequency gain information for the cor responding block. An implementation of the shaping function rs [f] has been discussed before, but alternative functions can be used as well).
However, Ghido does not disclose method of claim 23, 
wherein the coded gain comprises a coarsely quantized gain and one or more fine quantized residuals,
wherein the method further comprises dequantizing, based on the course quantized gain and the one or more fine quantized residuals, the coded gain to obtain a gain.
Svedberg does teach the method of claim 23,
wherein the coded gain comprises a coarsely quantized gain and one or more fine quantized residuals ([0004] - Gain and shape components are then encoded using a shape quantizer which is tuned for the normalized shape input and again quantizer which handles the dynamics of the signal. This structure is well used in e.g. audio coding since the division into dynamics and shape (or fine structure) fits well with the perceptual auditory model. [0275] - A quantized gain parameter with precision derived from the current allocation is entropy coded to represent the relative gains of each side of the split, and the entire decoding process is recursively applied),
wherein the method further comprises dequantizing, based on the course quantized gain and the one or more fine quantized residuals, the coded gain to obtain a gain ([0004] - Gain and shape components are then encoded using a shape quantizer which is tuned for the normalized shape input and again quantizer which handles the dynamics of the signal. This structure is well used in e.g. audio coding since the division into dynamics and shape (or fine structure) fits well with the perceptual auditory model. [0075] - A norm de-quantizer 66 uses the NORMO-bits to provide a norm factor g. The norm factor is then used to form the final output vector 2 being a reconstructed Sample 3 of the original audio/video sample).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Svedberg in order to implement the method of claim 23, wherein the coded gain comprises a coarsely quantized gain and one or more fine quantized residuals, wherein the method further comprises dequantizing, based on the course quantized gain and the one or more fine quantized residuals, the coded gain to obtain a gain. Doing so allows the device to limit the maximum size of bits to 32 in order to avoid the need for multi-precision calculations when decoding vectors ([0275]).
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Ghido (U.S. Publication No. 20180190393) in view of Yamanashi (U.S. Publication No. 20120221344).
Regarding claim 5, Ghido discloses all of the limitations as in claim 1, above.
However, Ghido does not disclose the device of claim 1,
wherein the one or more processors are further configured to determine a difference between the gain and a gain of a different foreground audio signal, and
wherein the one or more processors are configured to encode the difference to obtain the coded gain.
Yamanashi does teach the device of claim 1,
wherein the one or more processors are further configured to determine a difference between the gain and a gain of a different foreground audio signal ([0008] – inputs an input signal… generates first coded information including the first band information and first gain coded information obtained by encoding the first gain and generates a difference signal between a decoded signal obtained by performing decoding using the first coded information and the input signal), and
wherein the one or more processors are configured to encode the difference to obtain the coded gain ([0008] - and a second layer coding section that inputs the difference signal thereto, selects a second quantization target band of the difference signal from the plurality of sub-bands to obtain second band information, and obtains a second gain of the difference signal of the second quantization target band and to generate second coded information including the second band information and second gain coded information obtained by encoding the second gain).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Yamanashi in order to implement the device of claim 1, wherein the one or more processors are further configured to determine a difference between the gain and a gain of a different foreground audio signal, and wherein the one or more processors are configured to encode the difference to obtain the coded gain. Doing so allows for improved quality of the decoded signal in the hierarchial coding scheme in which the band of the coding target is selected in each hierarchy (Yamanashi [0007]).
Claim 6, 9-10, and 20-21 are rejected under 35 U.S.C. 103 as being unpatentable over Ghido (U.S. Publication No. 20180190393) in view of Zotkin (U.S. Publication No. 20210297780).
Regarding claim 6, Ghido discloses all of the limitations as in claim 1, above.
However, Ghido does not disclose the device of claim 1, 
wherein the one or more processors are configured to perform a linear invertible transform with respect to the ambisonic audio data to obtain the foreground audio signal and the corresponding spatial component.
Zotkin does teach the device of claim 1, 
wherein the one or more processors are configured to perform a linear invertible transform with respect to the ambisonic audio data to obtain the foreground audio signal and the corresponding spatial component ([0017] - determining a plane-wave transfer function for a spatial - audio recording device including a number of microphones based on a physical shape of the spatial - audio recording device. [0080] - An arbitrary 3D spatial acoustic field in the time domain can be converted to the frequency domain using known techniques of segmentation of time signals followed by Fourier transform).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the device of claim 1, wherein the one or more processors are configured to perform a linear invertible transform with respect to the ambisonic audio data to obtain the foreground audio signal and the corresponding spatial component. Doing so allows the device to determine time harmonic acoustic fields provided by microphones (Zotkin [0080]).
Regarding claim 9, Ghido discloses all of the limitations as in claim 1, above.
However, Ghido does not disclose the device of claim 1,
wherein the ambisonic audio data comprises audio data defined in a spherical harmonic domain.
Zotkin does teach the device of claim 1,
wherein the ambisonic audio data comprises audio data defined in a spherical harmonic domain ([0068] - As described above, embodiments for recovery of the incident acoustic field using a microphone array mounted on an arbitrarily-shaped scatterer are provided for. The scatterer influence on the field is characterized through an HRTF-like transfer function, which is computed in spherical harmonics domain using numerical methods, enabling one to obtain spherical spectra of the incident field from the microphone potentials directly via least-squares fitting. Incidentally, said spherical spectra include ambisonics representation of the field, allowing for use of such array as a HOA recording device).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the device of claim 1, wherein the ambisonic audio data comprises audio data defined in a spherical harmonic domain. Doing so allows for use of such arrays as an HOA recording device and show robustness to noise [0068]).
Regarding claim 10, Ghido discloses all of the limitations as in claim 1, above.
However, Ghido does not disclose the device of claim 1, 
wherein the foreground audio signal comprises a fore ground audio signal defined in the spherical harmonic domain, and
wherein the spatial component comprises a spatial component defined in the spherical harmonic domain.
Zotkin does teach the device of claim 1, 
wherein the foreground audio signal comprises a foreground audio signal defined in the spherical harmonic domain ([0063] - retrieve a plurality of signals captured by the microphones, determine spherical-harmonics coefficients for an audio signal based on the plurality of captured signals and the spherical harmonics transfer function , and generate the audio signal based on the determined spherical-harmonics coefficients), and
wherein the spatial component comprises a spatial component defined in the spherical harmonic domain ([0017] - a method of generating an audio signal includes determining a plane-wave transfer function for a spatial-audio recording device including a number of microphones based on a physical shape of the spatial-audio recording device, and expanding the plane wave transfer function to generate a spherical-harmonics transfer function corresponding to the plane-wave transfer function).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the device of claim 1, wherein the foreground audio signal comprises a fore ground audio signal defined in the spherical harmonic domain, and wherein the spatial component comprises a spatial component defined in the spherical harmonic domain. Doing so allows for use of such arrays as an HOA recording device and show robustness to noise [0068]).
Regarding claim 20, Ghido discloses all of the limitations as in claim 12, above.
However, Ghido does not disclose the device of claim 12,
wherein the ambisonic audio data comprises audio data defined in a spherical harmonic domain.
Zotkin does teach the device of claim 12,
wherein the ambisonic audio data comprises audio data defined in a spherical harmonic domain ([0068] - As described above, embodiments for recovery of the incident acoustic field using a microphone array mounted on an arbitrarily-shaped scatterer are provided for. The scatterer influence on the field is characterized through an HRTF-like transfer function, which is computed in spherical harmonics domain using numerical methods, enabling one to obtain spherical spectra of the incident field from the microphone potentials directly via least-squares fitting. Incidentally, said spherical spectra include ambisonics representation of the field, allowing for use of such array as a HOA recording device).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the device of claim 12, wherein the ambisonic audio data comprises audio data defined in a spherical harmonic domain. Doing so allows for use of such arrays as an HOA recording device and show robustness to noise [0068]).
Regarding claim 21, Ghido discloses all of the limitations as in claim 12, above.
However, Ghido does not disclose the device of claim 12, 
wherein the foreground audio signal comprises a foreground audio signal defined in the spherical harmonic domain, and
wherein the spatial component comprises a spatial component defined in the spherical harmonic domain.
Zotkin does teach the device of claim 12, 
wherein the foreground audio signal comprises a foreground audio signal defined in the spherical harmonic domain ([0063] -retrieve a plurality of signals captured by the microphones, determine spherical-harmonics coefficients for an audio signal based on the plurality of captured signals and the spherical harmonics transfer function , and generate the audio signal based on the determined spherical-harmonics coefficients), and
wherein the spatial component comprises a spatial component defined in the spherical harmonic domain ([0017] - a method of generating an audio signal includes determining a plane-wave transfer function for a spatial-audio recording device including a number of microphones based on a physical shape of the spatial-audio recording device, and expanding the plane wave transfer function to generate a spherical-harmonics transfer function corresponding to the plane-wave transfer function).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the device of claim 12, wherein the foreground audio signal comprises a foreground audio signal defined in the spherical harmonic domain, and wherein the spatial component comprises a spatial component defined in the spherical harmonic domain. Doing so allows for use of such arrays as an HOA recording device and show robustness to noise [0068]).
Claims 7-8, 18-19, and 29-30 are rejected under 35 U.S.C. 103 as being unpatentable over Ghido (U.S. Publication No. 20180190393) in view of Vilkamo (U.S. Publication No. 20210337338).
Regarding claim 7, Ghido discloses all of the limitations as in claim 1, above.
However, Ghido does not disclose the device of claim 1, 
wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than one.
Vilkamo does teach the device of claim 1, 
wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than one ([0065] - Moreover, the following examples that refer to Ambisonic FOA channels (or signals) further generalize into a higher order Ambisonic (HOA) signals, such as 2nd order Ambisonics with 9 channels or 3rd order Ambisonics with 16 channels, mutatis mutandis).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the device of claim 1, wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than one. Doing so allows for a spatial audio format that provides a relatively straightforward and well-defined representation of a spatial-audio signal (Vilkamo [0010]).
Regarding claim 8, Ghido discloses all of the limitations as in claim 1, above.
However, Ghido does not disclose the device of claim 1,
wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than zero.
Vilkamo does teach the device of claim 1,
wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than zero ([0065] - Moreover, the following examples that refer to Ambisonic FOA channels (or signals) further generalize into a higher order Ambisonic (HOA) signals, such as 2nd order Ambisonics with 9 channels or 3rd order Ambisonics with 16 channels, mutatis mutandis).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the device of claim 1, wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than zero. Doing so allows for a spatial audio format that provides a relatively straightforward and well-defined representation of a spatial-audio signal (Vilkamo [0010]).
Regarding claim 18, Ghido discloses all of the limitations as in claim 12, above.
However, Ghido does not disclose the device of claim 12,
wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than one.
Vilkamo does teach the device of claim 12, 
wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than one ([0065] - Moreover, the following examples that refer to Ambisonic FOA channels (or signals) further generalize into a higher order Ambisonic (HOA) signals, such as 2nd order Ambisonics with 9 channels or 3rd order Ambisonics with 16 channels, mutatis mutandis).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the device of claim 12, wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than one. Doing so allows for a spatial audio format that provides a relatively straightforward and well-defined representation of a spatial-audio signal (Vilkamo [0010]).
Regarding claim 19, Ghido discloses all of the limitations as in claim 12, above.
However, Ghido does not disclose the device of claim 12,
wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than zero.
Vilkamo does teach the device of claim 12,
wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than zero ([0065] - Moreover, the following examples that refer to Ambisonic FOA channels (or signals) further generalize into a higher order Ambisonic (HOA) signals, such as 2nd order Ambisonics with 9 channels or 3rd order Ambisonics with 16 channels, mutatis mutandis).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the device of claim 12, wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than zero. Doing so allows for a spatial audio format that provides a relatively straightforward and well-defined representation of a spatial-audio signal (Vilkamo [0010]).
Regarding claim 29, Ghido discloses all of the limitations as in claim 23, above.
However, Ghido does not disclose the device of claim 23,
wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than one.
Vilkamo does teach the device of claim 23, 
wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than one ([0065] - Moreover, the following examples that refer to Ambisonic FOA channels (or signals) further generalize into a higher order Ambisonic (HOA) signals, such as 2nd order Ambisonics with 9 channels or 3rd order Ambisonics with 16 channels, mutatis mutandis).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the device of claim 23, wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than one. Doing so allows for a spatial audio format that provides a relatively straightforward and well-defined representation of a spatial-audio signal (Vilkamo [0010]).
Regarding claim 30, Ghido discloses all of the limitations as in claim 23, above.
However, Ghido does not disclose the device of claim 23,
wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than zero.
Vilkamo does teach the device of claim 23,
wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than zero ([0065] - Moreover, the following examples that refer to Ambisonic FOA channels (or signals) further generalize into a higher order Ambisonic (HOA) signals, such as 2nd order Ambisonics with 9 channels or 3rd order Ambisonics with 16 channels, mutatis mutandis).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the device of claim 23, wherein the ambisonic audio data comprises ambisonic coefficients corresponding to an order greater than zero. Doing so allows for a spatial audio format that provides a relatively straightforward and well-defined representation of a spatial-audio signal (Vilkamo [0010]).
Claims 17 and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Ghido (U.S. Publication No. 20180190393) in view of Zotkin (U.S. Publication No. 20210297780), further in view of Yamanashi (U.S. Publication No. 20120221344), and further in view of Vilkamo (U.S. Publication No. 20210337338).
Regarding claim 17, Ghido discloses all of the limitations as in claim 12, above.
Ghido does teach the device of claim 12,
wherein the one or more processors are further configured to (Figure 9B – audio post-processor 100, Figure 10C – HREP postprocessor 100):
obtain, from the bitstream, a difference between the coded gain and a coded gain (Figure 1 – gain information, Figure 10C – bitstream [0014] - The parameters of the gain modification are transmitted in the bitstream. [0025] - Depending on the shape of the gain modification function the frequency response of the analysis filters is altered according to the composite window function);
wherein the one or more processors are configured to perform the gain and shape synthesis with respect to the gain and the coded shape (Figure 4 – analysis windower 115, DFT processor 116 [0129] - The DFT processor 116 has an output connected to an input of a low pass shaper 117. The low pass shaper 117 actually performs the low pass filtering action, and the output of the low pass shaper 117 is connected to a DFT inverse processor 118 for generating a sequence of blocks of low pass time domain sampling values. Finally, a synthesis windower 119 is provided at an output of the DFT inverse processor for windowing the sequence of blocks of low pass time domain sampling values using a synthesis window. The output of the synthesis windower 119 is a time domain low pass signal. Thus, blocks 115 to 119 correspond to the “low pass filter” block 111 of FIG. 2, and blocks 121 and 113 correspond to the “subtractor” 113 of FIG. 2. Thus, in the embodiment illustrated in FIG. 4, the band extractor further comprises the audio signal windower 121 for windowing the audio signal 102 using the analysis window and the synthesis window to obtain a sequence of windowed blocks of audio signal values. [0117] - Furthermore, the low pass shaper consisting of 1176 and 117a of FIG. 5A in the embodiment is configured to apply the shaping function rs [f] that depends on the time-variable high frequency gain information for the cor responding block. An implementation of the shaping function rs [f] has been discussed before, but alternative functions can be used as well). 
However, Ghido does not disclose the device of claim 12,
wherein the ambisonic audio data includes first ambisonic coefficients corresponding to a first spherical basis function and second ambisonic coefficients corresponding to a second spherical basis function,
wherein the coded gain represents the first ambisonic coefficients,
obtain a difference between the coded gain and a coded gain representative of the second ambisonic coefficients; and
determine, based on the difference and the coded gain representative of the second ambisonic coefficients, a gain of the first ambisonic coefficients; and
wherein the device obtains the first ambisonic coefficients.
Zotkin does teach the device of claim 12,
wherein the ambisonic audio data includes first ambisonic coefficients corresponding to a first spherical basis function and second ambisonic coefficients corresponding to a second spherical basis function ([0003] - Field recovery in terms of the spherical basis allows the generation of a higher-order ambisonics representation of the spatial audio scene. [0048] - The number p-1 is called order of ambisonics recording (even though it refers to the maximum degree of the spherical harmonics used). Older works used p = 2 (first order); since then, higher-order ambisonics (HOA) techniques has been developed for as high as 8);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the device of claim 12, wherein the ambisonic audio data includes first ambisonic coefficients corresponding to a first spherical basis function and second ambisonic coefficients corresponding to a second spherical basis function and wherein the device obtains the first ambisonic coefficients. Doing so allows the device to determine time harmonic acoustic fields provided by microphones (Zotkin [0080]).
However, Ghido in view of Zotkin does not disclose the device of claim 12,
wherein the coded gain represents the first ambisonic coefficients,
obtains a difference between the coded gain and a coded gain representative of the second ambisonic coefficients; and
determine, based on the difference and the coded gain representative of the second ambisonic coefficients, a gain of the first ambisonic coefficients.
Yamanashi does teach the device of claim 12,
wherein the coded gain represents the first ambisonic coefficients ([0008] - generates first coded information including the first band information and first gain coded information obtained by encoding the first gain. [0056] - Using the spectrum (MDCT coefficient) corresponding to the band indicated by the first layer band information in the input spectrum inputted from band selecting section 301, shape coding section 302 encodes the shape information to generate first layer shape coded information. Next, shape coding section 302 outputs the generated first layer shape coded information to multiplexing section 305. Furthermore, shape coding section 302 outputs an ideal gain (gain information) calculated during shape encoding to gain coding section 304),
obtains a difference between the coded gain and a coded gain representative of the second ambisonic coefficients ([0008] - obtains a second gain of the difference signal of the second quantization target band and to generate second coded information including the second band information and sec ond gain coded information obtained by encoding the second gain. [0056] - Using the spectrum (MDCT coefficient) corresponding to the band indicated by the first layer band information in the input spectrum inputted from band selecting section 301, shape coding section 302 encodes the shape information to generate first layer shape coded information. Next, shape coding section 302 outputs the generated first layer shape coded information to multiplexing section 305. Furthermore, shape coding section 302 outputs an ideal gain (gain information) calculated during shape encoding to gain coding section 304).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido in view of Zotkin to incorporate the teachings of Yamanashi in order to implement the device of claim 12, wherein the coded gain represents the first ambisonic coefficients, and obtains a difference between the coded gain and a coded gain representative of the second ambisonic coefficients. Doing so allows for improved quality of the decoded signal in the hierarchial coding scheme in which the band of the coding target is selected in each hierarchy (Yamanashi [0007]).
However, Ghido in view of Zotkin in view of Yamanashi does not disclose the device of claim 12,
wherein the device determines, based on the difference and the coded gain representative of the second ambisonic coefficients, a gain of the first ambisonic coefficients.
Vilkamo does teach the device of claim 12,
wherein the device determines, based on the difference and the coded gain representative of the second ambisonic coefficients, a gain of the first ambisonic coefficient ([0065] - The coordinate order y, z, x is applied herein because it is the same order as the 1st order coefficients of the typical Ambisonic Channel Number (ACN) channel ordering in Ambisonic signals. Since Ambisonics represents an audio scene in terms of spatial beam patterns, the following examples that refer to Ambisonic FOA channels (or signals) readily generalize into any spatial audio format that represents spatial audio using a corresponding set of spatial beam patterns. Moreover, the following examples that refer to Ambisonic FOA channels (or signals) further generalize into a higher order Ambisonic (HOA) signals, such as 2nd order Ambisonics with 9 channels or 3rd order Ambisonics with 16 channels, mutatis mutandis. [0097] - The ratio modifier 414 may be arranged to derive a direct-gain parameter (fík, n) for the frequency sub-band k and time index n on basis of the scaling factor a (n) and the angular difference B (k, n) obtained for the frequency sub band k and time index n. Figure 10 – Angle Diff. 612, Gain Determ. 614).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido in view of Zotkin in view of Yamanashi to incorporate the teachings of Vilkamo in order to implement the device of claim 12, wherein the device determines, based on the difference and the coded gain representative of the second ambisonic coefficients, a gain of the first ambisonic coefficients. Doing so allows for a spatial audio format that provides a relatively straightforward and well-defined representation of a spatial-audio signal (Vilkamo [0010]).
Regarding claim 28, Ghido teaches all of the limitations as in claim 23, above.
Ghido does teach the method of claim 23,
obtainining, from the bitstream, a difference between the coded gain and a coded gain (Figure 1 – gain information, Figure 10C – bitstream [0014] - The parameters of the gain modification are transmitted in the bitstream. [0025] - Depending on the shape of the gain modification function the frequency response of the analysis filters is altered according to the composite window function);
wherein the one or more processors are configured to perform the gain and shape synthesis with respect to the gain and the coded shape (Figure 4 – analysis windower 115, DFT processor 116 [0129] - The DFT processor 116 has an output connected to an input of a low pass shaper 117. The low pass shaper 117 actually performs the low pass filtering action, and the output of the low pass shaper 117 is connected to a DFT inverse processor 118 for generating a sequence of blocks of low pass time domain sampling values. Finally, a synthesis windower 119 is provided at an output of the DFT inverse processor for windowing the sequence of blocks of low pass time domain sampling values using a synthesis window. The output of the synthesis windower 119 is a time domain low pass signal. Thus, blocks 115 to 119 correspond to the “low pass filter” block 111 of FIG. 2, and blocks 121 and 113 correspond to the “subtractor” 113 of FIG. 2. Thus, in the embodiment illustrated in FIG. 4, the band extractor further comprises the audio signal windower 121 for windowing the audio signal 102 using the analysis window and the synthesis window to obtain a sequence of windowed blocks of audio signal values. [0117] - Furthermore, the low pass shaper consisting of 1176 and 117a of FIG. 5A in the embodiment is configured to apply the shaping function rs [f] that depends on the time-variable high frequency gain information for the cor responding block. An implementation of the shaping function rs [f] has been discussed before, but alternative functions can be used as well). However, Ghido does not disclose the method of claim 23,
wherein the ambisonic audio data includes first ambisonic coefficients corresponding to a first spherical basis function and second ambisonic coefficients corresponding to a second spherical basis function,
wherein the coded gain represents the first ambisonic coefficients, wherein the method further comprises
obtain a difference between the coded gain and a coded gain representative of the second ambisonic coefficients; and
determine, based on the difference and the coded gain representative of the second ambisonic coefficients, a gain of the first ambisonic coefficients; and
wherein the device obtains the first ambisonic coefficients.
Zotkin does teach the method of claim 23,
wherein the ambisonic audio data includes first ambisonic coefficients corresponding to a first spherical basis function and second ambisonic coefficients corresponding to a second spherical basis function ([0003] - Field recovery in terms of the spherical basis allows the generation of a higher-order ambisonics representation of the spatial audio scene. [0048] - The number p-1 is called order of ambisonics recording (even though it refers to the maximum degree of the spherical harmonics used). Older works used p = 2 (first order); since then, higher-order ambisonics (HOA) techniques has been developed for as high as 8);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido to incorporate the teachings of Zotkin in order to implement the method of claim 23, wherein the ambisonic audio data includes first ambisonic coefficients corresponding to a first spherical basis function and second ambisonic coefficients corresponding to a second spherical basis function and wherein the device obtains the first ambisonic coefficients. Doing so allows the device to determine time harmonic acoustic fields provided by microphones (Zotkin [0080]).
However, Ghido in view of Zotkin does not disclose the method of claim 23,
wherein the coded gain represents the first ambisonic coefficients,
obtains a difference between the coded gain and a coded gain representative of the second ambisonic coefficients; and
determine, based on the difference and the coded gain representative of the second ambisonic coefficients, a gain of the first ambisonic coefficients.
Yamanashi does teach the method of claim 23,
wherein the coded gain represents the first ambisonic coefficients ([0008] - generates first coded information including the first band information and first gain coded information obtained by encoding the first gain. [0056] - Using the spectrum (MDCT coefficient) corresponding to the band indicated by the first layer band information in the input spectrum inputted from band selecting section 301, shape coding section 302 encodes the shape information to generate first layer shape coded information. Next, shape coding section 302 outputs the generated first layer shape coded information to multiplexing section 305. Furthermore, shape coding section 302 outputs an ideal gain (gain information) calculated during shape encoding to gain coding section 304),
obtains a difference between the coded gain and a coded gain representative of the second ambisonic coefficients ([0008] - obtains a second gain of the difference signal of the second quantization target band and to generate second coded information including the second band information and sec ond gain coded information obtained by encoding the second gain. [0056] - Using the spectrum (MDCT coefficient) corresponding to the band indicated by the first layer band information in the input spectrum inputted from band selecting section 301, shape coding section 302 encodes the shape information to generate first layer shape coded information. Next, shape coding section 302 outputs the generated first layer shape coded information to multiplexing section 305. Furthermore, shape coding section 302 outputs an ideal gain (gain information) calculated during shape encoding to gain coding section 304).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido in view of Zotkin to incorporate the teachings of Yamanashi in order to implement the method of claim 23, wherein the coded gain represents the first ambisonic coefficients, and obtains a difference between the coded gain and a coded gain representative of the second ambisonic coefficients. Doing so allows for improved quality of the decoded signal in the hierarchial coding scheme in which the band of the coding target is selected in each hierarchy (Yamanashi [0007]).
However, Ghido in view of Zotkin in view of Yamanashi does not disclose the method of claim 23,
wherein the device determines, based on the difference and the coded gain representative of the second ambisonic coefficients, a gain of the first ambisonic coefficients.
Vilkamo does teach the method of claim 23,
wherein the device determines, based on the difference and the coded gain representative of the second ambisonic coefficients, a gain of the first ambisonic coefficient ([0065] - The coordinate order y, z, x is applied herein because it is the same order as the 1st order coefficients of the typical Ambisonic Channel Number (ACN) channel ordering in Ambisonic signals. Since Ambisonics represents an audio scene in terms of spatial beam patterns, the following examples that refer to Ambisonic FOA channels (or signals) readily generalize into any spatial audio format that represents spatial audio using a corresponding set of spatial beam patterns. Moreover, the following examples that refer to Ambisonic FOA channels (or signals) further generalize into a higher order Ambisonic (HOA) signals, such as 2nd order Ambisonics with 9 channels or 3rd order Ambisonics with 16 channels, mutatis mutandis. [0097] - The ratio modifier 414 may be arranged to derive a direct-gain parameter (fík, n) for the frequency sub-band k and time index n on basis of the scaling factor a (n) and the angular difference B (k, n) obtained for the frequency sub band k and time index n. Figure 10 – Angle Diff. 612, Gain Determ. 614).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ghido in view of Zotkin in view of Yamanashi to incorporate the teachings of Vilkamo in order to implement the method of claim 23, wherein the device determines, based on the difference and the coded gain representative of the second ambisonic coefficients, a gain of the first ambisonic coefficients. Doing so allows for a spatial audio format that provides a relatively straightforward and well-defined representation of a spatial-audio signal (Vilkamo [0010]).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Batke (U.S. Publication No. 20200273470) teaches the method and device for decoding an audio soundfield representation. Bleidt (U.S. Publication No. 20190373294) teaches decoder for decoding a media signal and encoder for encoding secondary media data comprising metadata or control data for primary media data. Bosnjak (U.S. Publication No. 20190379994) teaches applications and format for immersive spatial sound. Hoogeboom (U.S. Publication No. 20200184595) teaches the method and device for digital image, audio or video data processing. Laaksonen (U.S. Publication No. 20210400413) teaches ambience audio representation and associated rendering. Pihlajakuja (U.S. Publication No. 20210319799) teaches spatial parameter signaling. Rämö (U.S. Publication No. 20210392434) teaches processing audio signals. Shahbazi Mirzahasanloo (U.S. Publication No. 20190371349) teaches audio coding based on audio pattern recognition. Shahbazi Mirzahasanloo (U.S. Publication No. 20190341064) teaches cooperative pyramid vector quantizaters for scalable audio coding. Shahbazi Mirzahasanloo (U.S. Publication No. 20190371348) teaches perceptual audio coding as sequential decision-making problems. Svedberd (U.S. Publication No. 20190362730) teaches methods, encoder, and decoder for handling envelope representation coefficients. Vasilache (U.S. Publication No. 20210407525) teaches determination of spatial audio parameter encoding and associated decoding.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ETHAN DANIEL KIM whose telephone number is (571) 272-1405.  The examiner can normally be reached on Monday - Friday 9:00 - 5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ETHAN DANIEL KIM/
Examiner, Art Unit 2658
/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658