DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is in response to the applicants’ amendment filed on September 1, 2022 and wherein the Applicant has amended claims 1, 20, 21, 22, 23, 24, cancelled claim 2, and added new independent claims 25-26.
In virtue of this communication, claims 1, 3-26 are currently pending in this Office Action.
With respect to the objection of claims 1-24 due to formality issue, as set forth in the previous Office Action, the Applicant’s amendment, including the cancelation of claim 2, and argument, see paragraph 2 of page 14 in Remarks filed on September 1, 2022, have been fully considered and the argument is persuasive. Therefore, the objection of claims 1-24 due to the formality issue, as set forth in the previous Office Action, has been withdrawn.

Claim Objections
Claims 2 are objected to because of the following informalities: 
Claim 2 recites “… from the encoded audio data..” which should be --… from the encoded audio data.[[.]]--. 
Appropriate correction is required.

Examiner Comments

The IDS submitted on December 13, 2021 was missing “abstract”, “equivalent to US …”, etc., in title FOEIGN DOCUMENT if applied and was missing marker “(year)” for some of NPL documents listed in title NON-PATENT LITERATURE DOCUMENTS. Appropriate correction is expected.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(a):

(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 1-16 are rejected under 35 U.S.C. 112(a), first paragraph, as based on a disclosure which is not enabling. The disclosed mode controller and functions thereof are critical or essential to the practice of the invention, e.g., allowing “audio decoder” to perform “when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects” or “comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects” and performing the claimed “either bypass the object processor … when the encoded audio data comprises the plurality of encoded audio channels without any audio objects, or to feed the lurality of decoded audio objects and the plurality of decoded audio channels into the object processor, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects”, but not included in the claims is not enabled by the disclosure.  See In re Mayhew, 527 F.2d 1229, 188 USPQ 356 (CCPA 1976). 
Claim 1 recites “when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects” and “when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects”, “wherein the audio decoder is configured to either bypass the object processor … or to feed the plurality of decoded audio objects and the plurality of decoded audio channels into the object processor (not bypass)…”, but the application specification discloses “Advantageously, the indication whether mode 1 or mode 2 is to be applied is included in the encoded audio data and then the mode controller 1600 analyses the encoded data to detect a mode indication. Mode 1 is used when the mode indication indicates that the encoded audio data comprises encoded channels and encoded objects and mode 2 is applied when the mode indication indicates that the encoded audio data does not contain any audio objects,” etc. which is essential to enable the claimed feature “when …” and “bypass …” above to be in practice. Claims 2-14 are rejected due to the dependencies to claim 1.
Claim 15 is rejected for the at least similar reason as described in claim 1 above because claim 15 recites the features having similar deficiencies as recited in claim 1.
Claim 16 is rejected for the at least similar reason as described in claim 1 above because claim 16 recites the features having similar deficiencies as recited in claim 1.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(B)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 4-5 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which applicant regards as the invention.
Claim 4 recites “the audio objects to acquire rendered audio objects” and wherein “the audio objects” has an insufficient antecedent basis for the limitation and causes confusing because it is unclear what “the audio objects” is at the audio decoder and thus, renders claim indefinite.
Claim 5 recites “the plurality of audio objects”, “to decode the audio channels” and to mix the audio channels” and wherein “the plurality of audio objects” and “the audio channels” have insufficient antecedent basis for the limitation in claim 5 and causes confusing because it is unclear what “the plurality of audio objects” is and it is unclear to perform “render the plurality of audio objects using …” and it is unclear what “the audio channels” is and it is unclear how “the audio channels” is decoded and thus, renders claim indefinite.

Double Patenting
A rejection based on double patenting of the "same invention" type finds its support in the language of 35 U.S.C. 101 which states that "whoever invents or discovers any new and useful process ... may obtain a patent therefor ..."  (Emphasis added).  Thus, the term "same invention," in this context, means an invention drawn to identical subject matter.  See Miller v. Eagle Mfg. Co., 151 U.S. 186 (1894); In re Ockert, 245 F.2d 467, 114 USPQ 330 (CCPA 1957); and In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970).
A statutory type (35 U.S.C. 101) double patenting rejection can be overcome by canceling or amending the conflicting claims so they are no longer coextensive in scope.  The filing of a terminal disclaimer cannot overcome a double patenting rejection based upon 35 U.S.C. 101.

Claim 1-8, 10-16 rejected under  35 U.S.C. 101 as claiming the same invention as that of claims  8-16, 18-21, 24-25 of prior U.S. Patent No. 10,249,311 B2. This is a double patenting rejection. The following is the comparison between claims 1-8, 10-16 of the current application with the conflicting claims 8-16, 18-21, 24-25 of prior U.S. Patent No. 10,249,311 B2 for reference:
Claims in the current application
Conflicting claims in U.S. Patent No. 10,249,311 B2
1. An audio decoder for decoding encoded audio data, comprising: an input interface configured for receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of audio objects, or a plurality of encoded audio channels without any encoded audio objects; a core decoder configured for decoding the plurality of encoded audio channels received by the input interface and the plurality of encoded audio objects received by the input interface to acquire a plurality of decoded audio channels and a plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or decoding the plurality of encoded audio channels received by the input interface to acquire a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; a metadata decompressor configured for decompressing the compressed metadata to acquire decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; an object processor configured for processing the plurality of decoded audio objects using the decompressed metadata and the plurality of decoded audio channels to acquire a number of output audio channels comprising audio data from the plurality of decoded audio objects and the plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; and a post-processor configured for converting the number of output audio channels into an output format, wherein the audio decoder is configured to either bypass the object processor and to feed the plurality of decoded audio channels as the output audio channels into the post-processor, when the encoded audio data comprises the plurality of encoded audio channels without any audio objects, or to feed the plurality of decoded audio objects and the plurality of decoded audio channels into the object processor, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects.

2. The audio decoder of claim 1, wherein the post-processor is configured to convert the number of output audio channels to a binaural representation as the output format or to a reproduction format as the output format, the reproduction format comprising a smaller number of audio channels than the number of output audio channels, and wherein the audio decoder is configured to control the post-processor in accordance with a control input derived from a user interface or extracted from the encoded audio data.

3. The audio decoder of claim 1, in which the object processor comprises: an object renderer for rendering the decoded audio objects to acquire rendered audio objects using the decompressed metadata; and a mixer for mixing the rendered audio objects and the decoded audio channels to acquire the number of output audio channels.

4. The audio decoder of claim 1, wherein the object processor comprises: a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects, wherein the spatial audio object coding decoder is configured to render the decoded audio objects in accordance with rendering information related to a placement of the audio objects to acquire rendered audio objects and to control the object processor to mix the rendered audio objects and the decoded audio channels to acquire the number of output audio channels.

5. The audio decoder of claim 1, wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects and encoded audio channels, wherein the spatial audio object coding decoder is configured to decode the encoded audio objects and the encoded audio channels using the one or more transport channels and the parametric side information and wherein the object processor is configured to render the plurality of audio objects using the decompressed metadata to acquire rendered audio objects and to decode the audio channels and to mix the audio channels with the rendered audio objects to acquire the number of output audio channels.

6. The audio decoder of claim 1, wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects or encoded audio channels, wherein the spatial audio object coding decoder is configured to transcode the associated parametric information and the decompressed metadata into transcoded parametric side information usable for directly rendering the output format, and wherein the post-processor is configured for calculating audio channels of the output format using the decoded transport channels and the transcoded parametric side information, or wherein the spatial audio object coding decoder is configured to directly upmix and render channel signals for the output format using the decoded transport channels and the parametric side information.

7. The audio decoder of claim 1, wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels output by the core decoder and associated parametric data and decompressed metadata to acquire a plurality of rendered audio objects, wherein the object processor comprises an object renderer being configured to render the decoded audio objects output by the core decoder to acquire rendered decoded audio objects; wherein the object processor is furthermore configured to mix the rendered decoded audio objects and the plurality of rendered audio objects with the decoded audio channels, wherein the audio decoder further comprises an output interface for outputting an output of the mixer to loudspeakers, wherein the post-processor furthermore comprises: a binaural renderer for rendering the output audio channels into two binaural channels using head related transfer functions or binaural impulse responses, the two binaural channels representing the binaural representation, and a format converter for converting the output audio channels into the output format comprising a lower number of audio channels than the output audio channels of the mixer using information on a reproduction layout.

8. The audio decoder of claim 1, wherein the plurality of encoded audio channels or the plurality of encoded audio objects are encoded as channel pair elements, single channel elements, low frequency elements or quad channel elements, wherein a quad channel element comprises four original audio channels or audio objects, and wherein the core decoder is configured to decode the channel pair elements, the single channel elements, the low frequency elements or the quad channel elements in accordance with side information comprised by the encoded audio data indicating a channel pair element, a single channel element, a low frequency element or a quad channel element.


10. The audio decoder of claim 1, wherein elements comprising the binaural renderer, the format converter, the mixer, the SAOC decoder and the core decoder and the object renderer operate in a quadrature mirror filterbank (QMF) domain and wherein quadrature mirror filter domain data is transmitted from one of the elements to another of the elements without any synthesis filterbank and subsequent analysis filterbank processing.

11. The audio decoder of claim 1, wherein the post-processor is configured to downmix the number of output audio channels output by the object processor to a format comprising three or more audio channels and comprising less audio channels than the number of output audio channels output by the object processor to acquire channels of an intermediate downmix, and to binaurally render the channels of the intermediate downmix into the binaural representation comprising a two-channel binaural output signal.

12. The audio decoder of claim 1, in which the post-processor comprises: a controlled downmixer for applying a downmix matrix; and a controller for determining a specific downmix matrix using information on a channel configuration of an output of the object processor and information on an intended reproduction layout.

13. The audio decoder of claim 1, in which the core decoder or the object processor are controllable, and in which the post-processor is configured to control the core decoder or the object processor in accordance with information on the output format so that a rendering incurring decorrelation processing of audio objects or audio channels not occurring as separate audio channels in the output format is reduced or eliminated, or so that for audio objects or audio channels not occurring as the separate audio channels in the output format, upmixing or decoding operations are performed as if the audio objects or the audio channels would occur as the separate audio channels in the output format, except that any decorrelation processing for the audio objects or the audio channels not occurring as the separate audio channels in the output format is deactivated.

14. The audio decoder of claim 1, in which the core decoder is configured to perform transform decoding and a spectral band replication decoding for the single channel elements, and to perform the transform decoding, parametric stereo decoding and the spectral band reproduction decoding for the channel pair elements and the quad channel elements.

15. A method of decoding encoded audio data, comprising: receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of audio objects, or a plurality of encoded audio channels without any encoded audio objects; core decoding the encoded audio data to acquire a plurality of decoded audio channels and a plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or the plurality of encoded audio channels to acquire a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; decompressing the compressed metadata to acquire decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, processing the plurality of decoded audio objects using the decompressed metadata, and the plurality of decoded audio channels to acquire a number of output audio channels comprising audio data from the plurality of decoded audio objects and the plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; and converting the number of output audio channels into an output format, wherein, in the method of decoding the encoded audio data, either the processing the plurality of decoded audio objects is bypassed and the plurality of decoded audio channels acquired by the core decoding is fed, as the output audio channels, into the converting, when the encoded audio data comprises the plurality of encoded audio channels without any audio objects, or the plurality of decoded audio objects and the plurality of decoded audio channels acquired by the core decoding are fed into processing the plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects.

16. A non-transitory digital storage medium having stored thereon a computer program for performing a method of decoding encoded audio data, comprising: receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of audio objects, or a plurality of encoded audio channels without any encoded audio objects; core decoding the encoded audio data to acquire a plurality of decoded audio channels and a plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or the plurality of encoded audio channels to acquire a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; decompressing the compressed metadata to acquire decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, processing the plurality of decoded audio objects using the decompressed metadata, and the plurality of decoded audio channels to acquire a number of output audio channels comprising audio data from the plurality of decoded audio objects and the plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; and converting the number of output audio channels into an output format, wherein, in the method of decoding the encoded audio data, either the processing the plurality of decoded audio objects is bypassed and the plurality of decoded audio channels acquired by the core decoding is fed, as the output audio channels, into the converting, when the encoded audio data comprises the plurality of encoded audio channels without any audio objects, or the plurality of decoded audio objects and the plurality of decoded audio channels acquired by the core decoding are fed into processing the plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, when said computer program is run by a computer. 
8. An audio decoder for decoding encoded audio data, comprising: an input interface that receives the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of encoded audio objects, or a plurality of encoded audio channels without any encoded audio objects; a core decoder that decodes either the plurality of encoded audio channels received by the input interface and the plurality of encoded audio objects received by the input interface to obtain a plurality of decoded audio channels and a plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or that decodes the plurality of encoded audio channels received by the input interface to obtain a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; a metadata decompressor that decompresses the compressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, an object processor that processes the plurality of decoded audio objects using the decompressed metadata and the plurality of decoded audio channels to acquire a number of output audio channels comprising audio data from the plurality of decoded audio objects and the plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; and a post processor that converts the number of output audio channels into an output format, wherein the audio decoder is configured to either bypass the object processor and to feed the plurality of decoded audio channels as the output audio channels into the post processor, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects, or to feed the plurality of decoded audio objects and the plurality of decoded audio channels into the object processor, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects.


9. The audio decoder of claim 8, wherein the post processor is configured to convert the number of output audio channels to a binaural representation or to a reproduction format comprising a smaller number of audio channels than the number of output audio channels, wherein the audio decoder is configured to control the post processor in accordance with control input derived from an user interface or extracted from the encoded audio data received by the input interface.


10. The audio decoder of claim 8, in which the object processor comprises: an object renderer for rendering decoded audio objects using decompressed metadata; and a mixer for mixing rendered audio objects and decoded audio channels to acquire the number of output audio channels.


11. The audio decoder of claim 8, wherein the object processor comprises: a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects, wherein the spatial audio object coding decoder is configured to render the decoded audio objects in accordance with rendering information related to a placement of the audio objects, wherein the object processor is configured to mix the rendered audio objects and the decoded audio channels to acquire the number of output audio channels.

12. The audio decoder of claim 8, wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects and encoded audio channels, wherein the spatial audio object coding decoder is configured to decode the encoded audio objects and the encoded audio channels using the one or more transport channels and the parametric side information and wherein the object processor is configured to render the plurality of audio objects using the decompressed metadata and to decode the audio channels and mix them with the rendered audio objects to acquire the number of output audio channels.


13. The audio decoder of claim 8, wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects or encoded audio channels, wherein the spatial audio object coding decoder is configured to transcode the associated parametric information and the decompressed metadata into transcoded parametric side information usable for directly rendering the output format, and wherein the post processor calculates audio channels of the output format using the decoded transport channels and the transcoded parametric side information, or wherein the spatial audio object coding decoder is configured to directly upmix and render channel signals for the output format using the decoded transport channels and the parametric side information.

14. The audio decoder in accordance with claim 8, wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels output by the core decoder and associated parametric data and decompressed metadata to acquire a plurality of rendered audio objects, wherein the object processor is furthermore configured to render decoded audio objects output by the core decoder; wherein the object processor is furthermore configured to mix rendered decoded audio objects with decoded audio channels, wherein the audio decoder further comprises an output interface for outputting an output of a mixer to loudspeakers, wherein the post processor furthermore comprises: a binaural renderer for rending the output audio channels into two binaural channels using head related transfer functions or binaural impulse responses, and a format converter for converting the output audio channels into an output format comprising a lower number of audio channels than the output audio channels of the mixer using information on a reproduction layout.




16. The audio decoder of claim 8, wherein the plurality of encoded audio channel elements or the plurality of encoded audio objects are encoded as channel pair elements, single channel elements, low frequency elements or quad channel elements, wherein a quad channel element comprises four original audio channels or audio objects, and wherein the core decoder is configured to decode the channel pair elements, single channel elements, low frequency elements or quad channel elements in accordance with side information comprised in the encoded audio data indicating a channel pair element, a single channel element, a low frequency element or a quad channel element.

15. The audio decoder of claim 14, wherein certain elements comprising the binaural renderer, the format converter, a mixer, an SAOC decoder, the core decoder, and an object renderer operate in a quadrature mirror filterbank domain and wherein quadrature mirror filter domain data is transmitted from one of the certain elements to another of the certain elements without any synthesis filterbank and subsequent analysis filterbank processing.

18. The audio decoder of claim 8, wherein the post processor is configured to downmix audio channels output by the object processor to a format comprising three or more audio channels and comprising less audio channels than the number of output audio channels of the object processor to acquire an intermediate downmix, and to binaurally render the audio channels of the intermediate downmix into a two-channel binaural output signal.



19. The audio decoder of claim 8, in which the post processor comprises: a controlled downmixer for applying a downmix matrix; and a controller for determining a specific downmix matrix using information on a channel configuration of an output of the object processor and information on an intended reproduction layout.

20. The audio decoder of claim 8, in which the core decoder or the object processor are controllable, and in which the post processor is configured to control the core decoder or the object processor in accordance with information on the output format so that a rendering incurring decorrelation processing of audio objects or audio channels not occurring as separate audio channels in the output format is reduced or eliminated, or so that for audio objects or audio channels not occurring as the separate audio channels in the output format, upmixing or decoding operations are performed as if the audio objects or audio channels would occur as the separate audio channels in the output format, except that any decorrelation processing for the audio objects or the audio channels not occurring as the separate audio channels in the output format is deactivated.

21. The audio decoder of claim 8, in which the core decoder is configured to perform transform decoding and a spectral band replication decoding for a single channel element, and to perform transform decoding, parametric stereo decoding and spectral band reproduction decoding for channel pair elements and quad channel elements. 


24. A method of decoding encoded audio data, comprising: receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of audio objects, or a plurality of encoded audio channels without any encoded audio objects; core decoding either the encoded audio data to obtain a plurality of decoded audio channels and a plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or the plurality of encoded audio channels to obtain a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; decompressing the compressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, processing the plurality of decoded audio objects using the decompressed metadata and the plurality of decoded audio channels to acquire a number of output audio channels comprising audio data from the plurality of decoded audio objects and the plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; and converting the number of output audio channels into an output format, wherein, in the method of decoding the encoded audio data, either the processing the plurality of decoded audio objects is bypassed and the plurality of decoded audio channels obtained by the core decoding is fed, as the output audio channels, into the converting, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects, or the plurality of decoded audio objects and the plurality of decoded audio channels obtained by the core decoding are fed into the processing the plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects.

25. A non-transitory digital storage medium having computer-readable code stored thereon to perform, when running on a computer or a processor, the method of claim 24.


Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.  A nonstatutory obviousness-type double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and  In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. 
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 CFR 3.73(b).

Claims 1-4, 6-12, 14-16 rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 1-14 of U.S. Patent No. 11,227,616 B2. Although the conflicting claims are not identical, they are not patentably distinct from each other because the claims of the instant application is a broader version of the claims of U.S. Patent No. 11,227,616 B2. The following is the comparison between claims 1-4, 6-12, 14-16 of the instant application and the conflicting claims 1-14 of the U.S. Patent No. 11,227,616 B2:
Claim(s) in the current application
Conflicting claims in U.S. Patent No. 11,227,616 B2
1. An audio decoder for decoding encoded audio data, comprising: an input interface configured for receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of audio objects, or a plurality of encoded audio channels without any encoded audio objects; a core decoder configured for decoding the plurality of encoded audio channels received by the input interface and the plurality of encoded audio objects received by the input interface to acquire a plurality of decoded audio channels and a plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or decoding the plurality of encoded audio channels received by the input interface to acquire a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; a metadata decompressor configured for decompressing the compressed metadata to acquire decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; an object processor configured for processing the plurality of decoded audio objects using the decompressed metadata and the plurality of decoded audio channels to acquire a number of output audio channels comprising audio data from the plurality of decoded audio objects and the plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; and a post-processor configured for converting the number of output audio channels into an output format, wherein the audio decoder is configured to either bypass the object processor and to feed the plurality of decoded audio channels as the output audio channels into the post-processor, when the encoded audio data comprises the plurality of encoded audio channels without any audio objects, or to feed the plurality of decoded audio objects and the plurality of decoded audio channels into the object processor, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects.









2. The audio decoder of claim 1, wherein the post-processor is configured to convert the number of output audio channels to a binaural representation as the output format or to a reproduction format as the output format, the reproduction format comprising a smaller number of audio channels than the number of output audio channels, and wherein the audio decoder is configured to control the post-processor in accordance with a control input derived from a user interface or extracted from the encoded audio data.


3. The audio decoder of claim 1, in which the object processor comprises: an object renderer for rendering the decoded audio objects to acquire rendered audio objects using the decompressed metadata; and a mixer for mixing the rendered audio objects and the decoded audio channels to acquire the number of output audio channels.


4. The audio decoder of claim 1, wherein the object processor comprises: a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects, wherein the spatial audio object coding decoder is configured to render the decoded audio objects in accordance with rendering information related to a placement of the audio objects to acquire rendered audio objects and to control the object processor to mix the rendered audio objects and the decoded audio channels to acquire the number of output audio channels.













6. The audio decoder of claim 1, wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects or encoded audio channels, wherein the spatial audio object coding decoder is configured to transcode the associated parametric information and the decompressed metadata into transcoded parametric side information usable for directly rendering the output format, and wherein the post-processor is configured for calculating audio channels of the output format using the decoded transport channels and the transcoded parametric side information, or wherein the spatial audio object coding decoder is configured to directly upmix and render channel signals for the output format using the decoded transport channels and the parametric side information.


7. The audio decoder of claim 1, wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels output by the core decoder and associated parametric data and decompressed metadata to acquire a plurality of rendered audio objects, wherein the object processor comprises an object renderer being configured to render the decoded audio objects output by the core decoder to acquire rendered decoded audio objects; wherein the object processor is furthermore configured to mix the rendered decoded audio objects and the plurality of rendered audio objects with the decoded audio channels, wherein the audio decoder further comprises an output interface for outputting an output of the mixer to loudspeakers, wherein the post-processor furthermore comprises: a binaural renderer for rendering the output audio channels into two binaural channels using head related transfer functions or binaural impulse responses, the two binaural channels representing the binaural representation, and a format converter for converting the output audio channels into the output format comprising a lower number of audio channels than the output audio channels of the mixer using information on a reproduction layout.













8. The audio decoder of claim 1, wherein the plurality of encoded audio channels or the plurality of encoded audio objects are encoded as channel pair elements, single channel elements, low frequency elements or quad channel elements, wherein a quad channel element comprises four original audio channels or audio objects, and wherein the core decoder is configured to decode the channel pair elements, the single channel elements, the low frequency elements or the quad channel elements in accordance with side information comprised by the encoded audio data indicating a channel pair element, a single channel element, a low frequency element or a quad channel element.











9. The audio decoder of claim 1, wherein the core decoder is configured to apply full-band decoding operation using a noise filling operation.


10. The audio decoder of claim 1, wherein elements comprising the binaural renderer, the format converter, the mixer, the SAOC decoder and the core decoder and the object renderer operate in a quadrature mirror filterbank (QMF) domain and wherein quadrature mirror filter domain data is transmitted from one of the elements to another of the elements without any synthesis filterbank and subsequent analysis filterbank processing.


11. The audio decoder of claim 1, wherein the post-processor is configured to downmix the number of output audio channels output by the object processor to a format comprising three or more audio channels and comprising less audio channels than the number of output audio channels output by the object processor to acquire channels of an intermediate downmix, and to binaurally render the channels of the intermediate downmix into the binaural representation comprising a two-channel binaural output signal.

12. The audio decoder of claim 1, in which the post-processor comprises: a controlled downmixer for applying a downmix matrix; and a controller for determining a specific downmix matrix using information on a channel configuration of an output of the object processor and information on an intended reproduction layout.



14. The audio decoder of claim 1, in which the core decoder is configured to perform transform decoding and a spectral band replication decoding for the single channel elements, and to perform the transform decoding, parametric stereo decoding and the spectral band reproduction decoding for the channel pair elements and the quad channel elements.






















15. A method of decoding encoded audio data, comprising: receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of audio objects, or a plurality of encoded audio channels without any encoded audio objects; core decoding the encoded audio data to acquire a plurality of decoded audio channels and a plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or the plurality of encoded audio channels to acquire a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; decompressing the compressed metadata to acquire decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, processing the plurality of decoded audio objects using the decompressed metadata, and the plurality of decoded audio channels to acquire a number of output audio channels comprising audio data from the plurality of decoded audio objects and the plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; and converting the number of output audio channels into an output format, wherein, in the method of decoding the encoded audio data, either the processing the plurality of decoded audio objects is bypassed and the plurality of decoded audio channels acquired by the core decoding is fed, as the output audio channels, into the converting, when the encoded audio data comprises the plurality of encoded audio channels without any audio objects, or the plurality of decoded audio objects and the plurality of decoded audio channels acquired by the core decoding are fed into processing the plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects.










16. A non-transitory digital storage medium having stored thereon a computer program for performing a method of decoding encoded audio data, comprising: receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of audio objects, or a plurality of encoded audio channels without any encoded audio objects; core decoding the encoded audio data to acquire a plurality of decoded audio channels and a plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or the plurality of encoded audio channels to acquire a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; decompressing the compressed metadata to acquire decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, processing the plurality of decoded audio objects using the decompressed metadata, and the plurality of decoded audio channels to acquire a number of output audio channels comprising audio data from the plurality of decoded audio objects and the plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; and converting the number of output audio channels into an output format, wherein, in the method of decoding the encoded audio data, either the processing the plurality of decoded audio objects is bypassed and the plurality of decoded audio channels acquired by the core decoding is fed, as the output audio channels, into the converting, when the encoded audio data comprises the plurality of encoded audio channels without any audio objects, or the plurality of decoded audio objects and the plurality of decoded audio channels acquired by the core decoding are fed into processing the plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, when said computer program is run by a computer.
1. An audio decoder for decoding encoded audio data, comprising: an input interface configured for receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of encoded audio objects, or a plurality of encoded audio channels without any encoded audio objects; a mode controller configured for analyzing the encoded audio data to determine whether the encoded audio data comprise either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of encoded audio objects, or a plurality of encoded audio channels without any encoded audio objects; a core decoder configured for either decoding the plurality of encoded audio channels received by the input interface to obtain decoded audio channels and decoding the plurality of encoded audio objects received by the input interface to obtain decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or decoding the plurality of encoded audio channels received by the input interface to obtain decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; a metadata decompressor configured for decompressing the compressed metadata to obtain decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; an object processor configured for processing the decoded audio objects using the decompressed metadata and the decoded audio channels to acquire a number of output audio channels comprising audio data from the decoded audio objects and the decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; a post processor configured for post processing the number of output audio channels to obtain an output format, wherein the mode controller is configured for controlling the audio decoder to either bypass the object processor and to feed the decoded audio channels as the output audio channels into the post processor, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects, or to feed the decoded audio objects and the decoded audio channels into the object processor, when the encoded audio data comprise the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects.

2. The audio decoder of claim 1, wherein the post processor is configured for converting the number of output audio channels to a binaural representation as the output format or to a reproduction format as the output format, the reproduction format comprising a smaller number of reproduction audio channels than the number of output audio channels, and wherein the audio decoder is configured for controlling the post processor in accordance with a control input derived from an user interface or extracted from the encoded audio data received by the input interface.

3. The audio decoder of claim 1, in which the object processor comprises: an object renderer configured for rendering the decoded audio objects using the decompressed metadata to obtain rendered audio objects; and a mixer configured for mixing the rendered audio objects and the decoded audio channels to acquire the number of output audio channels.

4. The audio decoder of claim 1, wherein the plurality of encoded objects comprises one or more core encoded transport channels and associated parametric side information, wherein the core decoder is configured to decode the one or more core encoded transport channels to obtain the decoded audio objects comprising one or more core decoded transport channels and the associated parametric side information, wherein the object processor comprises a spatial audio object coding decoder configured for decoding the one or more core decoded transport channels and the associated parametric side information to obtain spatial audio object decoded audio objects, wherein the spatial audio object coding decoder is configured for rendering the spatial audio object decoded audio objects in accordance with rendering information related to a placement of the spatial audio object decoded audio objects to obtain rendered audio objects, and wherein the object processor is configured for mixing the rendered audio objects and the decoded audio channels to acquire the number of output audio channels.

5. The audio decoder of claim 1, wherein the plurality of encoded audio objects comprises one or more core encoded transport channels and associated parametric side information representing the plurality of encoded audio objects, wherein the core decoder is configured to decode the one or more core encoded transport channels to obtain the decoded audio objects comprising one or more core decoded transport channels and the associated parametric side information, wherein the spatial audio object coding decoder is configured for transcoding the associated parametric side information and the decompressed metadata into transcoded parametric side information usable for directly rendering the output format, and wherein the post processor is configured for calculating output format audio channels of the output format using the one or more core decoded transport channels and the transcoded parametric side information.

6. The audio decoder of claim 1, wherein the plurality of encoded audio objects comprises one or more core encoded transport channels and associated parametric data, wherein the core decoder is configured to decode the one or more core encoded transport channels to obtain one or more core decoded transport channels, wherein the object processor comprises a spatial audio object coding decoder configured for decoding the core decoded one or more transport channels outputted by the core decoder and the associated parametric data and the decompressed metadata to acquire a plurality of spatial audio object rendered audio objects, wherein the object processor comprises an object renderer configured for rendering the decoded audio objects outputted by the core decoder to obtain rendered decoded audio objects; wherein the object processor comprises a mixer for mixing the rendered decoded audio objects, the spatial audio object rendered audio objects, and the decoded audio channels to obtain mixer output audio channels, wherein the audio decoder further comprises an output interface configured for outputting the mixer output audio channels to loudspeakers, wherein the post processor furthermore comprises: a binaural renderer configured for rendering the mixer output audio channels into two binaural channels as the output format using head related transfer functions or binaural impulse responses, or a format converter configured for converting the mixer output audio channels into an output channel representation, as the output format, the output channel representation comprising a lower number of audio channels than the mixer output audio channels using information on a reproduction layout.

8. The audio decoder of claim 1, wherein the plurality of encoded audio channels are encoded as audio channel pair elements, audio single channel elements, audio low frequency elements or audio quad channel elements, wherein an audio quad channel element comprises four encoded audio channels of the plurality of encode audio channels, or wherein the plurality of encoded audio objects are encoded as audio channel pair elements, audio single channel elements, audio low frequency elements or audio quad channel elements, wherein an audio quad channel element comprises four encoded audio objects of the plurality of encoded objects, and wherein the core decoder is configured for decoding the audio channel pair elements, the audio single channel elements, the audio low frequency elements or the audio quad channel elements in accordance with side information comprised in the encoded audio data indicating the audio channel pair element, the audio single channel element, the audio low frequency element or the audio quad channel element.

9. The audio decoder of claim 1, wherein the core decoder is configured for applying a full-band decoding operation using a noise filling operation without a spectral band replication operation.

7. The audio decoder of claim 6, wherein certain elements comprising the binaural renderer, the format converter, the mixer, the spatial audio object coding decoder, the core decoder, and the object renderer operate in a quadrature mirror filterbank domain, and wherein data in the quadrature mirror filterbank domain are transmitted from one of the certain elements to another one of the certain elements without any synthesis filterbank and subsequent analysis filterbank processing.

10. The audio decoder of claim 1, wherein the post processor is configured for downmixing the number of output audio channels to an intermediate format, the intermediate format comprising intermediate audio channels, a number of the intermediate audio channels being three or more and lower than the number of output audio channels, and for binaurally rendering the intermediate audio channels into a two-channel binaural output signal as the output format.


11. The audio decoder of claim 1, in which the post processor comprises: a controlled downmixer configured for applying a specific downmix matrix to the number of output audio channels; and a controller configured for determining the specific downmix matrix using information on a channel configuration of the number of output audio channels and information on an intended reproduction layout.

12. The audio decoder of claim 1, in which the core decoder is configured for performing a transform decoding and a spectral band replication decoding for a single channel element included in the encoded audio data, the single channel element comprising an encoded audio channel of the plurality of encoded audio channels or comprising an encoded audio object of the plurality of encoded audio objects, and performing the transform decoding, a parametric stereo decoding and the spectral band replication decoding for a channel pair element included in the encoded audio data, the channel pair element comprising a pair of encoded audio channels of the plurality of encoded audio channels or comprising a pair of encoded audio objects of the plurality of encoded audio objects, and performing the transform decoding, the parametric stereo decoding and the spectral band replication decoding for a quad channel elements included in the encoded audio data, the quad channel element comprising four encoded audio channels of the plurality of encoded audio channels or comprising four encoded audio objects of the plurality of encoded audio objects.

13. A method of decoding encoded audio data, comprising: receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of audio objects, or a plurality of encoded audio channels without any encoded audio objects; analyzing the encoded audio data to determine whether the encoded audio data comprise either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of encoded audio objects, or a plurality of encoded audio channels without any encoded audio objects core decoding either the encoded audio data comprising the plurality of encoded audio channels and the plurality of encoded audio objects to obtain decoded audio channels and decoded audio objects when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or the plurality of encoded audio channels to obtain decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; decompressing the compressed metadata to obtain decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; processing the decoded audio objects using the decompressed metadata and the decoded audio channels to acquire a number of output audio channels comprising audio data from the decoded audio objects and the decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; and post processing the number of output audio channels to obtain an output format, where the method of decoding the encoded audio data is controlled in response to the analyzing the encoded audio data so that either the processing the decoded audio objects is bypassed and the decoded audio channels obtained by the core decoding are fed, as the output audio channels, into the converting, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects, or the decoded audio objects and the decoded audio channels obtained by the core decoding are fed into the processing the decoded audio objects, when the encoded audio data comprise the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects.

14. A non-transitory digital storage medium having a computer program stored thereon to perform the method of claim 13.


Claims 5, 13 rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 1 of U.S. Patent No. 11,227,616 B2 and in view of references Jot et al (WO 2012125855 A1, also published as US 20140350944 A1 being applied) and in view of references Neuendorf et al (US 20110238425 A1, Engdegard et al (US 20100094631 A1), and Oh (US 20090278995 A1). The conflicting claims including claim 1 does not explicitly teach features recited in claims 5, 13. However, the combination of Jot, Neuendorf, Engdegard, and Oh teaches the features of claims 5, 13 for the determined benefits as set forth in the prior art rejections of claim 5, 13 below. Therefore, it would have been obvious for one having ordinary skill in the art before the effective filing date of the claimed invention to have applied the features disclosed by the combination of Jot, Neuendorf, Engdegard, and Oh, to the audio decoder, as taught by the conflicting claim 1 of U.S. Patent No. 11,227,616 B2 for the benefits as set forth in the prior art rejection of claims 5, 13 below. The following is the comparison between claims 5, 13 with the conflicting claim 1 of U.S. Patent No. 11,227,616 B2 for reference:
Claim(s) in the current application
Conflicting claims in U.S. Patent No. 11,227,616 B2
1. An audio decoder for decoding encoded audio data, comprising: an input interface configured for receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of audio objects, or a plurality of encoded audio channels without any encoded audio objects; a core decoder configured for decoding the plurality of encoded audio channels received by the input interface and the plurality of encoded audio objects received by the input interface to acquire a plurality of decoded audio channels and a plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or decoding the plurality of encoded audio channels received by the input interface to acquire a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; a metadata decompressor configured for decompressing the compressed metadata to acquire decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; an object processor configured for processing the plurality of decoded audio objects using the decompressed metadata and the plurality of decoded audio channels to acquire a number of output audio channels comprising audio data from the plurality of decoded audio objects and the plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; and a post-processor configured for converting the number of output audio channels into an output format, wherein the audio decoder is configured to either bypass the object processor and to feed the plurality of decoded audio channels as the output audio channels into the post-processor, when the encoded audio data comprises the plurality of encoded audio channels without any audio objects, or to feed the plurality of decoded audio objects and the plurality of decoded audio channels into the object processor, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects.

5. The audio decoder of claim 1, wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects and encoded audio channels, wherein the spatial audio object coding decoder is configured to decode the encoded audio objects and the encoded audio channels using the one or more transport channels and the parametric side information and wherein the object processor is configured to render the plurality of audio objects using the decompressed metadata to acquire rendered audio objects and to decode the audio channels and to mix the audio channels with the rendered audio objects to acquire the number of output audio channels.

13. The audio decoder of claim 1, in which the core decoder or the object processor are controllable, and in which the post-processor is configured to control the core decoder or the object processor in accordance with information on the output format so that a rendering incurring decorrelation processing of audio objects or audio channels not occurring as separate audio channels in the output format is reduced or eliminated, or so that for audio objects or audio channels not occurring as the separate audio channels in the output format, upmixing or decoding operations are performed as if the audio objects or the audio channels would occur as the separate audio channels in the output format, except that any decorrelation processing for the audio objects or the audio channels not occurring as the separate audio channels in the output format is deactivated.
1. An audio decoder for decoding encoded audio data, comprising: an input interface configured for receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of encoded audio objects, or a plurality of encoded audio channels without any encoded audio objects; a mode controller configured for analyzing the encoded audio data to determine whether the encoded audio data comprise either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of encoded audio objects, or a plurality of encoded audio channels without any encoded audio objects; a core decoder configured for either decoding the plurality of encoded audio channels received by the input interface to obtain decoded audio channels and decoding the plurality of encoded audio objects received by the input interface to obtain decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or decoding the plurality of encoded audio channels received by the input interface to obtain decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; a metadata decompressor configured for decompressing the compressed metadata to obtain decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; an object processor configured for processing the decoded audio objects using the decompressed metadata and the decoded audio channels to acquire a number of output audio channels comprising audio data from the decoded audio objects and the decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; a post processor configured for post processing the number of output audio channels to obtain an output format, wherein the mode controller is configured for controlling the audio decoder to either bypass the object processor and to feed the decoded audio channels as the output audio channels into the post processor, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects, or to feed the decoded audio objects and the decoded audio channels into the object processor, when the encoded audio data comprise the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects.







Claim 9 rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claim 17 of U.S. Patent No. 10,249,311 B2. Although the conflicting claims are not identical, they are not patentably distinct from each other because the claim of the instant application is a broader version of the claim of U.S. Patent No. 10,249,311 B2. The following is the comparison between claim 9 of the instant application and the conflicting claim 17 of the U.S. Patent No. 10,249,311 B2:
Claim in the current application
Conflicting claim in U.S. Patent No. 10,249,311 B2
1. An audio decoder for decoding encoded audio data, comprising: an input interface configured for receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of audio objects, or a plurality of encoded audio channels without any encoded audio objects; a core decoder configured for decoding the plurality of encoded audio channels received by the input interface and the plurality of encoded audio objects received by the input interface to acquire a plurality of decoded audio channels and a plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or decoding the plurality of encoded audio channels received by the input interface to acquire a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; a metadata decompressor configured for decompressing the compressed metadata to acquire decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; an object processor configured for processing the plurality of decoded audio objects using the decompressed metadata and the plurality of decoded audio channels to acquire a number of output audio channels comprising audio data from the plurality of decoded audio objects and the plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; and a post-processor configured for converting the number of output audio channels into an output format, wherein the audio decoder is configured to either bypass the object processor and to feed the plurality of decoded audio channels as the output audio channels into the post-processor, when the encoded audio data comprises the plurality of encoded audio channels without any audio objects, or to feed the plurality of decoded audio objects and the plurality of decoded audio channels into the object processor, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects.
 
9. The audio decoder of claim 1, wherein the core decoder is configured to apply full-band decoding operation using a noise filling operation.

8. An audio decoder for decoding encoded audio data, comprising: an input interface that receives the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of encoded audio objects, or a plurality of encoded audio channels without any encoded audio objects; a core decoder that decodes either the plurality of encoded audio channels received by the input interface and the plurality of encoded audio objects received by the input interface to obtain a plurality of decoded audio channels and a plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or that decodes the plurality of encoded audio channels received by the input interface to obtain a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; a metadata decompressor that decompresses the compressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, an object processor that processes the plurality of decoded audio objects using the decompressed metadata and the plurality of decoded audio channels to acquire a number of output audio channels comprising audio data from the plurality of decoded audio objects and the plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; and a post processor that converts the number of output audio channels into an output format, wherein the audio decoder is configured to either bypass the object processor and to feed the plurality of decoded audio channels as the output audio channels into the post processor, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects, or to feed the plurality of decoded audio objects and the plurality of decoded audio channels into the object processor, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects. 


17. The audio decoder of claim 8, wherein the core decoder is configured to apply full-band decoding operation using a noise filling operation without a spectral band replication operation.



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 9, 11, 14-16 are rejected under 35 U.S.C. 103 as being unpatentable over Jot et al (WO 2012125855 A1, Jot hereinafter, and equivalent to US 20140350944 A1 being applied hereinafter) and in view of reference Neuendorf et al (US 20110238425 A1, hereinafter Neuendorf).
Claim 1: Jot teaches an audio decoder for decoding encoded audio data (title and abstract, ln 1-8 and fig. 4 and peered encoder in fig. 1), comprising:
an input interface (including demux 56 in fig. 4) configured for receiving the encoded audio data (soundtrack data stream 40 from an output of the peer encoder, the encoder in fig. 1 and the decoder in fig. 2), the encoded audio data comprising either a plurality of encoded audio channels (a portion of an output of a downmix audio encode 32 with respect to the BASE MIX 10 as channel signals in the encoder of figs. 1-2, para 55) and a plurality of encoded objects (other portion of the output of the downmix audio encode 32 with respect to the decoded object 1/2 audio signals 12a/12b and outputs from the object audio encode 20a/20b to the MUX 42 in fig. 1, para 55) and compressed metadata (encoded cue 38 included in the soundtrack stream 40 in fig. 1, 38d in fig. 4) related to the plurality of encoded audio objects (the encoded cue 38 including encoded object mix cues and object render cues via cue encode 30, included in the encoded downmix signal 40 in figs. 1, 4, and relevant to each sound source 1, 2, 3, as sound objects, para 14, when the downmix format and the target spatial audio format are not equivalent, para 71), or a plurality of encoded audio channels (including encoded BASE MIX, encoded audio object as channel signals when the downmix format and the target spatial audio format are equivalent, para 70; the object audio signals also represented by channel signals, para 60);
a core decoder (part of decoder, including decoder 58, decode 62a, 62c, etc. in fig. 4) configured for 
decoding the plurality of encoded audio channels received by the input interface (included in the decoded audio signals 60 with respect to the downmix of the BASE MIX of the encoder of fig. 1, as the channel signals in fig. 4) and the plurality of encoded audio objects received by the input interface to acquire a plurality of decoded audio channels and a plurality of decoded audio objects (the signal 60 including decoded channels and decoded audio objects, e.g., signals 26a/26b from the audio object decode 62a/62c in fig. 4) when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects (the encoder in fig. 1 and the decoder in fig. 2 and when the downmix format and the target spatial audio format are not equivalent, para 71), or 
decoding the plurality of encoded audio chanels received by the input interface to acquire a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio signals (decoded downmix audio signals via the downmix audio decode 58 in fig. 4 when the downmix format and the target spatial audio format are equivalent, para 70; the object audio signals also represented by channel signals, para 60);
a metadata decompressor (other part of the decoder, including cue decode 64 in fig. 4) configured for decompressing the compressed metadata to acquire decompressed metadata, when the encoded audio data compresses the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects (decoding the encoded cue stream 38d, related to object audio signals 1 … n in fig. 1 and equivalent to the encoded cue 38 when the downmix format and the target spatial audio format are not equivalent, para 70-71),
an object processor (including audio object renderer 70, audio object removal 66 in fig. 4) configured for processing the plurality of decoded audio objects (including decoded objects 26a/26c in fig. 4 and the decoded object signals included in the decoded downmix signal 60) using the decompressed metadata (using the decoded render cue 18d and mix cues 16d in fig. 4) and the plurality of decoded audio channels (included in the decoded downmix signal 60 in fig. 4) to acquire a number of output channels comprising audio data (including object rendering signal 76 suitable for reproduction in the target spatial audio format and residual downmix signal 68 in figs. 1, 4, para 69) from the plurality of decoded audio objects and the plurality of decoded audio channels (from the sound sources as audio objects and the decoded multi-channel base mix signal included in the decoded downmix signal 60 in fig. 1), when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects (downmix format and the target spatial audio format are not equivalent and p.7, para 70-71); and
a post processor (including convert format 78, output post-processing module 86, etc. in fig. 4) configured for converting the number of output audio channels into an output format (performing additional spatial format conversion by element 86, para 69 and converting the residual downmix signal 80 suitable for reproduction in the target spatial audio format, para 68),
wherein the audio decoder is configured to 
either bypass the object processor and to feed the plurality of decoded audio channels as the output audio channels into the post-processor, when the encoded audio data comprises the plurality of encoded audio channels (bypassing elements 66, 70 when the downmix format and the target spatial audio format are equivalent; the decoded downmix signals 60 is directly fed to the convert format 78 in fig. 4, para 70), 
or to feed the plurality of decoded audio objects and the plurality of decoded audio channels into the object processor, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects (the convert format 78, audio object removal 66 and audio object renderer 70, etc., as the object processor, cue decode 64, etc. as the metadata decompressor, and downmix audio decode 58, etc., as the core decoder and discussed above, when the downmix format and the target spatial audio format are not equivalent, para 70-71).
However, Jot does not explicitly teach wherein the encoded audio data comprises no any encoded audio objects in the disclosed when the encoded audio data comprises the plurality of encoded audio channels. 
Neuendorf teaches an analogous field of endeavor by disclosing an audio decoder (title and abstract, ln 1-6, and fig. 1B) comprising 
an input interface configured for receiving a plurality of encoded audio channels without any encoded audio objects (including a bit stream demltiplexer 900 in fig. 2B and two or more audio channels as input at an audio encoder and output a bit stream via an bit stream multiplexer 800 in fig. 2A and received at an audio decoder via bit stream demultiplexer 900 in fig. 2B and two or more channels as the input to the audio encoder, p.7, para 99);
a core decoder (including elements 431, 440, 600, 701 etc., in fig. 2B) configured for decoding the plurality of encoded audio channels received by the input interface to obtain decoded audio channels when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects (an output from the element 701 via the controllable switch 600 in fig. 2B); 
a post processor (including element 702, etc., in fig. 2B) configured for post processing the number of output audio channels (an output from the element 701 to the element 702 via the switch 600 in fig. 2B) to obtain an output format (output audio signals from the element 702 in fig. 2B),
wherein the audio decoder is configured for bypassing an audio signal processor (including elements 540, 532, etc., in fig. 2B) and to feed the decoded audio channels as the output audio channels into the post-processor when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects (the output from the element 440 into the element 702 via the element 701 in fig. 2B) for benefits of achieving a balance in high quality and low bitrate between music encoding and speech encoding (p.1, para 7-8) and improving the bitrate by removing out zero bits of the decoded data stream (p.16, para 230-235).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied the plurality of encoded audio channels without any encoded audio objects and other features above, as taught by Neuendorf, to the plurality of encoded audio channels in the audio decoder, as taught by Jot, for the benefits discussed above.
Claim 15 has been analyzed and rejected according to claim 1 above.
Claim 16 has been analyzed and rejected according to claims 1 and 15 above and the combination of Jot and Neuendorf further teaches, a non-transitory digital storage medium having stored thereon a computer program for performing the method of decoding encoded audio data in claim 15 (Jot, including RAM and processor readable medium, ROM, flash memory, etc. stored with software and executed by CPU, p.5, para 50-51 and Neuendorf, storage medium with computer program for performing the methods, p.32, para 425-429).
Claim 2: the combination of Jot and Neuendorf further teaches, according to claim 1 above, wherein the post-processor is configured to convert the number of output audio channels to a binaural representation as the output format or to a reproduction format as the output format, the reproduction format comprising a smaller number of audio channels than the number of output audio channels (Jot, converting from 5 channels to binaural channel signals in fig. 7, para 90), and wherein the audio decoder is configured to control the post- processor in accordance with a control input derived from an user interface or extracted from the encoded audio data (Joy, controlled by the user interaction through the signal 72 to the audio object renderer 70 and convert format 78 in fig. 4 and via the GUI, p.5, para 47 and p.8-9, para 85).
Claim 3: the combination of Jot and Neuendorf further teaches, according to claim 1 above, in which the object processor comprises:
an object renderer configured for rendering the decoded objects to acquire rendered audio objects using the decompressed metadata (Jot, part of audio object renderer 70 and controlled by render cues 18d in fig. 4); and
a mixer (Jot, adder 82 in fig. 4) for mixing the rendered objects and the decoded audio channels to acquire the number of output audio channels (Jot, adding the residual downmix signal outputted from the audio object removal 66 and outputted from the audio object renderer 70 in fig. 4).
Claim 9: the combination of Jot and Neuendorf further teaches, according to claim 1 above, wherein the core decoder is configured to apply a full-band decoding operation using a noise filling operation (Neuendorf, fill spectral gaps in the decode spectra which occur when spectral value are quantized to zero due to a strong restriction on bit demand in the encoder side and p.16, para 230-235 and in a full band decoding outputted from bandwidth extension 701 in fig. 2B which has no SBR operation in fig. 2B and p.8, para 114 and p.7, para 107).
Claim 11: the combination of Jot and Neuendorf teaches all the elements of claim 11, according to claim 1 above, including wherein binaurally rendering the channels of the intermediate audio channels into a two-channel binaural output signal as the output format (Jot, format convert 70 to make binaural output from the five channel audio signals in fig. 7), except wherein the post processor is configured to downmix the number of output audio channels output by the object processor to a format comprising three or more audio channels and comprising less audio channels than the number of output audio channels output by the object processor to acquire channels of an intermediate downmix.
An Official Notice is taken that downmixing audio channels to a format comprising three or more audio channels and comprising less audio channels then the number of input audio channels to acquire an intermediate downmix and then binaurally rendering the channels of the intermediate downmix into a two-channel binaural output signal is notoriously well-known in the art before the effective filing date of the claimed invention, e.g., multiple stages of Two-to-One TTO or three-to-two TTT (US 20090043591 A1 by Breebaart et al, fig. 5) for benefits of simplifying downmix algorithms with less duplication computation and simplifying processing.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied downmixing the number of output audio channels to the format comprising the three or more audio channels and comprising less audio channels than the number of output audio channels to acquire the intermediate downmix having the intermediate audio channels, and binaurally rendering the channels of the intermediate downmix into the two-channel binaural output signal, as taught by the well-known in the art, to the post-processor and the audio decoder, as taught by the combination of Jot and Neuendorf, for the benefits discussed above.
Claim 14: the combination Jot and Neuendorf further teaches, according to claim 1 above, in which the core decoder is configured (Neuendorf, fig. 12B) to perform transform decoding (transition decoding via 532 in fig. 12B) and a spectral band replication decoding for the single channel elements (Jot, single channel decoding 62a/62c in fig. 4 and Neuendorf, enhanced SBR decoding 701 in fig. 12B), and to perform the transform decoding, parametric stereo decoding and spectral band reproduction decoding for the channel pair elements and the quad channel elements (Jot, multi-channel decoder 58 in fig. 4 and Neuendorf, QMF in figs. 9 and 11B and p.14, para 181).

Claims 4-8, 10 are rejected under 35 U.S.C. 103 as being unpatentable over Jot (above) and in view of references Neuendorf (above) and Hellmuth et al (US 20090125314 A1, hereinafter Helimuth).
Claim 4: the combination of Jot and Neuendorf teaches all the elements of claim 4, according to claim 1 above, including wherein the object processor is controlled to mix the rendered audio objects and the decoded audio channels to acquire the number of output audio channels (via the adder 82 by controlling the inputs via target spatial audio format definition signal 74, and user interaction signal 72, and render cues 18d, and mix cues 16d in fig. 4), except wherein the object processor comprises a spatial audio object coding decoder for decoding the one or more transport channels and associated parametric side information representing encoded audio objects, wherein the spatial audio object coding decoder is configured to render the decoded audio objects in accordance with rendering information related to a placement of the audio objects to acquire rendered audio objects.
Hellmuth teaches an analogous field of endeavor by disclosing an audio decoder (title and abstract, ln 1-13 and fig. 14) and wherein the plurality of encoded objects comprises one or more transport channels and associated parametric side information (downmix 112 and MPS  bitstream 114 in fig. 14) and a spatial audio object coding decoder is disclosed (including part of SAOC-MPS transcoder, TTN box, etc. in fig. 14 or SAOC-MPS transcoder in MBO transcoder in fig. 14) for decoding the one or more transport channels (common downmix signal 112 in fig. 14) and associated parametric side information representing encoded audio objects (SAOC side information stream 114 in fig. 14 and output the decoded audio objects 164 in fig. 14), wherein the spatial audio object coding decoder is configured to render the decoded audio objects (decoding for BGOs 154 and FGOs 156 from common downmix signal 112 in fig. 14) in accordance with rendering information related to a placement of the audio objects to acquire rendered audio objects (controlled by SAOC bitstream 114 including prepositioning of all individual FGOs in the downmix signal 112  in fig. 14 and p.13, para 181-182 and including the rendering information 26 in the SAOC decoder, p.3, para 36 and in matrix A, p.4, para 49) and the object processor is controlled for mixing the rendered audio objects (represented by FGOs 156 in fig. 14) and the decoded audio channels (by BGO 154 in fig. 14) to acquire the number of output audio channels (preprocessed downmix signal 164 in fig. 14) for benefits of achieving a better sound effects and quality by separately processing background sounds from processing specific instrument, vocal voices, etc. or fore ground audio objects (p.1, para 6-9).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied wherein wherein the object processor comprises a spatial audio object coding decoder for decoding the one or more transport channels and associated parametric side information representing encoded audio objects, wherein the spatial audio object coding decoder is configured to render the decoded audio objects in accordance with rendering information related to a placement of the audio objects to acquire rendered audio objects, as taught by Hellmuth, to the object processor in the audio decoder, as taught by the combination of Jot and Neuendorf, for the benefits discussed above.
	Claim 5: the combination of Jot, Neuendorf, and Hellmuth further teaches, according to claim 1 above, the object processor comprises a spatial audio object coding decoder is disclosed (including part of SAOC-MPS transcoder, TTN box, etc. in fig. 14 or SAOC-MPS transcoder in MBO transcoder in fig. 14) for decoding the one or more transport channels (common downmix signal 112 in fig. 14) and associated parametric side information representing encoded audio objects (SAOC side information stream 114 in fig. 14 and output the decoded audio objects 164 in fig. 14),
wherein the spatial audio object coding decoder is configured to decode the encoded audio objects and the encoded audio channels using the one or more transport channels and the parametric side information (Hellmuth, decoding for BGOs 154 and FGOs 156 from common downmix signal 112 in fig. 14) and 
wherein the object processor is configured to render the plurality of audio objects using the decompressed metadata to acquire rendered audio objects and to decode the audio channels (Jot, part of audio object renderer 70 and controlled by render cues 18d in fig. 4) and to mix the audio channels with the rendered audio objects to acquire the number of output audio channels (Jot, adding the residual downmix signal outputted from the audio object removal 66 and outputted from the audio object renderer 70 in fig. 4).
Claim 6: the combination of Jot, Neuendorf, and Hellmuth further teaches, according to claims 1 above, wherein the object processor comprises a spatial audio object coding decoder (including part of SAOC-MPS transcoder, TTN box, etc. in fig. 14 or SAOC-MPS transcoder in MBO transcoder in fig. 14) for decoding one or more transport channels (common downmix signal 112 in fig. 14) and associated parametric side information representing encoded audio objects (SAOC side information stream 114 in fig. 14 and output the decoded audio objects 164 in fig. 14), wherein the spatial audio object coding decoder is configured to transcode the associated side parametric information and the decompressed metadata into transcoded parametric side information (Helimuth, transcode the SAOC bitstream to MS bitstream 162 controlled by the MBO operational mode 158 in fig. 14) usable for directly rendering the output format (Helimuth, used at the MPS decoder 122 in fig. 14), and wherein the post-processor is configured for calculating audio channels of the output format using the decoded transport channels and the transcoded parametric side information (Jot, post processor includes the convert format 78, output post processing 86 in fig. 14 and Helimuth, fig. 14), or wherein the spatial audio object coding decoder is configured to directly upmix and render channel signals for the output format using the decoded transport channels and the parametric side information (Helimuth, via preprocessed downmix signal 164 with the transcoded MPS 162 in fig. 14 and discussed in claim 4 above).
Claim 7: the combination of Jot, Neuendorf, and Hellmuth further teaches, according to claim 1 above, wherein the object processor comprises a spatial audio object coding decoder configured for decoding the one or more transport channels output by the core decoder and associated parametric data and the decompressed metadata to acquire a plurality of rendered audio objects (Hellmuth, the discussion in claims 4-6 above), 
wherein the object processor comprises an object renderer being configured to  render the decoded audio objects output by the core decoder to acquire rendered decoded audio objects (Jot, through the audio object renderer 70, audio object removal 66 in fig. 4 and the discussion in claim 4 above and Hellmuth, via the transcoded MPS 162, etc. in fig. 14);
wherein the object processor is furthermore configured to mix the rendered decoded audio objects with the decoded audio channels (Jot, via the adder 82 in fig. 4 and Hellmuth, via the mixing 166 in fig. 14),
wherein the audio decoder further comprises an output interface for outputting an output of the mixer to loudspeakers (Jot, via loudspeaker on a headphone in fig. 7 and p.9, para 89 and Hellmuth, via the part of 122 in fig. 14),
wherein the post-processor furthermore comprises: a binaural renderer for rendering the output audio channels into two binaural channels (Jot, including part of format convert 78 in fig. 4 and performing binaural rendering from five channels according to the user interaction in fig. 7 and p.9, para 89 and the discussion in claim 9 above) using head related transfer functions or binaural impulse responses, the two binaural channels representing the binaural representation, (Jot, via the user interface to enter HRTF to the format convert 78 in fig. 4 and p.9, para 89), and
a format converter (Jot, other part of format convert 78 and output post processing 86 in fig. 4 and by performing including part of format convert 78 and output post processing to perform additional spatial format conversion in fig. 4, para 69 and according to the input of target spatial audio format definition 74 or render cues 18d in fig. 4, para 82) for converting the output audio channels into the output format, the output comprising a lower number of audio channels than the output audio channels of the mixer using information on a reproduction layout (Jot, the generated audio scene is identical to the object downmix signal 46d, and p.8, para 82).
Claim 8: the combination of Jot, Neuendorf, and Hellmuth further teaches, according to claim 1 above, wherein the plurality of encoded audio channels or the plurality of encoded audio objects are encoded as channel pair elements, single channel elements, low frequency elements or quad channel elements (Jot, single channel elements 14a/14b in fig. 1 and Hellmuth, paired L/R downmix signal 112 and MPS encoder for outputting stereo signals 104 in fig. 6), wherein a quad channel element comprises four original audio channels or audio objects (Jot, single channel elements 14a/14b in fig. 1 and Hellmuth, paired L/R downmix signal 112 and MPS encoder for outputting stereo signals 104 in fig. 6), and 
wherein the core decoder is configured to decode the channel pair elements, the single channel elements, the low frequency elements or the quad channel elements in accordance with side information comprised by the encoded audio data indicating a channel pair element, a single channel element, a low frequency element or a quad channel element (Jot, via the decode 62a/62c for single encoded single downmix signal and Hellmuth, single SAOC downmix signal, etc. and p.7, para 84).
Claim 10: the combination of Jot, Neuendorf, and Hellmuth further teaches, according to claim 1 above, including wherein elements comprising the binaural renderer (Jot, binaural 3D audio rendering based on spatial audio scene coding in fig. 7 and p.9, para 90), the format converter (Jot, convert format 78 in fig. 4), the mixer (Jot, the adder 82 in fig. 4 and Hellmuth, the mixing 166 in fig. 14), the spatial audio object coding SAOC decoder (Hellmuth, including SAOC MPS transcoder etc. in figs. 11, 14 and SAOC decoder in figs. 12), the core decoder (Jot, elements 58, 62a, 62c in fig. 4), and the object renderer (Jot, audio object renderer 70, etc. in fig. 4) operate in a quadrature mirror filterbank QMF domain (Hellmuth, QMF bank being used for downmixer at the encoder side in fig. 1 and within the element 82 in fig. 4, para 58) and wherein data quadrature mirror filter domain data is transmitted from one of the elements to another of the elements without any synthesis filterbank and subsequent analysis filterbank processing (Hellmuth, hybrid QMF bank applied to the input audio signal 84 in element 82 at the audio encoder side in fig. 4, para 85 and thus, inherently the processing of the encoded audio data is under QMF format for data integrity from the audio encoder to the audio decoder).

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Jot (above) and in view of references Neuendorf (above) and Engdegard et al (US 20100094631 A1, hereinafter Engdegard).
Claim 12: the combination of Jot and Neuendorf teaches all the elements of claim 12, according to claim 1 above, including wherein the post-processor comprises a controlled downmixer (Jot, HRTF for binaural processing in fig. 7 and p.9, para 89), except applying a downmix matrix to; and a controller for determining a specific downmix matrix using information on a channel configuration of an output of the object processor and information on an intended reproduction layout.
Engdegard teaches an analogous field of endeavor by disclosing an audio decoder (title and abstract, ln 1-11 and fig. 1-3A) and wherein a controlled downmixer is disclosed for applying a downmix matrix (303 in fig. 3A and details in fig. 8 and downmix matrix D is applied to downmixer 101a); and a controller is configured for determining a specific downmix matrix using information on a channel configuration of an output of the object processor and information on an intended reproduction layout (combined with renderer matrix A with output layout information in fig. 9 and the downmix matrix Q is selected in fig. 14 and processing by a processor and p.1, para 12 and wherein the input audio objects are distributed into the elements of downmix matrix D in fig. 8) for benefits of achieving an efficient and scalable downmix scheme for high quality of sound image (p.1, para 5).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied applying a downmix matrix to; and a controller for determining a specific downmix matrix using information on a channel configuration of an output of the object processor and information on an intended reproduction layout, as taught by Engdegard, to the post-processor in the audio decoder, as taught by the combination of Jot and Neuendorf, for the benefits discussed above.

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Jot (above) and in view of references Neuendorf (above), Engdegard (above), and Oh (US 20090278995 A1).
Claim 13: the combination of Jot, Neuendorf, and Engdegard further teaches, according to claim 1 above, in which the core decoder or the object processor are controllable (Jot, e.g., omitting elements 66, 70 and thus, omitting decode 62a/62c in the decoder in fig. 4 and p.7, para 70 and Engdegard, audio decoder is controlled by matrix calculator 202 in fig. 2A), and a rendering incurring decorrelation processing of audio objects or audio channels not occurring as separate audio channels in the output format is reduced or eliminated (Engdegard, downmix performed prior to performing decorrelation and recovered or upmixed after decorrelation in fig. 4E), or so that for audio objects or audio channels not occurring as the separate audio channels in the output format, upmixing or decoding operations are performed as if the audio objects or audio channels would occur as the separate audio channels in the output format, except that any decorrelation processing for the audio objects or the audio channels not occurring as the separate audio channels in the output format is deactivated (Engdegard, fig. 4E, separated channels are maintained by downmix, decorrelation, and upmix in fig. 4E).
However, the combination of Jot, Neuendorf, and Engdegard does not explicitly teach in which the post-processor is configured to control the core decoder or the object processor in accordance with information on the output format.
Oh teaches an analogous field of endeavor by disclosing an audio decoder (title and abstract, ln 1-7 and audio decoding unit 150, etc. in fig. 1) and wherein a post-processor is disclosed (via the multiplexing unit 850 in fig. 8) configured for controlling a core decoder or the object processor (via the controlling unit 870 to control the converting unit 830 in fig. 8) in accordance with information on the output format (according to the detected quality and rate of the target data, para 138-143) for benefits of achieving an efficient and adaptive coding and decoding by providing multiple coding schemes and adaptive decoding scheme for a low bitrate and high sound quality (p.1, para 4-7).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied the post processor is configured for controlling the core decoder or the object processor in accordance with information on the output format, as taught by Oh, to the post-processor in the audio decoder, as taught by the combination of Jot, Neuendorf, and Engdegard, for the benefits discussed above.

The prior art (US 20110202355 A1) by Grill made of record and not relied upon is considered pertinent to applicant's disclosure because Grill above has disclosed an audio decoder and wherein switching and bypassing between different decoding schemes by an flag information transmitted from an audio encoder is disclosed, which is considered to be consistent with the disclosed switching between and bypassing the object processor if the received data is channel signals only without any audio objects. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LESHUI ZHANG whose telephone number is (571)270-5589.  The examiner can normally be reached on Monday-Friday 6:30am-4:00pm EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached on 571-272-7848.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/LESHUI ZHANG/
Primary Examiner, Art Unit 2654