DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Examiner’s Comments

The non-final rejection filed 9-23-2022 has been vacated and a new non-final rejection with newly discovered prior art has been submitted.

	


Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-4,6,8-12,14,15,17-19,22-24 is/are rejected under 35 U.S.C. 102a1 as being anticipated by Oh et al (US 20090271015 A1).
As per claim 1, Oh discloses an audio decoder comprising: 
one or more buffers (part of 530) for storing a received audio bitstream (block 530 receives streamed audio SAOC bitstream, where block 530 requires memory/buffering in order to receive and store the streamed audio bitstream in order to be processed by unit 530 ); and 
a controller (The hardware microprocessor required to implement the functions performed by unit 530) coupled to the one or more buffers (coupled as they are part of the same unit 530) and configured: 
to operate in a decoding mode selected from a plurality of different decoding modes (the preset information used by stage 530 is used to operate in a selected decoding mode as described with the analogous information in the fig. 1 embodiment as per para. 37, preset metadata ) for decoding (via 550) the received audio bitstream into one or more audio object positions that can be set via static or dynamic audio objects (static and dynamic audio object positions are decoded based on the static or dynamic preset information) , 
a dynamic or static audio object comprising an audio signal associated with either a time- varying or a static spatial position (objects 1-n are made static or dynamic via the preset information at 530), 
the plurality of different decoding modes comprising a first decoding mode (the dynamic preset information being used  ) and a second decoding mode (the static preset information being used), 
wherein of the first and second decoding modes only the first decoding mode allows full decoding of one or more encoded dynamic audio objects in the bitstream, into reconstructed individual audio objects (only the dynamic preset information/first mode can enable a dynamic audio object because it varies the position of the object); and 
when the selected decoding mode is the second decoding mode: 
to access the received audio bitstream (accessed in order to perform the functions of 530-540-550)
to determine whether the received audio bitstream includes one or more dynamic audio objects (both the static and dynamic objects are decoded/identified via stages 530,540,550 based on the accessed bitstreams from blocks 510,520 during both the first and second mode); and 
responsive at least to determining that the received audio bitstream includes one or more dynamic audio objects (the objects including static and dynamic, are transmitted to/determined as part of the object based audio at stage 550), to map at least one of the one or more dynamic audio objects to a set of static audio objects (the multi channel audio signal output from decoder 550 can comprise both static and dynamic audio objects, which are mapped to each other as they are output via the same set of multi channel signals) 
the set of static audio objects corresponding to a predefined immersive speaker configuration (the multichannel signals from 550 define an immersive speaker configuration where each channel defines and is an input to a particular speaker).

As per claim 2, the audio decoder of claim 1, wherein when the selected decoding mode is the second decoding mode (the selected mode is based on the static or dynamic preset information), the controller is further configured to render the set of static audio objects to a set of output audio channels (multi channel output signal in fig. 5).

As per claim 3, the audio decoder of Claim 2, wherein the audio bitstream comprises a first set of downmix coefficients (the SAOC bitstream associated with the downmix signal as shown in fig. 5), wherein the controller is configured to utilize the first set of downmix coefficients for rendering (via stages 530 and 550) the set of static audio objects (when the preset information is static) to the set of output audio channels/multi-channel audio signal.

As per claim 4, the audio decoder of claim 3, wherein the controller is further configured to receive information pertaining to attenuation applied in at least one of the one or more dynamic audio objects on an encoder side (the gain set at stage 44 as per 101, fig. 1a), wherein the controller is configured to modify the first set of downmix coefficients accordingly when utilizing the first set of downmix coefficients for rendering the set of static audio objects to a set of output audio channels (said gain affects the received bitstream, which comprises the downmix coefficients which will then be modified by said gain).

As per claim 5, the audio decoder of claims 3, wherein the controller is further configured to receive information pertaining to a downmix operation performed on an encoder side (SAOC bitstream is associated with the downmixed/encoded signal), wherein the information defines an original channel configuration of an audio signal (it defines the channel config because it is based on individual object inputs 1-n), wherein the downmix operation results in downmixing the audio signal to the one or more dynamic audio objects (the downmix signal with the dynamic preset information), 
wherein the controller is configured to select a subset of the first set of downmix coefficients based on the information pertaining to the downmix information (the static preset information selects associated audio objects as static audio objects, which are a subset of the audio objects), wherein the utilizing of the first set of downmix coefficients for rendering (rendered via stage 550) the set of static audio objects to a set of output audio channels (multi channel audio signal) comprises utilizing the subset of the first set of downmix coefficients (via stage 530) for rendering the set of static audio objects to a set of output audio channels multi channel audio signal).

As per claim 6, the audio decoder of claim 2, wherein the controller is configured to perform the mapping of the at least one of the one or more dynamic audio objects and the rendering of the set of static audio objects in a combined calculation using a single matrix (the coefficients used in 550 to map the objects to a particular channel of the multi channel audio signal), or 
wherein the controller is configured to perform the mapping of the at least one of the one or more dynamic audio objects and the rendering of the set of static audio objects in individual calculations using respective matrices (the individual sets of coefficients used in 550 to map the objects to each of the channels  of the multi channel audio signal).

7. (Cancelled)

As per claim 8, The audio decoder of claim 1, wherein the received audio bitstream comprises metadata identifying the at least one of the one or more dynamic audio objects (the metadata comprises information specific to a particular object/sources, location of the audio sources via the static or dynamic preset information, additionally, the data type information cited in para. 69) .

As per claim 9, The audio decoder of claim 8, wherein the metadata indicates that N of the one or more dynamic audio objects are to be mapped (the data type information in para. 69 indicates when preset information is to be generated, which is also an indication that the static and dynamic audio objects are to be mapped per the preset information) to the set of static audio objects, 
wherein, responsive to the metadata, the controller is configured to map, to the set of static audio objects, N of the one or more dynamic audio objects selected from a predefined location or predefined locations in the received audio bitstream (the dynamic audio objects are mapped to associated static audio objects when they are combined to form the multichannel audio signal, fig. 5, where the predefined location is the location in time where the static and dynamic objects are occurring at the same time or near the same time).

As per claim 10, The audio decoder of claim 9, wherein the one or more dynamic audio objects included in the received audio bitstream comprises more than N dynamic audio objects (there can be more than one object, noting objects 1-n in fig. 5).
As per claim 11, the audio decoder of claim 10, wherein the one or more dynamic audio objects included in the received audio bitstream comprises the N dynamic audio objects and K further dynamic audio objects, wherein the controller is configured to render the set of static audio objects and the K further audio objects to a set of output audio channels (the system renders multiple sources/objects simultaneously via stages 550 including static audio objects, dynamic audio objects and further dynamic audio objects).

As per claim 12, as per claim 9, wherein, responsive to the metadata, the controller is configured to map, to the set of static audio objects, the first N of the one or more dynamic audio objects in the received audio bitstream, and/or wherein the set of static audio objects consists of M static audio objects, and M > N > 0 (there can be any number of objects 1-n, where the objects can be static or dynamic based on the preset information).

13. (Cancelled)

As per claim 14:  the audio decoder of claim 1, wherein the received audio bitstream further comprises one or more further static audio objects (any number of dynamic and static audio objects can be transmitted and rendered as per objects 1-n), and/or wherein the predefined immersive speaker configuration is defined by the particular number of channels in the multi channel audio signal.


16. (Cancelled).

As per claim 17, the claim 1 rejection discloses a method in a decoder comprising the steps of: receiving an audio bitstream and storing the received audio bitstream in one or more buffers (via the buffers of the claim 1 rejection), 
selecting a decoding mode from a plurality of different decoding modes for decoding the received audio bitstream into one or more dynamic or static audio objects (the decoding mode is selected based on the preset information as per the claim 1 rejection), 
a dynamic or static audio object comprising an audio signal associated with either a time-varying or a static spatial position (per the claim 1 rejection), 
the plurality of different decoding modes comprising a first decoding mode and a second decoding mode, wherein of the first and second decoding modes only the first decoding mode allows full decoding of one or more encoded dynamic audio objects in the bitstream, into reconstructed individual audio objects (per the claim 1 rejection); 
operating a controller coupled to the one or more buffers in the selected decoding mode (per the claim 1 rejection), 
when the selected decoding mode is the second decoding mode, the method further comprises the steps of: 
accessing, by the controller, the received audio bit stream (the controller must access the audio bit stream to perform the processing as per the claim 1 rejection)); 
determining, by the controller, whether the received audio bitstream includes one or more dynamic audio objects (the controller determines whether the object is moving or not/is a dynamic audio object via the dynamic preset information ); and 
responsive at least to determining that the received audio bitstream includes one or more dynamic audio objects, mapping, by the controller, at least one of the one or more dynamic audio objects to a set of static audio objects, the set of static audio objects corresponding to a predefined immersive speaker configuration (per the claim 1 rejection).

As per claim 18, an audio encoder comprising a receiving component 510 (fig. 5) configured for receiving a set of audio objects; 
a downmixing component (part of 510, and part of 520) configured for downmixing the set of audio objects to one or more downmixed dynamic audio objects (the downmix signal and the SAOC bitstream), wherein at least one of the one or more downmixed dynamic audio objects is intended to, in at least one of a plurality of decoding modes on a decoder side, be mapped to a set of static audio objects (per the claim 1 rejection), 
the static audio objects comprising audio signal associated with static spatial positions, the set of static audio objects corresponding to a predefined immersive speaker configuration (per the claim 1 rejection); 
a downmix coefficients providing component configured for determining a first set of downmix coefficients (the coefficients representing the SAOC information and also the preset information) to be utilized for rendering the set of static audio objects corresponding to the predefined immersive speaker configuration to a set of output audio channels at the decoder side; 
a bitstream multiplexer (part of 520) configured for multiplexing the at least one downmixed dynamic audio object and the first set of downmix coefficients into an audio bitstream (SAOC bitstream).

As per claim 19, the encoder of claim 18, wherein the downmixing component further is configured for providing metadata identifying the at least one of the one or more downmixed dynamic audio objects to the bitstream multiplexer (as per the claim 9 rejection), 
wherein the bitstream multiplexer 520 is further configured for multiplexing the metadata into the audio bitstream (SAOC bitstream) and/or for multiplexing information pertaining to a channel configuration of the audio objects received by the receiving component into the audio bitstream. 

As per claim 22, the encoder of the claim 18 rejection performs a method in an encoder comprising the steps of: receiving a set of audio objects; downmixing the set of audio objects to one or more downmixed dynamic audio objects, wherein at least one of the one or more downmixed dynamic audio objects is intended to, in at least one of a plurality of decoding modes on a decoder side, be mapped to a set of static audio objects, the static audio objects comprising audio signals associated with static spatial positions, the set of static audio objects corresponding to a predefined immersive speaker configuration; determining a first set of downmix coefficients to be utilized for rendering the set of static audio objects corresponding to the predefined immersive speaker configuration to a set of output audio channels at the decoder side; and multiplexing the at least one downmixed dynamic audio object and the first set of downmix coefficients into an audio bitstream (as per the claim 18 rejection).

As per claims 23,24 a computer program product comprising a computer-readable storage medium with instructions adapted to carry out the method of claim 17 when executed by a device having processing capability. The encoder and decoding device of the claim 1 and 17,22 rejections each require respective processor with software and memory in order to be implemented.



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 15 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Oh et al (US 20090271015 A1).

As per claim 15, the audio decoder of claim 2 rejection discloses a multi channel audio signal output from the decoder 550, but odes not specify wherein the set of output audio channels is one of: stereo output channels; 5.1 surround sound output channels, 5.1.2 immersive sound output channels; or 5.1.4 immersive sound output channels.
Oh teaches another embodiment in fig. 11 and teaches that the output channels can comprise a stereo channel (para. 130).  It would have been obvious to one skilled in the art that the multi channel audio signal of fig. 5 could specify a stereo channel for the purpose of conforming to well known speaker signaling protocols.


As per claim 20, the encoder of claim 18, however Swaminathan does not specify wherein the encoder is further adapted to determine information pertaining to attenuation applied in at least one of the one or more dynamic audio objects when downmixing the set of audio objects to one or more downmixed dynamic audio objects, wherein the bitstream multiplexer is further configured for multiplexing the information pertaining to attenuation into the audio bitstream.
The examiner takes official notice that it is well known in the art for encoder determined metadata to comprise gain for the purpose of expressing the relative amplitudes of the captured audio objects.  As such it would have been obvious to one skilled in the art at the time of the effective filing date of the invention, for the metadata 25 determined by the encoder to comprise gain terms for the purpose of effectively expressing the audio objects.

21. (Cancelled)

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER KRZYSTAN whose telephone number is 571-272-7498, and whose email address is alexander.krzystan@uspto.gov

The examiner can usually be reached on m-f 7:30-4:00 est.
If attempts to reach the examiner by telephone or email are unsuccessful, the examiner’s supervisor, Fan Tsang can be reached on (571) 272-7547.  

The fax phone numbers for the organization where this application or proceeding is assigned are 571-273-8300 for regular communications and 571-273-8300 for After Final communications.
/ALEXANDER KRZYSTAN/Primary Examiner, Art Unit 2653                                                                                                                                                                                                        
Examiner Alexander Krzystan
November 17, 2022



/AHMAD F. MATAR/Supervisory Patent Examiner, Art Unit 2652