DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .





Claim Rejections - 35 USC § 112

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 24,25 and their respective depending claims are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

As per claims 24,25, it is not clear how to read particular encoder and particular decoder.


Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

The following claims including 1,9,24, 25,28,29,2,12,3,13,6,16,10,17,18,22 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Kim et al (US 20200228780 A1).

As per claim 1, Kim discloses a method for encoding an audio signal into a bitstream, in particular at an encoder (para. 98, 4, the output of the media transmission device is encoded as per para. 97, via the encoding format) and multimedia, where multimedia comprises audio signals), the method comprising: 
encoding or including audio signal data associated with 3DoF audio rendering into one or more first bitstream parts of the bitstream (para. 98 the media transmission apparatus encodes audio for transmission, where the remaining space is transmitted in 2d sphere data/metadata in a 3dof format, where the encoded multimedia signal is in a bitstream as per the bitrate element specified in para. 97, and as per the OMAF format specified in para. 98, where the 2d sphere data in the bitstream is the first bitstream parts); and 
encoding or including metadata associated with 6DoF audio rendering into one or more second bitstream parts of the bitstream (para. 98, where the 3D data comprises the 6dof data/metadata which is in a second part of the bitstream within the 3d data) , wherein the method further includes: 

determining environmental characteristics and parameters relating to distance (the SOI is defined/determined by a coordinate system/parameters based on a distance/environmental characteristic that creates a sphere, para. 108) or  attenuation, occlusion, and/or reverberations; 
determining a parametrization of a transform function A based on said environmental characteristics and said parameters (para. 108 SOI_space_structure to be used by the encoder/decoder system defined by figs. 12 and 13 to adapt/transform/ applied to media data corresponding to all coordinates existing within the boundary surface, and all data related to the media) and providing a parametrized transform function A (provided as parameters in the processor 1210 and 1310 as per ‘based on’ in para. 9 playing immersive media based on the partial space-of-interest description information, wherein the space of interest description information is used to transform the media data)
A*A^-1  and A^(-1) * A each = 1; (for any real non-zero values of the parameter based transform function A, the partial space-of-interest description information, will be raised to the 0 power, which is equal to 1).

generating the audio signal data associated with 3DoF audio rendering by transforming the audio signals from the one or more audio sources into 3DoF audio signals using the transform function A (para. 101: The SOI descriptor allows the media transmission apparatus to selectively and adaptively transmit all or a part of the immersive media based on selection of the media where applied to the media data is generating the audio signal data, which are 3dof signals as per the x,y,z coordinates in para. 109), 
wherein the transform function A maps or projects the audio signals of the one or more audio sources (as per the above step) onto respective audio objects (the above cited objects) positioned on one or more spheres surrounding a default 3DoF listener position (the media/audio signals for objects positioned on the surface of a sphere as per para. 42 which surrounds a fixed location/default 3dof listener position).

As per claim 9, the encoder cited in the claim 1 rejection works with a decoder that performs a method for decoding and/or audio rendering, in particular at a decoder or audio renderer, the method comprising: 
receiving a bitstream which includes audio signal data associated with 3DoF audio rendering in one or more first bitstream parts of the bitstream and further including metadata associated with 6DoF audio rendering in one or more second bitstream parts of the bitstream (as received via the bitstream cited in the claim 1 rejection), and 
performing at least one of 3DoF audio rendering (via the 2d data portion as cited in the claim 1 rejection) and 6DoF audio rendering (via the 3d data in the claim 1 rejection) based on the received bitstream, 
wherein performing 6DoF audio rendering, being based on the audio signal data associated with 3DoF audio rendering in the one or more first bitstream parts of the bitstream 
wherein the inverse transform function is an inverse function of a transform function (where the function is the inverse of the functions performed at the encoder in steps 1020,1030,1040 in order to recreate the captured signals form the capturing device in step 1010 in fig. 10) which maps or projects audio signals of the one or more audio sources onto respective audio objects positioned on one or more spheres surrounding a default 3DoF listener position (mapped onto a sphere as described in the claim 1 rejection).

As per claim 24, an apparatus, in particular encoder (fig. 12), including a processor the method of the claim 1 rejection requires a processor) configured to: 
encode or include audio signal data associated with 3DoF audio rendering into one or more first bitstream parts of the bitstream (encoding as per the claim 1 rejection); 
encode or include metadata associated with 6DoF audio rendering into one or more second bitstream parts of the bitstream (encoding or including as per the claim 1 rejection); and 

wherein the processor is further configured to: 
receive audio signals from one or more audio sources (receiving as per the claim 1 rejection); 
determine environmental characteristics and parameters relating to distance attenuation, occlusion, and/or reverberations (determining as per the claim 1 rejection); 
determine a parametrization of a transform function A based on said environmental characteristics and said parameters and provide a parametrized transform function A, wherein   
    PNG
    media_image1.png
    14
    149
    media_image1.png
    Greyscale
 6Attorney Docket. No. D18041US01 (as per the claim 1 rejection);
generate the audio signal data associated with 3DoF audio rendering by transforming the audio signals from the one or more audio sources into 3DoF audio signals using the transform function A, wherein the transform function A maps or projects the audio signals of the one or more audio sources onto respective audio objects positioned on one or more spheres surrounding a default 3DoF listener position (generating as per the claim 1 rejection).

As per claim 25, an apparatus, in particular decoder or audio renderer, including a processor configured to: 
receive a bitstream which includes audio signal data associated with 3DoF audio rendering in one or more first bitstream parts of the bitstream and further including metadata associated with 6DoF audio rendering in one or more second bitstream parts of the bitstream (as per the claim 9 rejection, receiving), and 
perform at least one of 3DoF audio rendering and 6DoF audio rendering based on the received bitstream(as per the claim 9 rejection, performing), 

perform 6DoF audio rendering, being based on the audio signal data associated with 3DoF audio rendering in the one or more first bitstream parts of the bitstream and the metadata associated with 6DoF audio rendering in the one or more second bitstream parts of the bitstream, including generating audio signal data associated with 6DoF audio rendering based on the audio signal data associated with 3DoF audio rendering and an inverse transform function, wherein the inverse transform function is an inverse function of a transform function which maps or projects audio signals of the one or more audio sources onto respective audio objects positioned on one or more spheres surrounding a default 3DoF listener position (as per the claim 9 rejection).
As per claims 28,29,  the method and system recited in the above claim 1,9 rejections require nontransitory memory/computer program product with associated processors in order to perform the steps recited in the claim rejections.
As per claims 2,12, the method according to claim 1, wherein the audio signal data associated with 3DoF audio rendering includes audio signal data of one or more audio objects (para. 110 the num_of_objects parameter), directional data of one or more audio objects, and/or distance data of one or more audio objects.
As per claims 3,13, the method according to claim 2, wherein the one or more audio objects are positioned on one or more spheres surrounding a default 3DoF listener position (he media/audio signals for objects positioned on the surface of a sphere as per para. 42 which surrounds a fixed location/default 3dof listener position).

As per claims 6,16, the method according to claim 1, wherein the metadata associated with 6DoF audio rendering is indicative of one or more default 3DoF listener positions 

 includes or is indicative of at least one of: 
a description of 6DoF space (the 3d data/ metadata cited in the claim 1 rejection describes a 6dof space),
 optionally including object coordinates; audio object directions of one or more audio objects; a virtual reality (VR) environment; and parameters relating to distance attenuation, occlusion, and/or reverberations.
As per claim 10, the method according to claim 9, wherein, 
when performing 3DoF audio rendering, the 3DoF audio rendering is performed based on the audio signal data associated with 3DoF audio rendering in the one or more first bitstream parts of the bitstream, while discarding the metadata associated with 6DoF audio rendering in the one or more second bitstream parts of the bitstream (not mapped as in alternative), 
and/or
 when performing 6DoF audio rendering, the 6DoF audio rendering is performed based on the audio signal data associated with 3DoF audio rendering in the one or more first bitstream parts of the bitstream and the metadata associated with 6DoF audio rendering in the one or more second bitstream parts of the bitstream (the 6DOF and 3DOF playback/rendering occurs together and are associated based on each other, ie based on the 2d and 3d data as they each render different portions of the same space).

As per claim 17, the method according to claim 9, wherein the audio signal data associated with 3DoF audio rendering are generated based on the audio signals from the one or more audio sources and a transform function as received from the encoder as per the claim 1 rejection.
claim 18, the method according to claim 17, wherein 
the audio signal data associated with 3DoF audio rendering is generated by transforming the audio signals from the one or more audio sources into 3DoF audio signals using the transform function (not mapped as in the alternative), 
and/or
 the transform function maps or projects the audio signals of the one or more audio sources onto respective audio objects positioned on one or more spheres surrounding a default 3DoF listener position as per the claim 1 rejection.

As per claim 22, the method according to claim 9, wherein the audio signal data associated with 6DoF audio rendering is generated by transforming the audio signal data associated with 3DoF audio rendering (the 2d data) using the inverse transform function and the metadata (as per the claim 1 and 9 rejections) associated with 6DoF audio rendering (the 2d/3Dof and 3d/6Dof data are each rendered together in order to create the entire listening space), 
and/or
performing 3DoF audio rendering based on the audio signal data associated with 3DoF audio rendering in the one or more first bitstream parts of the bitstream results in the same generated sound field as performing 6DoF audio rendering, at a default 3DoF listener position, based on the audio signal data associated with 3DoF audio rendering in the one or more first bitstream parts of the bitstream and the metadata associated with 6DoF audio rendering in one or more second bitstream parts of the bitstream.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The following claims including claims 7,20,8,21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kim et al (US 20200228780 A1) as applied to claims 1 and 9.
As per claims 7,20, Kim discloses a multimedia system with an encoder and decoder but does not specify the bitstream between the encoder and decoder is an MPEG-H 3D Audio bitstream or a bitstream using MPEG-H 3D Audio syntax.
The examiner takes official notice it is well known in the art to implement well known signaling protocols such as MPEG3D to transport bitstreams of audio and video/multimedia signals for the purpose of compatibility of known standards.

As per claims 8,21, the 2d and 3d data are in different portions of the bitstream as per the claim 1 rejection where the first part of the bitstream is the payload and the second part is an extension container because the 3d data describes a different portion of the space (the SOI as recited in para. 98) at a different protocol and it extends the functionality from 3dof to 6dof.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER KRZYSTAN whose telephone number is 571-272-7498, and whose email address is alexander.krzystan@uspto.gov


If attempts to reach the examiner by telephone or email are unsuccessful, the examiner’s supervisor, Fan Tsang can be reached on (571) 272-7547.  

The fax phone numbers for the organization where this application or proceeding is assigned are 571-273-8300 for regular communications and 571-273-8300 for After Final communications.
/ALEXANDER KRZYSTAN/Primary Examiner, Art Unit 2653                                                                                                                                                                                                        
Examiner Alexander Krzystan
September 10, 2021