Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION

Claim Rejections - 35 USC § 101


35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 15,18,20, are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because a computer program is not patentable.


	
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1,8,9,17,5,8, and their respective depending claims are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and 
As per claims 1,8,9,17, it is not clear how to read one or more effective audio elements in view of the specification definition.  Noting applicant’s specification recites: ‘an audio element is understood to mean one or more audio signals and associated metadata. Audio Elements could be audio objects, channels or HOA signals, for example. An audio object is understood to mean an audio signal with associated static/dynamic metadata.
As defined, an audio element can be an audio object, or one or more audio signals, where an audio object is an audio signal with metadata.  It is not clear how to read audio element relative to an audio object or audio signal when defined as such.  Additionally it is not clear how to read ‘one or more’ audio elements relative to audio signals given that a single ‘audio element’ can comprise multiple ‘audio signals’. 
The examiner notes the claimed term ‘audio scene’ is read as any combination of ‘audio elements’ where ‘audio elements’ comprise any form of audio signaling (metadata objects channels) received or processed in an encoder or decoder/renderer used to form an acoustic environment for the listener, where ‘effective audio elements’ are drawn to any signaling at the decoder/renderer used to render virtual sound sources.

As per claims 5,8, it is not clear how to differentiate between the effective audio element and the audio element.
As per claim 9, it is not clear how to read acoustic element distinct from effective audio elements.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The following claims including claim 1,9,2,10-16,3-7,  is/are rejected under 35 U.S.C. 103 as being unpatentable over De Bruijn et al (US 20160080886 A1), and further in view of Mahabub et al (US 20120213375 A1).



As per claim 1, De Bruijn discloses a method of decoding audio scene content from a bitstream by a decoder that includes an audio renderer with one or more rendering tools, the method comprising: 
receiving audio data (fig. 7, the DTA signals into 701); 
obtaining a description of an audio scene from the audio data (the meta dta and aud dta) describes the audio scene when used to render by 707, the audio scene comprising an acoustic environment (the output of 707 produces a rendered audio scene of an acoustic environment comprising virtual/effective objects/elements that are defined by the inputs to block 705, noting para. 87: The audio components are typically rendered to provide a spatial experience to the user and may for example include audio channels, audio objects and/or audio scene objects); 

determining effective audio element information indicative of effective audio element positions of the one or more effective audio elements from the description of the audio scene (para. 89 receiver 705 may further be arranged to provide position data to the renderer 707 for the audio components,),
 wherein the effective audio element information comprises information indicative of respective sound radiation patterns of the one or more effective audio elements (the audio objects can include ambience sources including reverberation as per para. 146, where reverberation is a defined sound radiation pattern);
obtaining a rendering mode indication from the audio data, wherein the rendering mode indication is indicative of whether the one or more effective audio elements represent a sound field obtained from pre-rendered audio elements and should be rendered using a predetermined rendering mode (the rendering mode is part of AUD DTA in fig. 7 as the rendering algorithm that may be provided as part of the input stream, with the audio data, and is predetermined, and indicates that one or more effective audio elements represent a sound field obtained from pre-rendered audio elements and should be rendered using a predetermined rendering mode, where the algorithm defines the predetermined rendering mode ); and

wherein rendering the one or more effective audio elements using the predetermined rendering mode takes into account the effective audio element information (signals from 705 to 707), and the information indicative of the respective sound radiation patterns of the one or more effective audio elements (the reverberation cited above), and wherein the predetermined rendering mode defines a predetermined configuration (configuration defined by the particular output channels from 707) of the rendering tools (the mechanism by which the rendering algorithm is combined with the outputs from 705 and converted to individual channels to be output from 707 are respective rendering tools ) for controlling an impact of the acoustic environment of the audio scene on the rendering output (the impact of the acoustic environment is controlled by each of the above cited signals).

	However DeBruijn Fig. 7 does not disclose that the DTA signals come from a received bitstream, and decoding a description of an audio scene and rendering mode from the bitstream.	

DeBruijn Fig. 5 discloses an encoding and decoding system and teaches that the audio is transmitted from the encoder to the decoder via a high rate bitstream.  It would have been obvious to one skilled in the art to implement a high rate bitstream for the purpose of providing a well known high speed connection to transport audio data between the encoder and decoder.  


However Debruijn does not disclose or teach
the predetermined configuration of the rendering tools corresponding to distance attenuation modeling.
Mahabub discloses a 3d spatial audio system and teaches to apply sound attenuation modelling in accordance with respective distances between a listener position and the effective audio element positions of the one or more effective audio elements (distance processing, para. 124 comprises a function as described in para. 125,126 which is sound attenuation modelling based on the distance between the listener and a sound source).  It would have been obvious to one skilled in the art that the rendering of sources/objects via the rendering tools, relative to the listener of De Bruijn, could comprise/’be configured to correspond to’ attenuation modelling/’distance attenuation modeling’ for the purpose of providing more realistic experience for the listener as the modelling is based on the realistic propagation waves in 3d space.


As per claim 9, DeBruijn discloses a method of encoding audio scene content into a bitstream, the method comprising: 
receiving a description of an audio scene (Fig. 5 3D audio encoder receiving scene based input), the audio scene comprising an acoustic environment and one or more audio elements at respective audio element positions (the corresponding objects in the scene as per the object based data); 

the one or more effective audio elements being determined by rendering the one or more audio elements to a reference position in the acoustic environment using a first rendering function (as per the claim 1 rejection, the function of the rendering tools on a particular audio source/object to a position as described in the claim 1 rejection, is the reference position),
 thereby obtaining a reference sound field at the reference position (the rendered source at a particular position creates a reference sound field),
 Wherein the first rendering function takes into account acoustic elements in the acoustic environment (the sources/objects/elements at a particular position as well as distance attenuation between the audio element positions and the reference positions as per the claim 1 rejection);
And determining, based on the reference sound field at the reference position (a rendered object at a position), the one or more effective audio elements at the respective effective audio element positions in the acoustic environment (the object is created by rendering a particular number of audio channels as shown in Debruijn fig. 7 channels 703),
In such manner that rendering the effective audio elements to the reference position using a second rendering function would yield a sound field at the reference position that approximates the reference sound field (the rendering tools of the claim 1 rejection function to render multiple objects/effective audio elements at various positions/reference positions in order to create a sound field at the reference position of each position of each object which is an approximation of the reference sound field specified by the audio data)
Wherein the second rendering function takes into account distance attenuation between the effective audio element positions and the reference position (the distance based attenuation in the claim 1 rejection is applied to all rendering functions including the second rendering function, where the reference position is the desired location of each object or the listener position relative to the rendered objects), but does not take into account the acoustic elements in the acoustic environment (the rendering of one particular object/element in the environment is independent of other object/elements in other positions as such the rendering of one particular element does not take into account other acoustic elements in the acoustic environment);
 generating effective audio element information indicative of the effective audio element positions of the one or more effective audio elements, wherein the effective audio element information is generated to comprise information indicative of respective sound radiation patterns of the one or more effective audio elements (the corresponding parameters in the encoder used to produce the information indicative of respective sound radiation patterns cited in the claim 1 rejection) ; 
generating a rendering mode indication that indicates that the one or more effective audio elements represent a sound field obtained from pre-rendered audio elements and should be rendered using a predetermined rendering mode that defines a predetermined configuration of rendering tools of a decoder for controlling an impact of the acoustic environment on the rendering output at the decoder ((the corresponding parameters in the encoder used to produce the rendering mode indication of respective sound radiation patterns cited in the claim 1 rejection); and 
encoding the one or more audio elements, the audio element positions, the one or more effective audio elements, the effective audio element information, and the rendering mode indication  


	As per claims 2,10, the method according to claim 1, further comprising: 

obtaining listener position information indicative of a position of a listener's head in the acoustic environment (para. 90 the rendering if selected based on the loudspeaker positions, where the loudspeaker positions are listener position information as they are arranged by the listener in order to listen at a particular location, the ‘sweet spot’) and/or listener orientation information indicative of an orientation of the listener's head in the acoustic environment, 
wherein rendering the one or more effective audio elements using the predetermined rendering mode further takes into account the listener position information and/or listener orientation information (para. 90 the rendering if selected based on the loudspeaker positions, where the loudspeaker positions are listener position information as they are arranged by the listener in order to listen at a particular location, the listener position in para. 94).

As per claim 11, the objects received as part of the aud DTA cited in the claim 1 rejection require at least two effective audio elements are generated and encoded into the bitstream; and wherein the rendering mode indication indicates a respective predetermined rendering mode for each of the at least two effective audio elements (the rendering mode can be respective to each element and received from the encoder as per the claim 1 rejection).

claim 12, the method according to claim 9, further comprising: obtaining listener position area information indicative of a listener position area for which the predetermined rendering mode shall be used (para. 90 the rendering if selected based on the loudspeaker positions, where the loudspeaker positions are listener position information as they are arranged by the listener in order to listen at a particular location, the ‘sweet spot’); and encoding the listener position area information into the bitstream (it must be encoded in to the bitstream by the encoder in order to receive the rendering mode indication at the decoder).

As per claim 13, the method according to claim 12, wherein the predetermined rendering mode indicated by the rendering mode indication depends on the listener position so that the rendering mode indication indicates a respective predetermined rendering mode for each of a plurality of listener positions (para. 90 the rendering if selected based on the loudspeaker positions, where the loudspeaker positions are listener position information as they are arranged by the listener in order to listen at a particular location, the ‘sweet spot’, additionally there can be multiple different modes for different elements as per the claim 1 and 11 rejections).

As per claim 14, an audio decoder comprising a processor coupled to a memory storing instructions for the processor, wherein the processor is adapted to perform the method according to claim 1 (the method of the claim 1 rejection requires a processor in an audio decoder to perform the cited functions).
As per claim 15, a computer program including instructions for causing a processor that carries out the instructions to perform the method according to any one of claim 1 is required to support the processor as per the claim 14 rejection.
claim 16(cancelled), a computer-readable storage medium storing the computer program according to claim 15 is required in order to implement the processor of the claim 14 rejection.




As per claim 3, DeBruijn discloses the method according to claim 1, but does not specify, wherein rendering the one or more effective audio elements using the predetermined rendering mode applies sound attenuation modelling in accordance with respective distances between a listener position and the effective audio element positions of the one or more effective audio elements (distance based attenuation).
Mahabub discloses a 3d spatial audio system and teaches to apply sound attenuation modelling in accordance with respective distances between a listener position and the effective audio element positions of the one or more effective audio elements (distance processing, para. 124 comprises a function as described in para. 125,126 which is sound attenuation modelling based on the distance between the listener and a sound source).  It would have been obvious to one skilled in the art that the rendering of sources/objects relative to the listener of De Bruijn could comprise attenuation modelling for the purpose of providing more realistic experience for the listener as the modelling is based on the realistic propagation waves in 3d space.

As per claim 4, (Currently amended) The method according to any one of claims 1 to 3, 
wherein at least two effective audio elements are determined from the description of the audio scene (outputs of 705 and outputs of 707 /effective audio elements are based on the description of audio scene described in the DTA inputs to 705); 
 The render controller 709 may specifically divide the loudspeakers 703 into a number of subsets and independently select the rendering mode for each of these subsets depending on the position of the loudspeakers 703 in the subset_;
wherein the method comprises rendering the at least two effective audio elements using their respective predetermined rendering modes (rendered with the configuration specified as per para. 97); and 
wherein rendering each effective audio element using its respective predetermined rendering mode takes into account the effective audio element information for that effective audio element (para. 96, the locations of the loudspeakers affects the rendering modes), and 
wherein the rendering mode for that effective audio element defines a respective predetermined configuration of the rendering tools (the mechanism for mapping the audio to various loudspeaker channels output from 707 based on the determined rendering mode based on loudpeakers configuration defines a respective predetermined configuration of the rendering tools used to render the audio to speakers 703, for controlling the impact of the acoustic environment of the audio scene on the rendering output for that effective audio element (the impact of the environment is controlled per the above steps in addition to those specified in the claim 1 rejection);.

As per claim 5, the method according to claims 1, further comprising: 
determining one or more audio elements from the description of the audio scene (as per the claim 4 rejection); 

rendering the one or more audio elements using a rendering mode for the one or more audio elements that is different from the predetermined rendering mode used for the one or more effective audio elements (different channels/ elements can have different predetermined rendering modes applied as per the claim 4 rejection), 
wherein rendering the one or more audio elements using the rendering mode for the one or more audio elements takes into account the audio element information (the loudspeaker position is the audio element information that is used to determine the rendering modes as per the claim 4 rejection), 
the rendering mode defining a configuration of the rendering tools, the configuration of the rendering tools corresponding to acoustic rendering (as described in the claim 4 rejection).

As per claim 6, the method according to claim 5, further comprising: obtaining listener position area information indicative of a listener position area for which the predetermined rendering mode shall be used (The listener position information defined with each rendering mode/loudspeaker configuration as per the claim 2 rejection).
As per claim 7, the method according to claim 6, wherein the predetermined rendering mode indicated by the rendering mode indication depends on the listener position (as per the claim 2 and 6 rejections, via the sweet spot designated for a particular rendering modes or loudspeaker configuration); and 
wherein the method comprises rendering the one or more effective audio elements using that predetermined rendering mode that is indicated by the rendering mode indication for the listener position area indicated by the listener position area information (as per the claim 4 and 5 rejections 



The following claims including claim 8,17,18,19,20, is/are rejected under 35 U.S.C. 103 as being unpatentable over De Bruijn et al (US 20160080886 A1) and further in view of Mahabub et al (US 20120213375 A1).

As per claim 8, De Bruijn discloses a method of generating audio scene content, the method comprising: 
Obtaining by sound capture one or more audio elements representing captured signals from an audio scene, the audio scene comprising an acoustic environment (Fig. 7 input AUD DTA signal which is captured by stage 705); 
obtaining effective audio element information indicative of effective audio element positions of one or more effective audio elements to be generated (as per the claim 1 rejection), 
the effective audio positions being estimated or received as user input (Debruijn teaches that “In some cases these methods require placing a microphone to the desired listening position in order to capture the reproduced sound field”, para. 125, where it would have been obvious to one skilled in the art that the audio positions, as part of the reproduced sound field, could be captured by user input/placing a microphone for the purpose of obtaining the soundfield to be reproduced.
;

wherein the effective audio element information comprises information indicative of respective sound radiation patterns of the one or more effective audio elements (as per the claim 1 rejection); and 
determining the one or more effective audio elements (output of 705 and or output of 707) from the one or more audio elements representing the captured signals (inputs to 705).  

However, De Bruijin does not disclose that the effective audio elements are determined by application of sound attenuation modelling according to distances between a position at which the captured signals have been captured and the effective audio element positions of the one or more effective audio elements.

Mahabub discloses a 3d spatial audio system and teaches to apply sound attenuation modelling in accordance with respective distances between a listener position and the effective audio element positions of the one or more effective audio elements (distance processing, para. 124 comprises a function as described in para. 125,126 which is sound attenuation modelling based on the distance between the listener and a sound source).  It would have been obvious to one skilled in the art that the rendering of sources/objects relative to the listener of De Bruijn could comprise attenuation modelling for the purpose of providing more realistic experience for the listener as the modelling is based on the realistic propagation waves in 3d space.

claim 17(cancelled), the method of the claim 8 rejection requires an audio encoder comprising a processor coupled to a memory storing instructions for the processor, wherein the processor is adapted to perform the method according to claim 8.
As per claim 18, the processor of the claim 1 rejection requires a computer program including instructions for causing a processor that carries out the instructions to perform the method according to claim 8.
As per claim 19(cancelled), the processor of the claim 1 rejection requires a computer-readable storage medium storing the computer program according to claim 18 in order to be implemented.
As per claim 20, the processor of the claim 1 rejection requires a computer-readable storage medium storing the computer program in order to implement the method of claim 9.

Response to Arguments

The submitted arguments have been considered but are moot in view of the new grounds of rejection.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

	

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER KRZYSTAN whose telephone number is 571-272-7498, and whose email address is alexander.krzystan@uspto.gov

The examiner can usually be reached on m-f 7:30-4:00 est.
If attempts to reach the examiner by telephone or email are unsuccessful, the examiner’s supervisor, Fan Tsang can be reached on (571) 272-7547.  

The fax phone numbers for the organization where this application or proceeding is assigned are 571-273-8300 for regular communications and 571-273-8300 for After Final communications.

Examiner Alexander Krzystan
January 20, 2022