DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Examiner’s Comments

‘assumed’ as recited in claim 23 is read as a function performed by the processor of claim 17.
The examiner notes applicant’s remarks defining three common, well known input types to mpeg 3d audio codecs

	

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


The following claims including 1,17,14,15 and their respective depending claims are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

claims 1,17 It is not clear how to read ‘audio signal’ as used in the claims.  The claims recite: ‘rendering an audio signal’, then, the audio signal including first and second elements that can be simultaneously rendered.  It is not clear where in the signal processing chain the ‘audio signal’ is being referred to.  Further it is not clear how a system would render audio objects/first elements, and also audio channels/second elements at the same time because the objects are created via the channels and as such the rendered signals are not distinct from one another and should not be claimed as such.  The terms audio signal, first element and second element should be clearly and consistently recited throughout the claims in a manner that is mappable to the signal processing stages disclosed in applicant’s drawings 1 and 2, with different names used for different signals.  For example, applicant’s specification page 8 states that figure 1 shows encoding an audio signal and figure 2 shows decoding an audio signal.  It appears that the ‘audio signal’ as described by the specification can be any signal at any point in either of the encoder or decoder.  Additionally this is further confused by the fact that the audio signal is recited to comprise first and second element signals.  

As per claim 14, it is not clear how the processor renders the same second element signal two times.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the 

The following claims 1,17,12,25,16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Breebaart (US 10278000 B2), and further in view of Johnson et al (US 9558785 B2).

As per claim 1, Breebaart discloses an audio signal processing device 140 rendering an audio signal including a first element signal (the channel signals derived in 140 for each loudspeaker shown in figs. 2,4a,4b ), the device comprising a processor (140 requires a processor to perform the disclosed functions) for obtaining metadata (from 110) including the audio signal (comprised of at least the audio objects) and first element reference distance information indicating the reference distance of the first element signal  (the reference speaker layout, where the layout is specified by positions, where each coordinate is a distance along a defined axis as shown in figs. 2,4a,4b via the axis, para. DETX 26: cluster positions are determined based on object positions of the audio objects and a reference speaker layout, DETX 28: In some embodiments, the cluster positions and the speaker positions of the reference speaker layout may be represented in the same coordinate system,); and 
rendering the first element signal on the basis of the first element reference distance information (para. DETX 26: rendering, it is desired to perform audio object clustering first. In some example embodiments, the audio objects have associated metadata describing their spatial information such as the positions, DETX 26: cluster positions are determined based on object positions of the audio objects and a ), 
wherein: the audio signal is able to include a second element signal (one of the objects originating from 110 as output by 130)  which is able to be simultaneously rendered with the first element signal (the objects are produced by stage 140 producing driving/channel signals as per para. DETX 16: rendering system 140 used to render the cluster signals to speakers included in an audio playback system, where the channels and objects are rendered simultaneously since the objects require the channels in order to be rendered); 
the metadata is able to include second element distance information indicating the distance of the second element signal (either of the cluster or object positions, where a position in the coordinate system comprises a set of distances/distance information along each axis, noting the coordinate systems as shown in figs. 2,4a,4b, where a particular coordinate for an audio object is second element distance information);   


However, Breebaart does not disclose 
the number of bits required for representing the first element reference distance information is smaller than the number of bits required for representing the second element distance information; and 
a set of reference distances which is able to be represented by the first element reference distance information is a subset of a set of distances which is able to be represented by the second element distance information.


Johnson discloses an audio coding system and teaches to implement an extension audio layer (345 A, fig. 3) that can comprise additional channels (para. 22 ) with a higher number of bits relative to the core audio layer (para. 22: add resolution detail (e.g., the base layer may be a 16-bit stream and the enhancement layer may include data to result in a 24-bit stream)and further teaches that this can transform the base layer from a basic or partial audio track into a richer, more detailed audio track (para. 21) .  It would have been obvious to one skilled in the art to implement an additional layer with additional channels with greater resolution and bit depth for the purpose of  transforming the base layer from a basic or partial audio track into a richer, more detailed audio track.  
When implemented in the system of Breebaart the objects/second elements in 110 are processed with the additional channel as taught by Johnson, where the increased resolution and bit depth of the additional channel, as applied to the objects in block 110 of fig. 1 and/or the objects in 70 of fig. 7 of Breebaart comprises the object coordinate/distance also applied with increased resolution.  Where the subsequent clustering and rendering stages 610,620,630 of fig. 7 and system 140 of fig. 1 operate at the base layer, including the reference speaker layout and associated speaker positions/distances.
As such:
the number of bits required for representing the first element reference distance information (reference speaker layout in the XY coordinate system at the base layer) is smaller 
a set of reference distances which is able to be represented by the first element reference distance information is a subset of a set of distances which is able to be represented by the second element distance information (the second element/object position/distance information is at a higher resolution and bit depth than the reference speaker positions/distances, where the reduced bit depth/resolution of the first element reference distance provides for a subset of the possible reference distance values provided for by the increased bit depth of the second element object positions/distances).


Further it would have been obvious to one skilled in the art that the teachings throughout Breebaart figs. 4-7 could be applied to the prior Art system of fig. 1 for the purpose of completing and enhancing the audio system with the inventive elements taught throughout Breebaart.

As per claim 17, the claim 1 rejection discloses an audio coding system where the system comprises:
 audio signal processing device encoding an audio signal including a first element signal (para. 58,the cluster signals may be stored for future use, or may be input to an encoder or translation process. In some other examples, the ( encoded/translated) cluster signals may be transmitted to rendering systems. The cluster positions may be used as part of metadata of the cluster signals, so as to facilitate the subsequent rendering).


 wherein: the audio signal is able to include a second element signal and the metadata is able to include second element distance information indicating the distance of the second element signal, (second element and associated distance information is part of the cluster signals and metadata encoded by the encoder in order to be received and processed as the second element and associated distance information at the decoder/block 140 as per the claim 1 rejection); 

the number of bits used for indicating the first element reference distance information is smaller than the number of bits used for indicating the second element distance information (as per the claim 1 rejection),
 and a set of reference distances which is able to be represented by the first element reference distance information is a subset of a set of distances which is able to be represented by the second element distance information (as per the claim 1 rejection).


claims 12,25, the audio signal processing device of claim 1, wherein the first element signal is a channel signal, and the second element signal is an object signal as per the claim 1 rejection.
As per claim 16, the audio signal processing device of claim 1, wherein the processor renders the second element signal on the basis of the first element reference distance information (the channels and objects and rendered together based on both the first and second element distance/reference distance information since they are rendered together via channel signals to individual loudspeakers).


The following claims 2,18,3,19,4,20,8,24,9,10,11,13,26,14,15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Breebaart (US 10278000 B2), and further in view of Johnson et al (US 9558785 B2) as applied to claim 1, and further in view of Squires et al (US 20150230040 A1).


As per claims 2,18, Breebaart and Johnson disclose the audio signal processing device of claim 1, but do not specify wherein the first element reference distance information indicates the reference distance of the first element signal using an exponential function for each loudspeaker location.
	Squires discloses an audio coding system and teaches that virtual loudspeakers can be implemented via exponential interpolation which is an exponential function (para. 116).  It would have been obvious to one skilled in the art that the loudspeaker location/distance ie, the first element reference distance could be determined via an exponential function/interpolation for the purpose of representing virtual loudspeakers.

	As per claims 3,19, all parameters of the functions of the exponential interpolation need to be determined noting the functions in Squires para. 116 and 117, where the first element reference distance/position information is the means to determine all the parameters in the exponential interpolation to determine the first element distances/positions.

	As per claims 4,20, it would have been obvious to one skilled in the art that the bit depths could be set as a matter of design choice in order to vary the audio quality/increased bit depth based on the available processing resources (more bit depth is more processing resources).

As per claims 8,24, the audio signal processing device of claim 1, wherein the minimum reference distance which is able to be indicated by the first element reference distance information is a predetermined positive number greater than 0 (an interpolation cannot indicate 0 and all of the loudspeakers are in positive axis as shown in fig. 4a of Breebaart).

As per claim 9, the audio signal processing device of claim 1, wherein: the audio signal including the first element signal includes the second element signal (the channel signals comprise the objects when rendered); and the processor renders the first element signal and the second element signal, simultaneously (the channel signals are used to produce the object signals, as such they must be rendered simultaneously).
As per claim 10, the audio signal processing device of claim 9, wherein the processor adjusts, on the basis of the first element reference distance information, the loudness of a sound output in which the first element signal is rendered, and adjusts, on the basis of the The method also includes determining object-to-cluster gains based on the determined cluster positions, the object positions and the reference speaker layout,per the abstract of Breebaart, where the object to cluster gains directly effect the loudness of the output to each speaker in the reference speaker layout.

As per claim 11, the audio signal processing device of claim 9, wherein the processor applies a delay to the first element signal on the basis of the first element reference distance information, and applies a delay to the second element signal on the basis of the second element distance information (Squires teaches to include a delay to each loudspeaker feed/channel as per para. 40, where a delay added to a channel is also a delay added to an object associated with said channel, where the delay is based on the relative locations/distances of the virtual sources/objects and the loudspeakers (para. 41); where it would be obvious to implement the delays for the purpose of rendering spatial information included in the metadata.

As per claims 13,26, Breebaart and Johnson disclose the audio signal processing device of claim 1 where the second element signal is an object signal, but do not specify wherein the first element signal is an ambisonics signal.
Squires teaches the well known format of ambisonics to generate a 3d sound field (para. 63).  It would have been obvious to one skilled in the art to implement the channels/first element in ambisonic format for the purpose of creating a 3d sound field using a well known signaling protocol.

claim 14, the audio signal processing device of claim 1, wherein: the first element signal is a channel signal as per the claim 12 rejection; 
the audio signal further includes an ambisonics signal/second element as per the claim 13 rejection; 
and the processor renders the channel signal and the ambisonics signal on the basis of the reference distance of the first element signal (rendered as per the claim 1 and 13 rejections, where the ambisonics signals are rendered via the loudspeaker feeds cited in the claim 11 rejection, where the first element and second element signals are rendered based on both the reference distance of the first element signal and the reference distance of the second element signal, defined as per the claim 1 rejection ).

	
As per claim 15, the audio signal processing device of claim 1, wherein: 
the first element signal is a channel signal (per the claim 1 rejection); 
the audio signal further includes an ambisonics/second element signal signal (per the claim 13,14 rejections) ; 
the metadata includes channel reference distance information indicating the reference distance of the channel signal and ambisonics reference distance information indicating the reference distance of the ambisonics signal/second element signal (the reference loudspeaker layout and loudspeaker positions/distances of the claim 1 rejection are the channel reference distance information and ambisonics reference distance information in the metadata as the ambisonics is implemented via the channels); 
and the processor renders the channel signal on the basis of the channel reference distance information and renders the ambisonics signal on the basis of the ambisonics reference 

Allowable Subject Matter

Claims 5,6,7,21,22,23 objected to as being dependent upon a rejected base claim, but would be allowable over the prior art of record if rewritten in independent form including all of the limitations of the base claim and any intervening claims assuming the 112 rejection is overcome and the signaling is recited in a manner clearly mappable to the signals defined in the system shown in applicant’s figures 1 and 2.

Response to Arguments

The submitted arguments have been considered but are moot in view of the new grounds of rejection.
As per applicant’s argument that Johnson only teaches to provide the same element signal as the base layer, the examiner notes the explanation analyzing the combination of the primary reference with the teachings of Johnson.  The following is recited in the above claim 1 rejection and includes mappings to the first and second element and their respect numbers of bits/bit resolution:  
the number of bits required for representing the first element reference distance information (reference speaker layout in the XY coordinate system at the base layer) is smaller than the number of bits required for representing the second element distance information (the object position/distance in the xy coordinate system); and 


As per applicant’s additional arguments about Johnson, applicant does not appear to be considering the combination of references as described in the claim rejection.

As per applicant’s argument that Johnson does not disclose a set or subset of reference distances, the bit resolution representing the reference distance information is by definition able to represent a range of distances (a set, or subset) for a given number of bits because the number of bits correspond to the number of values that are possible to be represented.


Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory 

	


Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER KRZYSTAN whose telephone number is 571-272-7498, and whose email address is alexander.krzystan@uspto.gov

The examiner can usually be reached on m-f 7:30-4:00 est.
If attempts to reach the examiner by telephone or email are unsuccessful, the examiner’s supervisor, Fan Tsang can be reached on (571) 272-7547.  

The fax phone numbers for the organization where this application or proceeding is assigned are 571-273-8300 for regular communications and 571-273-8300 for After Final communications.
/ALEXANDER KRZYSTAN/Primary Examiner, Art Unit 2653                                                                                                                                                                                                        
Examiner Alexander Krzystan
January 21, 2022