DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The following claims including claims 26-45 is/are rejected under 35 U.S.C. 103 as being unpatentable over Metcalf (US 7636448 B2), and further in view of Arteaga et al (US 20180061072 A1).

As per claim 26, Metcalf discloses a method comprising: 
defining for at least one time period (the system of fig. 1 is a digital system based on parameters, as such, each indication and function is relative to a defined period of time relative to a master/system clock) at least one contextual grouping (the group of audio objects as macro objects to be rendered by the farfield rendering engines, Col 3 line 57 to Col 4 line 10) comprising at least two of a plurality of audio objects (multiple audio objects are disclosed) and at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping ( the micro objects to be rendered by the nearfield rendering engine), the plurality of audio objects within at least one audio scene (the reproduced 3d sound events, Col 3 lines 20-25); and 
defining with respect to the at least one contextual grouping at least one first parameter and/or parameter rule type (Col 10 lines 15-25, the object volume for a macro object in the macro domain is a parameter rule type) which is configured to be applied with respect to a common element (the audio sources/micro objects used in the macro object) associated with the at least two of the plurality of audio objects (the macro object comprises a combination of all source attributes from the micro objects which are a plurality of audio objects),   , and 
wherein the at least one first parameter and/or parameter rule type is configured to be applied with respect to individual element associated with the at least one further audio object outside of the at least one contextual grouping (the respective object volume of a micro object is applied to each micro object as the micro object volume, where the individual element is a micro object in a micro domain, which is outside the contextual grouping of the macro object since it is in a micro domain and hence in a different contextual grouping ), 
the at least one first parameter and/or parameter rule type being applied in audio rendering of both the at least two of the plurality of audio objects (Col 10 lines 10-50, the rendering of the macro object in the macro domain using object volume) and the at least one further audio object (the object volume parameter rule type applied in rendering the micro object in the micro domain which is the individual element).

While Metcalf discloses that virtual audio objects can be rendered (para. 21), Metcalf does not disclose the audio rendering is a six degrees of freedom free-viewpoint audio rendering.
Arteaga discloses an audio rendering system and teaches that the system can render virtual audio objects in an audio scene (virtual model of the spatialized audio sources, para. 23) using 6DOF rendering (para. 65, the model is based on a 6dof, which is a free viewpoint system) in order to create a VR based audio scene (para. 23, the virtual model of spatialized audio sources).  It would have been obvious to one skilled in the art that the virtual objects of Metcalf could be rendered in a well known format including a 6DOF format for the purpose of implementing a VR based audio scene for the listener.

As per claim 34, the claim 26 rejection discloses a method for rendering audio signals associated with a plurality of audio objects within at least one audio scene (as per the claim 26 rejection), the method comprising: 
Determining (the defined parameters are determined by the processor 120,130,140,150 in fig. 1 )when rendering the audio as per the claim 26 rejection) for at least one time period at least one contextual grouping comprising at least two of the plurality of audio objects and at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping (as per the claim 26 rejection); and 
determining with respect to the at least one contextual grouping at least one first parameter and/or parameter rule type (the parameter rule type object volume as per the claim 26 rejection as determined by the processor 120,130,140,150 in fig. 1);
 determining at least one common element with respect to the at least one contextual grouping (the micro objects used in a given macro object as per the claim 26 rejection as determined by the processor 120,130,140,150 in fig. 1); 
determining an individual element with respect to the at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping (the individual micro objects rendered in the micro domain as per the claim 26 rejection as determined by the processor 120,130,140,150 in fig. 1); 
rendering audio signals associated with the at least two of the plurality of audio objects by applying the at least one first parameter and/or parameter rule type with respect to the common element to audio signals associated with the at least two of the plurality of audio objects (as per the claim 26 rejection); 
rendering audio signals associated with the at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping by applying the at least one first parameter and/or parameter rule type with respect to the individual element to audio signals associated with the at least one further audio object (as per the claim 26 rejection); and 
combining the rendering of audio signals associated with the at least two of the plurality of audio objects with the rendering of audio signals associated with the at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping (the examiner notes that ‘rendering’ as used here is not be used in its traditional meaning, where rendering usually means the processed sounds are output to speakers, in this case the examiner reads rendering as processing the input audio to produce signals that are then to be translated into a set of signals to drive groups of speakers; as such the separately rendered models are combined as per step 50 of fig. 1 as multiple models can be combined to output from a common speaker cluster).
Where the audio rendering is a six degrees of freedom free-viewpoint audio rendering (as per the claim 26 rejection).

As per claim 38, the method of claim 26 and 24 rejections is performed by: an apparatus for audio signal processing audio objects within at least one audio scene, the apparatus comprising at least one processor (Fig. 1, 120,130,140,150) which requires a memory storing program code configured to, upon execution perform the recited functions including: 
define for at least one time period at least one contextual grouping comprising at least two of a plurality of audio objects and at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping, the plurality of audio objects within at least one audio scene (as per the claim 26 and 34 rejections); and 
define with respect to the at least one contextual grouping at least one first parameter and/or parameter rule type which is configured to be applied with respect to a common element associated with the at least two of the plurality of audio objects and wherein the at least one first parameter and/or parameter rule type is configured to be applied with respect to individual element associated with the at least one further audio object outside of the at least one contextual grouping, the at least one first parameter and/or parameter rule type being applied in audio rendering of both the at least two of the plurality of audio objects and the at least one further audio object  (as per the claim 26 and 34 rejections).
Where the audio rendering is a six degrees of freedom free-viewpoint audio rendering (as per the claim 26 rejection).


As per claim 42, the claim 26,34 and 38 rejections disclose: An apparatus for rendering audio signals associated with a plurality of audio objects within at least one audio scene, the apparatus comprising at least one processor 6 of 9 LEGAL02/39715918v1Application No.: 16/758,698 Amendment Dated April 10, 2020 Supplemental Preliminary Amendment and a memory storing program code, the at least one processor is configured, upon execution of the program code, to: 

determine for at least one time period at least one contextual grouping comprising at least two of the plurality of audio objects and at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping (as per the claim 34 rejection); 
and determine with respect to the at least one contextual grouping at least one first parameter and/or parameter rule type (as per the claim 34 rejection); 
determine at least one common element with respect to the at least one contextual grouping (as per the claim 34 rejection); 
determine an individual element with respect to the at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping(as per the claim 34 rejection); 
render audio signals associated with the at least two of the plurality of audio objects by applying the at least one first parameter and/or parameter rule type with respect to the common element to audio signals associated with the at least two of the plurality of audio objects(as per the claim 34 rejection); 
render audio signals associated with the at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping by being configured to apply the at least one first parameter and/or parameter rule type with respect to the individual element to audio signals associated with the at least one further audio object (as per the claim 34 rejection); and 
combine the rendered audio signals associated with the at least two of the plurality of audio objects with the rendered audio signals associated with the at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping (as per the claim 34 rejection).
Where the audio rendering is a six degrees of freedom free-viewpoint audio rendering (as per the claim 26 rejection).


As per claim 27, the method as claimed in claim 26, further comprising: defining with respect to the at least one contextual grouping at least one second parameter and/or parameter rule type configured to be applied with respect to individual elements associated with the at least two of the plurality of audio objects in audio rendering of the at least two of the plurality of audio objects; and
 defining the at least one second parameter and/or parameter rule type is configured to be applied with respect to individual elements associated with the at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping.
(in addition to the Object Volume/first parameter type, the system/method further comprises as per Col 4 lines 55-65, additional parameters that can be applied to both micro and macro objects where any one of the additional parameters is the second parameter rule type).

As per claim 28, the method as claimed in claim 26, further comprising: 
defining for at least one further time period at least one further contextual grouping comprising a further at least two of the plurality of audio objects (there can be multiple macro objects as per the claim 26 rejection, where a secondary macro object would comprise a further plurality of audio objects, where each macro object is defined during the periods of time, including a distinct further time period, that the processor contains the audio objects relative to the master clock required by the processor); and 
defining with respect to the at least one further contextual grouping at least one further first parameter and/or parameter rule type which is configured to be applied with respect to a further common element associated with the further at least two of the plurality of audio objects in audio rendering of the further at least two of the plurality of audio objects (each respective macro object has its own set of respective object volumes/ further first parameters based on the micro objects contained as part of a particular macro object).

As per claim 29, the at least two of the plurality of audio objects and the further at least two of the plurality of audio objects comprises at least one audio object in common (Col 5 lines 1-20 a particular audio event/object can be represented in both a nearfield a far field perspective, ie. in common, where the nearfield perspective comprises a further plurality of audio objects and the farfield perspective comprises a plurality of audio objects ).

As per claim 30, The method as claimed in claim 29, further comprising selecting for the at least one object in common one of:
 the at least one parameter and/or parameter rule type, to be applied with respect to the common element associated with the at least two of the plurality of audio objects; or the at least one further parameter and/or parameter rule type, to be applied with respect to the further common element associated with the further at least two of the plurality of audio objects (the nearfield rendering comprises selecting the further parameter rule type while the farfield rendering comprises selecting the at least one parameter rule type as per the perspectives discussed in the claim 29 rejection), 
based on at least one of: a volume determination (the parameter is based on a volume determination for a particular object, as per the cited object volume in the claim 26 rejection); and a prior contextual grouping of the at least one additional contextual grouping and the at least one contextual grouping.

As per claim 31, The method as claimed in claim 26, further comprising defining with respect to the at least one contextual grouping the common element as at least one common position or area (Col 8 lines 20-30, the objects in the contextual grouping are defined relative to common positions/volume focal points serving as a common origin).

As per claim 32, the method as claimed in claim 26, further comprising encoding a downmix of audio signals associated with the at least one contextual grouping (Col 6 lines 1-15 transferred objects may be mixed/encoded prior to articulation) based on at least one of: 
a distance within an audio scene relative to a rendering location (the micro objects in the macro object are defined as a farfield as per the claim 26 rejection, where farfield defines a distance relative to a rendering location); and
 a orientation of the at least one contextual grouping relative to a rendering location.

As per claim 33, the method as claimed in claim 26, further comprising: defining the common element with respect to the at least one contextual grouping (as per the common position in the claim 31 rejection); 
and transmitting and/or storing the defined common element and audio signals associated with the at least two of the plurality of audio objects (the processor must store the common element and all audio signals for the audio objects in order to perform the processing cited in the claim 26 rejection).

As per claim 35, the method as claimed in claim 34, further comprising: determining at least one second parameter and/or parameter rule type (as per the claim 27 rejection); 
rendering audio signals associated with the at least two of the plurality of audio objects and the at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping by applying the at least one second parameter and/or parameter rule type with respect to individual elements associated with each audio object to audio signals associated with each audio object (as per the claim 27 rejection); 
and combining rendering audio signals associated with the at least two of the plurality of audio objects and the at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping with the combined rendering (as per the claim 34 and 42 rejections).

As per claim 36, the method as claimed in claim 34, further comprising determining the common element as at least one common position or area (as per the claim 31 rejection).

As per claim 37, the method as claimed in claim 34, further comprising determining a downmix of audio signals associated with the at least one contextual grouping (as per the claim 32 rejection), 
wherein applying the at least one first parameter and/or parameter rule type with respect to the common element to audio signals associated with the at least two of the plurality of audio objects comprises applying the at least one first parameter and/or parameter rule type with respect to the common element to the downmix of audio signals associated with the at least two of the plurality of audio objects (the combined/encoded/downmixed signal can be applied to the rendering engines/flow through the processing chain as part of a macro object Col 6 lines 10-20 which applies the first parameter rule type to the audio object).

As per claim 39	the apparatus as claimed in claim 38, the at least one processor further configured to: 
define with respect to the at least one contextual grouping at least one second parameter and/or parameter rule type configured to be applied with respect to individual elements associated with the at least two of the plurality of audio objects in audio rendering of the at least two of the plurality of audio objects (as per the claim 27 rejection); and 

define the at least one second parameter and/or parameter rule type is configured to be applied with respect to individual elements associated with the at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping (as per the claim 27 rejection).

As per claim 40, the at least one processor is further configured to: 
define for at least one further time period at least one further contextual grouping comprising a further at least two of the plurality of audio objects; and define with respect to the at least one further contextual grouping at least one further first parameter and/or parameter rule type which is configured to be applied with respect to a further common element associated with the further at least two of the plurality of audio objects in audio rendering of the further at least two of the plurality of audio objects (as per the claim 28 rejection).

As per claim 41, the at least two of the plurality of audio objects and the further at least two of the plurality of audio objects comprises at least one audio object in common as per the claim 29 rejection.

As per claim 43, the apparatus as claimed in claim 42, wherein the processor is further configured to: 
determine at least one second parameter and/or parameter rule type (as per the claim 27 rejection); 
render audio signals associated with the at least two of the plurality of audio objects and the at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping by being configured to apply the at least one second parameter and/or parameter rule type with respect to individual elements associated with each audio object to audio signals associated with each audio object (as per the claim 27 rejection the second parameter types are used in the rendering functions); and 
combine rendered audio signals associated with the at least two of the plurality of audio objects and the at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping with the combined rendering (as per the claim 34 rejection).

As per claim 44, the apparatus as claimed in claim 42, wherein the processor is further configured to determine the common element as at least one common position or area (as per the claim 36 rejection).

As per claim 45, the apparatus as claimed in claim 42, wherein the processor is further configured to determine a downmix (as per the claim 37 rejection) of audio signals associated with the at least one contextual grouping, wherein the processor configured to apply the at least one first parameter and/or parameter rule type with respect to the common element to audio signals associated with the at least two of the plurality of audio objects is further configured to apply the at least one first parameter and/or parameter rule type with respect to the common element to the downmix of audio signals associated with the at least two of the plurality of audio objects as per the claim 37 rejection.


Response to Arguments
Applicant's arguments have been fully considered but they are not persuasive. 

As per applicant’s argument that the cited prior art Arteaga does not disclose audio objects being placed based on motion detected by the IMU, the examiner disagrees and notes the model cited in the outstanding claim 26 rejection, and the virtual model cited in para. 26 is based on the 6dof motion detected by the IMU, where the virtual model is used to locate spatialized audio sources as per para. 19.  Since the audio sources are located based on the virtual model which is adapted based on a 6-dof detection system, the rendering of the audio objects is a 6-dof rendering.

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

	Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER KRZYSTAN whose telephone number is 571-272-7498, and whose email address is alexander.krzystan@uspto.gov

The examiner can usually be reached on m-f 7:30-4:00 est.
If attempts to reach the examiner by telephone or email are unsuccessful, the examiner’s supervisor, Fan Tsang can be reached on (571) 272-7547.  

The fax phone numbers for the organization where this application or proceeding is assigned are 571-273-8300 for regular communications and 571-273-8300 for After Final communications.
/ALEXANDER KRZYSTAN/Primary Examiner, Art Unit 2653                                                                                                                                                                                                        
Examiner Alexander Krzystan
May 25, 2022