DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
	This Office Action is in response to the Arguments/Remarks filed 1/31/22. Claims 1-6, 8-19 and 21-22 are pending.

Response to Arguments
Applicant's arguments filed 1/31/22 have been fully considered but they are not persuasive. 
Applicant submits that Tsingos does not teach any metadata which is indicative of whether the first part of the audio element is associated with a listener pose dependent position or with a listener pose non-dependent position. The examiner respectfully disagrees.
Tsingos (¶0053, all paragraph numbers from provisional) discloses that, “Some implementations may involve monitoring player locations and head orientations in order to provide audio to the near-field speakers in which sounds are accurately rendered according to intended sound source locations, at least with respect to direct arrival sounds.” Tsingos (¶0065) further discloses, “The audio objects may include audio data and associated metadata. The metadata may, for example, include data indicating the position, size and/or trajectory of an audio object.” Tsingos (¶0067) continues, “In some implementations, audio objects may include metadata indicating whether an audio object is a near-field audio object, a far-field audio object or in a transitional zone between the near field and the far field.”
Based on the above portions of Tsingos, the metadata indicates the position, size and trajectory of an audio object, as well as if it is near-field, far-field, or transitional. Further, monitoring the player (user) location and head orientation (i.e. pose) is done for the purpose of near-field audio reproduction. The metadata including the position of the object and if it is near-field therefore indicates whether it depends on a player’s position and head orientation (i.e. “is indicative of whether the first part of the audio element is associated with a listener pose dependent position or with a pose non-dependent position.”)
	The object to the Title as well as the 35 USC 112(b) rejections have been withdrawn.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 13-16, 18-19 and 21-22 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Tsingos (US 2019/0116450 A1, references made to provisional 62/574076.).
As to claim 13, Tsingos discloses a method of audio processing (Fig. 4) comprising: receiving data describing an audio scene, wherein the data comprises audio data for a set of audio elements, wherein the set of audio elements correspond to audio sources in the scene and metadata, wherein the metadata comprises at least a first audio rendering property indicator, wherein the first audio rendering property indicator is for a first audio element of the set of audio elements (¶0065, Fig. 4. “Block 405 involves receiving audio reproduction data. In this example, the audio reproduction data includes audio objects. The audio objects may include audio data and associated metadata. The metadata may, for example, include data indicating the position, size and/or trajectory of an audio object in a three-dimensional space, etc.”); 
rendering audio elements by generating a first set of audio signals for a set of loudspeakers (¶0068, Fig. 4. “Block 415 involves rendering the far-field audio objects into a first plurality of speaker feed signals for room speakers of a reproduction environment.”); 
rendering audio elements by generating a second set of audio signals for a headphone (¶0071, Fig. 4. “Block 420 involves rendering the near-field audio objects into speaker feed signals for at least one of near-field speakers or headphone speakers of the reproduction environment. As noted above, headphone speakers may, in this disclosure, be referred to as a particular category of near-field speakers.); and 
selecting between rendering of at least a first part of the first audio element for the set of loudspeakers and for the headphone in response to the first audio rendering property indicator (¶0066, Fig. 4. “Block 410 may, for example, involve differentiating the near-field audio objects and the far-field audio objects according to a distance between a location at which an audio object is to be rendered and a location of the reproduction environment.”).
wherein the first audio rendering property indicator is indicative of whether the first part of the first audio element is associated with a listener pose dependent position or with a listener pose non-dependent position (¶0053 and ¶0071. “Some implementations may involve monitoring player locations and head orientations in order to provide audio to the near-field speakers in which sounds are accurately rendered according to intended sound source locations.” Near-field audio associated with listener pose.).
As to claim 14, it is rejected using the same rationale as claim 13 above with further reference made to ¶0036 and ¶0064, which disclose the method implemented by software on a non-transitory media.
As to claim 15, Tsingos discloses receiving a listener pose, wherein the listener pose is indicative of a pose of a listener (¶0053 and ¶0071. Head orientation monitored.); 
generating the first set of audio signals independently of the listener pose (¶0053 and ¶0071. Head orientation used for near-field, not far-field.); and 
generating the second set of audio signals in response to the listener pose (¶0053 and ¶0071. Head orientation used for near-field, not far-field.).
As to claim 16, Tsingos discloses generating audio signals for a plurality of listeners (¶0038, Fig. 1a. Players 110a and 110b.); 
generating the first set of audio signals as a common set of audio signals for the plurality of listeners (¶0039, Fig. 1a. “The car 130a is outside the reproduction environment, so the audio corresponding to the car 130a may be presented to the players 110a and 110b via room speakers 105. This is true in part because "far-field" sounds, such as the direct sounds 135a from the car 130a, seem to be coming from a similar direction from the perspective of the players 110a and 110b.”); 
generating the second set of audio signals for headphones for a first listener of the plurality of listeners (¶0040, Figs. 1 and 4. “However, "near-field" sounds, such as the direct sounds 135b from the car 130b, cannot always be reproduced realistically by the room speakers 105. In this example, the direct sounds 135b from the car 130b appear to be coming from different directions, from the perspective of each player. Therefore, such near-field sounds may be more accurately and consistently reproduced by headphone speakers or other types of near-field speakers, such as those that may be provided on some VR headsets.” Near-field sounds at headphones 115a.); and 
generating a third set of audio signals for headphones for a second listener of the plurality of listeners (¶0040, Figs. 1 and 4. “However, "near-field" sounds, such as the direct sounds 135b from the car 130b, cannot always be reproduced realistically by the room speakers 105. In this example, the direct sounds 135b from the car 130b appear to be coming from different directions, from the perspective of each player. Therefore, such near-field sounds may be more accurately and consistently reproduced by headphone speakers or other types of near-field speakers, such as those that may be provided on some VR headsets.”  Near-field sounds at headphones 115b.).
As to claim 18, Tsingos discloses selecting different rendering for the first part of the first audio element and for a second part of the first audio element (¶0066-0067, Fig 4. Rendered in near-field, far-field or transitional zone.).
As to claim 19, Tsingos discloses wherein the first audio rendering property indicator is indicative of an audio format of the first audio element (¶0065. “Alternatively, or additionally, the audio reproduction data may include channel-based audio data.”).
As to claim 21, Tsingos discloses wherein the first audio rendering property indicator is indicative of whether the first part of the first audio item is intended for rendering over loudspeakers or headphones (¶0039-0040, ¶0065 and ¶0067. Metadata indicating near-field, far-field or transitional. Near-field intended for headphones, far-field for loudspeakers.).
As to claim 22, Tsingos discloses rendering the audio scene by generating a hybrid set of output signals, wherein the hybrid set of output signals includes at least a first set of output signals and a second set of output signals, wherein the first output signals are generated to be rendered by the set of loudspeakers and wherein the second set of output signals are generated to be rendered by the headphone, wherein the first set of output signals are a set of surround sound signals for reproduction by the set of loudspeakers, wherein the second set of audio signals form a binaural stereo signal for reproduction by the headphone, wherein the first set of output signals are produced in response to a first set of the audio sources, wherein the second set of output signals are produced in response to a second set of the audio sources, wherein the first set of audio sources have a property that is listener pose non- dependent, and the second set of audio sources have a property that is listener pose dependent (¶0039-0040, ¶0053,  ¶0071 and ¶0103. Figs. 1a/b-2a/b. Far-field sounds output by loudspeakers 105 and near-field sounds output by headphones 115. Near-field rendered based on head orientation. Far-field not based head orientation. Multi-channel (i.e. surround) output to loudspeakers and binaural to headphones disclosed.).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-3 and 5-6 and 8-12 are rejected under 35 U.S.C. 103 as being unpatentable over Jackson et al. (GB 2550877 A), hereinafter “Jackson,” in view of Tsingos.
As to claim 1, Jackson discloses an audio apparatus (Fig. 1) comprising: 
a receiver circuit , wherein the receiver circuit is arranged to receive data, wherein the data describes an audio scene, wherein the data comprises audio data and metadata, wherein the audio data is for a set of audio elements corresponding to audio sources in the scene, wherein the metadata comprises at least a first audio rendering property indicator, wherein the first audio rendering property indicator is for a first audio element of the set of audio elements (p.5 line 30 – p. 6 line 20 and Fig. 1. “The apparatus 100 is configured to receive a plurality of audio objects 110, each of which comprises at least one audio signal 111 and associated metadata 112. The apparatus 100 is further configured to render the objects 110 into loudspeaker signals comprising rendered audio data. The rendered audio data is outputted to various speakers distributed in the environment 120 in which the audio is to be reproduced.” “As described above, in known object-based audio rendering systems the object metadata defines parameters such as the object type, location and trajectory. In some embodiments of the present invention, the metadata 112 for an audio object 110 may also include editorial information defined by the producer.”); 
a first renderer circuit, wherein the first renderer circuit is arranged to render audio elements by generating a first set of audio signals for a set of loudspeakers (p. 11 lines 10-21, Fig. 1. “For background speech, for example in a cocktail party scene, the object refiner 302 could decide to use yet another rendering method such as ambisonic rendering, or a rendering algorithm which is configured to distribute voices over multiple discrete loudspeakers.” See bank of Renderers 103.); 
a second renderer circuit, wherein the second renderer circuit is arranged to render audio elements by generating a second set of audio signals for a headphone (p. 7 lines 4-6 and p. 11 lines 10-21, Fig. 1. “For a narrator, the object refiner 302 could decide to use a Renderer that is configured to output a loudspeaker signal to a speaker that is close to the listener, such as a wireless loudspeaker or an integrated speaker within a second-screen device such as a mobile phone or tablet.” “The apparatus 100 can send audio signals to wearable devices such as headphones.” See bank of Renderers 103.); and 
a selector circuit, wherein the selector circuit is arranged to select between the first renderer circuit and the second renderer circuit such that the rendering of at 3least a first part of the first audio element is in response to the first audio rendering property indicator (p. 8 lines 29-31 and p. 11 lines 10-21, Fig. 1. “In the present embodiment, the object refiner 102 is configured to select one or more suitable rendering algorithms for converting the adapted audio objects 110 into speaker signals, according to contextual information provided by the context unit 105.” “As an example, the object refiner 302 may decide to render two ‘dialogue’ type objects using different renderers when one of the objects corresponds to a character in the foreground of the scene, and the other object corresponds to a character in the background of the scene.”).
Jackson does not expressly disclose wherein the first audio rendering property indicator is indicative of whether the first part of the first audio element is associated with a listener pose dependent position or with a listener pose non-dependent position.
Jackson in view of Tsingos discloses wherein the audio rendering property indicator is indicative of whether the first part of the first audio element is associated with a listener pose dependent position or with a listener pose non-dependent position (Tsingos, ¶0053 and ¶0071. “Some implementations may involve monitoring player locations and head orientations in order to provide audio to the near-field speakers in which sounds are accurately rendered according to intended sound source locations.” Near-field audio associated with listener pose while far-field is not.).
Jackson and Tsingos are analogous art because they are from the same field of endeavor with respect to audio object rendering.
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to render based on listener pose for near-field, as taught by Tsingos. The motivation would have been for realistic sound reproduction (Tsingos, ¶0039-0040).
As to claim 2, Jackson does not expressly disclose a listener pose receiver circuit wherein the listener pose receiver circuit is arranged to receive a listener pose, wherein the listener pose is indicative of a pose of a listener, wherein the first renderer circuit is arranged to generate the first set of audio signals independently of the listener pose, wherein the second renderer circuit is arranged to generate the second set of audio signals in response to the listener pose.
Jackson in view of Tsingos discloses a listener pose receiver circuit wherein the listener pose receiver circuit is arranged to receive a listener pose, wherein the listener pose is indicative of a pose of a listener, wherein the first renderer circuit is arranged to generate the first set of audio signals independently of the listener pose, wherein the second renderer circuit is arranged to generate the second set of audio signals in response to the listener pose (Tsingos, ¶0053 and ¶0071. Head orientation monitored and used for near-field but not far-field.).
The motivation is the same as claim 1 above.
As to claim 3, Jackson in view of Tsingos discloses wherein the apparatus is arranged to generate audio signals for a plurality of listeners (p. 13 lines 8-12. Multiple listeners.), 
wherein the first renderer circuit is arranged to generate the first set of audio signals as a common set of audio signals for the plurality of listeners, wherein the second renderer circuit is arranged to generate the second set of audio signals for headphones for a first listener of the plurality of listeners, wherein the second renderer circuit is arranged to generate a third set of audio signals for headphones for a second listener of the plurality of listeners (Jackson, p. 7 lines 4-6, p. 11 lines 10-21, p. 13 line 29 – p. 14 line 2 and Fig. 1. Audio can be rendered to multiple loudspeakers or just to one closest to a specific listener of a plurality of listeners, such as one who favors a certain team. Headphones disclosed as types of loudspeakers used.).
As to claim 5, Jackson in view of Tsingos discloses wherein the selector circuit is arranged to select different renderers of the first renderer circuit and the second renderer circuit for the first part of the first audio element and for a second part of the first audio element (Jackson, p. 8 lines 29-31, p. 11 lines 10-21 and Fig. 1. Different renderers selected based on audio object properties.).
As to claim 6, Jackson does not expressly disclose discloses wherein the first audio rendering property indicator is indicative of an audio format of the first audio element.
Jackson in view of Tsingos discloses wherein the first audio rendering property indicator is indicative of an audio format of the first audio element (Tsingos, ¶0065. “Alternatively, or additionally, the audio reproduction data may include channel-based audio data.”).
	The motivation is the same as claim 1 above.
As to claim 8, Jackson in view of Tsingos discloses wherein the first audio rendering property indicator is indicative of a guidance rendering property for the rendering of the first audio element (Jackson, p. 6 lines 4-20. Editorial information from producer in metadata.).
As to claim 9, Jackson in view of Tsingos discloses wherein the first audio rendering property indicator is indicative of whether the first part of the first audio item is intended for rendering over loudspeakers or headphones (Jackson, p. 16 lines 5-9. “Examples of properties that can be defined in the advanced object metadata 502 include: …target device; ...”).
As to claim 10, Jackson in view of Tsingos discloses wherein the circuit is arranged to receive visual data, wherein the visual data is indicative of a virtual scene corresponding to the audio scene, wherein the first audio rendering property indicator is indicative of whether the first audio element represents an audio source corresponding to a visual scene object (Jackson, p.11 lines 16-18 and p. 16 lines 5-9. “For an actor on-screen, the object refiner 302 could decide to use a VBAP Renderer to pan to the location of the actor in the scene.” “Examples of properties that can be defined in the advanced object metadata 502 include: …onscreen/offscreen; ….”).
As to claim 11, Jackson in view of Tsingos discloses a user input circuit, wherein the user input circuit is arranged to receive a user input, and wherein the selector circuit is arranged to select between the first renderer circuit and the second renderer circuit for rendering of at least the first part of the first audio element in response to the user input (Jackson, p. 13 lines 8-22. User interface for user input. “By acquiring information about the number of listeners and their locations, the apparatus 100 can optimize the rendered audio data for each user.”).
As to claim 12, Jackson in view of Tsingos discloses wherein the selector circuit is arranged to determine an audio property of the first audio element, wherein the selector circuit is arranged to select between the first renderer circuit and the second renderer circuit for rendering of at least the first part of the first audio element in response to audio property (Jackson, p. 7 lines 8-30 and p. 8 lines 29-31. Object refiner receives objects from scene adapter and refines individual objects prior to selecting which renderer to send them to based on metadata and context information.).

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Jackson in view of Tsingos, as applied to claim 1 above, in view of Metcalf (US 2006/0109988 A1).
As to claim 4, Jackson in view of Tsingos does not expressly disclose wherein the first part is a frequency subrange of the first audio element.
Jackson in view of Tsingos as modified by Metcalf discloses wherein the first part is a frequency subrange of the first audio element (Metcalf, ¶0074. Sound object metadata contains frequency range.).
Jackson, Tsingos and Metcalf are analogous art because they are from the same field of endeavor with respect to virtual sound events.
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to have frequency range in the metadata, as taught by Metcalf. The motivation would have been to provide more info about the sound object to improve rendering.

Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Tsingos, as applied to claim 13 above, in view of Metcalf.
	As to claim 17, Tsingos discloses does not expressly disclose wherein the first part is a frequency subrange of the first audio element.
Tsingos in view of Metcalf discloses wherein the first part is a frequency subrange of the first audio element (Metcalf, ¶0074. Sound object metadata contains frequency range.).
Tsingos and Metcalf are analogous art because they are from the same field of endeavor with respect to virtual sound events.
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to have frequency range in the metadata, as taught by Metcalf. The motivation would have been to provide more info about the sound object to improve rendering.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAMES K MOONEY whose telephone number is (571)272-2412. The examiner can normally be reached Monday-Thursday, 8:30-6:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached on (571) 272-7848. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/JAMES K MOONEY/Primary Examiner, Art Unit 2654