DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 
				      Patent Board Decision
1.	Final rejection was AFFIRMED-IN-PART. Claims 1,3-12,14-24.
	Claims reversed 2,13.
	Applicant amended independent claims with the reversed claims 2 and 13.
		

Allowable Subject Matter
2.	Claims 1, 3-12, 14-24 are allowed.

The following is an examiner’s statement of reasons for allowance: 

Regarding claim 1, in combination with other limitations of the claims the prior art of record fails to disclose or specifically suggest a device configured to support unified audio rendering, the device comprising: an audio decoder configured to decode, from a bitstream, first audio data for a time frame and second audio data for the time frame; a memory configured to store the first audio data and the second audio data; and one or more processors configured to: determine, based on headset capability data representative of one or more capabilities of a headset and prior to rendering the first audio data and the second audio data, the set of virtual speaker locations at which the virtual speakers are located; render the first audio data into first spatial domain audio data for playback by virtual speakers at a set of virtual speaker locations; render the second audio data into second spatial domain audio data for playback by the virtual speakers at the set of virtual speaker locations; mix the first spatial domain audio data and the second spatial domain audio data to obtain mixed spatial domain audio data; and convert the mixed spatial domain audio data to scene-based audio data, when taking the claim as a whole.

The following is an examiner’s statement of reason for allowance: 
The Examiner’s Answer to Appeal Brief (10/27/2020) addressed the previous set of claims with regards to the combined teaches of  Mundt (US 2011/0211702), McCary (US 2013/0322514) and Zadeh (US 2016/0241980). Those references describe, teach and suggest providing a computer system which using computer programs for running on a computer and storage for recording. The encoded bitstream is decoded which generates separate data which represents the original data. The system receives the decoded data being processed which includes the spatial characteristics and virtual sound source to be processed at different directions to be generated at headphone speakers. The system receives the decoded data being processed which includes the spatial data and virtual sound sources that are mixed for speaker output. The spatial data which includes the HRTF for location and orientation in which the mixing of the audio signals are determined. However, by Applicant’s claims have been distinguished from the combination of Mundt, Najaf-Zadeh and McCary. Those references do not describe, teach or suggest the concepts of an audio decoder configured to decode, from a bitstream, first audio data for a time frame and second audio data for the time frame; a memory configured to store the first audio data and the second audio data; and one or more processors configured to: determine, based on headset capability data representative of one or more capabilities of a headset and prior to rendering the first audio data and the second audio data, the set of virtual speaker locations at which the virtual speakers are located; render the first audio data into first spatial domain audio data for playback by virtual speakers at a set of virtual speaker locations; render the second audio data into second spatial domain audio data for playback by the virtual speakers at the set of virtual speaker locations; mix the first spatial domain audio data and the second spatial domain audio data to obtain mixed spatial domain audio data; and convert the mixed spatial domain audio data to scene-based audio data. Thus the prior art on record in correlation with the claim language does not disclose claim as whole. For the foregoing reason, the claims are allowable over the cited prior art.


Regarding claim 12, in combination with other limitations of the claims the prior art of record fails to disclose or specifically suggest a method of supporting unified audio rendering, the method comprising: decoding, by a computing device and from a bitstream, first audio data for a time frame and second audio data for the time frame; 
determining, based on headset capability data representative of one or more capabilities of a headset and prior to rendering the first audio data and the second audio data, the set of virtual speaker locations at which the virtual speakers are located; rendering, by the computing device, the first audio data into first spatial domain audio data for playback by virtual speakers at a set of virtual speaker locations; rendering, by the computing device, the second audio data into second spatial domain audio data for playback by the virtual speakers at the set of virtual speaker locations; mixing, by the computing device, the first spatial domain audio data and the second spatial domain audio data to obtain mixed spatial domain audio data; and converting, by the computing device, the mixed spatial domain audio data to scene-based audio data, when taking the claim as a whole.

The following is an examiner’s statement of reason for allowance: 
The Examiner’s Answer to Appeal Brief (10/27/2020) addressed the previous set of claims with regards to the combined teaches of  Mundt (US 2011/0211702), McCary (US 2013/0322514) and Zadeh (US 2016/0241980). Those references describe, teach and suggest providing a computer system which using computer programs for running on a computer and storage for recording. The encoded bitstream is decoded which generates separate data which represents the original data. The system receives the decoded data being processed which includes the spatial characteristics and virtual sound source to be processed at different directions to be generated at headphone speakers. The system receives the decoded data being processed which includes the spatial data and virtual sound sources that are mixed for speaker output. The spatial data which includes the HRTF for location and orientation in which the mixing of the audio signals are determined. However, by Applicant’s claims have been distinguished from the combination of Mundt, Najaf-Zadeh and McCary. Those references do not describe, teach or suggest the concepts of determining, based on headset capability data representative of one or more capabilities of a headset and prior to rendering the first audio data and the second audio data, the set of virtual speaker locations at which the virtual speakers are located; rendering, by the computing device, the first audio data into first spatial domain audio data for playback by virtual speakers at a set of virtual speaker locations; rendering, by the computing device, the second audio data into second spatial domain audio data for playback by the virtual speakers at the set of virtual speaker locations; mixing, by the computing device, the first spatial domain audio data and the second spatial domain audio data to obtain mixed spatial domain audio data; and converting, by the computing device, the mixed spatial domain audio data to scene-based audio data. Thus the prior art on record in correlation with the claim language does not disclose claim as whole. For the foregoing reason, the claims are allowable over the cited prior art.

Regarding claim 23, in combination with other limitations of the claims the prior art of record fails to disclose or specifically suggest a device configured to support unified audio rendering, the device comprising: means for decoding from a bitstream, first audio data for a time frame and second audio data for the time frame; means  for determining, based on headset capability data representative of one or more capabilities of a headset and prior to rendering the first audio data and the second audio data, the set of virtual speaker locations at which the virtual speakers are located; means for rendering the first audio data into first spatial domain audio data for playback by virtual speakers at the set of virtual speaker locations; means for rendering the second audio data into second spatial domain audio data for playback by the virtual speakers at the set of virtual speaker locations; means for mixing the first spatial domain audio data and the second spatial domain audio data to obtain mixed spatial domain audio data; and means for converting the mixed spatial domain audio data to scene-based audio data, when taking the claim as a whole.

The following is an examiner’s statement of reason for allowance: 
The Examiner’s Answer to Appeal Brief (10/27/2020) addressed the previous set of claims with regards to the combined teaches of  Mundt (US 2011/0211702), McCary (US 2013/0322514) and Zadeh (US 2016/0241980). Those references describe, teach and suggest providing a computer system which using computer programs for running on a computer and storage for recording. The encoded bitstream is decoded which generates separate data which represents the original data. The system receives the decoded data being processed which includes the spatial characteristics and virtual sound source to be processed at different directions to be generated at headphone speakers. The system receives the decoded data being processed which includes the spatial data and virtual sound sources that are mixed for speaker output. The spatial data which includes the HRTF for location and orientation in which the mixing of the audio signals are determined. However, by Applicant’s claims have been distinguished from the combination of Mundt, Najaf-Zadeh and McCary. Those references do not describe, teach or suggest the concepts of means for decoding from a bitstream, first audio data for a time frame and second audio data for the time frame; means  for determining, based on headset capability data representative of one or more capabilities of a headset and prior to rendering the first audio data and the second audio data, the set of virtual speaker locations at which the virtual speakers are located; means for rendering the first audio data into first spatial domain audio data for playback by virtual speakers at the set of virtual speaker locations; means for rendering the second audio data into second spatial domain audio data for playback by the virtual speakers at the set of virtual speaker locations; means for mixing the first spatial domain audio data and the second spatial domain audio data to obtain mixed spatial domain audio data; and means for converting the mixed spatial domain audio data to scene-based audio data. Thus the prior art on record in correlation with the claim language does not disclose claim as whole. For the foregoing reason, the claims are allowable over the cited prior art.


Regarding claim 24, in combination with other limitations of the claims the prior art of record fails to disclose or specifically suggest a non-transitory computer-readable storage medium having sored thereon instructions, that, when executed, cause one or more processors to: decode, from a bitstream, first audio data for a time frame and second audio data for the time frame; determine, based on headset capability data representative of one or more capabilities of a headset and prior to rendering the first audio data and the second audio data, the set of virtual speaker locations at which the virtual speakers are located; render the first audio data into first spatial domain audio data for playback by virtual speakers at the set of virtual speaker locations; render the second audio data into second spatial domain audio data for playback by the virtual speakers at the set of virtual speaker locations; mix the first spatial domain audio data and the second spatial domain audio data to obtain mixed spatial domain audio data; and convert the mixed spatial domain audio data to scene-based audio data, when taking the claim as a whole.

The following is an examiner’s statement of reason for allowance: 
The Examiner’s Answer to Appeal Brief (10/27/2020) addressed the previous set of claims with regards to the combined teaches of  Mundt (US 2011/0211702), McCary (US 2013/0322514) and Zadeh (US 2016/0241980). Those references describe, teach and suggest providing a computer system which using computer programs for running on a computer and storage for recording. The encoded bitstream is decoded which generates separate data which represents the original data. The system receives the decoded data being processed which includes the spatial characteristics and virtual sound source to be processed at different directions to be generated at headphone speakers. The system receives the decoded data being processed which includes the spatial data and virtual sound sources that are mixed for speaker output. The spatial data which includes the HRTF for location and orientation in which the mixing of the audio signals are determined. However, by Applicant’s claims have been distinguished from the combination of Mundt, Najaf-Zadeh and McCary. Those references do not describe, teach or suggest the concepts of means for decoding from a bitstream, first audio data for a time frame and second audio data for the time frame; means  for determining, based on headset capability data representative of one or more capabilities of a headset and prior to rendering the first audio data and the second audio data, the set of virtual speaker locations at which the virtual speakers are located; means for rendering the first audio data into first spatial domain audio data for playback by virtual speakers at the set of virtual speaker locations; means for rendering the second audio data into second spatial domain audio data for playback by the virtual speakers at the set of virtual speaker locations; means for mixing the first spatial domain audio data and the second spatial domain audio data to obtain mixed spatial domain audio data; and means for converting the mixed spatial domain audio data to scene-based audio data. Thus the prior art on record in correlation with the claim language does not disclose claim as whole. For the foregoing reason, the claims are allowable over the cited prior art.

      Citation of Prior Art
3.	In view of (US 2016/0241980), Najaf-Zadeh discloses equipment having a memory element for storing a set of head-related transfer functions. A processor receives an audio signal, where the audio signal comprises a set of ambisonic signals. The processor identifies orientation of the equipment based on physical properties of the equipment. The processor rotates the set of ambisonic signals based on the orientation of the equipment. The processor filters the set of ambisonic signals using the set of head-related transfer functions to form speaker signals. The processor outputs the speaker signals. The equipment utilizes a set of head-related transfer functions to reduce computation overhead, if binaural signals are generated directly from high order ambisonic (HOA) signals without mapping HOA signals to virtual loudspeakers. The equipment reduces common distortions such as coloration, and loudness preservation to improve audio quality of downmixed audio. The equipment performs auditory masking in an effective manner as sounds come from same direction in a downmixed sound field (see ¶ 0067, 0069, 0077-0080). 

In view of (US 2013/0322514), McCary discloses a system that involves producing digital radio transmission containing an audio/song and simultaneously displaying video/lyrics of a song plus additional song information or images, with an enhanced lyric background. An EISL is produced by a producer and the EISL is distributed to a digital radio Broadcaster. The songs with lyrics, images, and information are encoded in a proprietary bit stream format along with program associated data (PAD), other associated song data added by the producer, and distributed to the broadcasters as a data stream. An enhanced process is provided for producing, broadcasting and receiving digital radio transmission with cost effective radios (see claim 10). 

In view of (US 2011/0211702), Mundt discloses a similarity reduction unit reduces the similarity between the left and right channels, the front and rear channels, and the center and non-center channels of several channels of a multi-channel signal, to obtain an inter-similarity reduced channel set. Several directional filters perform modeling of acoustic transmission of respective inter-similarity reduced channel set from a virtual sound source position, according to the hearing capacity of a listener. The adders mix the outputs of the filters, to obtain a channel of the binaural signal. The device for generating binaural signal for yielding a more stable and pleasant headphone reproduction is provided, hence the listening results can be improved. The voice in movie dialogs and music can be perceived clearly, while rendering multi-channel signals for headphone reproduction. By reducing the inter-similarity of channels of the multi-channel input signal, the spatial width of the binaural output signal can be increased and the externalization can be improved (see fig. 7, 10-11, ¶ 0038-0046, 0062, 0071-0074). 

The references cited above neither in combination or alone do not disclose the functionality of the claimed limitations when taking the claimed invention as a whole.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

	Conclusion

4.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to ASSAD MOHAMMED whose telephone number is (571)270-7253.  The examiner can normally be reached on 9:00AM-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached on 571-272-7503.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/ASSAD MOHAMMED/Examiner, Art Unit 2651 

/DUC NGUYEN/Supervisory Patent Examiner, Art Unit 2651