Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
This is in response to applicant's amendment which was filed on 8/25/2021 and has been entered. Claims 1, 6-7, 9-10, 17, 25, 29-31, 33, and 49 have been amended. Claims 2-4, 8, 24, 26-28, 32, 34-41, and 48 have been cancelled. Claims 50-56 have been added. Claims 1, 5-7, 9-23, 25, 29-31, 33, 42-47, 49-56 are still pending in this application, with claims 1, 10, 25, and 49 being independent.
 
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.
The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:


Claim 47 is rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.  The limitation of claim 47 has already been claimed in claim 25, therefore does not further limit the subject matter of claim 25.  Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having 

Claims 1, 5-7, 9-13, 15-16, 18, 21, 23, 25, 29-31, 33, 44, 46-47, 49-52 and 54-55 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glaser (US 20180088900) in view of Murtaza (WO 2019/072984).
Regarding claim 1, Glaser teaches A content consumer device configured to play one or more of a plurality of audio streams (Glaser figure 1 and ¶0061, “The audio generator 140” and ¶0064), the content consumer device comprising:
a memory (Glaser ¶0102, “memory”) configured to store the plurality of audio streams (Glaser figure 4, and ¶046, “map of audio sources”) and audio location information associated with the plurality of audio streams (Glaser ¶0035, “detect and/or predict locations of microphones and their relative positioning”) and representative of audio stream coordinates (It is old and known in the art that determining a position of a device can be done using coordinates. See pertinent art Beaty Col 4 lines 52-67) in an acoustical space where an audio stream was captured (Glaser figure 1 and ¶0031, “microphones 110 functions to acquire multiple audio inputs from a set of distinct location”) or audio stream coordinates in a virtual acoustical space where an audio stream was synthesized or both, each of the audio streams representative of a soundfield (Glaser figure 4); and one or more processors coupled to the memory (Glaser figure 1 and ¶0102), and configured to: 
determine device location information representative of device coordinates of the content consumer device in the acoustical space (Glaser ¶0091, “Collecting agent environment orientation can include sensing position using GPS … smart direction of the glasses.” GPS inherently uses coordinates. See also pertinent art Pong Col 4 lines 21-26 teaching a headset including orientation sensors); 
determine an audio source distance based on the audio location information as a distance between an audio source in the acoustical space (Glaser figure 13, positions of sound sources P1-P3) and the device coordinates (Glaser ¶0091, “based on the environment orientation, the method may determine the audio sources in near proximity and/or in the direction of attention of a person, and appropriately set the audio processing properties for audio sources of interest”): 
select, a single audio stream of the plurality of audio streams as a subset of the plurality of audio streams, the single audio stream having a shortest audio source distance (Glaser ¶0091, “based on the environment orientation, the method may determine the audio sources in near proximity and/or in the direction of attention of a person, and appropriately set the audio processing properties for audio sources of interest” and figure 13, P1 and P2 are outputted but P3 is muted. It is obvious to one with ordinary skills in the art that various scenarios with different audio source position and user position will adjust the audio source attenuations. One scenario may be where an audio source of interest is a single source and the rest of the audio sources are behind the user. For example in figure 13, If both sources P3 and P1 were behind the ; and 
output, based on the subset of the plurality of audio streams, one or more speaker feeds (Glaser figure 8, S140), however does not explicitly teach coordinates of the content consumer device, compare the audio source distance to an audio source distance threshold; select, when the audio source distance is greater than the audio source distance threshold, a single audio stream of the plurality of audio streams as a subset of the plurality of audio streams, the single audio stream having a shortest audio source distance.
Murtaza teaches coordinates of the content consumer device (Murtaza Page 24 ¶2, “The media consumption device is also responsible for collection information about user location and/or orientation and/or direction of movement” and page 3, Terminology and Definitions: “User position information: location information (e.g., x, y, z coordinates”), compare the audio source distance (Page 22 last ¶, “rendering of each audio source is adapted to the user position…level of the audio source is higher when the user is closer to the position of the audio source, and lower when the user is more distant from the audio source,”) to an audio source distance threshold (Murtaza figure 3 and Page 30, The coordinates that mark the transition between scenes can be considered thresholds. The coordinates of the transition door that completely phase out audio element 152A and 152B); select, when the audio source distance is greater than the audio source distance threshold, a single audio stream of the plurality of audio streams (Murtaza Page 23 ¶2, “creating one or more streams for each available as a subset of the plurality of audio streams (Murtaza Page 22 last ¶, “Each audio element (audio source) is notwithstanding encoded in audio streams that are provided to the decoder,” “rendering of each audio source is adapted to the user position…level of the audio source is higher when the user is closer to the position of the audio source, and lower when the user is more distant from the audio source,” See also Page 23 last 2¶, “just individual audio objects”), the single audio stream having a shortest audio source distance (Murtaza figure 3, once the distance between user and source 152A exceeds the transition door area and reaches scene B, the audio stream for element 152B is selected which would have the shortest audio source distance to the user).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Murtaza to improve the known device of Glaser to achieve the predictable result of accurately determining the position of a user to provide optimal directionality.

Regarding claims 5 and 29, Glaser in view of Murtaza teaches wherein the one or more processors are further configured to: obtain a new audio stream and corresponding new audio location information; and update the subset of the plurality of audio streams to include the new audio stream (Glaser ¶0080 “automatic detection audio source locations and setting of an audio control inputs,” “The location property can additionally be updated”).

Regarding claims 6 and 30, Glaser in view of Murtaza teaches wherein the one or more processors are further configured to: determine, based on the plurality of audio streams, an energy map representative of an energy of a common soundfield represented by the plurality of audio streams (Glaser figure 4, Audio Map); wherein the selection of the single audio stream is further based on the energy map (Glaser figure 11, selection of which audio source to be mute is based on the Audio spatial analyzer).

Regarding claims 7 and 31, Glaser in view of Murtaza teaches wherein the one or more processors are further configured to: analyze the energy map to determine the audio source location information (Glaser ¶0022, “detection of audio sources may be achieved through analysis of the audio map”).

Regarding claims 9 and 33, Glaser in view of Murtaza teaches wherein the one or more processors are configured to: select, when the audio source distance is less than or equal to the audio source distance threshold, multiple audio streams of the plurality of audio streams as the subset of the plurality of audio streams, the multiple audio streams being the subset of the plurality of audio streams with the audio stream coordinates surrounding the device coordinates (Murtaza Page 19, last 3 ¶ with 3 hyphens).

Regarding claim 10, Glaser teaches A content consumer device configured to play one or more of a plurality of audio streams (Glaser figure 1 and ¶0061, “The , the content consumer device comprising: a memory (Glaser ¶0102, “memory”) configured to store the plurality of audio streams (Glaser figure 4, and ¶046, “map of audio sources”) and audio location information associated with the plurality of audio streams (Glaser ¶0035, “detect and/or predict locations of microphones and their relative positioning”) and representative of audio stream coordinates (It is old and known in the art that determining a position of a device can be done using coordinates. See pertinent art Beaty Col 4 lines 52-67) in an acoustical space where an audio stream was captured (Glaser figure 1 and ¶0031, “microphones 110 functions to acquire multiple audio inputs from a set of distinct location”) or audio stream coordinates in a virtual acoustical space where an audio stream was synthesized or both, each of the audio streams representative of a soundfield (Glaser figure 4): and one or more processors coupled to the memory (Glaser figure 1 and ¶0102), and configured to: determine device location information representative of device coordinates of the content consumer device in the acoustical space (Glaser ¶0091, “Collecting agent environment orientation can include sensing position using GPS … smart direction of the glasses.” GPS inherently uses coordinates. See also pertinent art Pong Col 4 lines 21-26 teaching a headset including orientation sensors), determine first audio stream coordinates for a first audio stream based on the audio location information (Glaser figure 4 and ¶0037, “Audio feature detection…used in calculating location…deep learning model trained on locating of audio sources,” See also ¶0038); determine a first audio source distance as a distance between the first audio stream coordinates (Glaser figure 13, positions of sound sources P1-P3) and the device coordinates (Glaser ¶0091, “based on the environment orientation, the method may determine the audio sources in near proximity and/or in the direction of attention of a person, and appropriately set the audio processing properties for audio sources of interest”); select, the first audio stream of the plurality of audio streams (Glaser ¶0091, “based on the environment orientation, the method may determine the audio sources in near proximity and/or in the direction of attention of a person, and appropriately set the audio processing properties for audio sources of interest” and figure 13, P1 and P2 are outputted but P3 is muted. It is obvious to one with ordinary skills in the art that various scenarios with different audio source position and user position will adjust the audio source attenuations. One scenario may be where an audio source of interest is a single source and the rest of the audio sources are behind the user. For example in figure 13, If both sources P3 and P1 were behind the user, they will both be muted and only audio source P2 is reproduced. See also ¶0084); and output, based on the first audio stream, one or more speaker feeds (Glaser figure 8, S140), however does not explicitly teach, coordinates of the content consumer device, compare the first audio source distance to a first audio source distance threshold; select, when the first audio source distance is less than or equal to the first audio source distance threshold, wherein the first audio stream is an only audio stream selected.

Murtaza teaches coordinates of the content consumer device (Murtaza Page 24 ¶2, “The media consumption device is also responsible for collection information about user location and/or orientation and/or direction of movement” and page 3, , compare the first audio source distance (Page 22 last ¶, “rendering of each audio source is adapted to the user position…level of the audio source is higher when the user is closer to the position of the audio source, and lower when the user is more distant from the audio source,”) to a first audio source distance threshold (Murtaza figure 3 and Page 30, The coordinates that mark the transition between scenes can be considered thresholds. The coordinates of the transition door that completely phase out audio element 152A and 152B); select, when the first audio source distance is less than or equal to the first audio source distance threshold, the first audio stream of the plurality of audio streams (Murtaza figure 3, when user is within the zone of first scene A only, wherein entering the transition door would be greater than the threshold), wherein the first audio stream is an only audio stream selected (Murtaza figure 3, and Page 23 ¶2, “creating one or more streams for each available scene 150 associated with one sound scene part of one viewpoint,” in other words, there are plurality of scenes wherein each scene may comprise only one stream).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Murtaza to improve the known device of Glaser to achieve the predictable result of accurately determining the position of a user to provide optimal directionality.



Regarding claims 12 and 51, Glaser in view of Murtaza teaches wherein the one or more processors are configured to combine the first audio stream and the second audio stream by at least one of adaptive mixing the first audio stream and the second audio stream or interpolating a third audio stream based on the first audio stream and the second audio stream (Murtaza figure 3, “streams of both scenes”).

Regarding claims 13 and 52, Glaser in view of Murtaza teaches wherein the one or more processors are configured to combine the first audio stream and the second 

Regarding claims 15 and 54, Glaser in view of Murtaza teaches wherein the one or more processors are further configured to: select, when the second audio source distance is less than or equal to the second audio source threshold, the second audio stream of the plurality of audio streams; and output, based on the second audio stream, one or more speaker feeds, wherein the second audio stream is an only audio stream selected (Murtaza figure 3, when user does not enter transition scene, the second audio source distance is less than the second audio source threshold with BRI).

Regarding claims 16 and 55, Glaser in view of Murtaza teaches wherein the one or more processors are further configured to select a different audio stream based the device coordinates changing (Murtaza Page 20 ¶2 discloses a transitional position would provide two streams of audio versus reaching a boundary where one of the audio sources is no longer relevant. The single stream can be the audio from the new audio scene when the device is in a position where the old audio scene completely phases out. The new audio scene would have the shortest distance to the device coordinates).

Regarding claim 18, Glaser in view of Murtaza teaches wherein the audio stream coordinates in the acoustical space or the audio stream coordinates in the virtual acoustical space are coordinates in a displayed world in relation to which the corresponding audio stream was captured or synthesized (Murtaza Page 26 last ¶, 

Regarding claims 21 and 44, Glaser in view of Murtaza teaches wherein the content consumer device comprises a mobile handset (Glaser ¶0102, “mobile device”).

Regarding claims 23 and 46, Glaser in view of Murtaza teaches wherein the one or more processors are further configured to only decode the subset of the plurality of audio streams, in response to the selection (Glaser figure 13, P3 is muted).

Regarding claim 25, Glaser teaches A method of playing one or more of a plurality of audio streams (Glaser figure 1 and ¶0061, “The audio generator 140” and ¶0064), the method comprising: Storing, by a memory of a content consumer device (Glaser ¶0102, “memory”), the plurality of audio streams (Glaser figure 4, and ¶046, “map of audio sources”) and audio location information associated with the plurality of audio streams (Glaser ¶0035, “detect and/or predict locations of microphones and their relative positioning”) and representative of audio stream coordinates (It is old and known in the art that determining a position of a device can be done using coordinates. See pertinent art Beaty Col 4 lines 52-67) in an acoustical space where an audio stream was captured (Glaser figure 1 and ¶0031, “microphones 110 functions to acquire multiple audio inputs from a set of distinct location”) or audio stream coordinates in a virtual acoustical space where an audio stream was synthesized or both, each of the audio streams representative of a soundfield (Glaser figure 4); and determining, by one or more processors of the content consumer device, device location information representative of device coordinates of the content consumer device in the acoustical space (Glaser ¶0091, “Collecting agent environment orientation can include sensing position using GPS … smart direction of the glasses.” GPS inherently uses coordinates. See also pertinent art Pong Col 4 lines 21-26 teaching a headset including orientation sensors);

determining, by the one or more processors, an audio source distance based on the audio location information as a distance between an audio source in the acoustical space (Glaser figure 13, positions of sound sources P1-P3) and the device coordinates (Glaser ¶0091, “based on the environment orientation, the method may determine the audio sources in near proximity and/or in the direction of attention of a person, and appropriately set the audio processing properties for audio sources of interest”);

selecting, a single audio stream of the plurality of audio streams as a subset of the plurality of audio streams, the single audio stream having a shortest audio source distance (Glaser ¶0091, “based on the environment orientation, the method may determine the audio sources in near proximity and/or in the direction of attention of a person, and appropriately set the audio processing properties for audio sources of interest” and figure 13, P1 and P2 are outputted but P3 is muted. It is obvious to one with ordinary skills in the art that various scenarios with different audio source position and user position will adjust the audio source attenuations. One scenario may be where , and outputting, by the one or more processors and based on the subset of the plurality of audio streams, one or more speaker feeds (Glaser figure 8, S140), however does not explicitly teach coordinates of the content consumer device, comparing, by the one or more processors, the audio source distance to an audio source distance threshold; selecting, when the audio source distance is greater than the audio source distance threshold, a single audio stream of the plurality of audio streams as a subset of the plurality of audio streams, the single audio stream having a shortest audio source distance.
.
Murtaza teaches coordinates of the content consumer device (Murtaza Page 24 ¶2, “The media consumption device is also responsible for collection information about user location and/or orientation and/or direction of movement” and page 3, Terminology and Definitions: “User position information: location information (e.g., x, y, z coordinates”), comparing, by the one or more processors, the audio source distance (Page 22 last ¶, “rendering of each audio source is adapted to the user position…level of the audio source is higher when the user is closer to the position of the audio source, and lower when the user is more distant from the audio source,”) to an audio source distance threshold (Murtaza figure 3 and Page 30, The coordinates that mark the transition between scenes can be considered thresholds. The coordinates of the transition door that completely phase out audio element 152A and 152B); selecting, when the audio source distance is greater than the audio source distance threshold, a single audio stream of the plurality of audio streams (Murtaza Page 23 ¶2, “creating one or more streams for each available scene 150 associated with one sound scene part of one viewpoint,” in other words, there are plurality of scenes wherein each scene may comprise only one stream) as a subset of the plurality of audio streams (Murtaza Page 22 last ¶, “Each audio element (audio source) is notwithstanding encoded in audio streams that are provided to the decoder,” “rendering of each audio source is adapted to the user position…level of the audio source is higher when the user is closer to the position of the audio source, and lower when the user is more distant from the audio source,” See also Page 23 last 2¶, “just individual audio objects”), the single audio stream having a shortest audio source distance (Murtaza figure 3, once the distance between user and source 152A exceeds the transition door area and reaches scene B, the audio stream for element 152B is selected which would have the shortest audio source distance to the user).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Murtaza to improve the known device of Glaser to achieve the predictable result of accurately determining the position of a user to provide optimal directionality.

Regarding claim 47, Glaser in view of Murtaza teaches determining, by the one or more processors, an audio source distance as a distance between an audio source in the acoustical space and the device coordinates; comparing, by the one or more 

Regarding claim 49, Glaser teaches A non-transitory computer-readable storage medium having stored thereon instructions that, when executed, cause one or more processors of a content consumer device (Glaser figure 1 and ¶0061, “The audio generator 140” and ¶0064) to: store a plurality of audio streams (Glaser figure 4, and ¶046, “map of audio sources”) and audio location information associated with the plurality of audio streams (Glaser ¶0035, “detect and/or predict locations of microphones and their relative positioning”) and representative of audio stream coordinates (It is old and known in the art that determining a position of a device can be done using coordinates. See pertinent art Beaty Col 4 lines 52-67) in an acoustical space where an audio stream was captured (Glaser figure 1 and ¶0031, “microphones 110 functions to acquire multiple audio inputs from a set of distinct location”) or audio stream coordinates in a virtual acoustical space where an audio stream was synthesized or both, each of the audio streams representative of a soundfield (Glaser figure 4); and

determine device location information representative of device coordinates of the content consumer device in the acoustical space (Glaser ¶0091, “Collecting agent ;

determine first audio stream coordinates for a first audio stream based on the audio location information (Glaser figure 4 and ¶0037, “Audio feature detection…used in calculating location…deep learning model trained on locating of audio sources,” See also ¶0038);

determine a first audio source distance as a distance between the first audio stream coordinates (Glaser figure 13, positions of sound sources P1-P3) and the device coordinates (Glaser ¶0091, “based on the environment orientation, the method may determine the audio sources in near proximity and/or in the direction of attention of a person, and appropriately set the audio processing properties for audio sources of interest”);

select, the first audio stream of the plurality of audio streams (Glaser ¶0091, “based on the environment orientation, the method may determine the audio sources in near proximity and/or in the direction of attention of a person, and appropriately set the audio processing properties for audio sources of interest” and figure 13, P1 and P2 are outputted but P3 is muted. It is obvious to one with ordinary skills in the art that various scenarios with different audio source position and user position will adjust the audio source attenuations. One scenario may be where an audio source of interest is a single , and

output, based on the first audio stream, one or more speaker feeds (Glaser figure 8, S140), however does not explicitly teach coordinates of the content consumer device; compare the first audio source distance to a first audio source distance threshold; select, when the first audio source distance is less than or equal to the first audio source distance threshold, the first audio stream of the plurality of audio streams; wherein the first audio stream is an only audio stream selected.

Murtaza teaches coordinates of the content consumer device (Murtaza Page 24 ¶2, “The media consumption device is also responsible for collection information about user location and/or orientation and/or direction of movement” and page 3, Terminology and Definitions: “User position information: location information (e.g., x, y, z coordinates”); compare the first audio source distance (Page 22 last ¶, “rendering of each audio source is adapted to the user position…level of the audio source is higher when the user is closer to the position of the audio source, and lower when the user is more distant from the audio source,”) to a first audio source distance threshold (Murtaza figure 3 and Page 30, The coordinates that mark the transition between scenes can be considered thresholds. The coordinates of the transition door that completely phase out audio element 152A and 152B); select, when the first audio source distance is less than or equal to the first audio source distance threshold, the first audio stream of the plurality of audio streams (Murtaza figure 3, when user is within the zone of first scene A only, wherein entering the transition door would be greater than the threshold); wherein the first audio stream is an only audio stream selected (Murtaza figure 3, and Page 23 ¶2, “creating one or more streams for each available scene 150 associated with one sound scene part of one viewpoint,” in other words, there are plurality of scenes wherein each scene may comprise only one stream).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Murtaza to improve the known device of Glaser to achieve the predictable result of accurately determining the position of a user to provide optimal directionality.

Claims 14, 38 and 53 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glaser (US 20180088900) in view of Murtaza (WO 2019/072984) in further view of Mentz (US 2016/0295341).

Regarding claims 14, 38, and 53 Glaser in view of Murtaza does not explicitly teach wherein the one or more processors are further configured to: determine whether the device coordinates have been steady relative to the first audio source distance threshold and the second audio source distance threshold for a predetermined period of time; and based on the device coordinates being steady relative to the first audio source distance threshold and the second audio source distance threshold for a predetermined 

Mentz teaches wherein the one or more processors are further configured to: determine whether the device coordinates have been steady relative to the first audio source distance threshold and the second audio source distance threshold for a predetermined period of time (Mentz ¶0015 and table, “Hold the sound stage still while the user rotates head such that source sounds as if at stationary point while the user rotates head with respect to same stationary point”); and based on the device coordinates being steady relative to the first audio source distance threshold and the second audio source distance threshold for a predetermined period of time, select the first audio stream, the first audio stream and the second audio stream, or the second audio stream (Mentz ¶0015 and table, “Place multiple sources anywhere in a disc or hemisphere, such that each of the multiple source sounds as if originating at a certain angular positon”).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Mentz to improve the known device of Glaser in view of Murtaza to achieve the predictable result of a more realistic surround sound 3D audio experience in every spatial position.

Claims 17 and 56 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glaser (US 20180088900) in view of Murtaza (WO 2019/072984) in further view of Galera (US 2019/0147655).

Regarding claims 17 and 56, Glaser in view of Murtaza does not explicitly teach wherein the one or more processors are further configured to provide an alert to a user based on the first audio source distance equaling the first audio source distance threshold, wherein the alert is at least one of a visual alert, an auditory alert other than a change from selecting the first audio source to selecting a different audio source, or a haptic alert.

Galera teaches wherein the one or more processors are further configured to provide an alert to a user based on the first audio source distance equaling the first audio source distance threshold (Galera ¶0005, “safety zone configuration data from a safety sensor device, the safety zone configuration data defining dimensions of a safety field defined for the safety sensor device; and in response to determining that the user location data and the user orientation data indicate that a portion of the safety field is within a field of view of the wearable appliance, rendering, by the system on the augmented reality presentation based on the safety zone configuration data, a graphical representation of the portion of the safety field”), wherein the alert is at least one of a visual alert (Galera ¶0056, “employ color or position animations based on state”), an 

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Galera to improve the known device of Glaser in view or Murtaza to achieve the predictable result of informing the user of their current environment for improved awareness.

Claims 19-20, 22, 42-43 and 45 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glaser (US 20180088900) in view of Murtaza (WO 2019/072984) in further view of Mindlin (US 10484811).

Regarding claims 19 and 42, Glaser in view of Murtaza does not explicitly teach wherein the content consumer device comprises an extended reality headset, and wherein the displayed world comprises a scene represented by video data captured by a camera.

Mindlin teaches a content consumer device comprises an extended reality headset, and wherein the displayed world comprises a scene represented by video data captured by a camera (Mindlin figure 2A and Col 6 lines 49-67, “For example, an extended reality world may be a virtual reality world in which the entire real-world environment in which the user is located is replaced by a virtual world (e.g., a computer-

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Mindlin to improve the known device of Glaser in view of Murtaza to achieve the predictable result of a more accurate and immersive simulation of audio (Mindlin Col 3 lines 10-26).

Regarding claims 20 and 43, Glaser in view of Murtaza in further view of Mindlin teaches wherein the content consumer device comprises an extended reality headset, and wherein the displayed world comprises a virtual world (Mindlin figure 2A and Col 6 lines 49-67, “For example, an extended reality world may be a virtual reality world in which the entire real-world environment in which the user is located is replaced by a virtual world (e.g., a computer-generated virtual world, a virtual world based on a real-world scene that has been captured or is presently being captured with video footage from real world video cameras, or the like”).

Regarding claims 22 and 45, Glaser in view of Murtaza in further view of Mindlin teaches a transceiver configured to wirelessly receive the plurality of audio streams, wherein the transceiver is configured to wirelessly receive the plurality of audio streams .

Response to Arguments
Applicant's arguments filed 8/25/2021 have been fully considered but they are not persuasive. Applicant argues on pages 15-16 of Remarks that cited references Glaser in view of Murtaza does not teach the amended claims because the example of Murtaza discusses delivering all 10 audio objects without mixing, not a single audio stream. Examiner respectfully disagrees. Although Examiner does agrees that said particular example of Murtaza does not teach the amended limitations, Murtaza does teach on Page 23 that each sound scene (Murtaza figure 1.2) may contain only one stream (Murtaza Page 23 third ¶ “one or more streams 106 for each available scene 150 associated with one sound scene part of one viewpoint”), and each stream may contain only one audio object (Murtaza Page 23 last 2¶, “the boundaries of each audio scene…just individual audio objects,” see also figure 3, Scene A and B have only one audio source 152A or 152B). Applicant also argues regarding claim 11, that an audio scene of Murtaza is not equivalent to an audio stream and that each scene contemplates a plurality of streams. Examiner respectfully disagrees. Although Murtaza does teach each scene can comprise multiple streams, Murtaza also teach each stream can comprises only one stream as previously cited. As shown in figure 3 of Murtaza, when the user is in the transition scene, the distance between the user and source 152A has increased and exceeded a threshold compared to when the user is in Scene A. Likewise, when the user is in the transition scene, the distance between the user and .

Applicant argues on pages 17 of Remarks that claim 15 is not taught because Murtaza page 19 with the 3 hyphens does not teach the limitations. Examiner respectfully disagrees. Figure 3 of Murtaza clearly teaches the limitations as cited in the current rejection. Therefore the arguments are not persuasive and the claims stand rejected.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Beaty (US 9042563), Pong (US 9445172).
 
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NORMAN YU whose telephone number is (571)270-7436.  The examiner can normally be reached on Mon - Fri 11am-7pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached on 571-272-7488.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Any response to this action should be mailed to:
                        Commissioner of Patents and Trademarks
                        P.O. Box 1450
                        Alexandria, Va.  22313-1450
        Or faxed to:
                    (571) 273-8300, for formal communications intended for entry and for 
                     informal or draft communications, please label “PROPOSED” or “DRAFT”.
                                Hand-delivered responses should be brought to: 

                         Customer Service Window 
                         Randolph Building 
                         401 Dulany Street 
                         Arlington, VA 22314




/NORMAN YU/Primary Examiner, Art Unit 2652