DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . See 35 U.S.C. § 100 (note).
Continued Examination
A request for continued examination under 37 C.F.R. § 1.114, including the fee set forth in 37 C.F.R. § 1.17(e), was filed in this application on 24 February 2022 after final rejection.  Since this application is eligible for continued examination under 37 C.F.R. § 1.114, and the fee set forth in 37 C.F.R. § 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 C.F.R. § 1.114.  Applicant's submission filed on 24 February 2022 has been entered.
Art Rejections
Obviousness
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 16, 26, 27–29 and 35 are rejected under 35 U.S.C. § 103 as being unpatentable over the combination of US Patent Application Publication 2020/0221248 (effectively filed 29 September 2017) (“Eubank”); US Patent Application Publication 2018/0349406 (filed 25 September 2017) (“Shortlidge”) and US Patent Application Publication 2005/0138540 (published 23 June 2005) (“Baltus”) and CN 106162378 (published 23 November 2016) (“Liu”).
Claims 17–23 and 30–34 are rejected under 35 U.S.C. § 103 as being unpatentable over the combination of Eubank; Shortlidge; Baltus; Liu and US Patent 8,531,602 (patented 10 September 2013) (“Tudor”).
Claim 24 is rejected under 35 U.S.C. § 103 as being unpatentable over the combination of Eubank; Shortlidge; Baltus; Liu and US Patent Application Publication 2009/0282335 (published 12 November 2009) (“Alexandersson”).
Claim 16 is drawn to “an apparatus.” The following table illustrates the correspondence between the claimed apparatus and the Eubank reference.
Claim 16
The Eubank Reference
“16. An apparatus comprising:
Similarly, the Eubank reference describes a computerized data processing system for spatial audio processing. Eubank at Abs., ¶¶ 2, 5.
“at least one processor; and
“at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following:
Eubank’s system includes a processor and a storage medium that stores a sequence of instructions that are executed by the processor. Id. at ¶¶ 43, 44.
“select spatial audio content in dependence upon a position of a user;
“render, for consumption by the user, the selected spatial audio content comprising a first spatial audio content;
Eubank’s system similarly selects and renders spatial audio content based on a user’s position in at least two manners. Eubank’s system implements both a simulated reality (SR) experience mode 106 and a preview mode 108 in which a simulated reality (SR) sound end-user, or designer, may preview a SR audio object (spatial audio content) for use in SR experience mode 106.
An end user, or designer, may select an audio object for previewing, or rendering, by moving throughout a virtual environment and then peeking into rooms associated with different sound objects 206. Id. This allows a user to preview (select and render) different objects based on the user’s location 410 and orientation 412. Id. The user may then enter the room to trigger the full SR experience mode 106.
Further, in preview mode 108, the user, or designer, uses a GUI to manipulate a 3D-representation of a spatial audio object 206 and to position himself with respect to object 206. Id. at ¶¶ 37, 40–42, FIG.4. This process allows the user, or designer, to virtually select a spatialized sound source 11, 13 contained in object 206 based on the position of the object and the position of the user. Id. Eubank’s system effects the selection by weighting the sound sources based on their proximity to the user, or designer, or by triggering playback based on proximity. Id. at ¶¶ 37, 40–42, FIG.4. Eubank’s system then reproduces/renders the weighted sound sources for the user, or designer. Id.
“responsive to user consumption of the first spatial audio content, update recorded data related to the first spatial audio content with spatial audio metadata…
“use the spatial audio metadata within the recorded data to detect another [[new]] event relating to the first spatial audio content; and
Eubank’s system is computerized, id. at ¶¶ 43, 44, and accordingly makes continuous recordings of data relating to object 206, including the position of the object, to update the system’s representation of the object, see id. at ¶ 37, FIG.3. Eubank describes a use scenario where the designer makes changes to the object, such as a virtual turning of the object. Id. This would generate updated metadata relating to the object’s position, which is recorded and used to detect a new position that is used to update the object’s displayed position and the audio preview. Id.
“wherein the user consumption of the first spatial audio content is determined based on the position of the user in the virtual space correlating with a position of the first spatial audio content in the virtual space for at least a predetermined amount of time, and
Eubank describes a user consumption process that involves previewing SR audio objects when a user peeks at the object—for example, peeking into a room in order to select the audio object for a preview. The reference, however, does not describe how a user peeks. Accordingly, Eubank does not anticipate the claimed determination of consumption based on the position of the user in a virtual space correlating with the position of spatial audio content.
“wherein the spatial audio metadata comprises data identifying the first spatial audio content, a version identifier of the first spatial audio content, and at least one of an indication of when the user consumed the first spatial audio content, an indication of the user who consumed the first spatial audio content, an indication of a user device associated with rendering the first spatial audio content, an indication of a position of the user when the first spatial audio content was consumed, or a starting point of consumption or an ending point of consumption within the first audio spatial content;
The Eubank reference describes responding to a user’s consumption of object 206 by updating metadata associated with object 206, including relative positioning between a user and audio sources within the object. Id.
Eubank, however, does not describe responding to user consumption of object 206 by updating identifying data for object 206, a version identifier of object 206 and one of when, who, what device, starting point or ending point of consumption.
“provide a user-selectable option to enable rendering, for consumption by the user, of the first spatial audio content by rendering a simplified sound object representative of the first spatial audio content.”
Eubank’s system similarly provides a designer with the ability to preview object 206 as a simplified, mono object. Id. at ¶¶ 39–42, FIG.4. For example, a user may position/orient himself in a virtual environment to peek inside a room. Id. at ¶ 41. Further, the user may position object 206 in such a way that only one sound source is presented on a single output channel so that multiple sound sources are presented on a single output channel. Id. at ¶ 38.

Table 1
The foregoing table shows that the Eubank reference describes a data processing system that corresponds closely to the claimed apparatus. The Eubank reference describes updating metadata in response to a user’s consumption of first spatial audio content (e.g., Eubank’s audio object 206). For example, Eubank describes updating position metadata, such as a user’s position relative to spatial sound object 206. See Eubank at ¶¶ 37, 38, FIGs.2, 3. Claim 16, on the other hand, requires updating a combination of metadata that is not updated in Eubank. Claim 16 specifically requires updating identifying data for object 206, a version identifier of object 206 and one of when, who, what device, user position, starting point or ending point of consumption. Eubank only updates user position. Moreover, Eubank does not determine user consumption based on a user’s virtual position correlating withe the virtual position of an audio object (i.e., spatial audio content) for a predetermined period of time.
The differences between Eubank and the claimed invention are such that the invention as a whole would have been obvious to one of ordinary skill in the art at the time of filing. Eubank’s system enables a user to navigate a virtual environment in order to select, preview and experience audio objects distributed throughout the environment. Notably, Eubank’s system is configured so an end user may simply navigate, preview and experience the audio objects and so a designer may navigate, preview and experience the audio objects as part of an audio design process.
The Shortlidge reference teaches and suggests a refinement to Eubank’s audio design process that enables collaboration between parties. Shortlidge at ¶ 31. Shortlidge describes a system that allows two users to share media and to collaborate in the editing of the media. Id. For example, two musicians may be working on a piece of music. Id. Shortlidge’s system records and updates metadata concerning the music based on the accessing, consumption and changes made to the music by the collaborating users. Id. A first of the collaborating users shares updated music files with a second one of the users. Id. Shortlidge’s system performs a database comparison between the updated music file and the original music file stored on the second user’s computer. Id. at ¶¶ 31, 39, 41, 108–112, FIG.6. Any inconsistencies are resolved through application of a set of rules, which may include highlighting the differences. Id. at ¶ 39. Shortlidge describes tracking and updating numerous pieces of metadata including version number (e.g., time stamp of a modification), when the data was created, the device used to create the data, the person that created the data, etc. Id. at ¶¶ 48, 50, 55.
In a similar vein, the Baltus reference teaches and suggests tracking metadata concerning a document to determine whether a document currently being accessed by a user has changed in any meaningful way from the last time the user accessed the document. Baltus at ¶¶ 4–15. According to Baltus, this tracking allows a system to highlight the changes to the user, quickly drawing their attention to recent changes since the last time he viewed the document. Id.
Accordingly, the Eubank, the Shortlidge and the Baltus references reasonably teach and suggest modifying Eubank’s editing system to include an ability to collaborate in the editing of the audio object. The references further teach and suggest updating metadata for audio objects to reflect file name changes; version data, such as time of modification; who accessed, consumed and modified the audio object; what device was used to access, consume and modify the audio object and others. The metadata updates would allow for the detection of audio object changes through a database comparison between file versions (i.e., comparing recorded data of first spatial audio content with equivalent data for new first spatial audio content). Inconsistencies would be resolved through rules. The user of Eubank’s system would then be able to view highlights of changes since the last time he accessed and consumed the audio object, specifically by invoking a preview mechanism that would allow the user to see differences in versions of the same audio object.
Eubank’s system involves the navigation and selection of media from a library of media items, or objects. Eubank at ¶ 41, FIG.4. Eubank’s end user, or designer, navigates through a virtual environment in order to select audio objects. Id. Audio objects are selected by peeking inside a room. Id. Eubank does not describe any mechanism for detecting and instigating a peeking operation. The prior art includes several references directed towards solving the same problem, particularly in connection with a user navigating a virtual space. The Liu reference, for instance, teaches and suggests presenting a list of media items to a user and monitoring the user’s focus on the list. Liu at 3–4, FIGs.2, 3 (describing steps S1, S211, S212, S213, S3). When the user’s focus is fixed on a media object in the list, like a video, for a predetermined amount of time (i.e., for more than a threshold amount of time), Liu describes interpreting that focus as a desire to preview the object. Id. Liu also describes recording metadata concerning a preview operation or a previous consumption event, such as the ID of the user and the stop time at which a user previously stopped consuming the media in order to determine the start point at which to begin previewing the content. Id. Accordingly, Eubank and Liu would have reasonably suggested to one of ordinary skill in the art at the time of filing modifying Eubank’s system to include a mechanism to detect the user’s focus on a particular room in a virtual environment. Given Eubank’s description of position and orientation inputs, see Eubank at ¶ 41, it would have been obvious for one of ordinary skill to have used a user’s position proximate to a virtual room and/or orientation towards the room to derive a user’s focus on the room. When the focus exceeds a predetermined amount of time, the system would peek into the room, playing a preview of the audio object. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Liu references makes obvious all limitations of the claim.
Claim 17 depends on claim 16 and further requires the following:
“wherein using the recorded data to detect another [[new]] event comprises detecting that the first spatial audio content has been adapted to create new first spatial audio content; and
“wherein providing a user-selectable option for the user to enable rendering, for consumption by the user, of the first spatial audio content comprises providing a user-selectable option for the user to enable rendering, for consumption by the user, of the new first spatial audio content.”
The anticipation rejection of claim 1 shows that the Eubank reference describes detecting new events related to spatial audio content, such as position changes to a spatial sound object 206. See Eubank at ¶¶ 37, 38, FIGs.2, 3. This new event detection differs from the type recited in this claim. This claim instead requires detecting when first spatial audio content has been adapted to create new first spatial audio content. This is not a patentable difference.
Eubank’s system enables a user/designer to preview spatial sounds while editing and designing a sound object. Id. at ¶ 37. The Tudor reference expands on Eubank’s editing system by teaching a mechanism for adding audio enhancements to a sound object. Tudor at Abs., col. 1. For example, Tudor recognizes that some sound objects may include low-quality audio portions. Id. Tudor describes a system that recognizes these low-quality audio portions and replaces them with enhanced audio portions. Id. Tudor’s system also includes the ability to record and detect these enhancements, and to present the user with a choice to preview the enhancements alongside the original audio. Id. at cols. 3–4, col. 7 l. 39 to col. 8 l. 28, FIG.5. Accordingly, it would have been obvious for one of ordinary skill in the art at the time of filing to configure Eubank’s system with an audio enhancement mechanism that is similar to the one taught and suggested by the Tudor reference. Eubank’s system would then record and detect changes, or enhancements, made to a sound object. See id. The user/designer would then choose whether to hear the original sound object or a preview of the new sound object together with its enhancements. See id. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Tudor references makes obvious all limitations of the claim.
Claim 18 depends on claim 17 and further requires the following:
wherein using the recorded data to detect another [[new]] event comprises comparing recorded data for the first spatial audio content with equivalent data for the new first spatial audio content.
The combination of Eubank and Tudor do not suggest the claimed comparison between equivalent data present in first spatial audio content and new first spatial audio content.
The Shortlidge reference teaches and suggests a further refinement to the Eubank-Tudor audio object editing system that enables collaboration between parties. Shortlidge at ¶ 31. Shortlidge describes a system that allows two users to share media and to collaborate in the editing of the media. Id. For example, two musicians may be working on a piece of music. Id. Shortlidge’s system records changes made to the music by a first one of the users. Id. The first user then shares the updated music file with the second one of the users. Id. Shortlidge’s system performs a database comparison between the updated music file and the original music file stored on the second user’s computer. Id. at ¶¶ 31, 39, 41, 108–112, FIG.6. Any inconsistencies are resolved through application of a set of rules, which may include highlighting the differences. Id. at ¶ 39. Shortlidge describes a visual highlight as an example. Id. The Tudor reference, as discussed in the obviousness rejection of claim 17, alternatively teaches and suggests and suggests an audio highlight mechanism, where differences/enhancements are provided through mixing. Tudor at col. 7 l. 39 to col. 8 l. 28, FIG.5.
Accordingly, the Eubank, the Tudor and the Shortlidge references reasonably teach and suggest modifying Eubank’s editing mechanism to include an ability to enhance an audio object and to collaborate in the editing of the audio object. Collaborative updates would be detected through a database comparison between file versions (i.e., comparing recorded data of first spatial audio content with equivalent data for new first spatial audio content). Inconsistencies would be resolved through rules, including an audible preview mechanism that allows a designer to selectively present mixed versions of original and updated audio objects. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Tudor references makes obvious all limitations of the claim.
Claim 19 depends on claim 17 and further requires the following:
wherein providing a user-selectable option for the user to enable rendering, for consumption by the user, of the first spatial audio content comprises causing rendering of a simplified sound object representative of the first spatial audio content before or after adaptation.
The obviousness rejection of claim 17, incorporated herein, shows the obviousness of selectively rendering either an original audio object or an enhanced version of the original audio object. The Eubank reference also describes presenting a simplified sound object, such as a downmix of an audio object, based on the object’s position relative to the virtual user. Eubank at ¶¶ 39–42, FIG.4. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Tudor references makes obvious all limitations of the claim.
Claim 20 depends on claim 17 and further requires the following:
wherein providing a user-selectable option for the user to enable rendering, for consumption by the user, of the first spatial audio content comprises rendering a limited preview of the new first spatial audio content.
The obviousness rejection of claim 17, incorporated herein, shows the obviousness of allowing a sound designer to selectively render either an original audio object or an enhanced version of the original audio object. The Eubank reference also describes presenting a simplified sound object, such as a downmix of an audio object, based on the object’s position relative to the virtual user. Eubank at ¶¶ 39–42, FIG.4. Accordingly, it would have been obvious for one of ordinary skill in the art at the time of filing to simply combine these concepts so that the designer may choose to render either an original audio object or an enhanced audio object as a limited preview/downmix. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Tudor references makes obvious all limitations of the claim.
Claim 21 depends on claim 20 and further requires the following:
wherein the limited preview depends upon how the new first spatial audio content for consumption differs from the user-consumed first spatial audio content.
Likewise, the Tudor reference teaches and suggests presenting enhanced audio to highlight a difference between an original audio object and an enhanced audio object. See Tudor at col. 7 l. 39 to col. 8 l. 28, FIG.5. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Tudor references makes obvious all limitations of the claim.
Claim 22 depends on claim 17 and further requires the following:
wherein providing a user-selectable option for the user to enable rendering, for consumption by the user, of the first spatial audio content comprises causing rendering of a simplified sound object dependent upon a selected subset of a group of one or more sound objects of the new first spatial audio content, at a selected position dependent upon a volume associated with the group of one or more sound objects and with an extent dependent upon the volume associated with the group of one or more sound objects.
Similarly, the Eubank reference describes presenting a preview of a volume/room based on the user’s position relative to a virtual extent of the room. Eubank at ¶ 41. In particular, when the user is outside the room, a limited preview of the room is rendered, including only those sounds oriented towards the user. Id. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Tudor references makes obvious all limitations of the claim.
Claim 23 depends on claim 17 and further requires the following:
wherein providing a user-selectable option for the user to enable rendering, for consumption by the user, of the first spatial audio content comprises highlighting the new first spatial audio by rendering the new first spatial audio in preference to other spatial audio content.
The obviousness rejection of claim 17, incorporated herein, shows the obviousness of allowing a sound designer to selectively render either an original audio object or an enhanced version of the original audio object that replaces part of the original audio object. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Tudor references makes obvious all limitations of the claim.
Claim 24 depends on claim 16 and further requires the following:
wherein providing a user-selectable option for the user to enable rendering, for consumption by the user, of the first spatial audio content comprises causing rendering of a simplified sound object that extends in a vertical plane.
Eubank describes a simulated reality environment that includes multiple rooms/volumes. Eubank at ¶¶ 26, 27, 41. A user navigates the environment and is presented with a preview of the rooms, where the previews include a downmix of the spatial audio content contained in each room. Id. Alexandersson expands on Eubank’s SR environment concept. Alexandersson describes navigating a three-dimensional space (e.g., a hallway or an elevator) with multiple rooms/volumes. Alexandersson at ¶¶ 65, 83, FIG.3. Previews from each room are presented using spatial audio, such that each room preview is presented as a simplified object located with respect to the user’s position in the 3D space. Id. at ¶¶ 40, 60. This would have reasonably suggested using spatial audio techniques to present room previews in Eubank’s system. Room previews would then be presented with a horizontal extent, a vertical extent or a combination of horizontal and vertical extents extent based on their proximity to the user in a 3D SR environment. See id at ¶ 83, FIG.5. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Alexandersson references makes obvious all limitations of the claim.
Claim 26 depends on claim 16 and further requires the following:
further caused to: divide a sound space into different non-overlapping groups of one or more sound objects associated with different non-overlapping volumes of the sound space; and
provide a user-selectable option for the user to enable rendering, for consumption by the user, of any one of the respective groups of one or more sound objects by interacting with the associated volume,
wherein providing a user-selectable option for a first group comprises rendering a simplified sound object dependent upon a selected subset of the sound objects of the first group.
Claim 27 depends on claim 26 and further requires the following:
wherein interacting with the associated volume occurs by a virtual user approaching, staring at or entering the volume, wherein a position of the virtual user changes with a position of the user.
Eubank describes a similar simulated reality preview operation. More particularly, Eubank tracks a user’s movement (e.g., walking) and maps that into virtual movement in a simulated reality (SR) environment. Eubank at ¶ 26. The SR environment may include a number of mutually exclusive volumes, or rooms. Id. at ¶ 41. When near to, but outside, one of the rooms, Eubank’s system presents a simplified preview of the sounds in the room, for example, a downmix of the sources in the room that are currently located near the user. Id. For the foregoing reasons, the combination of the Eubank, the Shortlidge and the Baltus references makes obvious all limitations of the claims.
Claim 28 depends on claim 16 and further requires the following:
further caused to: change a position of a virtual user when a position of the user changes;
cause, when the virtual user is outside a first volume associated with the first group, rendering of a simplified sound object dependent upon a selected first subset of the sound objects of the first group;
cause, when the virtual user is inside the first volume associated with the first group, rendering of the sound objects of the first group; and
cause, when the virtual user is moving from outside first volume to inside the first volume, rendering of a selected second subset of the sound objects of the first group.
Eubank describes a similar simulated reality preview operation. Eubank at ¶¶ 26, 27. More particularly, Eubank tracks a user’s movement (e.g., walking) and maps that into virtual movement in a simulated reality (SR) environment. Id. The SR environment may include a number of mutually exclusive volumes, or rooms. Id. at ¶ 41. When near to, but outside, one of the rooms, Eubank’s system presents a preview of the sounds in the room. Id. When inside the room, Eubank’s system presents the full sound object as an immersive SR experience. Id. Eubank does not address the situation when a user moves from outside the room to inside the room. However, the Eubank reference describes the idea of presenting a subsets of sound sources based on the user’s position and the position of the sources belonging to an object. Id. at ¶¶ 40–42. Accordingly, in a particular use case, Eubank contemplates presenting a first subset of sounds when a user walks by a first side of a room and presenting a second subset of sounds when a user enters the room from a second, opposite side of the room. See id. For the foregoing reasons, the combination of the Eubank, the Shortlidge and the Baltus references makes obvious all limitations of the claim.
Claim 29 is drawn to “a method.” The following table illustrates the correspondence between the claim and the Eubank reference.
Claim 29
The Eubank Reference
“29. A method comprising:
Similarly, the Eubank reference describes a computerized data processing system that performs a spatial audio processing method. Eubank at Abs., ¶¶ 2, 5.
“causing selection of spatial audio content in dependence upon a position of a user in a virtual space;
“causing rendering, for consumption by the user, of the selected spatial audio content comprising first spatial audio content;
Eubank’s system similarly selects and renders spatial audio content based on a user’s position. For example, Eubank’s system implements both a simulated reality (SR) mode 106 and a preview mode 108 in which a simulated reality (SR) sound designer may preview an audio object. In preview mode 108, the designer uses a GUI to manipulate a 3D-representation of a spatial audio object 206 and to position himself with respect to object 206. This process allows the designer to virtually select a spatialized sound source 11, 13 contained in object 206 based on the position of the object and the position of the user. Id. at ¶¶ 37, 40–42, FIG.4. The system effects the selection by weighting the sound sources based on their proximity to the user or by triggering playback based on proximity. Id. Eubank’s system then reproduces the weighted sound sources for the designer. Id.
“causing, responsive to user consumption of the first spatial audio content, 
“using the spatial audio metadata within the recorded data to detect another [[new]] event relating to the first spatial audio content…
Eubank’s system is computerized, id. at ¶¶ 43, 44, and accordingly makes continuous recordings of data relating to object 206, including the position of the object, to update the system’s representation of the object, see id. at ¶ 37, FIG.3. Eubank describes a use scenario where the designer makes changes to the object, such as a virtual turning of the object. Id. This would generate updated data relating to the object’s position, which is recorded and used to detect a new position that is used to update the object’s displayed position and the audio preview. Id.
“wherein the user consumption of the first spatial audio content is determined based on the position of the user in the virtual space correlating with a position of the first spatial audio content in the virtual space for at least a predetermined amount of time, and
Eubank describes a user consumption process that involves previewing SR audio objects when a user peeks at the object—for example, peeking into a room in order to select the audio object for a preview. The reference, however, does not describe how a user peeks. Accordingly, Eubank does not anticipate the claimed determination of consumption based on the position of the user in a virtual space correlating with the position of spatial audio content.
“wherein the spatial audio metadata comprises data identifying the first spatial audio content, a version identifier of the first spatial audio content, and at least one of an indication of when the user consumed the first spatial audio content, an indication of the user who consumed the first spatial audio content, an indication of a user device associated with rendering the first spatial audio content, an indication of a position of the user when the first spatial audio content was consumed, or a starting point of consumption or an ending point of consumption within the first audio spatial content;
The Eubank reference describes responding to a user’s consumption of object 206 by updating metadata associated with object 206, including relative positioning between a user and audio sources within the object. Id.
Eubank, however, does not describe responding to user consumption of object 206 by updating identifying data for object 206, a version identifier of object 206 and one of when, who, what device, starting point or ending point of consumption.
“providing a user-selectable option for the user to enable rendering, for consumption by the user, of the first spatial audio content by rendering a simplified sound object representative of the first spatial audio content.”
Eubank’s system similarly provides a designer with the ability to preview object 206 as a simplified, mono object. Id. at ¶¶ 39–42, FIG.4. For example, the user may position object 206 in such a way that only one sound source is presented on a single output channel so that multiple sound sources are presented on a single output channel. Id. at ¶ 38.

Table 2
The foregoing table shows that the Eubank reference describes a data processing system that corresponds closely to the claimed apparatus. The Eubank reference describes updating metadata in response to a user’s consumption of first spatial audio content (e.g., Eubank’s audio object 206). For example, Eubank describes updating position metadata, such as a user’s position relative to spatial sound object 206. See Eubank at ¶¶ 37, 38, FIGs.2, 3. Claim 29, on the other hand, requires updating a combination of metadata that is not updated in Eubank. Claim 29 specifically requires updating identifying data for object 206, a version identifier of object 206 and one of when, who, what device, user position, starting point or ending point of consumption. Eubank only updates user position. Moreover, Eubank does not determine user consumption based on a user’s virtual position correlating withe the virtual position of an audio object (i.e., spatial audio content) for a predetermined period of time.
The differences between Eubank and the claimed invention are such that the invention as a whole would have been obvious to one of ordinary skill in the art at the time of filing. Eubank’s system enables a user to navigate a virtual environment in order to select, preview and experience audio objects distributed throughout the environment. Notably, Eubank’s system is configured so an end user may simply navigate, preview and experience the audio objects and so a designer may navigate, preview and experience the audio objects as part of an audio design process.
The Shortlidge reference teaches and suggests a refinement to Eubank’s audio design process that enables collaboration between parties. Shortlidge at ¶ 31. Shortlidge describes a system that allows two users to share media and to collaborate in the editing of the media. Id. For example, two musicians may be working on a piece of music. Id. Shortlidge’s system records and updates metadata concerning the music based on the accessing, consumption and changes made to the music by the collaborating users. Id. A first of the collaborating users shares updated music files with a second one of the users. Id. Shortlidge’s system performs a database comparison between the updated music file and the original music file stored on the second user’s computer. Id. at ¶¶ 31, 39, 41, 108–112, FIG.6. Any inconsistencies are resolved through application of a set of rules, which may include highlighting the differences. Id. at ¶ 39. Shortlidge describes tracking and updating numerous pieces of metadata including version number (e.g., time stamp of a modification), when the data was created, the device used to create the data, the person that created the data, etc. Id. at ¶¶ 48, 50, 55.
In a similar vein, the Baltus reference teaches and suggests tracking metadata concerning a document to determine whether a document currently being accessed by a user has changed in any meaningful way from the last time the user accessed the document. Baltus at ¶¶ 4–15. According to Baltus, this tracking allows a system to highlight the changes to the user, quickly drawing their attention to recent changes since the last time he viewed the document. Id.
Accordingly, the Eubank, the Shortlidge and the Baltus references reasonably teach and suggest modifying Eubank’s editing system to include an ability to collaborate in the editing of the audio object. The references further teach and suggest updating metadata for audio objects to reflect file name changes; version data, such as time of modification; who accessed, consumed and modified the audio object; what device was used to access, consume and modify the audio object and others. The metadata updates would allow for the detection of audio object changes through a database comparison between file versions (i.e., comparing recorded data of first spatial audio content with equivalent data for new first spatial audio content). Inconsistencies would be resolved through rules. The user of Eubank’s system would then be able to view highlights of changes since the last time he accessed and consumed the audio object, specifically by invoking a preview mechanism that would allow the user to see differences in versions of the same audio object.
Eubank’s system involves the navigation and selection of media from a library of media items, or objects. Eubank at ¶ 41, FIG.4. Eubank’s end user, or designer, navigates through a virtual environment in order to select audio objects. Id. Audio objects are selected by peeking inside a room. Id. Eubank does not describe any mechanism for detecting and instigating a peeking operation. The prior art includes several references directed towards solving the same problem, particularly in connection with a user navigating a virtual space. The Liu reference, for instance, teaches and suggests presenting a list of media items to a user and monitoring the user’s focus on the list. Liu at 3–4, FIGs.2, 3 (describing steps S1, S211, S212, S213, S3). When the user’s focus is fixed on a media object in the list, like a video, for a predetermined amount of time (i.e., for more than a threshold amount of time), Liu describes interpreting that focus as a desire to preview the object. Id. Liu also describes recording metadata concerning a preview operation or a previous consumption event, such as the ID of the user and the stop time at which a user previously stopped consuming the media in order to determine the start point at which to begin previewing the content. Id. Accordingly, Eubank and Liu would have reasonably suggested to one of ordinary skill in the art at the time of filing modifying Eubank’s system to include a mechanism to detect the user’s focus on a particular room in a virtual environment. Given Eubank’s description of position and orientation inputs, see Eubank at ¶ 41, it would have been obvious for one of ordinary skill to have used a user’s position proximate to a virtual room and/or orientation towards the room to derive a user’s focus on the room. When the focus exceeds a predetermined amount of time, the system would peek into the room, playing a preview of the audio object. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Liu references makes obvious all limitations of the claim.
Claim 30 depends on claim 29 and further requires the following:
wherein using the recorded data to detect another [[new]] event comprises detecting that the first spatial audio content has been adapted to create new first spatial audio content; and
wherein providing a user-selectable option for the user to enable rendering, for consumption by the user, of the first spatial audio content comprises providing a user-selectable option for the user to enable rendering, for consumption by the user, of the new first spatial audio content.
The anticipation rejection of claim 1 shows that the Eubank reference describes detecting new events related to spatial audio content, such as position changes to a spatial sound object 206. See Eubank at ¶¶ 37, 38, FIGs.2, 3. This new event detection differs from the type recited in this claim. This claim instead requires detecting when first spatial audio content has been adapted to create new first spatial audio content. This is not a patentable difference.
Eubank’s system enables a user/designer to preview spatial sounds while editing and designing a sound object. Id. at ¶ 37. The Tudor reference expands on Eubank’s editing system by teaching a mechanism for adding audio enhancements to a sound object. Tudor at Abs., col. 1. For example, Tudor recognizes that some sound objects may include low-quality audio portions. Id. Tudor describes a system that recognizes these low-quality audio portions and replaces them with enhanced audio portions. Id. Tudor’s system also includes the ability to record and detect these enhancements, and to present the user with a choice to preview the enhancements alongside the original audio. Id. at cols. 3–4, col. 7 l. 39 to col. 8 l. 28, FIG.5. Accordingly, it would have been obvious for one of ordinary skill in the art at the time of filing to configure Eubank’s system with an audio enhancement mechanism that is similar to the one taught and suggested by the Tudor reference. Eubank’s system would then record and detect changes, or enhancements, made to a sound object. See id. The user/designer would then choose whether to hear the original sound object or a preview of the new sound object together with its enhancements. See id. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Tudor references makes obvious all limitations of the claim.
Claim 31 depends on claim 30 and further requires the following:
wherein using the recorded data to detect another [[new]] event comprises comparing recorded data for the first spatial audio content with equivalent data for the new first spatial audio content.
The combination of Eubank and Tudor do not suggest the claimed comparison between equivalent data present in first spatial audio content and new first spatial audio content.
The Shortlidge reference teaches and suggests a further refinement to the Eubank-Tudor audio object editing system that enables collaboration between parties. Shortlidge at ¶ 31. Shortlidge describes a system that allows two users to share media and to collaborate in the editing of the media. Id. For example, two musicians may be working on a piece of music. Id. Shortlidge’s system records changes made to the music by a first one of the users. Id. The first user then shares the updated music file with the second one of the users. Id. Shortlidge’s system performs a database comparison between the updated music file and the original music file stored on the second user’s computer. Id. at ¶¶ 31, 39, 41, 108–112, FIG.6. Any inconsistencies are resolved through application of a set of rules, which may include highlighting the differences. Id. at ¶ 39. Shortlidge describes a visual highlight as an example. Id. The Tudor reference, as discussed in the obviousness rejection of claim 17, alternatively teaches and suggests and suggests an audio highlight mechanism, where differences/enhancements are provided through mixing. Tudor at col. 7 l. 39 to col. 8 l. 28, FIG.5.
Accordingly, the Eubank, the Tudor and the Shortlidge references reasonably teach and suggest modifying Eubank’s editing mechanism to include an ability to enhance an audio object and to collaborate in the editing of the audio object. Collaborative updates would be detected through a database comparison between file versions (i.e., comparing recorded data of first spatial audio content with equivalent data for new first spatial audio content). Inconsistencies would be resolved through rules, including an audible preview mechanism that allows a designer to selectively present mixed versions of original and updated audio objects. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Tudor references makes obvious all limitations of the claim.
Claim 32 depends on claim 30 and further requires the following:
wherein providing a user-selectable option for the user to enable rendering, for consumption by the user, of the first spatial audio content comprises causing rendering of a simplified sound object representative of the first spatial audio content before or after adaptation.
The obviousness rejection of claim 17, incorporated herein, shows the obviousness of selectively rendering either an original audio object or an enhanced version of the original audio object. The Eubank reference also describes presenting a simplified sound object, such as a downmix of an audio object, based on the object’s position relative to the virtual user. Eubank at ¶¶ 39–42, FIG.4. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Tudor references makes obvious all limitations of the claim.
Claim 33 depends on claim 30 and further requires the following:
wherein providing a user-selectable option for the user to enable rendering, for consumption by the user, of the first spatial audio content comprises rendering a limited preview of the new first spatial audio content.
The obviousness rejection of claim 17, incorporated herein, shows the obviousness of allowing a sound designer to selectively render either an original audio object or an enhanced version of the original audio object. The Eubank reference also describes presenting a simplified sound object, such as a downmix of an audio object, based on the object’s position relative to the virtual user. Eubank at ¶¶ 39–42, FIG.4. Accordingly, it would have been obvious for one of ordinary skill in the art at the time of filing to simply combine these concepts so that the designer may choose to render either an original audio object or an enhanced audio object as a limited preview/downmix. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Tudor references makes obvious all limitations of the claim.
Claim 34 depends on claim 33 and further requires the following:
wherein the limited preview depends upon how the new first spatial audio content for consumption differs from the user-consumed first spatial audio content.
Likewise, the Tudor reference teaches and suggests presenting enhanced audio to highlight a difference between an original audio object and an enhanced audio object. See Tudor at col. 7 l. 39 to col. 8 l. 28, FIG.5. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Tudor references makes obvious all limitations of the claim.
Claim 35 is drawn to “a non-transitory computer readable medium comprising program instructions stored thereon for performing” a sequence of actions. The following table illustrates the correspondence between the claim and the Eubank reference.
Claim 35
The Eubank Reference
“35. A non-transitory computer readable medium comprising program instructions stored thereon for performing at least the following:
Similarly, the Eubank reference describes a computerized data processing system for spatial audio processing. Eubank at Abs., ¶¶ 2, 5. Eubank’s system includes a processor and a storage medium that stores a sequence of instructions that are executed by the processor. Id. at ¶¶ 43, 44.
“select spatial audio content in dependence upon a position of a user in a virtual space;
“render, for consumption by the user, the selected spatial audio content comprising a first spatial audio content;
Eubank’s system similarly selects and renders spatial audio content based on a user’s position. For example, Eubank’s system implements both a simulated reality (SR) mode 106 and a preview mode 108 in which a simulated reality (SR) sound designer may preview an audio object. In preview mode 108, the designer uses a GUI to manipulate a 3D-representation of a spatial audio object 206 and to position himself with respect to object 206. This process allows the designer to virtually select a spatialized sound source 11, 13 contained in object 206 based on the position of the object and the position of the user. Id. at ¶¶ 37, 40–42, FIG.4. The system effects the selection by weighting the sound sources based on their proximity to the user or by triggering playback based on proximity. Id. Eubank’s system then reproduces the weighted sound sources for the designer. Id.
“responsive to user consumption of the first spatial audio content, update recorded data related to the first spatial audio content with spatial audio metadata…
“use the spatial audio metadata within the recorded data to detect another [[new]] event relating to the first spatial audio content; and
Eubank’s system is computerized, id. at ¶¶ 43, 44, and accordingly makes continuous recordings of data relating to object 206, including the position of the object, to update the system’s representation of the object, see id. at ¶ 37, FIG.3. Eubank describes a use scenario where the designer makes changes to the object, such as a virtual turning of the object. Id. This would generate updated data relating to the object’s position, which is recorded and used to detect a new position that is used to update the object’s displayed position and the audio preview. Id.
“wherein the user consumption of the first spatial audio content is determined based on the position of the user in the virtual space correlating with a position of the first spatial audio content in the virtual space for at least a predetermined amount of time, and
Eubank describes a user consumption process that involves previewing SR audio objects when a user peeks at the object—for example, peeking into a room in order to select the audio object for a preview. The reference, however, does not describe how a user peeks. Accordingly, Eubank does not anticipate the claimed determination of consumption based on the position of the user in a virtual space correlating with the position of spatial audio content.
“wherein the spatial audio metadata comprises data identifying the first spatial audio content, a version identifier of the first spatial audio content, and at least one of an indication of when the user consumed the first spatial audio content, an indication of the user who consumed the first spatial audio content, an indication of a user device associated with rendering the first spatial audio content, an indication of a position of the user when the first spatial audio content was consumed, or a starting point of consumption or an ending point of consumption within the first audio spatial content;
The Eubank reference describes responding to a user’s consumption of object 206 by updating metadata associated with object 206, including relative positioning between a user and audio sources within the object. Id.
Eubank, however, does not describe responding to user consumption of object 206 by updating identifying data for object 206, a version identifier of object 206 and one of when, who, what device, starting point or ending point of consumption.
“provide a user-selectable option to enable rendering, for consumption by the user, of the first spatial audio content by rendering a simplified sound object representative of the first spatial audio content.”
Eubank’s system similarly provides a designer with the ability to preview object 206 as a simplified, mono object. Id. at ¶¶ 39–42, FIG.4. For example, the user may position object 206 in such a way that only one sound source is presented on a single output channel so that multiple sound sources are presented on a single output channel. Id. at ¶ 38.

Table 3
The foregoing table shows that the Eubank reference describes a data processing system that corresponds closely to the claimed apparatus. The Eubank reference describes updating metadata in response to a user’s consumption of first spatial audio content (e.g., Eubank’s audio object 206). For example, Eubank describes updating position metadata, such as a user’s position relative to spatial sound object 206. See Eubank at ¶¶ 37, 38, FIGs.2, 3. Claim 35, on the other hand, requires updating a combination of metadata that is not updated in Eubank. Claim 35 specifically requires updating identifying data for object 206, a version identifier of object 206 and one of when, who, what device, user position, starting point or ending point of consumption. Eubank only updates user position. Moreover, Eubank does not determine user consumption based on a user’s virtual position correlating withe the virtual position of an audio object (i.e., spatial audio content) for a predetermined period of time.
The differences between Eubank and the claimed invention are such that the invention as a whole would have been obvious to one of ordinary skill in the art at the time of filing. Eubank’s system enables a user to navigate a virtual environment in order to select, preview and experience audio objects distributed throughout the environment. Notably, Eubank’s system is configured so an end user may simply navigate, preview and experience the audio objects and so a designer may navigate, preview and experience the audio objects as part of an audio design process.
The Shortlidge reference teaches and suggests a refinement to Eubank’s audio design process that enables collaboration between parties. Shortlidge at ¶ 31. Shortlidge describes a system that allows two users to share media and to collaborate in the editing of the media. Id. For example, two musicians may be working on a piece of music. Id. Shortlidge’s system records and updates metadata concerning the music based on the accessing, consumption and changes made to the music by the collaborating users. Id. A first of the collaborating users shares updated music files with a second one of the users. Id. Shortlidge’s system performs a database comparison between the updated music file and the original music file stored on the second user’s computer. Id. at ¶¶ 31, 39, 41, 108–112, FIG.6. Any inconsistencies are resolved through application of a set of rules, which may include highlighting the differences. Id. at ¶ 39. Shortlidge describes tracking and updating numerous pieces of metadata including version number (e.g., time stamp of a modification), when the data was created, the device used to create the data, the person that created the data, etc. Id. at ¶¶ 48, 50, 55.
In a similar vein, the Baltus reference teaches and suggests tracking metadata concerning a document to determine whether a document currently being accessed by a user has changed in any meaningful way from the last time the user accessed the document. Baltus at ¶¶ 4–15. According to Baltus, this tracking allows a system to highlight the changes to the user, quickly drawing their attention to recent changes since the last time he viewed the document. Id.
Accordingly, the Eubank, the Shortlidge and the Baltus references reasonably teach and suggest modifying Eubank’s editing system to include an ability to collaborate in the editing of the audio object. The references further teach and suggest updating metadata for audio objects to reflect file name changes; version data, such as time of modification; who accessed, consumed and modified the audio object; what device was used to access, consume and modify the audio object and others. The metadata updates would allow for the detection of audio object changes through a database comparison between file versions (i.e., comparing recorded data of first spatial audio content with equivalent data for new first spatial audio content). Inconsistencies would be resolved through rules. The user of Eubank’s system would then be able to view highlights of changes since the last time he accessed and consumed the audio object, specifically by invoking a preview mechanism that would allow the user to see differences in versions of the same audio object.
Eubank’s system involves the navigation and selection of media from a library of media items, or objects. Eubank at ¶ 41, FIG.4. Eubank’s end user, or designer, navigates through a virtual environment in order to select audio objects. Id. Audio objects are selected by peeking inside a room. Id. Eubank does not describe any mechanism for detecting and instigating a peeking operation. The prior art includes several references directed towards solving the same problem, particularly in connection with a user navigating a virtual space. The Liu reference, for instance, teaches and suggests presenting a list of media items to a user and monitoring the user’s focus on the list. Liu at 3–4, FIGs.2, 3 (describing steps S1, S211, S212, S213, S3). When the user’s focus is fixed on a media object in the list, like a video, for a predetermined amount of time (i.e., for more than a threshold amount of time), Liu describes interpreting that focus as a desire to preview the object. Id. Liu also describes recording metadata concerning a preview operation or a previous consumption event, such as the ID of the user and the stop time at which a user previously stopped consuming the media in order to determine the start point at which to begin previewing the content. Id. Accordingly, Eubank and Liu would have reasonably suggested to one of ordinary skill in the art at the time of filing modifying Eubank’s system to include a mechanism to detect the user’s focus on a particular room in a virtual environment. Given Eubank’s description of position and orientation inputs, see Eubank at ¶ 41, it would have been obvious for one of ordinary skill to have used a user’s position proximate to a virtual room and/or orientation towards the room to derive a user’s focus on the room. When the focus exceeds a predetermined amount of time, the system would peek into the room, playing a preview of the audio object. For the foregoing reasons, the combination of the Eubank, the Shortlidge, the Baltus and the Liu references makes obvious all limitations of the claim.
Claim 36 depends on claim 16 and further requires the following:
“responsive to detecting the another event relating to the first spatial audio content, rotate a perspective of the virtual space such that the first spatial audio content is spatially closer to the user.”
The obviousness rejection of claim 16, incorporated herein, shows that Eubank’s system includes a preview function to preview spatial audio objects. The rejection shows the obviousness of modifying Eubank’s system to enable collaboration in audio object production, including detecting changes in an audio object and highlighting those changes to a user. The user would then be able to preview those changes by rotating the user’s virtual position/orientation relative to the audio object, such that an audio source contained in the object would be positioned closer to the user. See Eubank at ¶¶ 38–41, FIGs.3A–3C. For the foregoing reasons, the combination of the Eubank, the Shortlidge and the Baltus references makes obvious all limitations of the claim.
Summary
Claims 16–24 and 26–35 are rejected under at least one of 35 U.S.C. §§ 102 and 103 as being unpatentable over the cited prior art. In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 C.F.R. § 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. § 102(b)(2)(C) for any potential 35 U.S.C. § 102(a)(2) prior art against the later invention.
Response to Applicant’s Arguments
Applicant’s Reply at 8–11 (24 February 2022) includes comments concerning the rejections presented in the Final Rejection (08 November 2021). The Examiner has considered these comments, but they are moot in light of the new grounds of rejection included in this Office action.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WALTER F BRINEY III whose telephone number is (571)272-7513. The examiner can normally be reached on M-F 8 am-4:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached on (571)272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Walter F Briney III/

Walter F Briney IIIPrimary ExaminerArt Unit 2651

5/19/2022