DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
The Amendment filed on 01/06/2021 has been entered. Claims 1, 5, 8, 10, and 11 have been amended. Claims 1-12 are pending in the application. 

Response to Arguments
Applicant's arguments filed 01/06/2021 have been fully considered but they are persuasive. However, the amendments to the claims have changed the scope of the claims. Examiner will now rely on Wang (US 20170347163), hereafter Wang, and further in view of Hannuksela (US 20170347026), hereafter Hannuksela, and further in view of new reference Reznik et al. (US 20130195204), hereafter Reznik. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims: 1-11:

Claims 1-11 are rejected under AIA  35 U.S.C. 103 as being unpatentable over Wang (US 20170347163), hereafter Wang, and further in view of Hannuksela (US 20170347026), hereafter Hannuksela, and further in view of Reznik et al. (US 20130195204), hereafter Reznik. 
Regarding claim 1, Wang discloses a method ([0007], see method) for establishing a manifest for ([0044], encapsulation unit 30 may form a manifest file, such as a media presentation descriptor (MPD) that describes characteristics of the representations, where the manifest file/MPD correspond to the manifest) a requesting terminal configured to receive a multimedia content divided into segments ([0050], Client device 40 may retrieve the MPD of a media presentation to determine how to access segments of representations 68, where the client device corresponds to the requesting terminal, where the media presentation corresponds to the multimedia content, where the plurality of segments correspond to the segments), each segment being available in one or more representations ([0042] Encapsulation unit 30 is responsible for assembling elementary streams into video files (e.g., segments) of various representations), said manifest listing available representations for the multimedia content ([0050], multimedia content 64 includes manifest file 66, which may correspond to a media presentation description (MPD). Manifest file 66 may contain descriptions of different alternative representations 68 (e.g., video services with different qualities) and the description may include, e.g., codec information, a profile value, a level value, a bitrate, and other descriptive characteristics of representations 68, where the different alternative representations correspond to the available and specifying a plurality of adaptation sets ([0046] Manifest file 66 may include data indicative of the subsets of representations 68 corresponding to particular adaptation sets, as well as common characteristics for the adaptation sets. Manifest file 66 may also include data representative of individual characteristics, such as bitrates, for individual representations of adaptation sets), each adaptation set defining a spatial object from a plurality of spatial objects of the multimedia content ([0118] A device, such as client device 40 or server device 60, may determine, based on first data signaled at an adaptation set level of a media presentation description for a media presentation, whether a motion-constrained tiles based viewport dependent VR video coding scheme is in use in the media presentation, and the device may retrieve segments of the media presentation, where the viewport corresponds to a spatial object, and the media presentation/VR video corresponds with the multimedia content), the plurality of spatial objects of the adaptation sets defining a whole spatial object ([0119] (2) a base layer may be fully sent, such that at any time for any viewport at least the lowest resolution video is available for rendering, (3) enhancement layers (ELs) are coded using motion-constrained tiles such that each potential region covering a viewport can be independently decoded from other regions across time, with inter-layer prediction enabled, where each viewport corresponds with a spatial object, where the entire low-resolution video corresponds to the whole spatial object), comprising: 
defining, in the manifest ([0118], see media presentation description, where the data, at an adaptation set level of the media presentation description (i.e., manifest) includes which VR video coding scheme is in use (where the different schemes are described in para. 0119)), a point of reference  defining a center point or origin of said whole spatial object ([0094] see center area of the original face shape), in an adaptation set of reference amongst said plurality of adaptation sets ([0119] where in each of the schemes, a motion-constrained tile is encoded/decoded for a region, and the tile is encoded/decoded independently across a period of time, where the tile corresponds to a point of reference for the adaptation set, and [0135] When a motion-constrained tiles based viewport dependent VR video coding scheme is in use, and tile tracks as specified in clause 10 of 14496-15 are used, e.g., each motion-constrained tile or tile region is exclusively carried in a track or DASH representation, an  
However, Wang does not explicitly disclose defining, in the manifest, a type of mapping of the multimedia content to said whole spatial object, associating, in the manifest, depth information indicating a position of at least one spatial object between an eye of a user and a background of the multimedia content as a single value; and whereby the depth information is provided for each respective adaptation set of the plurality of adaptation sets, for rendering the spatial object in front of the background of the multimedia content.
However, Hannuksela, which is analogous to Wang because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device, does disclose defining, in the manifest, a type of mapping of the multimedia content to said whole spatial object ([0280] In an embodiment, a VR projection format (e.g. an equirectangular panorama or a cube map) is indicated in a manifest or parsed from a manifest. The VR projection format may be indicated in the manifest or parsed from the manifest or inferred to be specific to a viewport or a spatial region or an entire picture, where the entire picture corresponds to the whole spatial object, where the VR projection format corresponds to the type of mapping).
Wang and Hannuksela (hereafter Wang-Hannuksela) are analogous art because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang system of preparing and rendering virtual reality video segment data with the feature of spatial relationships between spatial objects, as disclosed in the Hannuksela system. The motivation to combine would be to use the spatial relationship of an adaptation set within a full-frame video, as taught by Hannuksela ([0239]). 

However, Wang-Hannuksela do not explicitly disclose associating, in the manifest, depth information indicating a position of at least one spatial object between an eye of a user and a background 
However, Reznik, which is analogous to Wang-Hannuksela because each reference discloses manipulating content/media images for rendering on a user device, does disclose associating, in the manifest ([0224] see manifest file), depth information indicating a position of at least one spatial object between an eye of a user and a background of the multimedia content as a single value ([0218] see presence of and range of depth of 3D content of the multimedia content, where the 3D indicates a background and spatial object, [0136-0137] see perceived depth and distance from user’s eye to screen, [0214] see selection of adaptation set, where each set uses a viewing distance and/or viewing angle/value, [0224] see viewing parameter which corresponds to single value, see viewing angle); and 
whereby the depth information is provided for each respective adaptation set of the plurality of adaptation sets, for rendering the spatial object in front of the background of the multimedia content ([0214] see selection of adaptation set, where each set uses a viewing distance and/or viewing angle/value, [0218] where the viewing parameter includes a range of depth of 3D content of multimedia content).
Wang-Hannuksela and Reznik (hereafter Wang-Hannuksela-Reznik) are analogous art because each reference discloses manipulating content/media images for rendering on a user device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang-Hannuksela system of preparing and rendering virtual reality video segment data with the feature of using a viewer’s distance to the screen, as disclosed in the Reznik system. The motivation to combine would be to compensate depth distortion, as taught by Reznik ([0137]). 

Regarding claim 2, the combination of Wang-Hannuksela-Reznik discloses the features of claim 1, as discussed above. Wang further discloses wherein the type of mapping includes at least one of: 
cube mapping ([0094] In one example of a sub-sampled cube-map, one of the faces can be kept unchanged, while the face on the opposite side can be sub-sampled or down-scaled to a smaller size ; 
pyramidal mapping ([0094] The extreme is to down-scale the face on the opposite side to be a single point, and thus the cube becomes a pyramid).
However, Wang does not explicitly disclose spherical mapping; and cylindrical mapping.
However, Hannuksela, which is analogous to Wang because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device, does disclose spherical mapping ([0287], the viewport is indicated in a manifest or parsed from a manifest with reference to spherical coordinates indicating the position and, in some embodiments, the orientation, of the viewport on a sphere); and cylindrical mapping. ([0055], Pseudo-cylindrical projections result into non-rectangular contiguous 2D images representing the projected sphere). 
Wang and Hannuksela (hereafter Wang-Hannuksela) are analogous art because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang system of preparing and rendering virtual reality video segment data with the feature of spatial relationships between spatial objects, as disclosed in the Hannuksela system. The motivation to combine would be to use the spatial relationship of an adaptation set within a full-frame video, as taught by Hannuksela ([00239]). 

Regarding claim 3, the combination of Wang-Hannuksela-Reznik discloses the features of claim 1, as discussed above. However, Wang does not explicitly disclose wherein the point of reference corresponds to a center of the spatial object associated with the adaptation set of reference.
However, Hannuksela, which is analogous to Wang because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device, does disclose wherein the point of reference corresponds to the center of a spatial object associated with the adaptation set of reference ([0288], the position of the viewport on a sphere is indicated using two angles of a spherical coordinate system indicating a specific point of the viewport, such as the center point or a particular corner point of the viewport, where the viewport corresponds to the spatial object, and the angles of the spherical coordinate system correspond to the point of reference.
Wang and Hannuksela (hereafter Wang-Hannuksela) are analogous art because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang system of preparing and rendering virtual reality video segment data with the feature of spatial relationships between spatial objects, as disclosed in the Hannuksela system. The motivation to combine would be to use the spatial relationship of an adaptation set within a full-frame video, as taught by Hannuksela ([00239]). 

Regarding claim 4, the combination of Wang-Hannuksela-Reznik discloses the features of claim 1, as discussed above. However, Wang does not explicitly disclose further comprising defining coordinates associated with one or several adaptation sets specified in the manifest.
However, Hannuksela, which is analogous to Wang because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device, does disclose further comprising defining coordinates associated with one or several adaptation sets specified in the manifest ([0288] The specific point may be pre-defined, e.g. in a manifest format specification, or may be indicated in a manifest and/or parsed from a manifest, where the specific point corresponds to the defining coordinates). 
Wang and Hannuksela (hereafter Wang-Hannuksela) are analogous art because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang system of preparing and rendering virtual reality video segment data Hannuksela system. The motivation to combine would be to use the spatial relationship of an adaptation set within a full-frame video, as taught by Hannuksela ([00239]). 

Regarding claim 5, Wang discloses a non-transitory processor readable medium for storing a manifest for transmission ([0044], where the manifest file/MPD correspond to the manifest) to a client terminal configured to receive a multimedia content divided into segments ([0050], where the client device corresponds to the requesting terminal, where the media presentation corresponds to the multimedia content, where the plurality of segments correspond to the segments), each segment being available in one or more representations ([0042] Encapsulation unit 30 is responsible for assembling elementary streams into video files (e.g., segments) of various representations), said manifest listing available representations for the multimedia content ([0050], where the different alternative representations correspond to the available representations) and specifying a plurality of adaptation sets ([0046] Manifest file 66 may include data indicative of the subsets of representations 68 corresponding to particular adaptation sets, as well as common characteristics for the adaptation sets. Manifest file 66 may also include data representative of individual characteristics, such as bitrates, for individual representations of adaptation sets), each adaptation set defining a spatial object from a plurality of spatial objects of the multimedia content ([0118], where the viewport corresponds to a spatial object, and the media presentation/VR video corresponds with the multimedia content), the plurality of spatial objects of the adaptation sets defining a whole spatial object ([0119], where the entire low-resolution video corresponds to the whole spatial object), the manifest comprising: 
a point of reference defining a center point or origin of said whole spatial object ([0094] see center area of the original face shape) in an adaptation set of reference amongst said adaptation sets ([0119] When a motion-constrained tiles based viewport dependent VR video coding scheme is in use, and tile tracks as specified in clause 10 of 14496-15 are used, e.g., each motion-constrained tile or tile region is exclusively carried in a track or DASH representation, an adaptation-set-level element is used to signal the mapping between each motion-constrained tile or tile region and the representation carrying it, where the tile corresponds to the point of reference).
Wang does not explicitly disclose a type of mapping of the multimedia content associated with said whole spatial object, depth information indicating position of at least one spatial object between an eye of a user and a background of the multimedia content as a of a single value, whereby the depth information is provided for each respective adaptation set of the plurality of adaptation sets, for rendering the spatial object in front of the background of the multimedia content.
 However, Hannuksela, which is analogous to Wang because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device, does disclose a type of mapping of the multimedia content associated with said whole spatial object ([0280] In an embodiment, a VR projection format (e.g. an equirectangular panorama or a cube map) is indicated in a manifest or parsed from a manifest. The VR projection format may be indicated in the manifest or parsed from the manifest or inferred to be specific to a viewport or a spatial region or an entire picture, where the entire picture corresponds to the whole spatial object, where the VR projection format corresponds to the type of mapping).
Wang and Hannuksela (hereafter Wang-Hannuksela) are analogous art because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang system of preparing and rendering virtual reality video segment data with the feature of spatial relationships between spatial objects, as disclosed in the Hannuksela system. The motivation to combine would be to use the spatial relationship of an adaptation set within a full-frame video, as taught by Hannuksela ([00239]). 

However, Wang-Hannuksela do not explicitly disclose depth information indicating position of at least one spatial object between an eye of a user and a background of the multimedia content as a of a single value, whereby the depth information is provided for each respective adaptation set of the plurality of adaptation sets, for rendering the spatial object in front of the background of the multimedia content.
Reznik, which is analogous to Wang-Hannuksela because each reference discloses manipulating content/media images for rendering on a user device, does disclose depth information indicating position of at least one spatial object between an eye of a user and a background of the multimedia content as a of a single value ([0218] see presence of and range of depth of 3D content of the multimedia content, where the 3D indicates a background and spatial object, [0136-0137] see perceived depth and distance from user’s eye to screen, [0214] see selection of adaptation set, where each set uses a viewing distance and/or viewing angle/value, [0224] see viewing parameter which corresponds to single value, see viewing angle), whereby the depth information is provided for each respective adaptation set of the plurality of adaptation sets, for rendering the spatial object in front of the background of the multimedia content ([0214] see selection of adaptation set, where each set uses a viewing distance and/or viewing angle/value, [0218] where the viewing parameter includes a range of depth of 3D content of multimedia content).
Wang-Hannuksela and Reznik (hereafter Wang-Hannuksela-Reznik) are analogous art because each reference discloses manipulating content/media images for rendering on a user device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang-Hannuksela system of preparing and rendering virtual reality video segment data with the feature of using a viewer’s distance to the screen, as disclosed in the Reznik system. The motivation to combine would be to compensate depth distortion, as taught by Reznik ([0137]). 



Regarding claim 6, the combination of Wang-Hannuksela-Reznik discloses the features of claim 5, as discussed above. However, Wang does not explicitly disclose wherein the point of reference corresponds to the center of the spatial object associated with the adaptation set of reference.
However, Hannuksela, which is analogous to Wang because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device, does disclose wherein the point of reference corresponds to the center of the spatial object associated with the adaptation set of reference ([0288], the position of the viewport on a sphere is indicated using two angles of a spherical coordinate system indicating a specific point of the viewport, such as the center point or a particular corner point of the viewport, where the viewport corresponds to the spatial object, and the angles of the spherical coordinate system correspond to the point of reference.
Wang and Hannuksela (hereafter Wang-Hannuksela) are analogous art because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang system of preparing and rendering virtual reality video segment data with the feature of spatial relationships between spatial objects, as disclosed in the Hannuksela system. The motivation to combine would be to use the spatial relationship of an adaptation set within a full-frame video, as taught by Hannuksela ([00239]). 

Regarding claim 7, the combination of Wang-Hannuksela-Reznik discloses the features of claim 5, as discussed above. However, Wang does not explicitly disclose comprising coordinates associated with one or several adaptation sets specified in the manifest.
However, Hannuksela, which is analogous to Wang because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device, does disclose comprising coordinates associated with one or several adaptation sets specified in the manifest ([0288] The specific point may be pre-defined, e.g. in a manifest format specification, or may be indicated in a manifest and/or parsed from a manifest, where the specific point corresponds to the defining coordinates). 
Wang and Hannuksela (hereafter Wang-Hannuksela) are analogous art because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was Wang system of preparing and rendering virtual reality video segment data with the feature of spatial relationships between spatial objects, as disclosed in the Hannuksela system. The motivation to combine would be to use the spatial relationship of an adaptation set within a full-frame video, as taught by Hannuksela ([00239]). 

Regarding claim 8, Wang discloses a network equipment ([0044], encapsulation unit 30 may form a manifest file, where the encapsulation unit belongs to the content preparation device 20 (see Fig. 1)) for establishing a manifest ([0044], where the manifest file/MPD correspond to the manifest) for a requesting terminal configured to receive a multimedia content divided into segments ([0050], where the client device corresponds to the requesting terminal, where the media presentation corresponds to the multimedia content, where the plurality of segments correspond to the segments) from a network equipment ([0048], Request processing unit 70 is configured to receive network requests from client devices, such as client device 40, where the request processing unit 70 is part of the server device 60 (see Fig. 1), which corresponds to the network equipment), each segment being available in one or more representations ([0042] Encapsulation unit 30 is responsible for assembling elementary streams into video files (e.g., segments) of various representations), said manifest listing available representations for the multimedia content ([0050], multimedia content 64 includes manifest file 66, which may correspond to a media presentation description (MPD). Manifest file 66 may contain descriptions of different alternative representations 68 (e.g., video services with different qualities) and the description may include, e.g., codec information, a profile value, a level value, a bitrate, and other descriptive characteristics of representations 68, where the different alternative representations correspond to the available representations) and specifying a plurality of adaptation sets ([0046] Manifest file 66 may include data indicative of the subsets of representations 68 corresponding to particular adaptation sets, as well as common characteristics for the adaptation sets. Manifest file 66 may also include data representative of individual characteristics, such as bitrates, for individual representations of adaptation sets), each adaptation set defining a spatial object from a plurality of spatial objects of the multimedia content ([0118] A device, such as client device 40 or server device 60, may determine, based on first data signaled at an adaptation set level of a media presentation , the plurality of spatial objects of the adaptation sets defining a whole spatial object ([0119] (2) a base layer may be fully sent, such that at any time for any viewport at least the lowest resolution video is available for rendering, (3) enhancement layers (ELs) are coded using motion-constrained tiles such that each potential region covering a viewport can be independently decoded from other regions across time, with inter-layer prediction enabled, where each viewport corresponds with a spatial object, where the entire low-resolution video corresponds to the whole spatial object), comprising at least one memory and at least one processing circuitry ([0169] If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which corresponds to a tangible medium such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, e.g., according to a communication protocol) configured to: 
define, in the manifest ([0118], see media presentation description), a point of reference defining a center point or origin of said whole spatial object ([0094] see center area of the original face shape), in an adaptation set of reference amongst said adaptation sets ([0119] When a motion-constrained tiles based viewport dependent VR video coding scheme is in use, and tile tracks as specified in clause 10 of 14496-15 are used, e.g., each motion-constrained tile or tile region is exclusively carried in a track or DASH representation, an adaptation-set-level element is used to signal the mapping between each motion-constrained tile or tile region and the representation carrying it, where the tile corresponds to the point of reference).
However, Wang does not explicitly disclose define, in the manifest, a type of mapping of the multimedia content to said whole spatial object, and associating depth information indicating position of at least one spatial object between an eye of a user and a background of multimedia content as a single value; and associate the depth information is provided for with each respective adaptation set of the 
 However, Hannuksela, which is analogous to Wang because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device, does disclose define, in the manifest, a type of mapping of the multimedia content to said whole spatial object ([0280] In an embodiment, a VR projection format (e.g. an equirectangular panorama or a cube map) is indicated in a manifest or parsed from a manifest. The VR projection format may be indicated in the manifest or parsed from the manifest or inferred to be specific to a viewport or a spatial region or an entire picture, where the entire picture corresponds to the whole spatial object, where the VR projection format corresponds to the type of mapping).
Wang and Hannuksela (hereafter Wang-Hannuksela) are analogous art because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang system of preparing and rendering virtual reality video segment data with the feature of spatial relationships between spatial objects, as disclosed in the Hannuksela system. The motivation to combine would be to use the spatial relationship of an adaptation set within a full-frame video, as taught by Hannuksela ([00239]). 

However, Wang-Hannuksela do not explicitly disclose associating depth information indicating position of at least one spatial object between an eye of a user and a background of multimedia content as a single value; and associate the depth information is provided for with each respective adaptation set of the plurality of adaptation sets, for rendering the spatial object in front of the background of the multimedia content.
However, Reznik, which is analogous to Wang-Hannuksela because each reference discloses manipulating content/media images for rendering on a user device, does disclose associating depth information indicating position of at least one spatial object between an eye of a user and a background of multimedia content as a single value ([0218] see presence of and range of depth of 3D content of the multimedia content, where the 3D indicates a background and spatial object, [0136-0137] see perceived depth and distance from user’s eye to screen, [0214] see selection of adaptation set, where each set uses a viewing distance and/or viewing angle/value, [0224] see viewing parameter which corresponds to single value, see viewing angle); and associate the depth information is provided for with each respective adaptation set of the plurality of adaptation sets, for rendering the spatial object in front of the background of the multimedia content ([0214] see selection of adaptation set, where each set uses a viewing distance and/or viewing angle/value, [0218] where the viewing parameter includes a range of depth of 3D content of multimedia content).
Wang-Hannuksela and Reznik (hereafter Wang-Hannuksela-Reznik) are analogous art because each reference discloses manipulating content/media images for rendering on a user device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang-Hannuksela system of preparing and rendering virtual reality video segment data with the feature of using a viewer’s distance to the screen, as disclosed in the Reznik system. The motivation to combine would be to compensate depth distortion, as taught by Reznik ([0137]). 

Regarding claim 9, the combination of Wang-Hannuksela-Reznik discloses the features of claim 8, as discussed above. However, Wang does not explicitly disclose wherein said one processing circuitry is further configured to define coordinates associated with one or several adaptation sets specified in the manifest.
However, Hannuksela, which is analogous to Wang because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device, does disclose wherein said one processing circuitry is further configured to define coordinates associated with one or several adaptation sets specified in the manifest ([0288] The specific point may be pre-defined, e.g. in a manifest format specification, or may be indicated in a manifest and/or parsed from a manifest, where the specific point corresponds to the defining coordinates).
Wang and Hannuksela (hereafter Wang-Hannuksela) are analogous art because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang system of preparing and rendering virtual reality video segment data with the feature of spatial relationships between spatial objects, as disclosed in the Hannuksela system. The motivation to combine would be to use the spatial relationship of an adaptation set within a full-frame video, as taught by Hannuksela ([00239]). 

Regarding claim 10, Wang discloses a method ([0007], see method) for receiving a manifest by a requesting terminal configured to receive a multimedia content divided into segments ([0050], Client device 40 may retrieve the MPD of a media presentation to determine how to access segments of representations 68, where the client device corresponds to the requesting terminal, where the MPD corresponds to the received manifest, where the media presentation corresponds to the multimedia content, where the plurality of segments correspond to the segments), each segment being available in one or more representations ([0042] Encapsulation unit 30 is responsible for assembling elementary streams into video files (e.g., segments) of various representations), said manifest listing available representations for the multimedia content ([0050], multimedia content 64 includes manifest file 66, which may correspond to a media presentation description (MPD). Manifest file 66 may contain descriptions of different alternative representations 68 (e.g., video services with different qualities) and the description may include, e.g., codec information, a profile value, a level value, a bitrate, and other descriptive characteristics of representations 68, where the different alternative representations correspond to the available representations) and specifying a plurality of adaptation sets ([0046] Manifest file 66 may include data indicative of the subsets of representations 68 corresponding to particular adaptation sets, as well as common characteristics for the adaptation sets. Manifest file 66 may also include data representative of individual characteristics, such as bitrates, for individual representations of adaptation sets), each adaptation set defining a spatial object from a plurality of spatial objects of the multimedia content ([0118] A device, such as client device 40 or server device , the plurality of spatial objects of the adaptation sets defining a whole spatial object ([0119] (2) a base layer may be fully sent, such that at any time for any viewport at least the lowest resolution video is available for rendering, (3) enhancement layers (ELs) are coded using motion-constrained tiles such that each potential region covering a viewport can be independently decoded from other regions across time, with inter-layer prediction enabled, where each viewport corresponds with a spatial object, where the entire low-resolution video corresponds to the whole spatial object), 
defines a point of reference defining a center point or origin of said whole spatial object ([0094] see center area of the original face shape), in an adaptation set of reference amongst said plurality of adaptation sets (([0118], see media presentation description, and [0119] When a motion-constrained tiles based viewport dependent VR video coding scheme is in use, and tile tracks as specified in clause 10 of 14496-15 are used, e.g., each motion-constrained tile or tile region is exclusively carried in a track or DASH representation, an adaptation-set-level element is used to signal the mapping between each motion-constrained tile or tile region and the representation carrying it, where the tile corresponds to the point of reference).
However, Wang does not explicitly disclose wherein the manifest further defines a type of mapping of the multimedia content to said whole spatial object and associates depth information indicating a position of at least one spatial object between an eye of a user and a background of the multimedia content as a single value, whereby the depth information is provided for each respective adaptation set of the plurality of adaptation sets for rendering of the at least one spatial object in front of the background of the multimedia content.
However, in an analogous art, Hannuksela does disclose wherein the manifest further defines a type of mapping of the multimedia content to said whole spatial object ([0280] In an embodiment, a VR projection format (e.g. an equirectangular panorama or a cube map) is indicated in a manifest or .
Wang and Hannuksela (hereafter Wang-Hannuksela) are analogous art because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang system of preparing and rendering virtual reality video segment data with the feature of spatial relationships between spatial objects, as disclosed in the Hannuksela system. The motivation to combine would be to use the spatial relationship of an adaptation set within a full-frame video, as taught by Hannuksela ([00239]). 

However, Wang-Hannuksela do not explicitly disclose associates depth information indicating a position of at least one spatial object between an eye of a user and a background of the multimedia content as a single value, whereby the depth information is provided for each respective adaptation set of the plurality of adaptation sets for rendering of the at least one spatial object in front of the background of the multimedia content.
However, Reznik, which is analogous to Wang-Hannuksela because each reference discloses manipulating content/media images for rendering on a user device, does disclose associates depth information indicating a position of at least one spatial object between an eye of a user and a background of the multimedia content as a single value ([0218] see presence of and range of depth of 3D content of the multimedia content, where the 3D indicates a background and spatial object, [0136-0137] see perceived depth and distance from user’s eye to screen, [0214] see selection of adaptation set, where each set uses a viewing distance and/or viewing angle/value, [0224] see viewing parameter which corresponds to single value, see viewing angle), whereby the depth information is provided for each respective adaptation set of the plurality of adaptation sets for rendering of the at least one spatial object in front of the background of the multimedia content ([0214] see selection of adaptation set, .
Wang-Hannuksela and Reznik (hereafter Wang-Hannuksela-Reznik) are analogous art because each reference discloses manipulating content/media images for rendering on a user device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang-Hannuksela system of preparing and rendering virtual reality video segment data with the feature of using a viewer’s distance to the screen, as disclosed in the Reznik system. The motivation to combine would be to compensate depth distortion, as taught by Reznik ([0137]). 

Regarding claim 11, Wang discloses a client terminal configured to receive a multimedia content divided into segments ([0050], where the client device corresponds to the requesting terminal, where the media presentation corresponds to the multimedia content, where the plurality of segments correspond to the segments), each segment being available in one or more representations ([0042] Encapsulation unit 30 is responsible for assembling elementary streams into video files (e.g., segments) of various representations), said client terminal being further configured to receive a manifest ([0050], Client device 40 may retrieve the MPD of a media presentation, where the MPD corresponds to the manifest) listing available representations for the multimedia content ([0050], multimedia content 64 includes manifest file 66, which may correspond to a media presentation description (MPD). Manifest file 66 may contain descriptions of different alternative representations 68 (e.g., video services with different qualities) and the description may include, e.g., codec information, a profile value, a level value, a bitrate, and other descriptive characteristics of representations 68, where the different alternative representations correspond to the available representations) and specifying a plurality of adaptation sets ([0046] Manifest file 66 may include data indicative of the subsets of representations 68 corresponding to particular adaptation sets, as well as common characteristics for the adaptation sets. Manifest file 66 may also include data representative of individual characteristics, such as bitrates, for individual representations of adaptation sets), each adaptation set defining a spatial object from a plurality of spatial objects of the multimedia content ([0118] A device, such as client device 40 or , the plurality of spatial objects of the adaptation sets defining a whole spatial object ([0119] (2) a base layer may be fully sent, such that at any time for any viewport at least the lowest resolution video is available for rendering, (3) enhancement layers (ELs) are coded using motion-constrained tiles such that each potential region covering a viewport can be independently decoded from other regions across time, with inter-layer prediction enabled, where each viewport corresponds with a spatial object, where the entire low-resolution video corresponds to the whole spatial object), 
defines a point of reference defining a center point or origin of said whole spatial object ([0094] see center area of the original face shape), in an adaptation set of reference amongst said adaptation sets (([0118], see media presentation description, and [0119] When a motion-constrained tiles based viewport dependent VR video coding scheme is in use, and tile tracks as specified in clause 10 of 14496-15 are used, e.g., each motion-constrained tile or tile region is exclusively carried in a track or DASH representation, an adaptation-set-level element is used to signal the mapping between each motion-constrained tile or tile region and the representation carrying it, where the tile corresponds to the point of reference).
However, Wang does not explicitly disclose wherein the manifest further defines a type of mapping of the multimedia content to said whole spatial object and associates depth information indicating a position of at least one spatial object between the background of the multimedia content as a single value, whereby the depth information is provided for with each respective adaptation set of the plurality of adaptation sets, for rendering of the at least one spatial object in front of the background of the multimedia content.
However, in an analogous art, Hannuksela does disclose wherein the manifest further defines a type of mapping of the multimedia content to said whole spatial object ([0280] In an embodiment, a VR projection format (e.g. an equirectangular panorama or a cube map) is indicated in a manifest or .
Wang and Hannuksela (hereafter Wang-Hannuksela) are analogous art because each reference discloses transmitting and parsing a manifest for a video file that is used to list available versions and adaptation sets of multimedia data to be rendered on a receiving client device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang system of preparing and rendering virtual reality video segment data with the feature of spatial relationships between spatial objects, as disclosed in the Hannuksela system. The motivation to combine would be to use the spatial relationship of an adaptation set within a full-frame video, as taught by Hannuksela ([00239]). 

However, Wang-Hannuksela do not explicitly disclose associates depth information indicating a position of at least one spatial object between the background of the multimedia content as a single value, whereby the depth information is provided for with each respective adaptation set of the plurality of adaptation sets, for rendering of the at least one spatial object in front of the background of the multimedia content.
However, Reznik, which is analogous to Wang-Hannuksela because each reference discloses manipulating content/media images for rendering on a user device, does disclose associates depth information indicating a position of at least one spatial object between the background of the multimedia content as a single value ([0218] see presence of and range of depth of 3D content of the multimedia content, where the 3D indicates a background and spatial object, [0136-0137] see perceived depth and distance from user’s eye to screen, [0214] see selection of adaptation set, where each set uses a viewing distance and/or viewing angle/value, [0224] see viewing parameter which corresponds to single value, see viewing angle), whereby the depth information is provided for with each respective adaptation set of the plurality of adaptation sets, for rendering of the at least one spatial object in front of the background of the multimedia content ([0214] see selection of adaptation set, where each .
Wang-Hannuksela and Reznik (hereafter Wang-Hannuksela-Reznik) are analogous art because each reference discloses manipulating content/media images for rendering on a user device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang-Hannuksela system of preparing and rendering virtual reality video segment data with the feature of using a viewer’s distance to the screen, as disclosed in the Reznik system. The motivation to combine would be to compensate depth distortion, as taught by Reznik ([0137]). 

Claims: 12:

Claim 12 is rejected under AIA  35 U.S.C. 103 as being unpatentable over Wang (US 20170347163), hereafter Wang, and further in view of Hannuksela (US 20170347026), hereafter Hannuksela, and Reznik et al. (US 20130195204), hereafter Reznik, and further in view of Krishna et al. (US 20150100702)), hereafter Krishna.
Regarding claim 12, the combination of Wang-Hannuksela-Reznik discloses the features of claim 1, as discussed above. However, Wang-Hannuksela-Reznik do not explicitly disclose further comprising: associating angle information to a minimum number of adaptation sets from the plurality of adaptation sets; computing angle information for each of the plurality of adaptation sets based on the angle information associated to the minimum number of adaptation sets.
However, Krishna, which is analogous to Wang-Hannuksela-Reznik because each reference discloses manipulating content/media images for rendering on a user device, does disclose further comprising: associating angle information to a minimum number of adaptation sets from the plurality of adaptation sets ([0018] where each adaptation set is formed for each camera angle/perspective); 
computing angle information for each of the plurality of adaptation sets based on the angle information associated to the minimum number of adaptation sets ([0029] where the change .
Wang-Hannuksela-Reznik and Krishna (hereafter Wang-Hannuksela-Reznik-Krishna) are analogous art because each reference discloses manipulating content/media images for rendering on a user device. Therefore, it would have been prima facie obvious to one of ordinary skill in the art at the time the invention was effectively filed to modify the Wang-Hannuksela-Reznik system of preparing and rendering virtual reality video segment data with the feature of forming adaptation sets for each camera angle of a particular scene, as disclosed in the Krishna system. The motivation to combine would be to allow a user to choose an adaptation set based on user preference, as taught by Krishna ([0018]). 


Interview Practice

USPTO Automated Interview Request (AIR)
The USPTO AIR is a new optional online interview scheduling tool that allows Applicants to request an interview with an Examiner for their pending patent application.
The USPTO AIR form is available on our website at: http://www.uspto.gov/patent/laws-and-regulations/interview-practice.
By submitting this type of interview request, the pending patent application will be in compliance with the written authorization requirement for Internet communication in accordance with MPEP §502.03. This authorization will be in effect until the Applicant provides a written withdrawal of authorization to the Examiner of record.
If you have questions or need assistance with the USPTO AIR form or with interview practice at the USPTO, please contact an Interview Specialist at http://www.uspto.gov/patent/laws-and-regulations/interview-practice/interview-specialist or send an email to ExaminerInterviewPractice@USPTO.GOV.

Examiner Notes: 
A) Prior to conducting any interview (whether using AIR or not), Applicant(s) must submit an agenda including the proposed date and time, all arguments in writing, and proposed claim amendments (if applicable). Any proposed amendments or arguments not presented in the agenda will only be heard by the Examiner, but because the Examiner will not have heard them in advance and been given an equitable opportunity to consider them, no decision will be rendered, nor agreement made. ALL AGENDAS MUST BE RECEIVED BY THE EXAMINER AT LEAST 24 HOURS PRIOR TO THE START OF THE INTERVIEW, OR THE PREVIOUS BUSINESS DAY, WHICHEVER IS LONGER, or the interview may have to be rescheduled. 
B) After-final interviews may be granted, but the agenda must be in compliance with MPEP 713.09 which limits the interview only to discussions of proposed amendments, or clarification for appeal. After-final interviews are not to be conducted for the purpose of rehashing previously made arguments. After seeing the agenda, Examiner will decide whether to grant or deny the interview.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RUTH S. SOLOMON whose telephone number is (571)270-0418.  The examiner can normally be reached on 9:30am - 5:45PM EST.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Philip J. Chea can be reached on 571-272-3951.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/RS/             Examiner, Art Unit 2456                                                                                                                                                                                           /PHILIP J CHEA/Supervisory Patent Examiner, Art Unit 2456