Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on March 1, 2021 has been entered.

Response to Amendment
This action is in response to the remark entered on 03/01/2021.
Claims 1, 3-13 & 16-24 are pending in the instant application.
Claims 1, 3, 12 & 24 are amended.
Claims 2 & 14-15 are cancelled.

Response to Arguments
Applicant's remarks filed 02/17/2021, page 8-12, regarding the rejection of claim 1, and similarly claims 12 & 24 under 35 U.S.C. § 103 have been fully considered and are moot upon further consideration and a new ground(s) of rejection made under 35 U.S.C. § 103 as being unpatentable over Le Floch et al. (US 2016/0029091 A1) 
In response to Applicant’s remark that Examiner’s recited references do not teach, suggest, nor disclose Applicant's newly-recited limitations, the Examiner directs Applicant’s attention to the rejection of the independent claim 1, and similarly claims 12 & 24 below, where Applicant’s newly recited claim limitations are addressed by Floch, Denoual, and Thomas and are rejected for the reasons as outlined below. 

Applicant's remarks filed 02/17/2021, page 12, regarding the rejection of claims 3-11, 13 & 16-23 under 35 U.S.C. § 103 have been fully considered but they are not persuasive. 
Applicant relies on the patentability of the claims from which these claims depend to traverse the rejection without prejudice to any further basis for patentability of these claims based on the additional elements recited. 
Examiner respectfully disagrees because the combination of Floch, Denoual, and Thomas teach or suggest independent claims 1, 12 & 24 as outlined below. Thus, claims 3-11, 13 & 16-23 are also rejected for the similar reasons as outlined below.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of 

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3-9, 12-13, 16-21 & 24 are rejected under 35 U.S.C. 103 as being unpatentable over Le Floch et al. (US 2016/0029091 A1) (hereinafter Floch) in view of Denoual et al. (US 2016/0165321 A1) (hereinafter Denoual), and further in view of Thomas et al. (WO 2015/197815 A1) (hereinafter Thomas).

Regarding claim 1, Floch discloses a method implemented over a transmission medium [Paragraphs [0218], Communication network], the method comprising:
receiving media content for a reference media presentation [Paragraphs [0084], Video stream as media content received by encoding device 302];
encoding the received media content as a plurality of component tracks, each component track comprising video media or metadata samples for a component of the reference media presentation [Paragraphs [0085]-[0087], [0101]-[0102] & [0110], video stream is subdivided into plurality of video streams, and are each encoded into an encapsulation file that comprises the encoded video components into video tracks];
at least one of the plurality of component tracks and (ii) specifies a set of operations for constructing video media and metadata samples of a sub-region of the reference media presentation based on the video media or metadata samples from the at least one referenced component tracks [Paragraphs [0096]-[0097], [0113]-[0119], [0124]-[0127], [0148]-[0152] & [0181]-[0186], Figs. 7A & 7B, Creation of corresponding video tracks as derived tracks containing multiple-extractors in NAL units, which contain instructions (ii) on how to replace data, from NAL units to media frames, from the current track with data from other tracks (i), and featuring only elementary video streams relating to ROI], wherein
the sub-region corresponds to a viewport or a region of interest (ROI), and each of the at least one referenced component tracks is a sub-region track that provides media samples for an associated portion of the sub-region [Paragraphs [0121]-[0127], Figs. 7A & 7B, Tiles a, b, c, d, as elementary video streams/sub-region tracks that correspond to the region of interest];
video media content of the derived track is constructed by referencing the at least one referenced component tracks and performing the set of operations [Paragraphs [0096]-[0097], [0113]-[0119], [0124]-[0127], [0148]-[0152] & [0181]-[0186], Figs. 7A & 7B, multiple-extractors in NAL units, which contain instructions (ii) on how to replace data, from NAL units to media frames, from the current track with video data from other tracks]; and
providing a streaming media file for the media presentation, having the encoded derived track, for retrieval [Paragraphs [0087]-[0088], [0092], [0096]-[0097], [0101]-[0102], [0127] & [0189], Encoded video streams are encapsulated into an encapsulation file 305, as streaming media file, having as many video tracks as encoded video streams and transmitted to server], wherein the providing comprises transmitting the streaming media file over the transmission medium to a receiver, for:
decoding the encoded derived track for the reference media presentation [Paragraphs [0096]-[0098], [0127] & [0189], Client device receives tracks within encapsulation file and obtains elementary stream that is decoded]; and
receipt of a selection of the sub-region of the reference media presentation from a user through a user interface associated with the receiver [Paragraphs [0091]-[0099] & [0210], Figs. 7A & 7B, User identifies region of interest 406 and requests display with high quality, in accordance with user interface 1305].
However, Floch does not explicitly disclose encoding a derived track that (i) references at least two of the plurality of component tracks and (ii) specifies a set of operations for constructing video media and metadata samples of a sub-region of the reference media presentation based on the video media or metadata samples from the at least two referenced component tracks, wherein
the sub-region corresponds to a viewport or a region of interest (RQI), and each of the at least two referenced component tracks is a sub-region track that provides media samples for an associated portion of the sub-region;
video media content of the derived track is constructed by referencing the at least two referenced component tracks and performing the set of operations; 
the derived track comprises no video media content before performing the set of operations; and

constructing, at the receiver, the video media content of the derived track by
referencing the at least two referenced component tracks; and
dynamically constructing, at the receiver, the viewport or ROI of the video media content, by performing the set of operations specified in the derived track, for display to the user.
Denoual teaches of encoding a derived track that (i) references at least two of the plurality of component tracks and (ii) specifies a set of operations for constructing video media and metadata samples of a sub-region of the reference media presentation based on the video media or metadata samples from the at least two referenced component tracks [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks, wherein an extended extractor type is defined for referencing timed media data and tile tracks from a composite track, being a media track representing a complete tiled frame (i.e.- the composition of all tiles) using extractor objects to refer to NAL units in their respective tile tracks within an ROI], wherein
the sub-region corresponds to a viewport or a region of interest (ROI), and each of the at least two referenced component tracks is a sub-region track that provides media samples for an associated portion of the sub-region [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Region of interest comprising tiles 3 and 7, wherein each tile is of a tile track, as sub-region track, and comprises one spatial subsample of several timed samples];
video media content of the derived track is constructed by referencing the at least two referenced component tracks and performing the set of operations [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks, wherein an extended extractor type is defined for referencing timed media data and tile tracks from a composite track and to refer to NAL units in their respective tile tracks within an ROI]; 
the derived track comprises no video media content before performing the set of operations [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks, as not being video media content, wherein an extended extractor type is defined for referencing timed media data and tile tracks from a composite track and used to refer to NAL units in their respective tile tracks]; and
wherein the providing comprises transmitting the streaming media file over the transmission medium to a receiver [Paragraphs [0336]-[0339], Server serves media segment files to client device], for:
decoding the encoded derived track for the reference media presentation [Paragraphs [0343]-[0355], Extractor replaced by data it is referencing and the bit-stream is sent to a video decoder to be decoded]; 

referencing the at least two referenced component tracks [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks]; and
dynamically constructing, at the receiver, the viewport or ROI of the video media content, by performing the set of operations specified in the derived track, for display to the user [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0355], [0384]-[0385], [0432], Figs. 4, 7 10-14 & 17-18, Composite track, as derived track, references timed media data tracks and tile tracks, and to refer to NAL units in their respective tile tracks within a selected ROI (step 608, dynamic), then displayed].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the composite tracks of Denoual as above, to provide an efficient data organization and track description scheme suitable for spatial tiles, which ensures, whatever track combination is selected by a client application, that the result of the ISO BMFF parsing always leads to a valid video elementary bit-stream for the video decoder (Denoual, Paragraphs [0017]).
However, Floch and Denoual do not explicitly disclose upon receipt of a selection of the sub-region of the reference media presentation from a user through a user interface associated with the receiver:
constructing, at the receiver, the video media content of the derived track;
for the selected sub-region of the reference media presentation, for display to the user.
Thomas teaches upon receipt of a selection of the sub-region of the reference media presentation from a user through a user interface associated with the receiver [Pgs. 15, ll. 20-31, pgs. 27-28 ll. 36-9, Fig. 11B, pgs. 35-36, ll. 4-3, Figs. 19-20, User-generated ROI, through user interface, received at client in step 1134]:
constructing, at the receiver, the video media content of the derived track [Pgs. 15, ll. 20-31, pgs. 27-28 ll. 36-9, Fig. 11B, pgs. 35-36, ll. 4-3, Figs. 19-20, Step 1134, client switching from rendering of ROI to user-generated ROI, requesting temporal segments of HEVC tile streams from spatial buffer, Fig. 19A-B];
dynamically constructing, at the receiver, the viewport or ROI of the video media content for the selected sub-region of the reference media presentation, for display to the user [Pgs. 15, ll. 20-31, pgs. 27-28 ll. 36-9, Fig. 11B, pgs. 35-36, ll. 4-3, Figs. 19-20, Step 1134, client dynamically switching from rendering of ROI to user-generated ROI, requesting temporal segments of HEVC tile streams from spatial buffer, Fig. 19A-B, and displaying image region and optionally cropped image regions constructed from HEVC tile streams].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement ROI rendering modes of Thomas as above, to enable efficient streaming of an ROI of a wide field-of-view image area to a client, enabling smooth and (Thomas, Pg. 2 ll. 26-30).

Regarding claim 3, Floch, Denoual, and Thomas discloses the method of claim 1, and are analyzed as previously discussed with respect to the claim. 
Furthermore, Floch discloses of further comprising: receiving a request over the transmission medium from the receiver for a subset of the plurality of component tracks; and transmitting over the transmission medium to the receiver only the requested component tracks [Paragraphs [0091]-[0096] & [0210], Server receives user request for region of interest, accesses the encapsulation file to determine image portions corresponding to the region of interest, and transmits said image portions to client device, all over network/wireless interfaces as transmission medium].

Regarding claim 4, Floch, Denoual, and Thomas discloses the method of claim 1, and are analyzed as previously discussed with respect to the claim. 
Furthermore, Floch discloses wherein the set of operations comprise an instruction to construct video media samples of a sub-region track for the derived track according to metadata samples of one or more of the at least one referenced component tracks [Paragraphs [0152] & [0181]-[0187], Multiple-extractors contain instructions on how to replace data from current track with data from other tracks that includes pointing to NALU units of other tracks as metadata samples and then the multiple-extractors themselves being replaced with image tile samples from the other tracks].
However Floch does not explicitly disclose one or more of the at least two referenced component tracks.
Denoual teaches of one or more of the at least two referenced component tracks [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the composite tracks of Denoual as above, to provide an efficient data organization and track description scheme suitable for spatial tiles, which ensures, whatever track combination is selected by a client application, that the result of the ISO BMFF parsing always leads to a valid video elementary bit-stream for the video decoder (Denoual, Paragraphs [0017]).

Regarding claim 5, Floch, Denoual, and Thomas discloses the method of claim 4, and are analyzed as previously discussed with respect to the claim. 
Furthermore, Floch discloses wherein the instruction comprises track references to at least one component tracks that stores video media samples and one or more component tracks that store metadata samples [Paragraphs [0152]-[0159] & [0181]-[0187], Multiple-extractors contain instructions and data that reference the index of tiles located in external tracks that contain samples (frames) as media samples, and other NALUs as metadata samples, also track reference box, as track references that points to an index that provides a track identifier that comprises samples].
However Floch does not explicitly disclose track references to at least two component tracks.
Denoual teaches of track references to at least two component tracks [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the composite tracks of Denoual as above, to provide an efficient data organization and track description scheme suitable for spatial tiles, which ensures, whatever track combination is selected by a client application, that the result of the ISO BMFF parsing always leads to a valid video elementary bit-stream for the video decoder (Denoual, Paragraphs [0017]).

Regarding claim 6, Floch, Denoual, and Thomas discloses the method of claim 4, and are analyzed as previously discussed with respect to the claim. 
Furthermore, Floch discloses wherein the instruction comprises a constructor that invokes the track references to at least one component tracks that stores video media samples and one or more component tracks that store metadata samples [Paragraphs [0152]-[0159] & [0181]-[0187], Multiple-extractors, as constructors, contain instructions and data that reference or invoke the index of tiles located in external tracks that contain samples (frames) as media samples, and other NALUs as metadata samples].
	However, Floch does not explicitly disclose track references to at least two component tracks.
Denoual teaches of track references to at least two component tracks [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the composite tracks of Denoual as above, to provide an efficient data organization and track description scheme suitable for spatial tiles, which ensures, whatever track combination is selected by a client application, that the result of the ISO BMFF parsing always leads to a valid video elementary bit-stream for the video decoder (Denoual, Paragraphs [0017]).

Regarding claim 7, Floch, Denoual, and Thomas discloses the method of claim 4, and are analyzed as previously discussed with respect to the claim. 
Furthermore, Floch discloses wherein the instruction comprises track references to at least one component tracks that stores video media samples [Paragraphs [0158]-[0159] & [0202], Track reference box, as track references that points to an index that provides a track identifier that comprises samples] and an indicator for [Paragraphs [0153]-[0155], NAL (metadata sample) Unit Header containing parameter nal_unit_type set to 48 and 63 indicates that derived track contains metadata samples].
However, Floch does not explicitly disclose track references to at least two component tracks.
Denoual teaches of track references to at least two component tracks [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the composite tracks of Denoual as above, to provide an efficient data organization and track description scheme suitable for spatial tiles, which ensures, whatever track combination is selected by a client application, that the result of the ISO BMFF parsing always leads to a valid video elementary bit-stream for the video decoder (Denoual, Paragraphs [0017]).

Regarding claim 8, Floch, Denoual, and Thomas discloses the method of claim 1, and are analyzed as previously discussed with respect to the claim. 
Furthermore, Floch discloses wherein the set of operations specified by the derived track comprises an ordered list of operations to be performed on an ordered list of input images or samples from the plurality of component tracks for the reference media presentation [Paragraphs [0126]-[0173], [0198]-[0209], Fig. 12, During steps 1210-1211, Multiple-Extractors run the ordered list of operations detailed in Annexes A-E’s syntax to replace data in the derived track with samples referenced via track indices, wherein the ordered list of input images or samples seen in Figs. 9 & 11, are referenced].

Regarding claim 9, Floch, Denoual, and Thomas disclose the method of claim 8, and are analyzed as previously discussed with respect to the claim.
Furthermore, Floch discloses performing the ordered list of operations [Paragraphs [0126]-[0173], [0198]-[0209], Fig. 12, During steps 1210-1211, Multiple-Extractors run the ordered list of operations detailed in Annexes A-E’s].
However, Floch and Denoual do not explicitly disclose wherein one or more of the referenced component tracks are metadata tracks storing metadata samples, wherein each metadata sample of a referenced metadata track specifies a dimension of a sub-region and a position of the sub-region in the reference media presentation.
Thomas teaches wherein one or more of the referenced component tracks are metadata tracks storing metadata samples, wherein each metadata sample of a referenced metadata track specifies a dimension of a sub-region and a position of the sub-region in the reference media presentation [Pg. 10 ll. 14-15, Pg. 22, ll. 8-30, Metadata track comprising ROI coordinates, wherein the position and size of the ROI may be defined on the basis of ROI coordinates].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the metadata track of Thomas as above, to locate delivery nodes in the (Thomas, Pg. 22 ll. 12-16).

Regarding claim 12, Floch discloses a method implemented over a transmission medium, the method comprising:
receiving from a sender over the transmission medium a streaming media file for a reference media presentation that comprises a plurality of components [Paragraphs [0097]-[0098] & [0210], Client receives encoded video streams (reference media presentation) in the form of segment files, as plurality of components, all over wireless network], each of the components having a corresponding component track [Paragraphs [0111]-[0112], ISO BMFF and DASH extension  make possible of putting each track in different segment files], each component track comprising video media or metadata samples [Paragraphs [0085]-[0087], [0101]-[0102], [0110] & [0196], video stream is subdivided into plurality of video streams (image frames as media) and segment files containing metadata data corresponding to elementary streams];
receiving a selection from a user through a user interface, the user selecting a sub-region in the reference media presentation [Paragraphs [0091]-[0099] & [0210], Figs. 7A & 7B, User identifies region of interest and requests display with high quality, in accordance with user interface 1305]; and
retrieving a derived track from the streaming media file [Paragraphs [0098]-[0102], [0121]-[0127] & [0140]-[0187], Encapsulated file is received containing video tracks, with Tracks 704 and 705 as derived tracks], wherein the derived track (i) references at least one of the plurality of component tracks and (ii) specifies a set of operations for constructing video media samples based on the video media samples of the at least one referenced component tracks [Paragraphs [0096]-[0097], [0113]-[0119], [0124]-[0127], [0148]-[0152] & [0181]-[0186], Figs. 7A & 7B, Creation of corresponding video tracks as derived tracks containing multiple-extractors in NAL units, which contain instructions (ii) on how to replace data, from NAL units to media frames, from the current track with data from other tracks (i), and featuring only elementary video streams relating to ROI], wherein;
the selected sub-region corresponds to a viewport or a region of interest (ROI), and each of the at least one referenced component tracks is a sub-region track that provides media samples for an associated portion of the selected sub-region [Paragraphs [0121]-[0127], Figs. 7A & 7B, Tiles a, b, c, d, as elementary video streams/sub-region tracks that correspond to the region of interest];
video media content of the derived track is constructed by referencing the at least one referenced component tracks and performing the set of operations [Paragraphs [0096]-[0097], [0113]-[0119], [0124]-[0127], [0148]-[0152] & [0181]-[0186], Figs. 7A & 7B, Creation of corresponding video tracks as derived tracks containing multiple-extractors in NAL units, which contain instructions (ii) on how to replace data, from NAL units to media frames, from the current track with data from other tracks].
However, Floch does not explicitly disclose wherein the derived track (i) references at least two of the plurality of component tracks and (ii) specifies a set of video media samples based on the video media samples of the at least two referenced component tracks, wherein;
the sub-region corresponds to a viewport or a region of interest (ROI), and each of the at least two referenced component tracks is a sub-region track that provides media samples for an associated portion of the sub-region; video media content of the derived track is constructed by referencing the at least two referenced component tracks and performing the set of operations; and the derived track comprises no video media content before performing the set of operations;
wherein the viewport or ROI of the video media content for the reference media presentation is dynamically constructed, for display to the user.
Denoual teaches wherein the derived track (i) references at least two of the plurality of component tracks and (ii) specifies a set of operations for constructing video media samples based on the video media samples of the at least two referenced component tracks [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks, wherein an extended extractor type is defined for referencing timed media data and tile tracks from a composite track, being a media track representing a complete tiled frame (i.e.- the composition of all tiles) using extractor objects to refer to NAL units in their respective tile tracks within an ROI], wherein;
the sub-region corresponds to a viewport or a region of interest (RQI), and each of the at least two referenced component tracks is a sub-region track that provides media samples for an associated portion of the sub-region [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Region of interest comprising tiles 3 and 7, wherein each tile is of a tile track, as sub-region track, and comprises one spatial subsample of several timed samples];
video media content of the derived track is constructed by referencing the at least two referenced component tracks and performing the set of operations [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks, wherein an extended extractor type is defined for referencing timed media data and tile tracks from a composite track and to refer to NAL units in their respective tile tracks within an ROI]; and
the derived track comprises no video media content before performing the set of operations [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks, as not being video media content, wherein an extended extractor type is defined for referencing timed media data and tile tracks from a composite track and used to refer to NAL units in their respective tile tracks];
wherein the viewport or ROI of the video media content for the reference media presentation is dynamically constructed, for display to the user [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0355], [0384]-[0385], [0432], Figs. 4, 7 10-14 & 17-18, Composite track, as derived track, references timed media data tracks and tile tracks, and to refer to NAL units in their respective tile tracks within a selected ROI (step 608, dynamic), then displayed].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the composite tracks of Denoual as above, to provide an efficient data organization and track description scheme suitable for spatial tiles, which ensures, whatever track combination is selected by a client application, that the result of the ISO BMFF parsing always leads to a valid video elementary bit-stream for the video decoder (Denoual, Paragraphs [0017]).
However, Floch and Denoual do not explicitly disclose:
retrieving a derived track from the received streaming media file that corresponds to the sub-region selection; 
wherein the viewport or ROI of the video media content for the selected sub-region of the reference media presentation is dynamically constructed, for display to the user.
Thomas teaches retrieving a derived track from the received streaming media file that corresponds to the sub-region selection [Pgs. 15, ll. 20-31, pgs. 27-28 ll. 36-9, Fig. 11B, pgs. 35-36, ll. 4-3, Figs. 19-20, Step 1134, client switching from rendering of ROI to user-generated ROI, requesting temporal segments of HEVC tile streams from spatial buffer, Fig. 19A-B]; 
wherein the viewport or ROI of the video media content for the selected sub-region of the reference media presentation is dynamically constructed, for display to the user [Pgs. 15, ll. 20-31, pgs. 27-28 ll. 36-9, Fig. 11B, pgs. 35-36, ll. 4-3, Figs. 19-20, Step 1134, client dynamically switching from rendering of ROI to user-generated ROI, requesting temporal segments of HEVC tile streams from spatial buffer, Fig. 19A-B, and displaying image region and optionally cropped image regions constructed from HEVC tile streams].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement ROI rendering modes of Thomas as above, to enable efficient streaming of an ROI of a wide field-of-view image area to a client, enabling smooth and seamless switching between single ROI stream to separate tile streams (Thomas, Pg. 2 ll. 26-30).

Regarding claim 13, Floch, Denoual, and Thomas discloses the method of claim 12, and are analyzed as previously discussed with respect to the claim. 
Furthermore, Floch discloses wherein providing the video media samples for the selected sub-region comprises selecting a subset of the at least one referenced component tracks based on the specification of the sub-region [Paragraphs [0091]-[0096], [0179]-[0193] & [0209], Only the media segments that contain the high quality version of the tiles composing the ROI need to be sent from server to client, wherein user selects ROI and sends ROI request to server] and performing the set of operations based on the video media samples that are in the selected subset of component tracks but not on the video media samples that are not in the selected subset of component tracks [Paragraphs [0091]-[0096], [0140]-[0193] & [0209], Extractors and multiple-extractors, comprising instructions on how to extract data (tiles and samples) from referenced tracks that are encompassed under the user-selected ROI sent from the server].
However, Floch does not explicitly disclose of at least two referenced component tracks.
Denoual teaches of at least two referenced component tracks [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the composite tracks of Denoual as above, to provide an efficient data organization and track description scheme suitable for spatial tiles, which ensures, whatever track combination is selected by a client application, that the result of the ISO BMFF parsing always leads to a valid video elementary bit-stream for the video decoder (Denoual, Paragraphs [0017]).

Regarding claim 16, Floch, Denoual, and Thomas discloses the method of claim 12, and are analyzed as previously discussed with respect to the claim. 
Furthermore, Floch discloses wherein the set of operations comprise an instruction to construct video media samples of a sub-region track for the derived track according to metadata samples of one or more of the referenced component tracks [Paragraphs [0152] & [0181]-[0187], Multiple-extractors contain instructions on how to replace data from current track with data from other tracks that includes pointing to NALU units of other tracks as metadata samples and then the multiple-extractors themselves being replaced with image tile samples from the other tracks].

Regarding claim 17, Floch, Denoual, and Thomas discloses the method of claim 16, and are analyzed as previously discussed with respect to the claim. 
Furthermore, Floch discloses wherein the instruction comprises track references to at least one of component tracks that stores video media samples and one or more component tracks that store metadata samples [Paragraphs [0152]-[0159] & [0181]-[0187], Multiple-extractors contain instructions and data that reference the index of tiles located in external tracks that contain samples (frames) as media samples, and other NALUs as metadata samples, also track reference box, as track references that points to an index that provides a track identifier that comprises samples].
	However, Floch does not explicitly disclose track references to at least two of component tracks.
Denoual teaches track references to at least two of component tracks [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the composite tracks of Denoual as above, to provide an  (Denoual, Paragraphs [0017]).

Regarding claim 18, Floch, Denoual, and Thomas discloses the method of claim 16, and are analyzed as previously discussed with respect to the claim. 
Furthermore, Floch discloses wherein the instruction comprises a constructor that invokes the track references to at least one component tracks that stores video media samples and one or more component tracks that store metadata samples [Paragraphs [0152]-[0159] & [0181]-[0187], Multiple-extractors, as constructors, contain instructions and data that reference or invoke the index of tiles located in external tracks that contain samples (frames) as media samples, and other NALUs as metadata samples].
However, Floch does not explicitly disclose track references to at least two component tracks.
Denoual teaches of track references to at least two component tracks [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the composite tracks of Denoual as above, to provide an  (Denoual, Paragraphs [0017]).

Regarding claim 19, Floch, Denoual, and Thomas discloses the method of claim 16, and are analyzed as previously discussed with respect to the claim. 
Furthermore, Floch discloses wherein the instruction comprises track references to at least one component tracks that stores video media samples [Paragraphs [0158]-[0159] & [0202], Track reference box, as track references that points to an index that provides a track identifier that comprises samples] and an indicator for indicating that metadata samples are stored in the derived track [Paragraphs [0153]-[0155], NAL (metadata sample) Unit Header containing parameter nal_unit_type set to 48 and 63 indicates that derived track contains metadata samples].
However, Floch does not explicitly disclose track references to at least two component tracks.
Denoual teaches of track references to at least two component tracks [Paragraphs [0208], [0236], [0269], [0283]-[0287], [0294]-[0297], [0332]-[0354], [0384]-[0385], [0432], Figs. 4, 10-14 & 17-18, Composite track, as derived track, comprises references to timed media data tracks and tile tracks].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the composite tracks of Denoual as above, to provide an  (Denoual, Paragraphs [0017]).

Regarding claim 20, Floch, Denoual, and Thomas discloses the method of claim 12, and are analyzed as previously discussed with respect to the claim. 
Furthermore, Floch discloses wherein the set of operations specified by the derived track comprises an ordered list of operations to be performed on an ordered list of input images or samples from the plurality of component tracks for the reference media presentation [Paragraphs [0126]-[0173], [0198]-[0209], Fig. 12, During steps 1210-1211, Multiple-Extractors run the ordered list of operations detailed in Annexes A-E’s syntax to replace data in the derived track with samples referenced via track indices, wherein the ordered list of input images or samples seen in Figs. 9 & 11, are referenced].

Regarding claim 21, Floch, Denoual, and Thomas disclose the method of claim 20, and are analyzed as previously discussed with respect to the claim.
Furthermore, Floch discloses performing the ordered list of operations [Paragraphs [0126]-[0173], [0198]-[0209], Fig. 12, During steps 1210-1211, Multiple-Extractors run the ordered list of operations detailed in Annexes A-E’s].
However, Floch and Denoual do not explicitly disclose wherein one or more of the referenced component tracks are metadata tracks storing metadata samples, 
Thomas teaches wherein one or more of the referenced component tracks are metadata tracks storing metadata samples, wherein each metadata sample of a referenced metadata track specifies a dimension of a sub-region and a position of the sub-region in the reference media presentation [Pg. 10 ll. 14-15, Pg. 22, ll. 8-30, Metadata track comprising ROI coordinates, wherein the position and size of the ROI may be defined on the basis of ROI coordinates].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the metadata track of Thomas as above, to locate delivery nodes in the network node and deliver desired video data by using the metadata in the ROI streams to configure the decoder before the video stream data is received by the client (Thomas, Pg. 22 ll. 12-16).

Regarding claim 24, apparatus claim 24 is drawn to the apparatus using/performing substantially the same method as claimed in claim 12. Therefore, apparatus claim 24 corresponds to method claim 12, and is rejected for the same rationale as used above.
Furthermore, Floch discloses a user interface circuit capable of receiving a selection of a sub-region in the reference media presentation [Paragraphs [0099], [0104] & [0210], User interface circuit receives user input to select and define ROI in video stream]; and
[Paragraphs [0200]-[0210], Client device comprises decoder].

Claim(s) 10 & 22 are rejected under 35 U.S.C. 103 as being unpatentable over Le Floch et al. (US 2016/0029091 A1) (hereinafter Floch), Denoual et al. (US 2016/0165321 A1) (hereinafter Denoual), and Thomas et al. (WO 2015/197815 A1) (hereinafter Thomas) in view of Abbas et al. (US 2017/0223368 A1) (hereinafter Abbas).

Regarding claim 10, Floch, Denoual, and Thomas disclose the method of claim 9, and are analyzed as previously discussed with respect to the claim.
Furthermore, Floch discloses performing the ordered list of operations [Paragraphs [0126]-[0173], [0198]-[0209], Fig. 12, During steps 1210-1211, Multiple-Extractors run the ordered list of operations detailed in Annexes A-E’s].
However, Floch, Denoual, and Thomas do not disclose wherein the sub-region corresponds to a viewport and the reference media presentation is a 360-degree virtual reality (360VR) video presentation, and wherein each metadata sample further specifies a set of angles of the viewport relative to the reference media presentation.
Abbas teaches wherein the sub-region corresponds to a viewport and the reference media presentation is a 360-degree virtual reality (360VR) video presentation [Paragraphs [0054] & [0090], Within 360-degree virtual reality environment, the area that the user is looking, is a viewport], and wherein each metadata sample further specifies a set of angles of the viewport relative to the reference media [Paragraphs [0065],[0081], [0090], [0126], [0217]-[0218], Metadata modules, including a gyroscope or orientation sensor, provides orientation information (angles) describing orientation of display device to update viewport within 360 degree environment].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the angles of a viewpoint as metadata in Abbas as above, to provide gyroscope and orientation information such that any change in motion, orientation, or change in location experienced by the display device is reflected in the video content being displayed (Abbas, Paragraph [0081]).

Regarding claim 22, Floch, Denoual, and Thomas disclose the method of claim 21, and are analyzed as previously discussed with respect to the claim.
Furthermore, Floch discloses performing the ordered list of operations [Paragraphs [0126]-[0173], [0198]-[0209], Fig. 12, During steps 1210-1211, Multiple-Extractors run the ordered list of operations detailed in Annexes A-E’s].
However, Floch, Denoual, and Thomas do not disclose wherein the sub-region corresponds to a viewport and the reference media presentation is a 360-degree virtual reality (360VR) video presentation, and wherein each metadata sample further specifies a set of angles of the viewport relative to the reference media presentation.
Abbas teaches wherein the sub-region corresponds to a viewport and the reference media presentation is a 360-degree virtual reality (360VR) video presentation [Paragraphs [0054] & [0090], Within 360-degree virtual reality environment, the area that the user is looking, is a viewport], and wherein each metadata sample further specifies a set of angles of the viewport relative to the reference media presentation for performing the ordered list of operations [Paragraphs [0065],[0081], [0090], [0126], [0217]-[0218], Metadata modules, including a gyroscope or orientation sensor, provides orientation information (angles) describing orientation of display device to update viewport within 360 degree environment].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement the angles of a viewpoint as metadata in Abbas as above, to provide gyroscope and orientation information such that any change in motion, orientation, or change in location experienced by the display device is reflected in the video content being displayed (Abbas, Paragraph [0081]).

Claims 11 & 23 are rejected under 35 U.S.C. 103 as being unpatentable over Le Floch et al. (US 2016/0029091 A1) (hereinafter Floch), Denoual et al. (US 2016/0165321 A1) (hereinafter Denoual), and Thomas et al. (WO 2015/197815 A1) (hereinafter Thomas) in view of Shaw et al. (US 2016/0021406 A1) (hereinafter Shaw).

Regarding claim 11, Floch, Denoual, and Thomas disclose the method of claim 9, and are analyzed as previously discussed with respect to the claim.
However, Floch, Denoual, and Thomas do not disclose the particulars of claim 11.
[Paragraphs [0044] & [0054], Metadata indicating preferred shape for area of attention, as viewport].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement shape metadata as taught in Shaw as above, to properly display an area of attention through adjustment based upon display device characteristics and user preferences (Shaw, Paragraph [0044]).

Regarding claim 23, Floch, Denoual, and Thomas disclose the method of claim 21, and are analyzed as previously discussed with respect to the claim.
However, Floch, Denoual, and Thomas do not disclose the particulars of claim 23.
Shaw teaches wherein each metadata sample further specifies a shape of the sub-region [Paragraphs [0044] & [0054], Metadata indicating preferred shape for area of attention, as viewport].
It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Floch to integrate and implement shape metadata as taught in Shaw as above, to properly display an area of attention through adjustment based upon display device characteristics and user preferences (Shaw, Paragraph [0044]).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL CHANG whose telephone number is (571)272-5707.  The examiner can normally be reached on M-Sa, 12PM - 10 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Czekaj can be reached on 571-272-7327.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/DANIEL CHANG/Examiner, Art Unit 2487