DETAILED ACTIONNotice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 5/12/2021 has been entered.
 
Examiner's Note  
The instant application has a lengthy prosecution history and the examiner encourages the applicant to have a telephonic interview with the examiner prior to filing a response to the instant office action. Also, prior to the interview the examiner encourages the applicant to present multiple possible claim amendments, so as to enable the examiner to identify claim amendments that will advance prosecution in a meaningful manner.

Acknowledgment 
Claims 1, 15, and 21-22, amended on 5/12/2021, are acknowledged by the examiner. 
Claims 7 and 20, canceled on 5/12/2021, are acknowledged by the examiner.    

Response to Arguments 
Presented arguments with respect to claims 1, 15, 21, 22, and their dependent claims have been fully considered, but some are rendered moot in view of the new ground of rejection necessitated by amendments initiated by the applicants. 
Claim Rejection – 35 U.S.C. § 112
The following is a quotation of 35 U.S.C. 112(a): 
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention. 
The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112: 
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 1-2, 5, 12, 15-16, 19, and 21-26 are rejected under 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph because of a new matter. The claims 1, 15, and 21-22 include the following claim limitations: “wherein the information representing a quality order for the sphere regions comprises a first value for a quality of a first sphere region of the sphere regions, and a second value for a quality of a second sphere region of the sphere regions different from the first sphere region” and “information representing a quality order for the sphere regions.   There is nowhere in the application specification describes “information representing a quality order for the sphere regions”. First, it is noted that the paragraphs [0136] and [183] of the application specification discuses quality for a picture level, but not for a sphere level. Second, several parameters in the Fig. 8B of the specification are used to indicate a quality for a region “sphere regions”.  Moreover, related parameters in the Fig. 8B only describe quality type and quality level but not “quality order”.  As a result, the limitations “wherein the information representing a quality order for the sphere regions comprises a first value for a quality of a first sphere region of the sphere regions, and a second value for a quality of a second sphere region of the sphere regions different from the first sphere region” and “information representing a quality order for the sphere regions a new matters, which are not described in the application as originally filed. The new matters are required to be canceled from the claims (Please see MPEP 608.04).  In this Office action, the limitation “a quality order for the sphere regions” is interpreted as a parameter indicates quality for a video frame.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.   
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under pre-AIA  35 U.S.C. 103(a) are summarized as follows:

2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
            This application currently names joint inventors. In considering patentability of the claims under pre-AIA  35 U.S.C. 103(a), the examiner presumes that the subject matter of the various claims was commonly owned at the time any inventions covered therein were made absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and invention dates of each claim that was not commonly owned at the time a later invention was made in order for the examiner to consider the applicability of pre-AIA  35 U.S.C. 103(c) and potential pre-AIA  35 U.S.C. 102(e), (f) or (g) prior art under pre-AIA  35 U.S.C. 103(a).
Claims 1, 15, and 21-22 are rejected under 35 U.S.C. 103 as being unpatentable over Sasaki et al.  (US Patent Application Publication 2012/0106921 A1), (“Sasaki”), in view of Chang et al. (US Patent Application Publication 2017/0118475 A1), (“Chang”), in view of Gao et al. (US Patent 9,042,458 B2), (“Gao”), in view of Campbell et al. (US Patent Application Publication 2016/0277772 A1), (“Campbell”).
Regarding claim 1, Sasaki meets the claim limitations as follow.
A method (i.e. an encoding method) [Sasaki: para. 0002] for processing 360-degree video data (i.e. 3D video) [Sasaki: para. 0011], the method (i.e. an encoding method) [Sasaki: para. 0002] performed by a 360-degree video transmission apparatus ((i.e. a video encoder) [Sasaki: para. 0184]; (i.e. 3D mode can be performed) [Sasaki: para. 0007]), the method (i.e. an encoding method) [Sasaki: para. 0002] comprising: obtaining (i.e. obtained) [Sasaki: para. 0090] the 360-degree video data ((i.e. a transport stream is obtained by multiplexing a video stream) [Sasaki: para. 0090; Fig. 6]; (i.e. a transport stream received into a video stream and other streams) [Sasaki: para. 0295]; (i.e. receives transport streams from external sources) [Sasaki: para. 292]; (i.e. composition of 3D video) [Sasaki: para. 0009]) captured by at least one camera ((i.e. The value "3" set to the "camera_assignment_type" identifier indicates that the transport stream is composed of video streams of a center camera perspective (C), a left camera perspective (L), and a right camera perspective (R). The value "4" set to the "camera_assignment_type" identifier indicates that the transport stream is composed of video streams of a left camera perspective (L), a first right camera perspective (R1), and a second right camera perspective (R2)) [Sasaki: para. 0256]; ((i.e. In the context of parallax images, images viewed by the left eye are called left-view images (L-images) and images viewed by the right eye are called right-view images (R-images). Furthermore, a motion picture in which each picture is an L-image is called the left-view video and a motion picture in which each picture is an R-image is called the right-view video) [Sasaki: para. 0110; Fig. 8]; (i.e. pixels forming the left-view picture form an image only for the left eye and the pixels forming the right-view picture form an image only for the right eye, with the result being a parallax picture shown to both eyes, which perceive the picture in 3D) [Sasaki: para. 0084; Fig. 23]), the 360-degree video data ((i.e. 3D video) [Sasaki: para. 0011]; (i.e. the left-view video stream constituting the left-view video, and the right-view video stream constituting the right-view. video exists in the transport stream) [Sasaki: para. 0013]; ; (i.e. applying the depth map) [Sasaki: para. 0076; Fig. 3]) covering sphere regions; projecting (i.e. applying) [Sasaki: para. 0076] the 360-degree video data ((i.e. 3D video) [Sasaki: para. 0011]; (i.e. the left-view video stream constituting the left-view video, and the right-view video stream constituting the right-view. video exists in the transport stream) [Sasaki: para. 0013]; ; (i.e. applying the depth map) [Sasaki: para. 0076; Fig. 3]) into a picture ((i.e. indicating high depth is assigned to the round object in the 2D image according to the depth map, while other areas are assigned information indicating low depth) [Sasaki: para. 0076]; (i.e. the left-view video stream constituting the left-view video, and the right-view video stream constituting the right-view. video exists in the transport stream. Thus, specification is made of video streams to be extracted for 2D playback) [Sasaki: para. 0013]); generating ((i.e. generating) [Sasaki: para. 0009]; (i.e. encoded) [Sasaki: para. 0093]) metadata (i.e. encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093] for the 360-degree video data ((i.e. In the following, description is provided on a data creation device and a data creation method pertaining to the present embodiment with reference to FIG. 23. The data creation device includes: a video encoder 2301, a multiplexer 2302, and a data containment method determining unit 2303. The data containment method determining unit 2303 specifies the data format of a transport stream to be created. For instance, when creating a transport stream having a video format as illustrated in FIG. 14, the section from PTS180000 to PTS5580000 is specified as the Side-by-Side playback section, the section from PTS5580000 to PTSl 0980000 is specified as the 2D playback section, and the section following PTS10980000 is specified as the Top-and-Bottom playback section. The data containment method determining unit 2303 further transmits a specification of time information and frame-packing information storage type to the video encoder 2301 in addition to the information regarding such playback methods.) [Sasaki: para. 0184-0186; Fig. 23]; (i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]; (i.e. generating parallax images of left-view video and right-view video according to a 2D video image and a depth map) [Sasaki: para. 0043]; (i.e. The 3D video specification information, which indicates the combination of video streams required for 3D playback, exists in the transport stream. The display apparatus, when first performing 2D playback and then switching to 3D playback, refers to the 3D video specification information indicating the correlation between video streams contained in the transport stream and thereby identifies which of the video streams are necessary for 3D playback) [Sasaki: para. 0010]; (i.e. information indicating stream identifiers each corresponding to the 2D video stream, the left-view video stream constituting the left-view video, and the right-view video stream constituting the right-view. video exists in the transport stream. Thus, specification is made of video streams to be extracted for 2D playback) [Sasaki: para. 0013]); encoding (i.e. encoded) [Sasaki: para. 0093] the picture ((i.e. encoded picture data) [Sasaki: para. 0093]; (i.e. a given picture is encoded) [Sasaki: para. 0091]; (i.e. an encoding step of compression-coding images) [Sasaki: para. 0009]; (i.e. shrinking each of the pictures corresponding to the left-view video and the right-view video so as to combine the pictures into one, and is performed using ordinary motion picture compression-coding methods) [Sasaki: para. 0112]); encapsulating (i.e. packed) [Sasaki: claim 6] the picture and the metadata into a file ((i.e. the 3D video specification information specifies a single video stream that constitutes the 3D video, the single video stream constitutes L/R packed video, the L/R packed video being video where each frame thereof contains a left-view image and a right-view image, and the contents table includes L/R packing information, the L/R packing information indicating a packing method according to which the left-view image and the right view image are contained in each frame constituting the L/R packed video) [Sasaki: para. 0093]; (i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093]; ((i.e. specify whether or not the extended video stream is a Side-by-Side video stream and thereby perform 3D playback in accordance with the storage format applied by referring to the "frame_packing_arrangement_type" identifier) [Sasaki: para. 0232]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]), wherein the metadata (i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093] includes information for the sphere regions (i.e. Under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096] including: information for a type for the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096], information for a number of the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096], information representing a quality type for a difference in a quality on the picture ((i.e. a method is applied of transferring 3D video by using two video streams as illustrated in FIG. 18, restrictions may be imposed such that attributes such as frame rate, resolution, and aspect ratio, are common between the two video streams.) [Sasaki: para. 0181; Fig. 18]; (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor.) [Sasaki: para. 0183; Fig. 18]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094] – Note: The depth map represents a different quality of the region), information representing a quality order for the sphere regions ((i.e. a method is applied of transferring 3D video by using two video streams as illustrated in FIG. 18, restrictions may be imposed such that attributes such as frame rate, resolution, and aspect ratio, are common between the two video streams.) [Sasaki: para. 0181; Fig. 18]; (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor.) [Sasaki: para. 0183; Fig. 18]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]), information representing a width for the sphere regions (i.e. under MPEG-4 AVC, the SPS contains aspect ratio information ("aspect_ratio_idc") as scaling information. Under MPEG-4 AVC, to expand a 1440x1080 pixel cropping area to a 1920x1080 pixel resolution for displaying, a 4:3 aspect ratio is designated. In this case, upconversion by a factor of 4/3 takes place in the horizontal direction (1440x4/3= 1920) for an expanded 1920x1080 pixel resolution display) [Sasaki: para. 0099], and information representing a height for the sphere regions (i.e. under MPEG-4 AVC, the SPS contains aspect ratio information ("aspect_ratio_idc") as scaling information. Under MPEG-4 AVC, to expand a 1440x1080 pixel cropping area to a 1920x1080 pixel resolution for displaying, a 4:3 aspect ratio is designated. In this case, upconversion by a factor of 4/3 takes place in the horizontal direction (1440x4/3= 1920) for an expanded 1920x1080 pixel resolution display) [Sasaki: para. 0099]; and
transmitting the file (i.e. The data containment method determining unit 2303 further transmits a specification of time information and frame-packing information storage type to the video encoder 2301 in addition to the information regarding such playback methods.) [Sasaki: para. 0186; Figs. 1, 23]; (i.e. the description on the structure of a typical stream transmitted by digital television broadcasts and the like) [Sasaki: para. 0107]), wherein the metadata (i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093] further includes: flag information for representing whether (i.e. a flag indicating whether) [Sasaki: para. 0294] a boundary for a region in the picture is processed (i.e. the "frame_cropping" information indicates the upper, lower, left, and right boundaries of the cropping area such that the differences thereof from the upper, lower, left, and right boundaries of the encoded frame indicate the area to be cropped out. More precisely, to designate a cropping area, a flag ("frame_cropping_flag") is set to 1, and the upper, lower, left, and right areas to be cropped out are respectively indicated as the fields "frame_crop_top_offset", "frame_crop_bottom_offset", "frame_crop_left_offset", and "frame_crop_right_offset".) [Sasaki: para. 0099; Figs. 11-12], and type information for representing a type related to the boundary, 
wherein the information representing a quality order  ((i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]; (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor.) [Sasaki: para. 0183; Fig. 18]; (i.e. two types of "processing priority" are prepared and provided to the frame-packing information descriptors, one being "descriptor prioritized" and the other being "video prioritized") [Sasaki: para. 0164; Fig. 16 – Note: More details is described in para. 0164-0166]) for the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096] comprises a first value for a quality of a first sphere region (i.e. FIG. 24 illustrates an example of generating parallax images of left-view video and right-view video according to a 2D video image and a depth map) [Sasaki: para. 0043; Fig. 24] of the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096], and a second value for a quality of a second sphere region (i.e. FIG. 24 illustrates an example of generating parallax images of left-view video and right-view video according to a 2D video image and a depth map) [Sasaki: para. 0043; Fig. 24] of the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096] different from the first sphere region (i.e. The depth map contains depth values corresponding to each pixel in the 2D image. In the example illustrated in FIG. 24, information indicating high depth is assigned to the round object in the 2D image according to the depth map, while other areas are assigned information indicating low depth. This information may be contained as a bit sequence for each pixel, and may also be contained as a picture image (such as an image where black indicates low depth and white indicates high-depth)) [Sasaki: para. 0076; Fig. 24], and 
the first and second values indicate that the first sphere region has a higher priority than the second sphere region (i.e. depth values corresponding to each pixel in the 2D image. In the example illustrated in FIG. 24, information indicating high depth is assigned to the round object in the 2D image according to the depth map, while other areas are assigned information indicating low depth. This information may be contained as a bit sequence for each pixel, and may also be contained as a picture image (such as an image where black indicates low depth and white indicates high-depth)) [Sasaki: para. 0076; Fig. 24].  
Sasaki does not explicitly disclose the following claim limitations (Emphasis added). 
A method for processing 360-degree video data, the method performed by a 360-degree video transmission apparatus, the method comprising: obtaining the 360-degree video data captured by at least one camera, the 360-degree video data covering sphere regions; projecting the 360-degree video data into a picture; generating metadata for the 360-degree video data; encoding the picture; encapsulating the picture and the metadata into a file, wherein the metadata includes information for the sphere regions including: information for a type for the sphere regions, information for a number of the sphere regions, information representing a quality type for a difference in a quality on the picture, information representing a quality order for the sphere regions, information representing a width for the sphere regions, and information representing a height for the sphere regions; and  transmitting the file, wherein the metadata further includes: flag information for representing whether a boundary for a region in the picture is processed, and type information for representing a type related to the boundary, 
wherein the information representing a quality order for the sphere regions comprises a first value for a quality of a first sphere region of the sphere regions, and a second value for a quality of a second sphere region of the sphere regions different from the first sphere region, and the first and second values indicate that the first sphere region has a higher priority than the second sphere region. 
However, in the same field of endeavor Chang further discloses the claim limitations and the deficient claim limitations, as follows:
the 360-degree video data covering sphere regions (i.e. As mentioned before, a 360-
degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18]; ((i.e. The 360-degree video metadata typically include information such as projection type, stitching software, capture software, pose degrees, view degrees, source photo count, cropped width, cropped height, full width, full height, etc. There are two types of 360-degree video metadata needed to represent various characteristics of a spherical video: Global and Local metadata. Global metadata is usually stored in an XML (Extensible Markup Language) format. These are two types of local metadata including the strictly per-frame metadata and arbitrary local metadata (e.g. information sampled at certain intervals)) [Chang: para. 0093]; (i.e. The picture regions corresponding to overlapped areas are indicated as dashed boxes (211-218)) [Chang: para. 0059]) (i.e. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image. According to the equirectangular projection, the horizontal coordinate is simply longitude, and the vertical coordinate is simply latitude) [Chang: para. 0092], information for a number of the sphere regions (i.e. the cubic projection is a type of projection for mapping the surface of a sphere onto six faces of a cube. The images are arranged like the faces of a cube. FIG. 25B illustrates an example of a 360-degree picture based on the cubic projection. In order to properly use the 360-degree video, it requires to include 360-degree video metadata associated with the 360-degree video) [Chang: para. 0092], information representing a quality type for a difference in a quality on the picture ((i.e. FIG. 27 illustrates an example of cloud-based processing of 360-degree video according to one embodiment of the present invention, where the video data captured by a 360-degree video capture camera 2710 is uploaded to the cloud 2720. The cloud environment has more computational resources and can provide processed video with different quality according to the network bandwidth. Depending on the available network bandwidth and the specific characteristics ( e.g. display resolution) of end receiving devices (e.g. mobile phone 2732, tablet 2734 and computer 2736)) [Chang: para. 0097]; (i.e. Techniques related to image stitching has been well studied in the field of panoramic image processing. However, the stitching techniques often still result in stitched image with imperfection or artefacts such as visible seams. Therefore, blending is always used to improve the visual quality of the stitched picture. According to the present invention, the 360-degree video metadata may also include information regarding the blending methods, such as GIST, Pyramid, and Alpha blending, that users can select. GIST stitching corresponds to GIST: Gradient-domain Image STitching. All these blending methods are well known in the field and the details are not repeated in this disclosure. The 360-degree video metadata may also include information related to stitching positions, which is defined as the seam between the images captured by different cameras) [Chang: para. 0101]), information representing a quality order for the sphere regions (i.e. The set of cameras have to be calibrated to avoid possible misalignment. Calibration is a process of correcting lens distortion and describing the transformation between world coordinate and camera coordinate. The calibration process is necessary to allow correct stitching of videos. Individual video recordings have to be stitched in order to create one 360-degree video) [Chang: para. 0005]; (i.e. Techniques related to image stitching has been well studied in the field of panoramic image processing. However, the stitching techniques often still result in stitched image with imperfection or artefacts such as visible seams. Therefore, blending is always used to improve the visual quality of the stitched picture. According to the present invention, the 360-degree video metadata may also include information regarding the blending methods, such as GIST, Pyramid, and Alpha blending, that users can select. GIST stitching corresponds to GIST: Gradient-domain Image STitching. All these blending methods are well known in the field and the details are not repeated in this disclosure. The 360-degree video metadata may also include information related to stitching positions, which is defined as the seam between the images captured by different cameras) [Chang: para. 0101]), information representing a width for the sphere regions ((i.e. The 360-degree video metadata typically include information such as projection type, stitching software, capture software, pose degrees, view degrees, source photo count, cropped width, cropped height, full width, full height, etc. There are two types of 360-degree video metadata needed to represent various characteristics of a spherical video: Global and Local metadata. Global metadata is usually stored in an XML (Extensible Markup Language) format. These are two types of local metadata including the strictly per-frame metadata and arbitrary local metadata (e.g. information sampled at certain intervals)) [Chang: para. 0093]; (i.e. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image. According to the equirectangular projection, the horizontal coordinate is simply longitude, and the vertical coordinate is simply latitude) [Chang: para. 0092]; (i.e. Depending on the available network bandwidth and the specific characteristics ( e.g. display resolution) of end receiving devices (e.g. mobile phone 2732, tablet 2734 and computer 2736)) [Chang: para. 0097]), and information representing a height for the sphere regions ((i.e. The 360-degree video metadata typically include information such as projection type, stitching software, capture software, pose degrees, view degrees, source photo count, cropped width, cropped height, full width, full height, etc. There are two types of 360-degree video metadata needed to represent various characteristics of a spherical video: Global and Local metadata. Global metadata is usually stored in an XML (Extensible Markup Language) format. These are two types of local metadata including the strictly per-frame metadata and arbitrary local metadata (e.g. information sampled at certain intervals)) [Chang: para. 0093]; (i.e. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image. According to the equirectangular projection, the horizontal coordinate is simply longitude, and the vertical coordinate is simply latitude) [Chang: para. 0092]; (i.e. Depending on the available network bandwidth and the specific characteristics (e.g. display resolution) of end receiving devices (e.g. mobile phone 2732, tablet 2734 and computer 2736)) [Chang: para. 0097]);
type information for representing a type related to the boundary, 
(i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18] (i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18], (i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18] different from the first sphere region (i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18], (i.e. As mentioned before, a 360- degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18].  
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki with Chang to rending video data into a 360-degree video.  
Therefore, the combination of Sasaki with Chang will enable users to view captured videos with a panoramic view of a 360-degree field of view [Chang: para. 0003]. 
Sasaki and Change do not explicitly disclose the following claim limitations (Emphasis added). 
type information for representing a type related to the boundary. 
However, in the same field of endeavor Gao further discloses the claim limitations and the deficient claim limitations, as follows:
wherein the metadata further includes (i.e. parameters (passed as metadata to the post-processing deblock filter) applicable for the block) [Gao: col. 17, line 18-20]: flag information for representing whether a boundary for a region in the picture is processed (i.e. the boundary between a block and its neighbor block is adaptively filtered depending on metadata signaled in the bitstream, smoothness of pixel values across the boundary, and quantization parameters (passed as metadata to the post-processing deblock filter) applicable for the block) [Gao: col. 17, line 15-20], and type information for representing a type related to the boundary  ((i.e. a video encoder or decoder determines the picture coding type ( e.g., I, P, B or BI) of a video picture. The encoder/decoder partitions the video picture into multiple segments for deblock filtering. Based at least in part on the picture coding type, the encoder/decoder selects between multiple different patterns for splitting operations of the deblock filtering into multiple passes. The selection of the pattern can also be based at least in part on the frame coding mode (e.g., progressive, interlaced field, or interlaced frame) of the picture. The encoder / decoder organizes the deblock filtering for the video picture as multiple tasks, where a given task includes the operations of one of the multiple passes for one of the multiple segments, then performs the multiple tasks with multiple threads.) [Gao: col. 3, line 7-21]; (i.e. type of a picture, potentially filtering block boundaries and/or subblock boundaries in any of several different processing orders. For additional details, see sections 8. 6 and 10 .10 of the VC-1 standard) [Gao: col. 11, line 15-19]; (i.e. According to another aspect of the techniques and tools described herein, a video encoder or decoder partitions a video picture into multiple segments for deblock filtering whose operations are split into three or more passes. For example, the passes include a first pass for making filtering decisions, a second pass for filtering of horizontal block boundaries, a third pass for filtering of horizontal sub-block boundaries, and a fourth pass for filtering of vertical boundaries. Or, the passes include a first pass for making filtering decisions, a second pass for filtering of horizontal boundaries for a top field, a third pass for filtering of horizontal boundaries for a bottom field, and a fourth pass for filtering of vertical boundaries for the top field and the bottom field. The encoder/decoder organizes the de block filtering for the video picture as multiple tasks, then performs the multiple tasks with multiple threads) [Gao: col. 3, line 22-37]);
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki and Chang with Gao to program the system to encode parameters related to block boundaries.  
Therefore, the combination of Sasaki and Chang with Gao will enable the system to improve video coding efficiency [Gao: col. 1, line 24-46]. 
In the same field of endeavor Campbell further discloses the claim limitations as follows:
wherein the information representing a quality order for the sphere regions ((i.e. The plurality of video segments relating to the total available field of view may be encoded at different quality levels, and the user terminal may further select a quality level of each selected video segment that is retrieved. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in the same way as adaptive video streaming, such as HTTP adaptive streaming) [Campbell: para. 0012-0013]; (i.e. The segments 281 are illustrated in FIG. 2 as tiles of a sphere, representing the total area of the video 280 that is available for display by smartphone 200 as the user changes the orientation of this user terminal. The displayed area 285 of video 280 spans six segments or tiles 281) [Campbell: para. 0052; Fig. 2]) comprises a first value for a quality of a first sphere region of the sphere regions ((i.e. The plurality of video segments relating to the total available field of view may be encoded at different quality levels, and the user terminal may further select a quality level of each selected video segment that is retrieved. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in the same way as adaptive video streaming, such as HTTP adaptive streaming) [Campbell: para. 0012-0013]; (i.e. The segments 281 are illustrated in FIG. 2 as tiles of a sphere, representing the total area of the video 280 that is available for display by smartphone 200 as the user changes the orientation of this user terminal. The displayed area 285 of video 280 spans six segments or tiles 281) [Campbell: para. 0052; Fig. 2]), and a second value for a quality of a second sphere region of the sphere regions different from the first sphere region ((i.e. The plurality of video segments relating to the total available field of view, or total video area may each be encoded at different quality levels. In that case, the user terminal not only selects which video segments to retrieve, but also at which quality level each segment should be retrieved. This allows the immersive video to be delivered with adaptive bitrate streaming. External factors such as the available bandwidth and available user terminal processing capacity are measured and the quality of the video stream is adjusted accordingly. The user terminal selects which quality level of a segment to stream depending on available resources.) [Campbell: para. 0061]; (i.e. The plurality of video segments relating to the total available field of view may be encoded at different quality levels, and the user terminal may further select a quality level of each selected video segment that is retrieved. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in the same way as adaptive video streaming, such as HTTP adaptive streaming) [Campbell: para. 0012-0013]; (i.e. where segments are available at different quality levels, the segments shown in different areas in FIGS. 5 and 6 are retrieved at different quality levels. That is the primary segments in the diagonally shaded regions 540a, 540b, 640a, and 640b are retrieved in a relatively high quality, whereas the auxiliary segments in cross hatched regions 542a, 542b, 642a, and 642b are retrieved at a relatively lower quality. Where the secondary auxiliary segments in area 644 are downloaded, lower still quality versions of these are retrieved.) [Campbell: para. 0067; Figs. 5-6]), and the first and second values indicate that the first sphere region has a higher priority than the second sphere region (i.e. The plurality of video segments relating to the total available field of view, or total video area may each be encoded at different quality levels. In that case, the user terminal not only selects which video segments to retrieve, but also at which quality level each segment should be retrieved. This allows the immersive video to be delivered with adaptive bitrate streaming. External factors such as the available bandwidth and available user terminal processing capacity are measured and the quality of the video stream is adjusted accordingly. The user terminal selects which quality level of a segment to stream depending on available resources. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in much the same way as adaptive video streaming, such as adaptive bitrate streaming) [Campbell: para. 0061-0062]; (i.e. ) [Campbell: para. 000#]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki, Chang, and Gao with Campbell to program the system to encode parameters related to quality for each region.  
Therefore, the combination of Sasaki, Chang, and Gao with Campbell will enable the system to improve video coding efficiency [Gao: col. 1, line 24-46] and display the best view of region of interests to view [Campbell: Para. 0061]. 

Regarding claim 15, Sasaki meets the claim limitations as follow.
A method for processing (i.e. a decoding method) [Sasaki: claim 19] 360-degree video data (i.e. 3D video) [Sasaki: para. 0011], the method performed (i.e. a decoding method) [Sasaki: claim 19] by a 360 video receiving apparatus ((i.e. a video decoder) [Sasaki: para. 0105]; (i.e. 3D mode can be performed) [Sasaki: para. 0007]), comprising: 
receiving media data including a picture (i.e. a receiving step of receiving input of a transport stream from external sources, the transport stream including a plurality of video streams) [Sasaki: claim 19] in which the 360-degree video data ((i.e. a transport stream is obtained by multiplexing a video stream) [Sasaki: para. 0090; Fig. 6]; (i.e. a transport stream received into a video stream and other streams) [Sasaki: para. 0295]; (i.e. receives transport streams from external sources) [Sasaki: para. 292]; (i.e. composition of 3D video) [Sasaki: para. 0009]) is projected (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor) [Sasaki: para. 0183 – Please see Fig. 18 and the projecting description in paragraphs 0179-0182] and metadata for the 360-degree video data (i.e. the transport stream includes 3D video specification information specifying video streams constituting 3D video) [Sasaki: claim 19] , the 360-degree video data ((i.e. 3D video) [Sasaki: para. 0011]; (i.e. the left-view video stream constituting the left-view video, and the right-view video stream constituting the right-view. video exists in the transport stream) [Sasaki: para. 0013]; ; (i.e. applying the depth map) [Sasaki: para. 0076; Fig. 3]) covering sphere regions; 
decoding the picture based on the metadata ((i.e. When receiving a video stream from the demultiplexer 1503, the video decoding unit 1504 decodes the received video stream and further, extracts "frame-packing information" from the received video stream. The decoding of video in units of frames is performed by the video decoding unit 1504. Here, when the "frame-packing information storage type" of the frame-packing information descriptor notified from the demultiplexer 1503 indicates "in units of GOPs", the video decoding unit 1504 performs the extraction of "frame-packing information" with respect to only the video access units at the head of the GOPs and skips the rest of the video access units.) [Sasaki: para. 0147]; (i.e. performs decoding of right-view images and left-view images) [Sasaki: para. 0004]); and 
rendering the pictur((i.e. The video decoding unit 1504 writes decoded frames to the frame buffer (1) 1508 and outputs the "frame-packing information" to the display judging unit 1506.) [Sasaki: para. 0148]; (i.e. the 3D digital television 4200 performs 3D playback by decoding the 2D/L video stream and the R video stream so extracted with use of the video decoder 4207 and by outputting video signals to the display unit 4213) [Sasaki: para. 0341]; (i.e. Further, the 3D digital television 4200 performs 3D playback by decoding the right-view video stream and the left-view video stream so extracted with use of the video decoder 4207 and by outputting video signals to the display unit 4213) [Sasaki: para. 0354]; (i.e. Further, the 3D digital television 4200 performs 3D playback by decoding the base view stream and the dependent view stream so extracted with use of the video decoder 4207 and by outputting video signals to the display unit 4213) [Sasaki: para. 0360]; (i.e. the playback unit plays back the 3D video by using the video streams constituting the 3D video when the current mode is the 3D mode) [Sasaki: claim 19]),
wherein the metadata (i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093] includes information for the sphere regions (i.e. Under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096] including: information for a type for the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096], information for a number of the sphere regions (i.e. Under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096], information representing a quality type for a difference in a quality on the picture ((i.e. a method is applied of transferring 3D video by using two video streams as illustrated in FIG. 18, restrictions may be imposed such that attributes such as frame rate, resolution, and aspect ratio, are common between the two video streams.) [Sasaki: para. 0181; Fig. 18]; (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor.) [Sasaki: para. 0183; Fig. 18]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094] – Note: The depth map represents a different quality of the region), information representing a quality order for the sphere regions ((i.e. a method is applied of transferring 3D video by using two video streams as illustrated in FIG. 18, restrictions may be imposed such that attributes such as frame rate, resolution, and aspect ratio, are common between the two video streams.) [Sasaki: para. 0181; Fig. 18]; (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor.) [Sasaki: para. 0183; Fig. 18]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]), information representing a width for the sphere regions (i.e. under MPEG-4 AVC, the SPS contains aspect ratio information ("aspect_ratio_idc") as scaling information. Under MPEG-4 AVC, to expand a 1440x1080 pixel cropping area to a 1920x1080 pixel resolution for displaying, a 4:3 aspect ratio is designated. In this case, upconversion by a factor of 4/3 takes place in the horizontal direction (1440x4/3= 1920) for an expanded 1920x1080 pixel resolution display) [Sasaki: para. 0099], and information representing a height for the sphere regions (i.e. under MPEG-4 AVC, the SPS contains aspect ratio information ("aspect_ratio_idc") as scaling information. Under MPEG-4 AVC, to expand a 1440x1080 pixel cropping area to a 1920x1080 pixel resolution for displaying, a 4:3 aspect ratio is designated. In this case, upconversion by a factor of 4/3 takes place in the horizontal direction (1440x4/3= 1920) for an expanded 1920x1080 pixel resolution display) [Sasaki: para. 0099]; and
transmitting the file (i.e. The data containment method determining unit 2303 further transmits a specification of time information and frame-packing information storage type to the video encoder 2301 in addition to the information regarding such playback methods.) [Sasaki: para. 0186; Figs. 1, 23]; (i.e. the description on the structure of a typical stream transmitted by digital television broadcasts and the like) [Sasaki: para. 0107]), wherein the metadata (i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093] further includes: flag information for representing whether (i.e. a flag indicating whether) [Sasaki: para. 0294] a boundary for a region in the picture is processed (i.e. the "frame_cropping" information indicates the upper, lower, left, and right boundaries of the cropping area such that the differences thereof from the upper, lower, left, and right boundaries of the encoded frame indicate the area to be cropped out. More precisely, to designate a cropping area, a flag ("frame_cropping_flag") is set to 1, and the upper, lower, left, and right areas to be cropped out are respectively indicated as the fields "frame_crop_top_offset", "frame_crop_bottom_offset", "frame_crop_left_offset", and "frame_crop_right_offset".) [Sasaki: para. 0099; Figs. 11-12], and type information for representing a type related to the boundary, 
wherein the information representing a quality order  ((i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]; (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor.) [Sasaki: para. 0183; Fig. 18]; (i.e. two types of "processing priority" are prepared and provided to the frame-packing information descriptors, one being "descriptor prioritized" and the other being "video prioritized") [Sasaki: para. 0164; Fig. 16 – Note: More details is described in para. 0164-0166]) for the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096] comprises a first value for a quality of a first sphere region (i.e. FIG. 24 illustrates an example of generating parallax images of left-view video and right-view video according to a 2D video image and a depth map) [Sasaki: para. 0043; Fig. 24] of the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096], and a second value for a quality of a second sphere region (i.e. FIG. 24 illustrates an example of generating parallax images of left-view video and right-view video according to a 2D video image and a depth map) [Sasaki: para. 0043; Fig. 24] of the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096] different from the first sphere region (i.e. The depth map contains depth values corresponding to each pixel in the 2D image. In the example illustrated in FIG. 24, information indicating high depth is assigned to the round object in the 2D image according to the depth map, while other areas are assigned information indicating low depth. This information may be contained as a bit sequence for each pixel, and may also be contained as a picture image (such as an image where black indicates low depth and white indicates high-depth)) [Sasaki: para. 0076; Fig. 24], and 
the first and second values indicate that the first sphere region has a higher priority than the second sphere region (i.e. depth values corresponding to each pixel in the 2D image. In the example illustrated in FIG. 24, information indicating high depth is assigned to the round object in the 2D image according to the depth map, while other areas are assigned information indicating low depth. This information may be contained as a bit sequence for each pixel, and may also be contained as a picture image (such as an image where black indicates low depth and white indicates high-depth)) [Sasaki: para. 0076; Fig. 24].    
Sasaki does not explicitly disclose the following claim limitations (Emphasis added). 
A method for processing 360-degree video data, the method performed by a 360 video receiving apparatus, comprising: receiving media data including a picture in which the 360-degree video data is projected and metadata for the 360-degree video data, the 360-degree video data covering sphere regions; 
decoding the picture based on the metadata; and rendering the picture, wherein the metadata includes information for the sphere regions including:  information for a type for the sphere regions, information for a number of the sphere regions, information representing a quality type for a difference in a quality on the picture, information representing a quality order for the sphere regions, information representing a width for the sphere regions, and information representing a height for the sphere regions, wherein the metadata further includes: flag information for representing whether a boundary for a region in the picture is processed, and type information for representing a type related to the boundary, 
wherein the information representing a quality order for the sphere regions comprises a first value for a quality of a first sphere region of the sphere regions, and a second value for a quality of a second sphere region of the sphere regions different from the first sphere region, and the first and second values indicate that the first sphere region has a higher priority than the second sphere region. 
However, in the same field of endeavor Chang further discloses the claim limitations and the deficient claim limitations, as follows:
the 360-degree video data covering sphere regions (i.e. As mentioned before, a 360-
degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18]; 
((i.e. The 360-degree video metadata typically include information such as projection type, stitching software, capture software, pose degrees, view degrees, source photo count, cropped width, cropped height, full width, full height, etc. There are two types of 360-degree video metadata needed to represent various characteristics of a spherical video: Global and Local metadata. Global metadata is usually stored in an XML (Extensible Markup Language) format. These are two types of local metadata including the strictly per-frame metadata and arbitrary local metadata (e.g. information sampled at certain intervals)) [Chang: para. 0093]; (i.e. The picture regions corresponding to overlapped areas are indicated as dashed boxes (211-218)) [Chang: para. 0059]) (i.e. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image. According to the equirectangular projection, the horizontal coordinate is simply longitude, and the vertical coordinate is simply latitude) [Chang: para. 0092], information for a number of the sphere regions (i.e. the cubic projection is a type of projection for mapping the surface of a sphere onto six faces of a cube. The images are arranged like the faces of a cube. FIG. 25B illustrates an example of a 360-degree picture based on the cubic projection. In order to properly use the 360-degree video, it requires to include 360-degree video metadata associated with the 360-degree video) [Chang: para. 0092], information representing a quality type for a difference in a quality on the picture ((i.e. FIG. 27 illustrates an example of cloud-based processing of 360-degree video according to one embodiment of the present invention, where the video data captured by a 360-degree video capture camera 2710 is uploaded to the cloud 2720. The cloud environment has more computational resources and can provide processed video with different quality according to the network bandwidth. Depending on the available network bandwidth and the specific characteristics ( e.g. display resolution) of end receiving devices (e.g. mobile phone 2732, tablet 2734 and computer 2736)) [Chang: para. 0097]; (i.e. Techniques related to image stitching has been well studied in the field of panoramic image processing. However, the stitching techniques often still result in stitched image with imperfection or artefacts such as visible seams. Therefore, blending is always used to improve the visual quality of the stitched picture. According to the present invention, the 360-degree video metadata may also include information regarding the blending methods, such as GIST, Pyramid, and Alpha blending, that users can select. GIST stitching corresponds to GIST: Gradient-domain Image STitching. All these blending methods are well known in the field and the details are not repeated in this disclosure. The 360-degree video metadata may also include information related to stitching positions, which is defined as the seam between the images captured by different cameras) [Chang: para. 0101]), information representing a quality order for the sphere regions (i.e. The set of cameras have to be calibrated to avoid possible misalignment. Calibration is a process of correcting lens distortion and describing the transformation between world coordinate and camera coordinate. The calibration process is necessary to allow correct stitching of videos. Individual video recordings have to be stitched in order to create one 360-degree video) [Chang: para. 0005]; (i.e. Techniques related to image stitching has been well studied in the field of panoramic image processing. However, the stitching techniques often still result in stitched image with imperfection or artefacts such as visible seams. Therefore, blending is always used to improve the visual quality of the stitched picture. According to the present invention, the 360-degree video metadata may also include information regarding the blending methods, such as GIST, Pyramid, and Alpha blending, that users can select. GIST stitching corresponds to GIST: Gradient-domain Image STitching. All these blending methods are well known in the field and the details are not repeated in this disclosure. The 360-degree video metadata may also include information related to stitching positions, which is defined as the seam between the images captured by different cameras) [Chang: para. 0101]), information representing a width for the sphere regions ((i.e. The 360-degree video metadata typically include information such as projection type, stitching software, capture software, pose degrees, view degrees, source photo count, cropped width, cropped height, full width, full height, etc. There are two types of 360-degree video metadata needed to represent various characteristics of a spherical video: Global and Local metadata. Global metadata is usually stored in an XML (Extensible Markup Language) format. These are two types of local metadata including the strictly per-frame metadata and arbitrary local metadata (e.g. information sampled at certain intervals)) [Chang: para. 0093]; (i.e. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image. According to the equirectangular projection, the horizontal coordinate is simply longitude, and the vertical coordinate is simply latitude) [Chang: para. 0092]; (i.e. Depending on the available network bandwidth and the specific characteristics ( e.g. display resolution) of end receiving devices (e.g. mobile phone 2732, tablet 2734 and computer 2736)) [Chang: para. 0097]), and information representing a height for the sphere regions ((i.e. The 360-degree video metadata typically include information such as projection type, stitching software, capture software, pose degrees, view degrees, source photo count, cropped width, cropped height, full width, full height, etc. There are two types of 360-degree video metadata needed to represent various characteristics of a spherical video: Global and Local metadata. Global metadata is usually stored in an XML (Extensible Markup Language) format. These are two types of local metadata including the strictly per-frame metadata and arbitrary local metadata (e.g. information sampled at certain intervals)) [Chang: para. 0093]; (i.e. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image. According to the equirectangular projection, the horizontal coordinate is simply longitude, and the vertical coordinate is simply latitude) [Chang: para. 0092]; (i.e. Depending on the available network bandwidth and the specific characteristics (e.g. display resolution) of end receiving devices (e.g. mobile phone 2732, tablet 2734 and computer 2736)) [Chang: para. 0097]);
type information for representing a type related to the boundary, 
(i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18] (i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18], (i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18] different from the first sphere region (i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18], (i.e. As mentioned before, a 360- degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki with Chang to rending video data into a 360-degree video.  
Therefore, the combination of Sasaki with Chang will enable users to view captured videos with a panoramic view of a 360-degree field of view [Chang: para. 0003]. 
Sasaki and Change do not explicitly disclose the following claim limitations (Emphasis added). 
type information for representing a type related to the boundary. 
However, in the same field of endeavor Gao further discloses the claim limitations and the deficient claim limitations, as follows:
wherein the metadata further includes (i.e. parameters (passed as metadata to the post-processing deblock filter) applicable for the block) [Gao: col. 17, line 18-20]: flag information for representing whether a boundary for a region in the picture (i.e. the boundary between a block and its neighbor block is adaptively filtered depending on metadata signaled in the bitstream, smoothness of pixel values across the boundary, and quantization parameters (passed as metadata to the post-processing deblock filter) applicable for the block) [Gao: col. 17, line 15-20] is processed (i.e. ) [Gao: col. 17, line 15-20], and type information for representing a type related to the boundary  ((i.e. a video encoder or decoder determines the picture coding type ( e.g., I, P, B or BI) of a video picture. The encoder/decoder partitions the video picture into multiple segments for deblock filtering. Based at least in part on the picture coding type, the encoder/decoder selects between multiple different patterns for splitting operations of the deblock filtering into multiple passes. The selection of the pattern can also be based at least in part on the frame coding mode (e.g., progressive, interlaced field, or interlaced frame) of the picture. The encoder / decoder organizes the deblock filtering for the video picture as multiple tasks, where a given task includes the operations of one of the multiple passes for one of the multiple segments, then performs the multiple tasks with multiple threads.) [Gao: col. 3, line 7-21]; (i.e. type of a picture, potentially filtering block boundaries and/or subblock boundaries in any of several different processing orders. For additional details, see sections 8. 6 and 10 .10 of the VC-1 standard) [Gao: col. 11, line 15-19]; (i.e. According to another aspect of the techniques and tools described herein, a video encoder or decoder partitions a video picture into multiple segments for deblock filtering whose operations are split into three or more passes. For example, the passes include a first pass for making filtering decisions, a second pass for filtering of horizontal block boundaries, a third pass for filtering of horizontal sub-block boundaries, and a fourth pass for filtering of vertical boundaries. Or, the passes include a first pass for making filtering decisions, a second pass for filtering of horizontal boundaries for a top field, a third pass for filtering of horizontal boundaries for a bottom field, and a fourth pass for filtering of vertical boundaries for the top field and the bottom field. The encoder/decoder organizes the de block filtering for the video picture as multiple tasks, then performs the multiple tasks with multiple threads) [Gao: col. 3, line 22-37]);
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki and Chang with Gao to program the system to encode parameters related to block boundaries.  
Therefore, the combination of Sasaki and Chang with Gao will enable the system to improve video coding efficiency [Gao: col. 1, line 24-46].
In the same field of endeavor Campbell further discloses the claim limitations as follows:
wherein the information representing a quality order for the sphere regions ((i.e. The plurality of video segments relating to the total available field of view may be encoded at different quality levels, and the user terminal may further select a quality level of each selected video segment that is retrieved. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in the same way as adaptive video streaming, such as HTTP adaptive streaming) [Campbell: para. 0012-0013]; (i.e. The segments 281 are illustrated in FIG. 2 as tiles of a sphere, representing the total area of the video 280 that is available for display by smartphone 200 as the user changes the orientation of this user terminal. The displayed area 285 of video 280 spans six segments or tiles 281) [Campbell: para. 0052; Fig. 2]) comprises a first value for a quality of a first sphere region of the sphere regions ((i.e. The plurality of video segments relating to the total available field of view may be encoded at different quality levels, and the user terminal may further select a quality level of each selected video segment that is retrieved. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in the same way as adaptive video streaming, such as HTTP adaptive streaming) [Campbell: para. 0012-0013]; (i.e. The segments 281 are illustrated in FIG. 2 as tiles of a sphere, representing the total area of the video 280 that is available for display by smartphone 200 as the user changes the orientation of this user terminal. The displayed area 285 of video 280 spans six segments or tiles 281) [Campbell: para. 0052; Fig. 2]), and a second value for a quality of a second sphere region of the sphere regions different from the first sphere region ((i.e. The plurality of video segments relating to the total available field of view, or total video area may each be encoded at different quality levels. In that case, the user terminal not only selects which video segments to retrieve, but also at which quality level each segment should be retrieved. This allows the immersive video to be delivered with adaptive bitrate streaming. External factors such as the available bandwidth and available user terminal processing capacity are measured and the quality of the video stream is adjusted accordingly. The user terminal selects which quality level of a segment to stream depending on available resources.) [Campbell: para. 0061]; (i.e. The plurality of video segments relating to the total available field of view may be encoded at different quality levels, and the user terminal may further select a quality level of each selected video segment that is retrieved. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in the same way as adaptive video streaming, such as HTTP adaptive streaming) [Campbell: para. 0012-0013]; (i.e. where segments are available at different quality levels, the segments shown in different areas in FIGS. 5 and 6 are retrieved at different quality levels. That is the primary segments in the diagonally shaded regions 540a, 540b, 640a, and 640b are retrieved in a relatively high quality, whereas the auxiliary segments in cross hatched regions 542a, 542b, 642a, and 642b are retrieved at a relatively lower quality. Where the secondary auxiliary segments in area 644 are downloaded, lower still quality versions of these are retrieved.) [Campbell: para. 0067; Figs. 5-6]), and the first and second values indicate that the first sphere region has a higher priority than the second sphere region (i.e. The plurality of video segments relating to the total available field of view, or total video area may each be encoded at different quality levels. In that case, the user terminal not only selects which video segments to retrieve, but also at which quality level each segment should be retrieved. This allows the immersive video to be delivered with adaptive bitrate streaming. External factors such as the available bandwidth and available user terminal processing capacity are measured and the quality of the video stream is adjusted accordingly. The user terminal selects which quality level of a segment to stream depending on available resources. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in much the same way as adaptive video streaming, such as adaptive bitrate streaming) [Campbell: para. 0061-0062]; (i.e. ) [Campbell: para. 000#]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki, Chang, and Gao with Campbell to program the system to encode parameters related to quality for each region.  
Therefore, the combination of Sasaki, Chang, and Gao with Campbell will enable the system to improve video coding efficiency [Gao: col. 1, line 24-46] and display the best view of region of interests to view [Campbell: Para. 0061].

Regarding claim 21, Sasaki meets the claim limitations as follow.
A 360-degree video transmission apparatus ((i.e. a video encoder) [Sasaki: para. 0394];  (i.e. an LSI) [Sasaki: para. 0365]; (i.e. a computer) [Sasaki: para. 0072]), the apparatus comprising:an obtainer (i.e. functional blocks) [Sasaki: para. 0365] configured to obtain (i.e. obtained) [Sasaki: para. 0090] 360-degree video data ((i.e. a transport stream is obtained by multiplexing a video stream) [Sasaki: para. 0090; Fig. 6]; (i.e. a transport stream received into a video stream and other streams) [Sasaki: para. 0295]; (i.e. receives transport streams from external sources) [Sasaki: para. 292]; (i.e. composition of 3D video) [Sasaki: para. 0009]) captured by at least one camera ((i.e. The value "3" set to the "camera_assignment_type" identifier indicates that the transport stream is composed of video streams of a center camera perspective (C), a left camera perspective (L), and a right camera perspective (R). The value "4" set to the "camera_assignment_type" identifier indicates that the transport stream is composed of video streams of a left camera perspective (L), a first right camera perspective (R1), and a second right camera perspective (R2)) [Sasaki: para. 0256]; ((i.e. In the context of parallax images, images viewed by the left eye are called left-view images (L-images) and images viewed by the right eye are called right-view images (R-images). Furthermore, a motion picture in which each picture is an L-image is called the left-view video and a motion picture in which each picture is an R-image is called the right-view video) [Sasaki: para. 0110; Fig. 8]; (i.e. pixels forming the left-view picture form an image only for the left eye and the pixels forming the right-view picture form an image only for the right eye, with the result being a parallax picture shown to both eyes, which perceive the picture in 3D) [Sasaki: para. 0084; Fig. 23]); a projector (i.e. functional blocks) [Sasaki: para. 0365] configured to project (i.e. applying) [Sasaki: para. 0076] the 360-degree video data ((i.e. applying the depth map) [Sasaki: para. 0076; Fig. 3]; (i.e. the left-view video stream constituting the left-view video, and the right-view video stream constituting the right-view. video exists in the transport stream) [Sasaki: para. 0013]; (i.e. 3D video) [Sasaki: para. 0011]) into a picture ((i.e. indicating high depth is assigned to the round object in the 2D image according to the depth map, while other areas are assigned information indicating low depth) [Sasaki: para. 0076]; (i.e. the left-view video stream constituting the left-view video, and the right-view video stream constituting the right-view. video exists in the transport stream. Thus, specification is made of video streams to be extracted for 2D playback) [Sasaki: para. 0013]); a generator (i.e. functional blocks) [Sasaki: para. 0365] configured to generate ((i.e. generating) [Sasaki: para. 0009]; (i.e. encoded) [Sasaki: para. 0093]) metadata (i.e. encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093] for the 360-degree video data ((i.e. In the following, description is provided on a data creation device and a data creation method pertaining to the present embodiment with reference to FIG. 23. The data creation device includes: a video encoder 2301, a multiplexer 2302, and a data containment method determining unit 2303. The data containment method determining unit 2303 specifies the data format of a transport stream to be created. For instance, when creating a transport stream having a video format as illustrated in FIG. 14, the section from PTS180000 to PTS5580000 is specified as the Side-by-Side playback section, the section from PTS5580000 to PTSl 0980000 is specified as the 2D playback section, and the section following PTS10980000 is specified as the Top-and-Bottom playback section. The data containment method determining unit 2303 further transmits a specification of time information and frame-packing information storage type to the video encoder 2301 in addition to the information regarding such playback methods.) [Sasaki: para. 0184-0186; Fig. 23]; (i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]; (i.e. generating parallax images of left-view video and right-view video according to a 2D video image and a depth map) [Sasaki: para. 0043]; (i.e. The 3D video specification information, which indicates the combination of video streams required for 3D playback, exists in the transport stream. The display apparatus, when first performing 2D playback and then switching to 3D playback, refers to the 3D video specification information indicating the correlation between video streams contained in the transport stream and thereby identifies which of the video streams are necessary for 3D playback) [Sasaki: para. 0010]; (i.e. information indicating stream identifiers each corresponding to the 2D video stream, the left-view video stream constituting the left-view video, and the right-view video stream constituting the right-view. video exists in the transport stream. Thus, specification is made of video streams to be extracted for 2D playback) [Sasaki: para. 0013]); an encoder (i.e. functional blocks) [Sasaki: para. 0365] configured to encode (i.e. encoded) [Sasaki: para. 0093] the picture ((i.e. encoded picture data) [Sasaki: para. 0093]; (i.e. a given picture is encoded) [Sasaki: para. 0091]; (i.e. an encoding step of compression-coding images) [Sasaki: para. 0009]; (i.e. shrinking each of the pictures corresponding to the left-view video and the right-view video so as to combine the pictures into one, and is performed using ordinary motion picture compression-coding methods) [Sasaki: para. 0112]); an encapsulator (i.e. functional blocks) [Sasaki: para. 0365] configured to encapsulate (i.e. packed) [Sasaki: claim 6] the picture and the metadata into a file ((i.e. the 3D video specification information specifies a single video stream that constitutes the 3D video, the single video stream constitutes L/R packed video, the L/R packed video being video where each frame thereof contains a left-view image and a right-view image, and the contents table includes L/R packing information, the L/R packing information indicating a packing method according to which the left-view image and the right view image are contained in each frame constituting the L/R packed video) [Sasaki: para. 0093]; (i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093]; ((i.e. specify whether or not the extended video stream is a Side-by-Side video stream and thereby perform 3D playback in accordance with the storage format applied by referring to the "frame_packing_arrangement_type" identifier) [Sasaki: para. 0232]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]), wherein the metadata (i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093] includes information for the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096] including: information for a type for the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096], information for a number of the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096], information representing a quality type for a difference in a quality on the picture ((i.e. a method is applied of transferring 3D video by using two video streams as illustrated in FIG. 18, restrictions may be imposed such that attributes such as frame rate, resolution, and aspect ratio, are common between the two video streams.) [Sasaki: para. 0181; Fig. 18]; (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor.) [Sasaki: para. 0183; Fig. 18]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094] – Note: The depth map represents a different quality of the region), information representing a quality order for the sphere regions ((i.e. a method is applied of transferring 3D video by using two video streams as illustrated in FIG. 18, restrictions may be imposed such that attributes such as frame rate, resolution, and aspect ratio, are common between the two video streams.) [Sasaki: para. 0181; Fig. 18]; (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor.) [Sasaki: para. 0183; Fig. 18]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]), information representing a width for the sphere regions (i.e. under MPEG-4 AVC, the SPS contains aspect ratio information ("aspect_ratio_idc") as scaling information. Under MPEG-4 AVC, to expand a 1440x1080 pixel cropping area to a 1920x1080 pixel resolution for displaying, a 4:3 aspect ratio is designated. In this case, upconversion by a factor of 4/3 takes place in the horizontal direction (1440x4/3= 1920) for an expanded 1920x1080 pixel resolution display) [Sasaki: para. 0099], and information representing a height for the sphere regions (i.e. under MPEG-4 AVC, the SPS contains aspect ratio information ("aspect_ratio_idc") as scaling information. Under MPEG-4 AVC, to expand a 1440x1080 pixel cropping area to a 1920x1080 pixel resolution for displaying, a 4:3 aspect ratio is designated. In this case, upconversion by a factor of 4/3 takes place in the horizontal direction (1440x4/3= 1920) for an expanded 1920x1080 pixel resolution display) [Sasaki: para. 0099]; and
a transmitter (i.e. the structure of a typical stream transmitted by digital television broadcasts and the like) [Sasaki: para. 0107] configured transmitting the file (i.e. The data containment method determining unit 2303 further transmits a specification of time information and frame-packing information storage type to the video encoder 2301 in addition to the information regarding such playback methods.) [Sasaki: para. 0186; Figs. 1, 23]; (i.e. the description on the structure of a typical stream transmitted by digital television broadcasts and the like) [Sasaki: para. 0107]), wherein the metadata (i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093] further includes: flag information for representing whether (i.e. a flag indicating whether) [Sasaki: para. 0294] a boundary for a region in the picture is processed (i.e. the "frame_cropping" information indicates the upper, lower, left, and right boundaries of the cropping area such that the differences thereof from the upper, lower, left, and right boundaries of the encoded frame indicate the area to be cropped out. More precisely, to designate a cropping area, a flag ("frame_cropping_flag") is set to 1, and the upper, lower, left, and right areas to be cropped out are respectively indicated as the fields "frame_crop_top_offset", "frame_crop_bottom_offset", "frame_crop_left_offset", and
"frame_crop_right_offset".) [Sasaki: para. 0099; Figs. 11-12], and type information for representing a type related to the boundary, 
wherein the information representing a quality order  ((i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]; (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor.) [Sasaki: para. 0183; Fig. 18]; (i.e. two types of "processing priority" are prepared and provided to the frame-packing information descriptors, one being "descriptor prioritized" and the other being "video prioritized") [Sasaki: para. 0164; Fig. 16 – Note: More details is described in para. 0164-0166]) for the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096] comprises a first value for a quality of a first sphere region (i.e. FIG. 24 illustrates an example of generating parallax images of left-view video and right-view video according to a 2D video image and a depth map) [Sasaki: para. 0043; Fig. 24] of the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096], and a second value for a quality of a second sphere region (i.e. FIG. 24 illustrates an example of generating parallax images of left-view video and right-view video according to a 2D video image and a depth map) [Sasaki: para. 0043; Fig. 24] of the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096] different from the first sphere region (i.e. The depth map contains depth values corresponding to each pixel in the 2D image. In the example illustrated in FIG. 24, information indicating high depth is assigned to the round object in the 2D image according to the depth map, while other areas are assigned information indicating low depth. This information may be contained as a bit sequence for each pixel, and may also be contained as a picture image (such as an image where black indicates low depth and white indicates high-depth)) [Sasaki: para. 0076; Fig. 24], and 
the first and second values indicate that the first sphere region has a higher priority than the second sphere region (i.e. depth values corresponding to each pixel in the 2D image. In the example illustrated in FIG. 24, information indicating high depth is assigned to the round object in the 2D image according to the depth map, while other areas are assigned information indicating low depth. This information may be contained as a bit sequence for each pixel, and may also be contained as a picture image (such as an image where black indicates low depth and white indicates high-depth)) [Sasaki: para. 0076; Fig. 24].   
Sasaki does not explicitly disclose the following claim limitations (Emphasis added). 
A 360-degree video transmission apparatus, the apparatus comprising: an obtainer configured to obtain 360-degree video data captured by at least one camera, the 360-degree video data covering sphere regions; a projector configured to project the 360-degree video data into a picture; a generator configured to generate metadata for the 360-degree video data; an encoder configured to encode the picture; and an encapsulator configured to encapsulate the picture and the metadata into a file, wherein the metadata includes information for the sphere regions including: information for a type for the sphere regions, information for a number of the sphere regions, information representing a quality type for a difference in a quality on the picture, information representing a quality order for the sphere regions, 6 Attorney Docket No. 2101-71943information representing a width for the sphere regions, and information representing a height for the sphere regions; and  a transmitter configured to transmit the file, wherein the metadata further includes: flag information for representing whether a boundary for a region in the picture is processed, and type information for representing a type related to the boundary, 
wherein the information representing a quality order for the sphere regions comprises a first value for a quality of a first sphere region of the sphere regions, and a second value for a quality of a second sphere region of the sphere regions different from the first sphere region, and the first and second values indicate that the first sphere region has a higher priority than the second sphere region. 
However, in the same field of endeavor Chang further discloses the claim limitations and the deficient claim limitations, as follows:
the 360-degree video data covering sphere regions (i.e. As mentioned before, a 360-
degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18]; 


((i.e. The 360-degree video metadata typically include information such as projection type, stitching software, capture software, pose degrees, view degrees, source photo count, cropped width, cropped height, full width, full height, etc. There are two types of 360-degree video metadata needed to represent various characteristics of a spherical video: Global and Local metadata. Global metadata is usually stored in an XML (Extensible Markup Language) format. These are two types of local metadata including the strictly per-frame metadata and arbitrary local metadata (e.g. information sampled at certain intervals)) [Chang: para. 0093]; (i.e. The picture regions corresponding to overlapped areas are indicated as dashed boxes (211-218)) [Chang: para. 0059]) (i.e. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image. According to the equirectangular projection, the horizontal coordinate is simply longitude, and the vertical coordinate is simply latitude) [Chang: para. 0092], information for a number of the sphere regions (i.e. the cubic projection is a type of projection for mapping the surface of a sphere onto six faces of a cube. The images are arranged like the faces of a cube. FIG. 25B illustrates an example of a 360-degree picture based on the cubic projection. In order to properly use the 360-degree video, it requires to include 360-degree video metadata associated with the 360-degree video) [Chang: para. 0092], information representing a quality type for a difference in a quality on the picture ((i.e. FIG. 27 illustrates an example of cloud-based processing of 360-degree video according to one embodiment of the present invention, where the video data captured by a 360-degree video capture camera 2710 is uploaded to the cloud 2720. The cloud environment has more computational resources and can provide processed video with different quality according to the network bandwidth. Depending on the available network bandwidth and the specific characteristics ( e.g. display resolution) of end receiving devices (e.g. mobile phone 2732, tablet 2734 and computer 2736)) [Chang: para. 0097]; (i.e. Techniques related to image stitching has been well studied in the field of panoramic image processing. However, the stitching techniques often still result in stitched image with imperfection or artefacts such as visible seams. Therefore, blending is always used to improve the visual quality of the stitched picture. According to the present invention, the 360-degree video metadata may also include information regarding the blending methods, such as GIST, Pyramid, and Alpha blending, that users can select. GIST stitching corresponds to GIST: Gradient-domain Image STitching. All these blending methods are well known in the field and the details are not repeated in this disclosure. The 360-degree video metadata may also include information related to stitching positions, which is defined as the seam between the images captured by different cameras) [Chang: para. 0101]), information representing a quality order for the sphere regions (i.e. The set of cameras have to be calibrated to avoid possible misalignment. Calibration is a process of correcting lens distortion and describing the transformation between world coordinate and camera coordinate. The calibration process is necessary to allow correct stitching of videos. Individual video recordings have to be stitched in order to create one 360-degree video) [Chang: para. 0005]; (i.e. Techniques related to image stitching has been well studied in the field of panoramic image processing. However, the stitching techniques often still result in stitched image with imperfection or artefacts such as visible seams. Therefore, blending is always used to improve the visual quality of the stitched picture. According to the present invention, the 360-degree video metadata may also include information regarding the blending methods, such as GIST, Pyramid, and Alpha blending, that users can select. GIST stitching corresponds to GIST: Gradient-domain Image STitching. All these blending methods are well known in the field and the details are not repeated in this disclosure. The 360-degree video metadata may also include information related to stitching positions, which is defined as the seam between the images captured by different cameras) [Chang: para. 0101]), information representing a width for the sphere regions ((i.e. The 360-degree video metadata typically include information such as projection type, stitching software, capture software, pose degrees, view degrees, source photo count, cropped width, cropped height, full width, full height, etc. There are two types of 360-degree video metadata needed to represent various characteristics of a spherical video: Global and Local metadata. Global metadata is usually stored in an XML (Extensible Markup Language) format. These are two types of local metadata including the strictly per-frame metadata and arbitrary local metadata (e.g. information sampled at certain intervals)) [Chang: para. 0093]; (i.e. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image. According to the equirectangular projection, the horizontal coordinate is simply longitude, and the vertical coordinate is simply latitude) [Chang: para. 0092]; (i.e. Depending on the available network bandwidth and the specific characteristics ( e.g. display resolution) of end receiving devices (e.g. mobile phone 2732, tablet 2734 and computer 2736)) [Chang: para. 0097]), and information representing a height for the sphere regions ((i.e. The 360-degree video metadata typically include information such as projection type, stitching software, capture software, pose degrees, view degrees, source photo count, cropped width, cropped height, full width, full height, etc. There are two types of 360-degree video metadata needed to represent various characteristics of a spherical video: Global and Local metadata. Global metadata is usually stored in an XML (Extensible Markup Language) format. These are two types of local metadata including the strictly per-frame metadata and arbitrary local metadata (e.g. information sampled at certain intervals)) [Chang: para. 0093]; (i.e. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image. According to the equirectangular projection, the horizontal coordinate is simply longitude, and the vertical coordinate is simply latitude) [Chang: para. 0092]; (i.e. Depending on the available network bandwidth and the specific characteristics (e.g. display resolution) of end receiving devices (e.g. mobile phone 2732, tablet 2734 and computer 2736)) [Chang: para. 0097]);
type information for representing a type related to the boundary, 
(i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18] (i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18], (i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18] different from the first sphere region (i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18], (i.e. As mentioned before, a 360- degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18].   
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki with Chang to rending video data into a 360-degree video.  
Therefore, the combination of Sasaki with Chang will enable users to view captured videos with a panoramic view of a 360-degree field of view [Chang: para. 0003]. 
Sasaki and Change do not explicitly disclose the following claim limitations (Emphasis added). 
type information for representing a type related to the boundary. 
However, in the same field of endeavor Gao further discloses the claim limitations and the deficient claim limitations, as follows:
wherein the metadata further includes (i.e. parameters (passed as metadata to the post-processing deblock filter) applicable for the block) [Gao: col. 17, line 18-20]: flag information for representing whether a boundary for a region in the picture (i.e. the boundary between a block and its neighbor block is adaptively filtered depending on metadata signaled in the bitstream, smoothness of pixel values across the boundary, and quantization parameters (passed as metadata to the post-processing deblock filter) applicable for the block) [Gao: col. 17, line 15-20] is processed (i.e. ) [Gao: col. 17, line 15-20], and type information for representing a type related to the boundary  ((i.e. a video encoder or decoder determines the picture coding type ( e.g., I, P, B or BI) of a video picture. The encoder/decoder partitions the video picture into multiple segments for deblock filtering. Based at least in part on the picture coding type, the encoder/decoder selects between multiple different patterns for splitting operations of the deblock filtering into multiple passes. The selection of the pattern can also be based at least in part on the frame coding mode (e.g., progressive, interlaced field, or interlaced frame) of the picture. The encoder / decoder organizes the deblock filtering for the video picture as multiple tasks, where a given task includes the operations of one of the multiple passes for one of the multiple segments, then performs the multiple tasks with multiple threads.) [Gao: col. 3, line 7-21]; (i.e. type of a picture, potentially filtering block boundaries and/or subblock boundaries in any of several different processing orders. For additional details, see sections 8. 6 and 10 .10 of the VC-1 standard) [Gao: col. 11, line 15-19]; (i.e. According to another aspect of the techniques and tools described herein, a video encoder or decoder partitions a video picture into multiple segments for deblock filtering whose operations are split into three or more passes. For example, the passes include a first pass for making filtering decisions, a second pass for filtering of horizontal block boundaries, a third pass for filtering of horizontal sub-block boundaries, and a fourth pass for filtering of vertical boundaries. Or, the passes include a first pass for making filtering decisions, a second pass for filtering of horizontal boundaries for a top field, a third pass for filtering of horizontal boundaries for a bottom field, and a fourth pass for filtering of vertical boundaries for the top field and the bottom field. The encoder/decoder organizes the de block filtering for the video picture as multiple tasks, then performs the multiple tasks with multiple threads) [Gao: col. 3, line 22-37]);

Therefore, the combination of Sasaki and Chang with Gao will enable the system to improve video coding efficiency [Gao: col. 1, line 24-46].
In the same field of endeavor Campbell further discloses the claim limitations as follows:
wherein the information representing a quality order for the sphere regions ((i.e. The plurality of video segments relating to the total available field of view may be encoded at different quality levels, and the user terminal may further select a quality level of each selected video segment that is retrieved. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in the same way as adaptive video streaming, such as HTTP adaptive streaming) [Campbell: para. 0012-0013]; (i.e. The segments 281 are illustrated in FIG. 2 as tiles of a sphere, representing the total area of the video 280 that is available for display by smartphone 200 as the user changes the orientation of this user terminal. The displayed area 285 of video 280 spans six segments or tiles 281) [Campbell: para. 0052; Fig. 2]) comprises a first value for a quality of a first sphere region of the sphere regions ((i.e. The plurality of video segments relating to the total available field of view may be encoded at different quality levels, and the user terminal may further select a quality level of each selected video segment that is retrieved. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in the same way as adaptive video streaming, such as HTTP adaptive streaming) [Campbell: para. 0012-0013]; (i.e. The segments 281 are illustrated in FIG. 2 as tiles of a sphere, representing the total area of the video 280 that is available for display by smartphone 200 as the user changes the orientation of this user terminal. The displayed area 285 of video 280 spans six segments or tiles 281) [Campbell: para. 0052; Fig. 2]), and a second value for a quality of a second sphere region of the sphere regions different from the first sphere region ((i.e. The plurality of video segments relating to the total available field of view, or total video area may each be encoded at different quality levels. In that case, the user terminal not only selects which video segments to retrieve, but also at which quality level each segment should be retrieved. This allows the immersive video to be delivered with adaptive bitrate streaming. External factors such as the available bandwidth and available user terminal processing capacity are measured and the quality of the video stream is adjusted accordingly. The user terminal selects which quality level of a segment to stream depending on available resources.) [Campbell: para. 0061]; (i.e. The plurality of video segments relating to the total available field of view may be encoded at different quality levels, and the user terminal may further select a quality level of each selected video segment that is retrieved. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in the same way as adaptive video streaming, such as HTTP adaptive streaming) [Campbell: para. 0012-0013]; (i.e. where segments are available at different quality levels, the segments shown in different areas in FIGS. 5 and 6 are retrieved at different quality levels. That is the primary segments in the diagonally shaded regions 540a, 540b, 640a, and 640b are retrieved in a relatively high quality, whereas the auxiliary segments in cross hatched regions 542a, 542b, 642a, and 642b are retrieved at a relatively lower quality. Where the secondary auxiliary segments in area 644 are downloaded, lower still quality versions of these are retrieved.) [Campbell: para. 0067; Figs. 5-6]), and the first and second values indicate that the first sphere region has a higher priority than the second sphere region (i.e. The plurality of video segments relating to the total available field of view, or total video area may each be encoded at different quality levels. In that case, the user terminal not only selects which video segments to retrieve, but also at which quality level each segment should be retrieved. This allows the immersive video to be delivered with adaptive bitrate streaming. External factors such as the available bandwidth and available user terminal processing capacity are measured and the quality of the video stream is adjusted accordingly. The user terminal selects which quality level of a segment to stream depending on available resources. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in much the same way as adaptive video streaming, such as adaptive bitrate streaming) [Campbell: para. 0061-0062]; (i.e. ) [Campbell: para. 000#]. 

Therefore, the combination of Sasaki, Chang, and Gao with Campbell will enable the system to improve video coding efficiency [Gao: col. 1, line 24-46] and display the best view of region of interests to view [Campbell: Para. 0061].
 
Regarding claim 22, Sasaki meets the claim limitations as follow.
A 360-degree video receiving apparatus ((i.e. a video decoder) [Sasaki: para. 0105];  (i.e. 3D mode can be performed) [Sasaki: para. 0007]), the apparatus ((i.e. a video decoder) [Sasaki: para. 0105] comprising: 
a receiver (i.e. a reception unit) [Sasaki: claim 13] configured to receive media data including a picture (i.e. a receiving step of receiving input of a transport stream from external sources, the transport stream including a plurality of video streams) [Sasaki: claim 19] in which 360-degree video data ((i.e. a transport stream is obtained by multiplexing a video stream) [Sasaki: para. 0090; Fig. 6]; (i.e. a transport stream received into a video stream and other streams) [Sasaki: para. 0295]; (i.e. receives transport streams from external sources) [Sasaki: para. 292]; (i.e. composition of 3D video) [Sasaki: para. 0009]) is projected (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor) [Sasaki: para. 0183 – Please see Fig. 18 and the projecting description in paragraphs 0179-0182] and metadata for the 360-degree video data (i.e. the transport stream includes 3D video specification information specifying video streams constituting 3D video) [Sasaki: claim 19]; 
a decoder (i.e. a video decoder) [Sasaki: para. 0105] configured to decode the picture based on the metadata ((i.e. When receiving a video stream from the demultiplexer 1503, the video decoding unit 1504 decodes the received video stream and further, extracts "frame-packing information" from the received video stream. The decoding of video in units of frames is performed by the video decoding unit 1504. Here, when the "frame-packing information storage type" of the frame-packing information descriptor notified from the demultiplexer 1503 indicates "in units of GOPs", the video decoding unit 1504 performs the extraction of "frame-packing information" with respect to only the video access units at the head of the GOPs and skips the rest of the video access units.) [Sasaki: para. 0147]; (i.e. performs decoding of right-view images and left-view images) [Sasaki: para. 0004]); and 
a renderer (i.e. a playback unit) [Sasaki: claim 13]  configured to render the pictur((i.e. The video decoding unit 1504 writes decoded frames to the frame buffer (1) 1508 and outputs the "frame-packing information" to the display judging unit 1506.) [Sasaki: para. 0148]; (i.e. the 3D digital television 4200 performs 3D playback by decoding the 2D/L video stream and the R video stream so extracted with use of the video decoder 4207 and by outputting video signals to the display unit 4213) [Sasaki: para. 0341]; (i.e. Further, the 3D digital television 4200 performs 3D playback by decoding the right-view video stream and the left-view video stream so extracted with use of the video decoder 4207 and by outputting video signals to the display unit 4213) [Sasaki: para. 0354]; (i.e. Further, the 3D digital television 4200 performs 3D playback by decoding the base view stream and the dependent view stream so extracted with use of the video decoder 4207 and by outputting video signals to the display unit 4213) [Sasaki: para. 0360]; (i.e. the playback unit plays back the 3D video by using the video streams constituting the 3D video when the current mode is the 3D mode) [Sasaki: claim 19]),
wherein the metadata (i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093] includes information for the sphere regions including: information for a type for the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096], information for a number of the sphere regions (i.e. Under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096], information representing a quality type for a difference in a quality on the picture ((i.e. a method is applied of transferring 3D video by using two video streams as illustrated in FIG. 18, restrictions may be imposed such that attributes such as frame rate, resolution, and aspect ratio, are common between the two video streams.) [Sasaki: para. 0181; Fig. 18]; (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor.) [Sasaki: para. 0183; Fig. 18]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094] – Note: The depth map represents a different quality of the region), information representing a quality order for the sphere regions ((i.e. a method is applied of transferring 3D video by using two video streams as illustrated in FIG. 18, restrictions may be imposed such that attributes such as frame rate, resolution, and aspect ratio, are common between the two video streams.) [Sasaki: para. 0181; Fig. 18]; (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor.) [Sasaki: para. 0183; Fig. 18]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]), information representing a width for the sphere regions (i.e. under MPEG-4 AVC, the SPS contains aspect ratio information ("aspect_ratio_idc") as scaling information. Under MPEG-4 AVC, to expand a 1440x1080 pixel cropping area to a 1920x1080 pixel resolution for displaying, a 4:3 aspect ratio is designated. In this case, upconversion by a factor of 4/3 takes place in the horizontal direction (1440x4/3= 1920) for an expanded 1920x1080 pixel resolution display) [Sasaki: para. 0099], and information representing a height for the sphere regions (i.e. under MPEG-4 AVC, the SPS contains aspect ratio information ("aspect_ratio_idc") as scaling information. Under MPEG-4 AVC, to expand a 1440x1080 pixel cropping area to a 1920x1080 pixel resolution for displaying, a 4:3 aspect ratio is designated. In this case, upconversion by a factor of 4/3 takes place in the horizontal direction (1440x4/3= 1920) for an expanded 1920x1080 pixel resolution display) [Sasaki: para. 0099]; and
transmitting the file (i.e. The data containment method determining unit 2303 further transmits a specification of time information and frame-packing information storage type to the video encoder 2301 in addition to the information regarding such playback methods.) [Sasaki: para. 0186; Figs. 1, 23]; (i.e. the description on the structure of a typical stream transmitted by digital television broadcasts and the like) [Sasaki: para. 0107]), wherein the metadata (i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093] further includes: flag information for representing whether (i.e. a flag indicating whether) [Sasaki: para. 0294] a boundary for a region in the picture is processed (i.e. the "frame_cropping" information indicates the upper, lower, left, and right boundaries of the cropping area such that the differences thereof from the upper, lower, left, and right boundaries of the encoded frame indicate the area to be cropped out. More precisely, to designate a cropping area, a flag ("frame_cropping_flag") is set to 1, and the upper, lower, left, and right areas to be cropped out are respectively indicated as the fields "frame_crop_top_offset", "frame_crop_bottom_offset", "frame_crop_left_offset", and "frame_crop_right_offset") [Sasaki: para. 0099; Figs. 11-12], and type information for representing a type related to the boundary, 
wherein the information representing a quality order  ((i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]; (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor.) [Sasaki: para. 0183; Fig. 18]; (i.e. two types of "processing priority" are prepared and provided to the frame-packing information descriptors, one being "descriptor prioritized" and the other being "video prioritized") [Sasaki: para. 0164; Fig. 16 – Note: More details is described in para. 0164-0166]) for the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096] comprises a first value for a quality of a first sphere region (i.e. FIG. 24 illustrates an example of generating parallax images of left-view video and right-view video according to a 2D video image and a depth map) [Sasaki: para. 0043; Fig. 24] of the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096], and a second value for a quality of a second sphere region (i.e. FIG. 24 illustrates an example of generating parallax images of left-view video and right-view video according to a 2D video image and a depth map) [Sasaki: para. 0043; Fig. 24] of the sphere regions (i.e. For example, under MPEG-4 AVC, the AU identification code is an AU delimiter (Access Unit Delimiter), the sequence header is an SPS (Sequence Parameter Set), the picture header is a PPS (Picture Parameter Set), the compressed picture data consist of several slices) [Sasaki: para. 0096] different from the first sphere region (i.e. The depth map contains depth values corresponding to each pixel in the 2D image. In the example illustrated in FIG. 24, information indicating high depth is assigned to the round object in the 2D image according to the depth map, while other areas are assigned information indicating low depth. This information may be contained as a bit sequence for each pixel, and may also be contained as a picture image (such as an image where black indicates low depth and white indicates high-depth)) [Sasaki: para. 0076; Fig. 24], and 
the first and second values indicate that the first sphere region has a higher priority than the second sphere region (i.e. depth values corresponding to each pixel in the 2D image. In the example illustrated in FIG. 24, information indicating high depth is assigned to the round object in the 2D image according to the depth map, while other areas are assigned information indicating low depth. This information may be contained as a bit sequence for each pixel, and may also be contained as a picture image (such as an image where black indicates low depth and white indicates high-depth)) [Sasaki: para. 0076; Fig. 24].   
Sasaki does not explicitly disclose the following claim limitations (Emphasis added). 
A 360-degree video receiving apparatus, the apparatus comprising: a receiver configured to receive media data including a picture in which 360-degree video data is projected and metadata for the 360-degree video data, the 360-degree video data covering sphere regions; a decoder configured to decode the picture based on the metadata; and a renderer configured to render the picture,wherein the metadata includes information for the sphere regions including:  information for a type for the sphere regions, information for a number of the sphere regions, information representing a quality type for a difference in a quality on the picture, information representing a quality order for the sphere regions, information representing a width for the sphere regions, and information representing a height for the sphere regions, wherein the metadata further includes: flag information for representing whether a boundary for a region in the picture is processed, and type information for representing a type related to the boundary, 
wherein the information representing a quality order for the sphere regions comprises a first value for a quality of a first sphere region of the sphere regions, and a second value for a quality of a second sphere region of the sphere regions different from the first sphere region, and the first and second values indicate that the first sphere region has a higher priority than the second sphere region. 
However, in the same field of endeavor Chang further discloses the claim limitations and the deficient claim limitations, as follows:
the 360-degree video data covering sphere regions (i.e. As mentioned before, a 360-
degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18]; 
((i.e. The 360-degree video metadata typically include information such as projection type, stitching software, capture software, pose degrees, view degrees, source photo count, cropped width, cropped height, full width, full height, etc. There are two types of 360-degree video metadata needed to represent various characteristics of a spherical video: Global and Local metadata. Global metadata is usually stored in an XML (Extensible Markup Language) format. These are two types of local metadata including the strictly per-frame metadata and arbitrary local metadata (e.g. information sampled at certain intervals)) [Chang: para. 0093]; (i.e. The picture regions corresponding to overlapped areas are indicated as dashed boxes (211-218)) [Chang: para. 0059]) (i.e. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image. According to the equirectangular projection, the horizontal coordinate is simply longitude, and the vertical coordinate is simply latitude) [Chang: para. 0092], information for a number of the sphere regions (i.e. the cubic projection is a type of projection for mapping the surface of a sphere onto six faces of a cube. The images are arranged like the faces of a cube. FIG. 25B illustrates an example of a 360-degree picture based on the cubic projection. In order to properly use the 360-degree video, it requires to include 360-degree video metadata associated with the 360-degree video) [Chang: para. 0092], information representing a quality type for a difference in a quality on the picture ((i.e. FIG. 27 illustrates an example of cloud-based processing of 360-degree video according to one embodiment of the present invention, where the video data captured by a 360-degree video capture camera 2710 is uploaded to the cloud 2720. The cloud environment has more computational resources and can provide processed video with different quality according to the network bandwidth. Depending on the available network bandwidth and the specific characteristics ( e.g. display resolution) of end receiving devices (e.g. mobile phone 2732, tablet 2734 and computer 2736)) [Chang: para. 0097]; (i.e. Techniques related to image stitching has been well studied in the field of panoramic image processing. However, the stitching techniques often still result in stitched image with imperfection or artefacts such as visible seams. Therefore, blending is always used to improve the visual quality of the stitched picture. According to the present invention, the 360-degree video metadata may also include information regarding the blending methods, such as GIST, Pyramid, and Alpha blending, that users can select. GIST stitching corresponds to GIST: Gradient-domain Image STitching. All these blending methods are well known in the field and the details are not repeated in this disclosure. The 360-degree video metadata may also include information related to stitching positions, which is defined as the seam between the images captured by different cameras) [Chang: para. 0101]), information representing a quality order for the sphere regions (i.e. The set of cameras have to be calibrated to avoid possible misalignment. Calibration is a process of correcting lens distortion and describing the transformation between world coordinate and camera coordinate. The calibration process is necessary to allow correct stitching of videos. Individual video recordings have to be stitched in order to create one 360-degree video) [Chang: para. 0005]; (i.e. Techniques related to image stitching has been well studied in the field of panoramic image processing. However, the stitching techniques often still result in stitched image with imperfection or artefacts such as visible seams. Therefore, blending is always used to improve the visual quality of the stitched picture. According to the present invention, the 360-degree video metadata may also include information regarding the blending methods, such as GIST, Pyramid, and Alpha blending, that users can select. GIST stitching corresponds to GIST: Gradient-domain Image STitching. All these blending methods are well known in the field and the details are not repeated in this disclosure. The 360-degree video metadata may also include information related to stitching positions, which is defined as the seam between the images captured by different cameras) [Chang: para. 0101]), information representing a width for the sphere regions ((i.e. The 360-degree video metadata typically include information such as projection type, stitching software, capture software, pose degrees, view degrees, source photo count, cropped width, cropped height, full width, full height, etc. There are two types of 360-degree video metadata needed to represent various characteristics of a spherical video: Global and Local metadata. Global metadata is usually stored in an XML (Extensible Markup Language) format. These are two types of local metadata including the strictly per-frame metadata and arbitrary local metadata (e.g. information sampled at certain intervals)) [Chang: para. 0093]; (i.e. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image. According to the equirectangular projection, the horizontal coordinate is simply longitude, and the vertical coordinate is simply latitude) [Chang: para. 0092]; (i.e. Depending on the available network bandwidth and the specific characteristics ( e.g. display resolution) of end receiving devices (e.g. mobile phone 2732, tablet 2734 and computer 2736)) [Chang: para. 0097]), and information representing a height for the sphere regions ((i.e. The 360-degree video metadata typically include information such as projection type, stitching software, capture software, pose degrees, view degrees, source photo count, cropped width, cropped height, full width, full height, etc. There are two types of 360-degree video metadata needed to represent various characteristics of a spherical video: Global and Local metadata. Global metadata is usually stored in an XML (Extensible Markup Language) format. These are two types of local metadata including the strictly per-frame metadata and arbitrary local metadata (e.g. information sampled at certain intervals)) [Chang: para. 0093]; (i.e. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image. According to the equirectangular projection, the horizontal coordinate is simply longitude, and the vertical coordinate is simply latitude) [Chang: para. 0092]; (i.e. Depending on the available network bandwidth and the specific characteristics (e.g. display resolution) of end receiving devices (e.g. mobile phone 2732, tablet 2734 and computer 2736)) [Chang: para. 0097]);
type information for representing a type related to the boundary, 
(i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18] (i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18], (i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18] different from the first sphere region (i.e. As mentioned before, a 360-degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18], (i.e. As mentioned before, a 360- degree video may be created with a spherical camera system that simultaneously records 360 degrees FOV of a scene. The image types of 360-degree video include equirectangular and cubic projections. The equirectangular projection is a type of projection for mapping a portion of the surface of a sphere to a flat image) [Chang: para. 0092; Figs 1-2; 5A-B; 10B, 11, 13, 18].   
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki with Chang to rending video data into a 360-degree video.  
Therefore, the combination of Sasaki with Chang will enable users to view captured videos with a panoramic view of a 360-degree field of view [Chang: para. 0003].
explicitly disclose the following claim limitations (Emphasis added). 
type information for representing a type related to the boundary. 
However, in the same field of endeavor Gao further discloses the claim limitations and the deficient claim limitations, as follows:
wherein the metadata further includes (i.e. parameters (passed as metadata to the post-processing deblock filter) applicable for the block) [Gao: col. 17, line 18-20]: flag information for representing whether a boundary for a region in the picture (i.e. the boundary between a block and its neighbor block is adaptively filtered depending on metadata signaled in the bitstream, smoothness of pixel values across the boundary, and quantization parameters (passed as metadata to the post-processing deblock filter) applicable for the block) [Gao: col. 17, line 15-20] is processed (i.e. ) [Gao: col. 17, line 15-20], and type information for representing a type related to the boundary  ((i.e. a video encoder or decoder determines the picture coding type ( e.g., I, P, B or BI) of a video picture. The encoder/decoder partitions the video picture into multiple segments for deblock filtering. Based at least in part on the picture coding type, the encoder/decoder selects between multiple different patterns for splitting operations of the deblock filtering into multiple passes. The selection of the pattern can also be based at least in part on the frame coding mode (e.g., progressive, interlaced field, or interlaced frame) of the picture. The encoder / decoder organizes the deblock filtering for the video picture as multiple tasks, where a given task includes the operations of one of the multiple passes for one of the multiple segments, then performs the multiple tasks with multiple threads.) [Gao: col. 3, line 7-21]; (i.e. type of a picture, potentially filtering block boundaries and/or subblock boundaries in any of several different processing orders. For additional details, see sections 8. 6 and 10 .10 of the VC-1 standard) [Gao: col. 11, line 15-19]; (i.e. According to another aspect of the techniques and tools described herein, a video encoder or decoder partitions a video picture into multiple segments for deblock filtering whose operations are split into three or more passes. For example, the passes include a first pass for making filtering decisions, a second pass for filtering of horizontal block boundaries, a third pass for filtering of horizontal sub-block boundaries, and a fourth pass for filtering of vertical boundaries. Or, the passes include a first pass for making filtering decisions, a second pass for filtering of horizontal boundaries for a top field, a third pass for filtering of horizontal boundaries for a bottom field, and a fourth pass for filtering of vertical boundaries for the top field and the bottom field. The encoder/decoder organizes the de block filtering for the video picture as multiple tasks, then performs the multiple tasks with multiple threads) [Gao: col. 3, line 22-37]);
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki and Chang with Gao to program the system to encode parameters related to block boundaries.  
Therefore, the combination of Sasaki and Chang with Gao will enable the system to improve video coding efficiency [Gao: col. 1, line 24-46].
In the same field of endeavor Campbell further discloses the claim limitations as follows:
wherein the information representing a quality order for the sphere regions ((i.e. The plurality of video segments relating to the total available field of view may be encoded at different quality levels, and the user terminal may further select a quality level of each selected video segment that is retrieved. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in the same way as adaptive video streaming, such as HTTP adaptive streaming) [Campbell: para. 0012-0013]; (i.e. The segments 281 are illustrated in FIG. 2 as tiles of a sphere, representing the total area of the video 280 that is available for display by smartphone 200 as the user changes the orientation of this user terminal. The displayed area 285 of video 280 spans six segments or tiles 281) [Campbell: para. 0052; Fig. 2]) comprises a first value for a quality of a first sphere region of the sphere regions ((i.e. The plurality of video segments relating to the total available field of view may be encoded at different quality levels, and the user terminal may further select a quality level of each selected video segment that is retrieved. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in the same way as adaptive video streaming, such as HTTP adaptive streaming) [Campbell: para. 0012-0013]; (i.e. The segments 281 are illustrated in FIG. 2 as tiles of a sphere, representing the total area of the video 280 that is available for display by smartphone 200 as the user changes the orientation of this user terminal. The displayed area 285 of video 280 spans six segments or tiles 281) [Campbell: para. 0052; Fig. 2]), and a second value for a quality of a second sphere region of the sphere regions different from the first sphere region ((i.e. The plurality of video segments relating to the total available field of view, or total video area may each be encoded at different quality levels. In that case, the user terminal not only selects which video segments to retrieve, but also at which quality level each segment should be retrieved. This allows the immersive video to be delivered with adaptive bitrate streaming. External factors such as the available bandwidth and available user terminal processing capacity are measured and the quality of the video stream is adjusted accordingly. The user terminal selects which quality level of a segment to stream depending on available resources.) [Campbell: para. 0061]; (i.e. The plurality of video segments relating to the total available field of view may be encoded at different quality levels, and the user terminal may further select a quality level of each selected video segment that is retrieved. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in the same way as adaptive video streaming, such as HTTP adaptive streaming) [Campbell: para. 0012-0013]; (i.e. where segments are available at different quality levels, the segments shown in different areas in FIGS. 5 and 6 are retrieved at different quality levels. That is the primary segments in the diagonally shaded regions 540a, 540b, 640a, and 640b are retrieved in a relatively high quality, whereas the auxiliary segments in cross hatched regions 542a, 542b, 642a, and 642b are retrieved at a relatively lower quality. Where the secondary auxiliary segments in area 644 are downloaded, lower still quality versions of these are retrieved.) [Campbell: para. 0067; Figs. 5-6]), and the first and second values indicate that the first sphere region has a higher priority than the second sphere region (i.e. The plurality of video segments relating to the total available field of view, or total video area may each be encoded at different quality levels. In that case, the user terminal not only selects which video segments to retrieve, but also at which quality level each segment should be retrieved. This allows the immersive video to be delivered with adaptive bitrate streaming. External factors such as the available bandwidth and available user terminal processing capacity are measured and the quality of the video stream is adjusted accordingly. The user terminal selects which quality level of a segment to stream depending on available resources. The quality level of an encoded video segment may be determined by the bit rate, the quantization parameter, or the pixel resolution. A lower quality segment should require fewer resources for transmission and processing. By making segments available at different quality levels, a user terminal can adapt the amount of network and processing resources it uses in much the same way as adaptive video streaming, such as adaptive bitrate streaming) [Campbell: para. 0061-0062]; (i.e. ) [Campbell: para. 000#]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki, Chang, and Gao with Campbell to program the system to encode parameters related to quality for each region.  
Therefore, the combination of Sasaki, Chang, and Gao with Campbell will enable the system to improve video coding efficiency [Gao: col. 1, line 24-46] and display the best view of region of interests to view [Campbell: Para. 0061].

Claims 2, 5, 16, 19 and 23-26 are rejected under 35 U.S.C. 103 as being unpatentable over Sasaki et al.  (US Patent Application Publication 2012/0106921 A1), (“Sasaki”), in view of Chang et al. (US Patent Application Publication 2017/0118475 A1), (“Chang”), in view of Gao et al. (US Patent 9,042,458 B2), (“Gao”), in view of Campbell et al. (US Patent Application Publication 2016/0277772 A1), (“Campbell”), in view of .
Regarding claim 2, Sasaki meets the claim limitations as set forth in claim 1.Sasaki further meets the claim limitations as follow.
The method of claim 1 (i.e. an encoding method) [Sasaki: para. 0002], wherein the metadata ((i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]) includes information related to a horizontal direction or a vertical direction for indicating a horizontal down scaling or a vertical down scaling (i.e. under MPEG-4 AVC, the SPS contains aspect ratio information ("aspect_ratio_idc") as scaling information. Under MPEG-4 AVC, to expand a 1440x1080 pixel cropping area to a 1920x1080 pixel resolution for displaying, a 4:3 aspect ratio is designated. In this case, upconversion by a factor of 4/3 takes place in the horizontal direction (1440x4/3= 1920) for an expanded 1920x1080 pixel resolution display) [Sasaki: para. 0099]. 
Sasaki and Chang does not explicitly disclose the following claim limitations (Emphasis added).
The method of claim 1, wherein the metadata includes information related to a horizontal direction or a vertical direction for indicating a horizontal down scaling or a vertical down scaling.   
However, in the same field of endeavor Nakano further discloses the claim limitations and the deficient claim limitations, as follows:
wherein the metadata includes information related to a horizontal direction or a vertical direction for indicating a horizontal down scaling or a vertical down scaling (i.e. In this scaling processing, the character graphic plane generated in the character graphic plane generator 48f4 is reduced to a character graphic plane having a smaller number of display pixels, and the reduction ratio does not reach even 1/2 (50%). This allows high quality characters and graphics to be displayed by the character graphic plane after scaling processing) [Nakano: para. 0060]).
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki and Chang with Nakano to encode down scaling factor for vertical and horizontal directions in the metadata, such as SPS (Sequence Parameter Set) or the picture header is a PPS (Picture Parameter Set).  
Therefore, the combination of Sasaki and Chang with Nakano will enable the decoder to down-sampling video frames with an appropriate ratio to fit into a small display and generating a high visual video quality [Nakano: para. 0066]. 

Regarding claim 5, Sasaki meets the claim limitations as set forth in claim 2.Sasaki and Nakano further meet the claim limitations as follow.
The method of claim 2 (i.e. an encoding method) [Sasaki: para. 0002], wherein the information related to the horizontal direction or the vertical direction is information about scaling ((i.e. under MPEG-4 AVC, the SPS contains aspect ratio information ("aspect_ratio_idc") as scaling information. Under MPEG-4 AVC, to expand a 1440x1080 pixel cropping area to a 1920x1080 pixel resolution for displaying, a 4:3 aspect ratio is designated. In this case, upconversion by a factor of 4/3 takes place in the horizontal direction (1440x4/3= 1920) for an expanded 1920x1080 pixel resolution display) [Sasaki: para. 0099]; (i.e. In this scaling processing, the character graphic plane generated in the character graphic plane generator  48f4 is reduced to a character graphic plane having a smaller number of display pixels, and the reduction ratio does not reach even 1/2 (50%). This allows high quality characters and graphics to be displayed by the character graphic plane after scaling processing) [Nakano: para. 0060]) a region in the projected picture.  
Sasaki and Nakano do not explicitly disclose the following claim limitations (Emphasis added).
The method of claim 2, wherein the information related to the horizontal direction or the vertical direction is information about scaling a region in the projected picture.  
However, in the same field of endeavor Chang further discloses the claim limitations and the deficient claim limitations as follows:
(i.e. A color scaling process can be used with the RIBC mode to compensate the color/brightness discrepancy between images from different cameras. A projection-based Inter prediction method is also disclosed. The projection-based Inter prediction method takes into account different perspectives between two images captured from different cameras. Transform matrix is applied to a block candidate to project the block candidate to a position of a target block. The projected block candidate is used as a predictor for the target block) [Chang: Abstract]; (i.e. The color-scaled block 1330 is then used as a predictor for the target block 1312) [Chang: para. 0076]; (i.e. The block 1924 is projected using camera parameters to a projected block 1930 and the projected block 1930 is used to predict the target block 1914) [Chang: para. 0084]).  
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki and Nakano with Chang to encode scaling information for vertical and horizontal directions in the metadata.  
Therefore, the combination of Sasaki and Nakano with Chang will enable the decoder to compensate the color/brightness discrepancy between images from different cameras and generating a high visual video quality output [Chang: Abstract]. 

Regarding claim 16, Sasaki meets the claim limitations as set forth in claim 15.Sasaki further meets the claim limitations as follow.
The method of claim 15 (i.e. a decoding method) [Sasaki: claim 19], wherein the metadata ((i.e. A video access unit is a unit containing encoded picture data, specifically a single frame in a frame structure and a single field in a field structure. Each video access unit is composed of an AU identification code, a sequence header, a picture header, supplementary data, compressed picture data, padding data, a sequence end code, a stream end code and the like. Under MPEG-4AVC, all data is contained in units called NAL units.) [Sasaki: para. 0093]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]) includes information related to a horizontal direction or a vertical direction is information indicating at least one of horizontal down scaling or vertical down scaling (i.e. under MPEG-4 AVC, the SPS contains aspect ratio information ("aspect_ratio_idc") as scaling information. Under MPEG-4 AVC, to expand a 1440x1080 pixel cropping area to a 1920x1080 pixel resolution for displaying, a 4:3 aspect ratio is designated. In this case, upconversion by a factor of 4/3 takes place in the horizontal direction (1440x4/3= 1920) for an expanded 1920x1080 pixel resolution display) [Sasaki: para. 0099]. 
Sasaki and Chang do not explicitly disclose the following claim limitations (Emphasis added).
The method of claim 15, wherein the metadata includes information related to a horizontal direction or a vertical direction is information indicating at least one of horizontal down scaling or vertical down scaling.   
However, in the same field of endeavor Nakano further discloses the claim limitations and the deficient claim limitations, as follows:
wherein the metadata includes information related to a horizontal direction or a vertical direction is information indicating at least one of horizontal down scaling or vertical down scaling  (i.e. In this scaling processing, the character graphic plane generated in the character graphic plane generator  48f4 is reduced to a character graphic plane having a smaller number of display pixels, and the reduction ratio does not reach even 1/2 (50%). This allows high quality characters and graphics to be displayed by the character graphic plane after scaling processing) [Nakano: para. 0060]).

Therefore, the combination of Sasaki and Chang with Nakano will enable the decoder to down-sampling video frames with an appropriate ratio to fit into a small display and generating a high visual video quality [Nakano: para. 0066]. 

Regarding claim 19, Sasaki and Nakano meet the claim limitations as set forth in claim 16.Sasaki and Nakano further meet the claim limitations as follow.
The method of claim 16 (i.e. a decoding method) [Sasaki: claim 19], wherein the information related to the horizontal direction or the vertical direction is information about scaling ((i.e. under MPEG-4 AVC, the SPS contains aspect ratio information ("aspect_ratio_idc") as scaling information. Under MPEG-4 AVC, to expand a 1440x1080 pixel cropping area to a 1920x1080 pixel resolution for displaying, a 4:3 aspect ratio is designated. In this case, upconversion by a factor of 4/3 takes place in the horizontal direction (1440x4/3= 1920) for an expanded 1920x1080 pixel resolution display) [Sasaki: para. 0099]; (i.e. In this scaling processing, the character graphic plane generated in the character graphic plane generator  48f4 is reduced to a character graphic plane having a smaller number of display pixels, and the reduction ratio does not reach even 1/2 (50%). This allows high quality characters and graphics to be displayed by the character graphic plane after scaling processing) [Nakano: para. 0060]) a region in the projected picture.  
Sasaki and Nakano do not explicitly disclose the following claim limitations (Emphasis added).
The method of claim 16, wherein the information related to the horizontal direction or the vertical direction is information about scaling a region in the projected picture.  

(i.e. A color scaling process can be used with the RIBC mode to compensate the color/brightness discrepancy between images from different cameras. A projection-based Inter prediction method is also disclosed. The projection-based Inter prediction method takes into account different perspectives between two images captured from different cameras. Transform matrix is applied to a block candidate to project the block candidate to a position of a target block. The projected block candidate is used as a predictor for the target block) [Chang: Abstract]; (i.e. The color-scaled block 1330 is then used as a predictor for the target block 1312) [Chang: para. 0076]; (i.e. The block 1924 is projected using camera parameters to a projected block 1930 and the projected block 1930 is used to predict the target block 1914) [Chang: para. 0084]).  
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki and Nakano with Chang to encode scaling information for vertical and horizontal directions in the metadata.  
Therefore, the combination of Sasaki and Nakano with Chang will enable the decoder to compensate the color/brightness discrepancy between images from different cameras and generating a high visual video quality output [Chang: Abstract]. 

Regarding claim 23, Sasaki meets the claim limitations as set forth in claim 1.Sasaki further meets the claim limitations as follow.
The method of claim 1 (i.e. an encoding method) [Sasaki: para. 0002], wherein the quality type comprises at least one of: a spatial scaling quality type ((i.e. scaling information of a video.) [Sasaki: para. 0030; Fig. 11]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]);a compression quality type ((i.e. the extended video stream is to be configured to contain a low bit-rate image) [Sasaki: para. 0176]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data) [Sasaki: para. 0094]); a bit depth quality type; a color quality type; a dynamic range quality type (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor) [Sasaki: para. 0183; Fig. 18]; a frame rate quality type (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]); or a detail quality type.

Regarding claim 24, Sasaki meets the claim limitations as set forth in claim 15.Sasaki further meets the claim limitations as follow.
The method of claim 15 (i.e. an encoding method) [Sasaki: para. 0002], wherein the quality type comprises at least one of: a spatial scaling quality type ((i.e. scaling information of a video.) [Sasaki: para. 0030; Fig. 11]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]);a compression quality type ((i.e. the extended video stream is to be configured to contain a low bit-rate image) [Sasaki: para. 0176]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data) [Sasaki: para. 0094]); a bit depth quality type; a color quality type; a dynamic range quality type (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor) [Sasaki: para. 0183; Fig. 18]; a frame rate quality type (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]); or a detail quality type.

Regarding claim 25, Sasaki meets the claim limitations as set forth in claim 21.Sasaki further meets the claim limitations as follow.
The 360-degree video transmission apparatus ((i.e. a video encoder) [Sasaki: para. 0394]; (i.e. an LSI) [Sasaki: para. 0365]; (i.e. a computer) [Sasaki: para. 0072]) of claim 21 (i.e. an encoding method) [Sasaki: para. 0002], wherein the quality type comprises at least one of: a spatial scaling quality type ((i.e. scaling information of a video.) [Sasaki: para. 0030; Fig. 11]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]);a compression quality type ((i.e. the extended video stream is to be configured to contain a low bit-rate image) [Sasaki: para. 0176]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data) [Sasaki: para. 0094]); a bit depth quality type; a color quality type; a dynamic range quality type (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor) [Sasaki: para. 0183; Fig. 18]; a frame rate quality type (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]); or a detail quality type.

Regarding claim 26, Sasaki meets the claim limitations as set forth in claim 22.Sasaki further meets the claim limitations as follow.
The 360-degree video receiving apparatus ((i.e. a video decoder) [Sasaki: para. 0105];  (i.e. 3D mode can be performed) [Sasaki: para. 0007]) of claim 22 (i.e. an encoding method) [Sasaki: para. 0002], wherein the quality type comprises at least one of: a spatial scaling quality type ((i.e. scaling information of a video.) [Sasaki: para. 0030; Fig. 11]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]);a compression quality type ((i.e. the extended video stream is to be configured to contain a low bit-rate image) [Sasaki: para. 0176]; (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data) [Sasaki: para. 0094]); a bit depth quality type; a color quality type; a dynamic range quality type (i.e. Further, although description has been made in the above with reference to FIG. 18 that the extended video stream is either the left-view video or the right-view video, the extended video stream may also be a depth map visualizing a depth of the 2D video. In addition, when the extended video stream is a depth map, a specification of a 3D playback method may be made with the use of a descriptor) [Sasaki: para. 0183; Fig. 18]; a frame rate quality type (i.e. The sequence header is a header containing information common to all of the video access units that make up the playback sequence, such as the resolution, frame rate, aspect ratio, bitrate and the like. The picture header is a header containing information indicating an encoding format applied to the entire picture and the like. The supplementary data are additional data not required to decode the compressed data, such as closed-caption text information that can be displayed on a television simultaneously with the video and information about the structure of the GOP. The compressed picture data includes compression-coded picture data.) [Sasaki: para. 0094]); or a detail quality type.

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Sasaki et al.  (US Patent Application Publication 2012/0106921 A1), (“Sasaki”), in view of Chang et al. (US Patent Application Publication 2017/0118475 A1), (“Chang”), in view of Gao et al. (US Patent 9,042,458 B2), (“Gao”), in view of Campbell et al. (US Patent Application Publication 2016/0277772 A1), (“Campbell”).  in view of Nakano (US Patent Application Publication 2008/0063355 A1), (“Nakano”), in view of Yin et al. (US Patent 10,542,289 B2), (“Yin”).
Regarding claim 12, Sasaki meets the claim limitations as set forth in claim 5.Sasaki further meets the claim limitations as follow.
The method of claim 5 (i.e. an encoding method) [Sasaki: para. 0002], wherein: the metadata comprises a flag indicating whether (i.e. The mode storing unit 4204 stores a flag indicating whether) [Sasaki: para. 0294] information on an area in which post-processing is performed in the region is forwarded, and when a value of the flag is 1, the metadata comprises information indicating the area in which post-processing is performed in the region.
Sasaki, Gao, Nakano, and Chang do not explicitly disclose the following claim limitations (Emphasis added).
The method of claim 5, wherein 
the metadata comprises a flag indicating whether information on an area in which post-processing is performed in the region is forwarded, and when a value of the flag is 1, the metadata comprises information indicating the area in which post-processing is performed in the region.
However, in the same field of endeavor Yin further discloses the claim limitations and the deficient claim limitations, as follows:
the metadata comprises a flag indicating whether information (i.e.  In a method to improve the coding efficiency of high-dynamic range (HDR) images, a decoder parses sequence processing set (SPS) data from an input coded bitstream to detect that an HDR extension syntax structure is present in the parsed SPS data. It extracts from the HDR extension syntax structure post-processing information that includes one or more of a color space enabled flag, a color enhancement enabled flag, an adaptive reshaping enabled flag, a dynamic range conversion flag, a color correction enabled flag, or an SDR viewable flag) [Yin: col. 3, line 33-42] on an area in which post-processing is performed in the region is forwarded (i.e. Post-production editing (115) may include adjusting or modifying colors or brightness in particular areas of an image to enhance the image quality or achieve a particular appearance for the image in accordance with the video creator's creative intent. This is sometimes called "color timing" or "color grading.") [Yin: col. 4, line 62-67], and when a value of the flag is 1 ((i.e.  wherein the adaptive reshaping enabled flag indicates that information related to adaptive reshaping is present) [Yin: col. 26, line 3-4]; (i.e.  signal_reshape model_present_flag = 1) [Yin: col. 15, line 59]; (i.e.   ) [Yin: col. 28, line 16-17]; (i.e.   colour_space_enabled_flag equal to 1 specifies that color space information is present) [Yin: col. 8, line 36-37]; (i.e.  colour_enhancement_enabled_flag equal to 1 specifies that a colour enhancement process for the decoded pictures may be used in the coded video sequence (CVS)) [Yin: col. 8, line 44-46]), the metadata comprises information ((i.e.   wherein the information related to adaptive reshaping comprises reshaping function parameters to determine a reshaping function based on one or more polynomial functions, and wherein the reshaping function parameters comprise: a first parameter based on a total number of polynomial functions used to define the reshaping function, and for each (p,) polynomial function in the reshaping function, further comprising: a starting pivot point for the polynomial function; a second parameter based on an order of the polynomial function, wherein the order of the polynomial function can't exceed the value of two; and one or more non-zero coefficients for the polynomial function) [Yin: col. 28, line 16-17]; (i.e.   In an embodiment, when colour_enhancement_enabled_flag=1, the bitstream (e.g., the sps_hdrwcg_extension( ) structure or pps_hdrwcg_extension() structure) may include additional information (e.g., filter coefficients) for post-processing to reduce quantization and down-sampling errors for chroma components to improve color performance) [Yin: col. 8, line 51-56]) indicating the area in which post-processing is performed in the region ((i.e.   In an embodiment, when colour_enhancement_enabled_flag=1, the bitstream (e.g., the sps_hdrwcg_extension( ) structure or pps_hdrwcg_extension() structure) may include additional information (e.g., filter coefficients) for post-processing to reduce quantization and down-sampling errors for chroma components to improve color performance) [Yin: col. 8, line 51-56]; (i.e.  The video data of production stream (112) is then provided to a processor at block (115) for post-production editing. Post-production editing (115) may include adjusting or modifying colors or brightness in particular areas of an image to enhance the image quality or achieve a particular appearance for the image in accordance with the video creator's creative intent. This is sometimes called "color timing" or "color grading.") [Yin: col. 4, line 60-67]).
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Sasaki, Gao, Nakano, and Chang with Yin to encode post processing flags in the metadata.  
Therefore, the combination of Sasaki, Nakano, and Chang with Yin will enable the decoder to apply the post processing processes to improve visual video quality [Yin: col. 4, line 60-67].   

Reference Notice 
Additional prior arts, included in the Notice of Reference Cited, made of record and not relied upon is considered pertinent to applicant's disclosure.

Contact Information


Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sath Perungavoor can be reached on 571-272-7455.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /Philip P. Dang/            Primary Examiner, Art Unit 2488