DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
Response to Arguments
3.	Applicant's arguments with respect to the rejections of claims 1-20 have been considered but are moot in view of the new grounds of rejection.  
          Regarding claims 1-20, Applicant argues that Oh does not disclose a visual attention region having a three-dimensional location in a three-dimensional scene corresponding to a gaze indication, as claimed in independent claims 1 and 15. 
Examiner respectfully disagrees.
First, Oh discloses a visual attention region having a three-dimensional location in a three-dimensional scene corresponding to a gaze indication (see Oh, para. 0095, the transmission-side feedback-processing unit may deliver the feedback information, received from the 360-degree video reception apparatus, to the data encoder, which may differently encode the regions.  For example, the transmission-side feedback-processing unit may deliver the viewport information, received from the reception side, to the data encoder; para. 0147, tiling may enable the user to enjoy or transmit only tiles corresponding to an important part or a predetermined part, such as the viewport that is being viewed by the user, to the reception side within a limited bandwidth; para’s 0017 and 0018, the renderer may re-project the stitched 360-degree video data in the 3D space; the signaling information may include ROI information indicating a region of interest (ROI), among the 360-degree video data, the ROI information may indicate the ROI, appearing in the 3D space, using pitch, yaw, and roll);
Second, in a new ground of rejection, Oh in view of Pepperell also discloses a visual attention region having a three-dimensional location in a three-dimensional scene corresponding to a gaze indication (see Pepperell, para’s 0012 and 0043-0047, fig. 1; the 3D data is accessed, objects in the scene are assigned a region of interest function, this being a function that defines a 3D volume enclosing the outer coordinates of the object, but extending beyond its contours to an extent that can be defined by a suitable user control interface, or determined by parameters pre-programmed into the device by a person skilled in the art. This volume acts as a visual fixation coordinate detection function and is incrementally calibrated, using computational techniques known to a person skilled in the art, such that the sensitivity of the function is lower at the outer extremities of the region and increases as it approaches the boundary of the object, being at its greatest sensitivity within the boundaries of the object. The visual fixation coordinate is set, using one of several processes as specified herein, and the distance of the head of the user from the image display is measured at step 706, via the eye or head “gaze” tracking sensors; see also para. 0012, a size of the detection region can be adjusted. The detection region may extend beyond boundaries of the respective object).


Response to Amendment
Claim Objections  
5.	Claims 17-19 are objected to because of the following informalities. These claims appear to be identical to claims 3-5. They should be cancelled or changed to method claims that depend on claim 15. Appropriate correction is required.

Claim Rejections - 35 USC § 103
6.	The text of those sections of Title 35, U.S. Code not included in this section can be found in a prior Office action.

7.	Claims 1, 2, 10-16, and 20 are rejected under AIA  35 U.S.C. 103 as being unpatentable over Oh et al. (US Publication 2020/0084428, hereinafter Oh) in view of Pepperell et al. (US Publication 2020/0285308, hereinafter Pepperell). 
Regarding claim 1, Oh discloses an apparatus for generating an image data stream comprising: 
a receiver circuit, wherein the receiver circuit is arranged to receive a gaze indication, wherein the gaze indication is indicative of a gaze direction, wherein the gaze direction is based on both a head pose and a relative eye pose of a viewer, wherein the head pose comprises a head position, wherein the relative eye pose is indicative of an eye pose relative to the head pose (Oh, para. 0076-0078, the head orientation information may be information about the position, angle, and movement of the head of the user. Information about the area that is being viewed by the user in the 360-degree video, i.e. the viewport information, may be calculated based on this information. Gaze analysis may be performed, and therefore it is possible to check the manner in which the user enjoys the 360-degree video, the area of the 360-degree video at which the user gazes, and the amount of time during which the user gazes at the 360-degree video. The gaze analysis may be performed at the reception side and may be delivered to the transmission side through a feedback channel.  An apparatus, such as a VR display, may extract a viewport area based on the position/orientation of the head of the user, a vertical or horizontal FOV that is supported by the apparatus; the head orientation information may be information about the position, angle, and movement of the head of the user; the disclosure above indicates a unit for receiving a gaze indication, wherein the gaze indication is indicative of both a head position and a relative eye position of a viewer, relative to the head position); 
a determiner circuit, wherein the determiner circuit is arranged to determine a visual attention region based on the gaze indication, wherein the visual attention region comprises a three-dimensional location in a three-dimensional (3D) scene (Oh, para. 0095, the transmission-side feedback-processing unit may deliver the feedback information, received from the 360-degree video reception apparatus, to the data encoder, which may differently encode the regions.  For example, the transmission-side feedback-processing unit may deliver the viewport information, received from the reception side, to the data encoder. The data encoder may encode regions including the areas indicated by the viewport information at higher quality (UHD, etc.) than other regions; para. 0147, tiling may enable the user to enjoy or transmit only tiles corresponding to an important part or a predetermined part, such as the viewport that is being viewed by the user, to the reception side within a limited bandwidth; para’s 0017-0018, the renderer may re-project the stitched 360-degree video data in the 3D space; the signaling information may include ROI information indicating a region of interest (ROI), among the 360-degree video data, the ROI information may indicate the ROI, appearing in the 3D space, using pitch, yaw, and roll); and 
a generator circuit, wherein the generator circuit is arranged to generate the image data stream such that the data stream comprises image data for the 3D scene, wherein the image data is generated so as to comprise at least a first image data for the visual attention region and a second image data for the 3D scene outside the visual attention region, wherein the generator circuit is arranged to generate the image data such that the first image data comprises a higher quality level than for the second image data (Oh, para. 0076-0078, as described above, the gaze analysis may be performed at the reception side and may be delivered to the transmission side through a feedback channel; para. 0095, the transmission-side feedback-processing unit may deliver the feedback information, received from the 360-degree video reception apparatus, to the data encoder, which may differently encode the regions. For example, the transmission-side feedback-processing unit may deliver the viewport information, received from the reception side, to the data encoder. The data encoder may encode regions including the areas indicated by the viewport information at higher quality (UHD, etc.) than other regions; para’s 0017-0018, the renderer may re-project the stitched 360-degree video data in the 3D space; the signaling information may include ROI information indicating a region of interest (ROI), among the 360-degree video data, the ROI information may indicate the ROI, appearing in the 3D space, using pitch, yaw, and roll).
Oh does not explicitly disclose wherein the gaze indication is indicative of a gaze distance; and wherein the visual attention region comprises a three-dimensional size.
Pepperell discloses wherein the gaze indication is indicative of a gaze distance; determine a visual attention region based on the gaze indication, wherein the visual attention region comprises a three-dimensional location size and a three-dimensional location in a three-dimensional (3D) scene (Pepperell, para’s 0012 and 0043-0047, fig. 1; the 3D data is accessed, objects in the scene are assigned a region of interest function, this being a function that defines a 3D volume enclosing the outer coordinates of the object, but extending beyond its contours to an extent that can be defined by a suitable user control interface, or determined by parameters pre-programmed into the device by a person skilled in the art. This volume acts as a visual fixation coordinate detection function and is incrementally calibrated, using computational techniques known to a person skilled in the art, such that the sensitivity of the function is lower at the outer extremities of the region and increases as it approaches the boundary of the object, being at its greatest sensitivity within the boundaries of the object. The visual fixation coordinate is set, using one of several processes as specified herein, and the distance of the head of the user from the image display is measured at step 706, via the eye or head “gaze” tracking sensors; see also para. 0012, a size of the detection region can be adjusted. The detection region may extend beyond boundaries of the respective object).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Pepperell’s features into Oh’s invention 

Regarding claim 2, Oh-Pepperell discloses the apparatus of claim 1, wherein the three dimensional size of the visual attention region has an extension  in at least one axis, wherein the extension is less than or equal to 10 degrees (Pepperell, as shown fig. 7, the boundary of an object in the scene is based on the size of the object and can be extended; for a small object the boundary can obviously be extended in one axis for less than or equal to 10 degrees).
The motivation and obviousness arguments are the same as claim 1.

Regarding claim 3, Oh-Pepperell discloses the apparatus of claim 1, wherein the visual attention region corresponds to a scene object (Pepperell, as shown fig. 7, the boundary “region” encloses an object).
The motivation and obviousness arguments are the same as claim 1.

Regarding claim 10, Oh-Pepperell discloses the apparatus of claim 1, wherein the generator circuit is arranged to generate the image data stream as a video data stream, wherein the video data stream comprises images corresponding to viewports for the head pose (Oh, para. 0095, the transmission-side feedback-processing unit may deliver the feedback information according to head movement and gazing information, received from the 360-degree video reception apparatus, to the data encoder, which may differently encode the regions. For example, the transmission-side feedback-processing unit may deliver the viewport information, received from the reception side, to the data encoder. The data encoder may encode regions including the areas indicated by the viewport information at higher quality (UHD, etc.) than other regions; para. 0147, tiling may enable the user to enjoy or transmit only tiles corresponding to an important part or a predetermined part, such as the viewport that is being viewed by the user, to the reception side within a limited bandwidth. The limited bandwidth may be more efficiently utilized through tiling, and calculation load may be reduced because the reception side does not process the entire 360-degree video data at once).

Regarding claim 11, Oh-Pepperell discloses the apparatus of claim 1, wherein the determiner circuit is arranged to determine a confidence measure for the visual attention region in response to a correlation between movement of the visual attention region in the 3D scene and changes in the gaze indication, wherein the generator circuit is arranged to determine the quality for the first image data in response to the confidence measure (Oh, para. 0095, the transmission-side feedback-processing unit may deliver the feedback information according to head movement and gazing information, received from the 360-degree video reception apparatus, to the data encoder, which may differently encode the regions. For example, the transmission-side feedback-processing unit may deliver the viewport information, received from the reception side, to the data encoder. The data encoder identifies the area(s) or region(s) indicated by the viewport information, i.e., determines a confidence measure, and encodes the identified region(s) at higher quality than other regions).

Regarding claim 12, Oh-Pepperell discloses the apparatus of claim 1, further comprising a processor circuit, wherein the processor circuit is arranged to execute an application for the 3D scene, wherein the application is arranged to generate the gaze indication, wherein the application is arranged to render an image corresponding to a viewport for the viewer from the image gaze indication (Oh, para’s 0077-0078, the head orientation information may be information about the position, angle, and movement of the head of the user.  Information about the area that is being viewed by the user in the 360-degree video, i.e. the viewport information, may be calculated based on this information. An application performs gaze analysis to check the area of the 360-degree video at which the user gazes, and the amount of time during which the user gazes at the 360-degree video; Oh, para. 0095, the data encoder identifies the areas/regions indicated by the viewport information, i.e., determines a confidence measure, and encodes the identified region at higher quality than other regions; para. 0061, rendering the region(s) of the immersive video).

Regarding claim 13, Oh-Pepperell discloses the apparatus of claim 1, wherein the apparatus is arranged to receive the gaze indication from a remote client, wherein the apparatus is arranged to transmit the image data stream to the remote client (Oh, para. 0095, the transmission-side feedback-processing unit may deliver the feedback information according to head movement and gazing information, received from the remote 360-degree video reception apparatus, to the data encoder, which may differently encode the regions. For example, the transmission-side feedback-processing unit may deliver the viewport information, received from the reception side, to the data encoder.  The data encoder identifies the area(s) or region(s) indicated by the viewport information, i.e., determines a confidence measure, and encodes the identified region(s) at higher quality than other regions; para. 0061, fig. 1, transmitting the processed video region(s) to the remote reception apparatus for rendering).

Regarding claim 14, Oh-Pepperell discloses the apparatus of claim 1, wherein the generator circuit is arranged to determine a viewport for the image data in response to the head pose, wherein the generator circuit is arranged to determine the first data in response to the viewport (Oh, para’s 0077-0078, the head orientation information may be information about the position, angle, and movement of the head of the user.  Information about the area that is being viewed by the user in the 360-degree video, i.e. the viewport information, may be calculated based on this information. An application performs gaze analysis to check the area of the 360-degree video at which the user gazes, and the amount of time during which the user gazes at the 360-degree video; Oh, para. 0095, the data encoder identifies the areas/regions indicated by the viewport information, i.e., determines a confidence measure, and encodes the identified region at higher quality than other regions; para. 0061, rendering the region(s) of the immersive video).

Regarding claims 15-17, and 20, these claims comprise limitations substantially the same as claims 1-3; therefore, they are rejected by the same rationale set forth.
Oh-Pepperell further discloses functional code of the invention can be written on a processor-readable storage medium (see Oh, par. 0521).

8.	Claims 4, 9, and 18 are rejected under AIA  35 U.S.C. 103 as being unpatentable over Oh-Pepperell, as applied to claims 1 and 3, in view of Kim et al. (US Publication 2020/0228780, hereinafter Kim). 
Regarding claims 4 and 18, Oh-Pepperell discloses the apparatus of claim 3 and the 3D scene as described above.
Oh-Pepperell does not explicitly disclose but Kim discloses wherein the determiner circuit is arranged to track movement of the scene object in the scene, wherein the determiner circuit is arranged to determine the visual attention region in response to the tracked movement (Kim, para’s 0050-0051, the SOI description information may include tracking information regarding object speed and object vector; the space of interest “SOI” or region of interest “ROI” includes an object of interest “OOI”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Kim’s features into Oh-Pepperell’s invention for enhancing user’ viewing of immersive 3D video in the case when user focusing on moving object(s) in the video content.

Regarding claim 9, Oh-Pepperell discloses the apparatus of claim 1, and further discloses the 3D scene as described above  and wherein the generator circuit is arranged to generate the image data to have a higher quality level for a third image data than for a portion of the second image data, wherein the portion of the second image data is outside the predicted visual attention region of the 3D scene (Oh, para. 0095, the transmission-side feedback-processing unit may deliver the feedback information, received from the 360-degree video reception apparatus, to the data encoder, which may differently encode the regions. For example, the transmission-side feedback-processing unit may deliver the viewport information, received from the reception side, to the data encoder.  The data encoder may encode regions including the areas indicated by the viewport information at higher quality (UHD, etc.) than other regions; para. 0147, tiling may enable the user to enjoy or transmit only tiles corresponding to an important part or a predetermined part, such as the viewport that is being viewed by the user, to the reception side within a limited bandwidth. The limited bandwidth may be more efficiently utilized through tiling, and calculation load may be reduced because the reception side does not process the entire 360-degree video data at once).
Oh-Pepperell does not explicitly disclose but Kim discloses wherein the determiner circuit is arranged to determine a predicted visual attention region in response to movement data of a scene object corresponding to the visual attention region, wherein the generator circuit is arranged to include the third image data for the predicted visual attention region (Kim, para’s 0050-0055, the SOI description information may include object speed and object vector; the object speed is an instantaneous speed at which an OOI moves; the object vector specifies a direction in which an OOI moves within a specific time range.  Once the OV of the OOI within a specific time or time range is known, the movement of the OOI may be predicted).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Kim’s features into Oh-Pepperell’s invention for enhancing user’ viewing of immersive video by displaying in different quality levels specific portions of the immersive video predicted by movement of object within the immersive video.

9.	Claims 5-8, and 19 are rejected under AIA  35 U.S.C. 103 as being unpatentable over Oh-Pepperell, as applied to claim 1, in view of Pio et al. (US Publication 2017/0084086, hereinafter Pio). 
Regarding claims 5 and 19, Oh-Pepperell discloses the apparatus of claim 1 and the 3D scene as described above.
Oh-Pepperell does not explicitly disclose but Pio discloses wherein the determiner circuit is arranged to determine the visual attention region in response to a stored user viewing behavior for the 3D scene (Pio, para. 0067, FIG. 1C illustrates an example scene 160 for which it may be determined that users that view the scene 160 generally exhibit a vertical and horizontal viewing pattern without viewing the corners of the scene.  In this example, based on user behavior, a diamond viewport shape 162 may be utilized so that video content that is within the viewport shape boundary 162 can be streamed at a higher quality (e.g., bit rate) while video content outside 164 of the viewport shape 162 can be streamed at a lower quality).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Pio’s features into Oh-Pepperell’s invention for enhancing user’ viewing of immersive video by presenting specific portion  of the immersive video preferably viewed in previous viewing sessions.

Regarding claim 6, Oh-Pepperell-Pio discloses the apparatus of claim 5, wherein the determiner circuit is arranged to bias the visual attention region towards regions of the 3D scene for which the stored user viewing behavior indicates a higher view frequency Pio, para. 0075, the computing device 350 can utilize the probability transition map to request and buffer the viewports, or streams, that the user accessing the computing device 350 is likely to view over some period of time. For example, if the probability transition map indicates that 99 percent of users who look at viewport A of the spherical video at playback time 1 second will continue to look at viewport A at playback time 5 seconds, then the computing device 350 can request and/or buffer data corresponding to viewport A. In another example, if the probability transition map indicates that 50 percent of users that look at viewport A at playback time 1 second will look at viewport A at playback time 5 seconds and 40 percent of users that look at viewport A at playback time 1 second will look at viewport B at playback time 5 seconds, then the computing device 350 can request and/or buffer data corresponding to both viewport A and viewport B).
The motivation and obviousness arguments are the same as claim 5.

Regarding claim 7, Oh-Pepperell discloses the apparatus claim 1 and the 3D scene as described above.
Oh-Pepperell does not explicitly disclose but Pio discloses wherein the determiner circuit is arranged to determine a predicted visual attention region in response to relationship data, wherein the relationship data is indicative of previous viewing behavior relationships between different regions of the scene, wherein the generator circuit is arranged to include third image data for the predicted visual attention region in the image data stream, wherein the generator circuit is arranged to generate the image data to have a higher quality level for the third image data than for a portion of the second image data, wherein the portion of the second image data is outside the predicted visual attention Pio, para. 0067, FIG. 1C illustrates an example scene 160 for which it may be determined that users that view the scene 160 generally exhibit a vertical and horizontal viewing pattern without viewing the corners of the scene.  In this example, based on user behavior, a diamond viewport shape 162 may be utilized so that video content that is within the viewport shape boundary 162 can be streamed at a higher quality (e.g., bit rate) while video content outside 164 of the viewport shape 162 can be streamed at a lower quality; para. 0073, FIGS. 3A-B illustrates examples of streaming a spherical video based on social predictive data, according to an embodiment of the present disclosure.  In some embodiments, changes made by various users to a viewport direction while accessing a spherical video can be measured and evaluated, in the aggregate. These aggregated changes may be used to determine directions in which users generally position the viewport while watching the spherical video at a given playback time. These determined directions may be used to predict, for a user who has not yet viewed the spherical video, what direction the user may position the viewport at a given time. Such predictions may be utilized to enhance the playback of the video, for example, by sending the appropriate stream data for a certain direction prior to the viewport direction being changed to that direction (e.g., buffering the stream before it is in use).  For example, a determination may be made that, while watch a spherical video, 70 percent of users changed the direction being viewed starting from viewport A to viewport B at playback time 5 seconds (i.e., 5 seconds into playback of the spherical video) while 30 percent of users changed the direction being viewed starting from viewport A to viewport C at playback time 5 seconds. In this example, viewport A corresponds to a first viewing direction of the spherical video, viewport B corresponds to a second viewing direction of the spherical video, and viewport C corresponds to a third viewing direction of the spherical video.  In various embodiments, such user data can be used to generate a probability transition map (e.g., a Markov model) that provides a likelihood of a user viewing a first viewport direction transitioning to a second viewport direction at a given playback time).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Pio’s features into Oh-Pepperell’s invention for enhancing user’ viewing of immersive video by displaying in different quality levels specific portions of the immersive video previously viewed in other viewing sessions.

Regarding claim 8, Oh-Pepperell-Pio discloses the apparatus of claim 7, wherein the relationship data is indicative of previous gaze shifts by at least one viewer, wherein the determiner circuit is arranged to determine the predicted visual attention region as a first region of the 3D scene, wherein the first region of the scene comprises the relationship data, wherein the relationship data is indicative of a frequency of gaze shifts from the visual attention region to the first region that exceeds a threshold (Pio, para. 0067, FIG. 1C illustrates an example scene 160 for which it may be determined that users that view the scene 160 generally exhibit a vertical and horizontal viewing pattern without viewing the corners of the scene.  In this example, based on user behavior, a diamond viewport shape 162 may be utilized so that video content that is within the viewport shape boundary 162 can be streamed at a higher quality (e.g., bit rate) while video content outside 164 of the viewport shape 162 can be streamed at a lower quality; para. 0073, FIGS. 3A-B illustrates examples of streaming a spherical video based on social predictive data, according to an embodiment of the present disclosure.  In some embodiments, changes made by various users to a viewport direction while accessing a spherical video can be measured and evaluated, in the aggregate. These aggregated changes may be used to determine directions in which users generally position the viewport while watching the spherical video at a given playback time. These determined directions may be used to predict, for a user who has not yet viewed the spherical video, what direction the user may position the viewport at a given time. Such predictions may be utilized to enhance the playback of the video, for example, by sending the appropriate stream data for a certain direction prior to the viewport direction being changed to that direction (e.g., buffering the stream before it is in use).  For example, a determination may be made that, while watch a spherical video, 70 percent of users changed the direction being viewed starting from viewport A to viewport B at playback time 5 seconds (i.e., 5 seconds into playback of the spherical video) while 30 percent of users changed the direction being viewed starting from viewport A to viewport C at playback time 5 seconds. In this example, viewport A corresponds to a first viewing direction of the spherical video, viewport B corresponds to a second viewing direction of the spherical video, and viewport C corresponds to a third viewing direction of the spherical video.  In various embodiments, such user data can be used to generate a probability transition map that provides a likelihood of a user viewing a first viewport direction transitioning to a second viewport direction at a given playback time, i.e., a probability or likelihood of a user viewing a first viewport direction transitioning to a second viewport direction at a given playback time exceeds a threshold).
The motivation and obviousness arguments are the same as claim 7.

Conclusion
10.	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
11.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to LOI H TRAN whose telephone number is (571)270-5645. The examiner can normally be reached 8:00AM-5:00PM PST FIRST FRIDAY OF BIWEEK OFF.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LOI H TRAN/           Primary Examiner, Art Unit 2484