Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
2.	This action is in response to Applicant’s amendments/remarks received on June 21, 2022.
3.	Claims 1-20 are pending in this application.
4.	Claims 1, 4-6, 8 and 15 have been amended.
Response to Arguments
5.	With regards to claim rejections under 35 USC § 101 made to claims 15-20, Applicant’s arguments filed on June 21, 2022 are persuasive, therefore, the rejections have been withdrawn.
6.	With regards to claim rejections under 35 USC § 112, Second Paragraph made to claims 4, 5 and 6, amendments to claims 4-6 overcome the rejections, therefore, the rejections are withdrawn.
7.	With regards to claim rejections under 35 USC § 103, Applicant's arguments filed June 21, 2022 have been fully considered but they are deemed moot in view of the necessitated new grounds of rejection.

Claim Rejections - 35 USC § 103
8.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

9.	The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
10.	Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. ( M. Xu, Y. Song, J. Wang, M. Qiao, L. Huo and Z. Wang, "Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 11, pp. 2693-2708, 1 Nov. 2019, doi: 10.1109/TPAMI.2018.2858783.)(hereinafter Xu) in view of Han et al.(US 2020/0128280 A1)(hereinafter Han) in further view of Osman et al.(US 2018/0001205 A1)(hereinafter Osman).
Regarding claims 1, 8 and 15, Xu teaches a computer-implemented method of predicting a displayed region of a video frame, a computer system and a computer program product [See Xu: abstract, section 1 Introduction, Figs. 1-4 regarding methods for predicting FoVs(field of views) and HM(head movement) positions in panoramic video for being implemented in a computer system for head-mounted displays with the use of databases.], the computer-implemented method / wherein the processor performs processor operations / when executed on a processor system, causes the processor system to perform processor operations comprising:
 using a reinforcement learning (RL) system of a processor system / a reinforcement learning (RL) system to generate a first set of displayed region candidates based on inputs received at the RL system from online users while watching video[See Xu: at least abstract, Figs. 1-4, section 1 Introduction page 2694 last paragraph- page 2695 first paragraph, section 5 Online-DHP Approach pages 2699-2700  regarding deep reinforcement learning (DRL) can be applied to predict HM positions, via maximizing the reward of imitating human HM scanpaths through the agent’s actions. More specifically, a DRL-based HM prediction (DHP) approach with offline and online versions, called offline-DHP and online-DHP. In online-DHP, the next HM position of one subject is estimated given the currently observed HM position, which is achieved by developing a DRL algorithm upon the learned offline-DHP model…The online-DHP approach refers to predicting a specific subject’s HM position                         
                            
                                
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                ^
                                            
                                        
                                        
                                            t
                                            +
                                            1
                                        
                                    
                                    ,
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                ^
                                            
                                        
                                        
                                            t
                                            +
                                            1
                                        
                                    
                                
                            
                        
                     at frame                         
                            t
                            +
                            1
                        
                    , given his/her HM positions                         
                            
                                
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    1
                                                
                                            
                                            ,
                                             
                                            
                                                
                                                    y
                                                
                                                
                                                    1
                                                
                                            
                                        
                                    
                                    ,
                                    …
                                    ,
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    t
                                                
                                            
                                            ,
                                            
                                                
                                                    y
                                                
                                                
                                                    t
                                                
                                            
                                        
                                    
                                
                            
                             
                             
                        
                    till frame                         
                            t
                        
                    . Additionally, we define the subject as the viewer, whose HM positions need to be predicted online…].
Xu does not explicitly disclose using a recommendation system to rank the first set of displayed region candidates based on inputs received from a local user watching video.
However, ranking regions of panoramic video based on inputs received from a video observer was well known in the art at the time of the invention was filed as evident from the teaching of Han [See at least par. 0015-0021, 0031-0039,0041-0048, 0058, 0063-0065,  0114-0115 regarding  multiple predicted fields of view may be obtained periodically according to a second time period less than the first time period. The predicted fields of view may be obtained from the equipment of the user or determined by a system performing the method, in which case they may be based on information received from the equipment of the user Likewise, the rankings may be determined by a system performing the method or received from a source of the content… With each prediction, the server re-prioritizes the tiles that need to be sent to the display device. For example, at T1, the server sends the display device whatever tiles it needs for T2, based on the FoV prediction performed at T1. If there is remaining bandwidth, the server sends the display device tiles for T3 (based on the FoV prediction performed at T1) according to the ranking, i.e. highest ranking tiles first… For example, let's assume each FoV needs 10 tiles, bandwidth capacity is constant at 12 tiles, and the display device's buffer is empty at T1. With that in mind: At T1, the server sends the display device the 10 tiles needed for T2, based on the FoV prediction performed at T1 and the 2 highest ranking tiles for T3 (based on the FoV prediction performed at T1)…  Moreover, the classifier can be employed to determine a ranking or priority of each cell site of the acquired network. A classifier is a function that maps an input attribute vector, x=(x1, x2, x3, x4, . . . , xn), to a confidence that the input belongs to a class, that is, f(x)=confidence (class). Such classification can employ a probabilistic and/or statistical-based analysis (e.g., factoring into the analysis utilities and costs) to determine or infer an action that a user desires to be automatically performed…].
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Xu with Han teachings by including “using a recommendation system to rank the first set of displayed region candidates based on inputs received from a local user watching video” because this combination has the benefit of providing an iterative prioritization scheme to prioritize retrieval of the tiles(regions) needed to maximize QoE while minimizing wasted bandwidth[See Han: par. 0039].
Han also teaches or suggests a computer system comprising a processor communicatively coupled to a memory and a computer program product comprising a computer readable program stored on a computer readable storage medium, wherein the computer readable program, when executed on a processor system, causes the processor system to perform processor operations [See Han: Fig. 9, par. 0082-0086 regarding computer 402 comprising a processing unit 404, a system memory 406 and a system bus 408. A number of program modules can be stored in the drives and RAM 412, comprising an operating system 430, one or more application programs 432, other program modules 434 and program data 436].
Further on, Han teaches or suggests using the recommendation system to select a first highest ranked one of the first set of displayed region candidates; and based on the first highest ranked one of the first set of displayed region candidates, fetching a first section of a first raw video frame that matches the first highest ranked one of the first set of displayed candidate regions; wherein the first section of the first raw video frame comprises a first predicted display region of the video frame[See Han: at least par. 0015-0021, 0024,0031-0039,0041-0048, 0058, 0063-0065,  0114-0115 regarding  multiple predicted fields of view may be obtained periodically according to a second time period less than the first time period. The predicted fields of view may be obtained from the equipment of the user or determined by a system performing the method, in which case they may be based on information received from the equipment of the user Likewise, the rankings may be determined by a system performing the method or received from a source of the content… With each prediction, the server re-prioritizes the tiles that need to be sent to the display device. For example, at T1, the server sends the display device whatever tiles it needs for T2, based on the FoV prediction performed at T1. If there is remaining bandwidth, the server sends the display device tiles for T3 (based on the FoV prediction performed at T1) according to the ranking, i.e. highest ranking tiles first… For example, let's assume each FoV needs 10 tiles, bandwidth capacity is constant at 12 tiles, and the display device's buffer is empty at T1. With that in mind: At T1, the server sends the display device the 10 tiles needed for T2, based on the FoV prediction performed at T1 and the 2 highest ranking tiles for T3 (based on the FoV prediction performed at T1)… For the tiling scheme, we spatially segment a 360-degree video into tiles and deliver only tiles overlapping with predicted FoVs for viewport-adaptive video streaming. To increase the robustness, a player can also fetch the rest at lower qualities. Each 360-degree video chunk is pre-segmented into multiple smaller chunks, which are called tiles. The easiest way to generate the tiles is to evenly divide a chunk containing projected raw frames into m x n rectangles each corresponding to a tile.].
Xu and Han do not explicitly disclose using a reinforcement learning (RL) system of a processor system / a reinforcement learning (RL) system to generate a first set of displayed region candidates based on inputs received  at the RL system from online users while the online users are actively watching video.
However, using a reinforcement learning (RL) system for actively receiving input from online users was well known in the art at the time of the invention was filed as evident from the teaching of Osman [See Osman: at least Figs. 1A-2, 4-5, par. 0019, 0060-0067, 0071-0080, 0089, 0093, 0099, 0119, 0123-0128 regarding The data in gaming profiles 131A-N may be fed back to the deep learning engine 146 of the user game play profiler 145. Deep learning engine 146 utilizes artificial intelligence, including deep learning algorithms, reinforcement learning, or other artificial intelligence-based algorithms In that manner, the analysis on the collected data may be continually performed to provide updated analytics used for upgrading and/or building default game play profiles, and game play profiles for corresponding users. For example, a game play profile for a corresponding user may be updated to reflect new data. In one embodiment, successful responses to tasks as monitored from a plurality of game plays of a plurality of users is analyzed to determine the appropriate response to take in association with a particular task that is presented to the user… In one embodiment, the use may be provided with an outline view of physical objects in the room, to warn the user of their presence. The outline may, for example, be an overlay in the virtual environment. In some embodiments, the HMD user may be provided with a view to a reference marker, that is overlaid in, for example, the floor. For instance, the marker may provide the user a reference of where the center of the room is, which in which the user is playing the game…].
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Xu and Han with Osman teachings by including “using a reinforcement learning (RL) system of a processor system / a reinforcement learning (RL) system to generate a first set of displayed region candidates based on inputs received  at the RL system from online users while the online users are actively watching video” because this combination has the benefit of providing an interactive gaming application to a user playing in real time.
Regarding claims 2, 9 and 16, Xu, Han and Osman teach all the limitations of claims 1, 8 and 15,  and are analyzed as previously discussed with respect to those claims. Further on, Han teaches or suggests further comprising/ further comprise / wherein the processor operations further comprise selectively applying a video enhancement technique to only the first predicted display region of the video frame [See Han: at least par. 0015-0021, 0024-0026,0031-0039,0041-0048, 0058, 0063-0065, 0114-0115 regarding multiple predicted fields of view may be obtained periodically according to a second time period less than the first time period. The predicted fields of view may be obtained from the equipment of the user or determined by a system performing the method, in which case they may be based on information received from the equipment of the user Likewise, the rankings may be determined by a system performing the method or received from a source of the content… With each prediction, the server re-prioritizes the tiles that need to be sent to the display device. For example, at T1, the server sends the display device whatever tiles it needs for T2, based on the FoV prediction performed at T1. If there is remaining bandwidth, the server sends the display device tiles for T3 (based on the FoV prediction performed at T1) according to the ranking, i.e. highest ranking tiles first… For example, let's assume each FoV needs 10 tiles, bandwidth capacity is constant at 12 tiles, and the display device's buffer is empty at T1. With that in mind: At T1, the server sends the display device the 10 tiles needed for T2, based on the FoV prediction performed at T1 and the 2 highest ranking tiles for T3 (based on the FoV prediction performed at T1)… For the tiling scheme, we spatially segment a 360-degree video into tiles and deliver only tiles overlapping with predicted FoVs for viewport-adaptive video streaming. To increase the robustness, a player can also fetch the rest at lower qualities. Each 360-degree video chunk is pre-segmented into multiple smaller chunks, which are called tiles. The easiest way to generate the tiles is to evenly divide a chunk containing projected raw frames into m x n rectangles each corresponding to a tile.(Thus,  a first predicted FoV will be fetch with higher quality(it is enhanced))].  
Regarding claims 3, Xu, Han and Osman teach all the limitations of claim 1, and are analyzed as previously discussed with respect to that claim. Further on, Xu teaches or suggests further comprising: receiving from the local user an adjustment to the first predicted display region; using the RL system of the processor system to generate a second set of displayed region candidates based on updated inputs received from the online users while watching video[See Xu: at least abstract, Figs. 1-4, section 1 Introduction page 2694 last paragraph- page 2695 first paragraph, section 5 Online-DHP Approach pages 2699-2700  regarding deep reinforcement learning (DRL) can be applied to predict HM positions, via maximizing the reward of imitating human HM scanpaths through the agent’s actions. More specifically, a DRL-based HM prediction (DHP) approach with offline and online versions, called offline-DHP and online-DHP. In online-DHP, the next HM position of one subject is estimated given the currently observed HM position, which is achieved by developing a DRL algorithm upon the learned offline-DHP model…The online-DHP approach refers to predicting a specific subject’s HM position                         
                            
                                
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                ^
                                            
                                        
                                        
                                            t
                                            +
                                            1
                                        
                                    
                                    ,
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                ^
                                            
                                        
                                        
                                            t
                                            +
                                            1
                                        
                                    
                                
                            
                        
                     at frame                         
                            t
                            +
                            1
                        
                    , given his/her HM positions                         
                            
                                
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    1
                                                
                                            
                                            ,
                                             
                                            
                                                
                                                    y
                                                
                                                
                                                    1
                                                
                                            
                                        
                                    
                                    ,
                                    …
                                    ,
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    t
                                                
                                            
                                            ,
                                            
                                                
                                                    y
                                                
                                                
                                                    t
                                                
                                            
                                        
                                    
                                
                            
                             
                             
                        
                    till frame                         
                            t
                        
                    . Additionally, we define the subject as the viewer, whose HM positions need to be predicted online…(thus, the DRL-based HM prediction (DHP) approach is configured to track a specific subject’s changes in head movement(HM) positions to generate updated HM predicted positions)]; and 
Han teaches or suggests using the recommendation system to rank the second set of displayed region candidates based on the adjustments to the first predicted display region received from the local user watching video [See Han: at least par. 0015-0021, 0024-0026,0031-0039, 0041-0048, 0058, 0063-0065, 0114-0115 regarding  multiple predicted fields of view may be obtained periodically according to a second time period less than the first time period. The predicted fields of view may be obtained from the equipment of the user or determined by a system performing the method, in which case they may be based on information received from the equipment of the user Likewise, the rankings may be determined by a system performing the method or received from a source of the content… With each prediction, the server re-prioritizes the tiles that need to be sent to the display device. For example, at T1, the server sends the display device whatever tiles it needs for T2, based on the FoV prediction performed at T1. If there is remaining bandwidth, the server sends the display device tiles for T3 (based on the FoV prediction performed at T1) according to the ranking, i.e. highest ranking tiles first… For example, let's assume each FoV needs 10 tiles, bandwidth capacity is constant at 12 tiles, and the display device's buffer is empty at T1. With that in mind: At T1, the server sends the display device the 10 tiles needed for T2, based on the FoV prediction performed at T1 and the 2 highest ranking tiles for T3 (based on the FoV prediction performed at T1). At T2, assuming the T2 FoV prediction is the same as the T1 FoV prediction, the server sends the display device the remaining 8 tiles for T3 and the 4 highest ranking tiles for T4 (based on the FoV prediction performed at T2). If the T2 FoV prediction is not the same as the T1 FoV prediction, at T2, the server may need to send 9 or even all 10 T3 tiles, based on the FoV prediction performed at T2. If there is remaining bandwidth, the server sends the display device tiles for T4 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first. If there is remaining bandwidth, after sending all of the T4 tiles, the server sends the display device tiles for T5 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first…(thus, tiles are ranked based on updated FoV prediction)].  
Regarding claim 4, Xu, Han and Osman teach all the limitations of claim 3, and are analyzed as previously discussed with respect to that claim. Further on, Han teaches or suggests further comprising using the recommendation system to rank the second set of displayed region candidates based on the adjustments to the first predicted display region received from the local user watching video [See Han: at least par. 0015-0021, 0024-0026,0031-0039, 0041-0048, 0058, 0063-0065, 0114-0115 regarding  multiple predicted fields of view may be obtained periodically according to a second time period less than the first time period. The predicted fields of view may be obtained from the equipment of the user or determined by a system performing the method, in which case they may be based on information received from the equipment of the user Likewise, the rankings may be determined by a system performing the method or received from a source of the content… With each prediction, the server re-prioritizes the tiles that need to be sent to the display device. For example, at T1, the server sends the display device whatever tiles it needs for T2, based on the FoV prediction performed at T1. If there is remaining bandwidth, the server sends the display device tiles for T3 (based on the FoV prediction performed at T1) according to the ranking, i.e. highest ranking tiles first… For example, let's assume each FoV needs 10 tiles, bandwidth capacity is constant at 12 tiles, and the display device's buffer is empty at T1. With that in mind: At T1, the server sends the display device the 10 tiles needed for T2, based on the FoV prediction performed at T1 and the 2 highest ranking tiles for T3 (based on the FoV prediction performed at T1). At T2, assuming the T2 FoV prediction is the same as the T1 FoV prediction, the server sends the display device the remaining 8 tiles for T3 and the 4 highest ranking tiles for T4 (based on the FoV prediction performed at T2). If the T2 FoV prediction is not the same as the T1 FoV prediction, at T2, the server may need to send 9 or even all 10 T3 tiles, based on the FoV prediction performed at T2. If there is remaining bandwidth, the server sends the display device tiles for T4 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first. If there is remaining bandwidth, after sending all of the T4 tiles, the server sends the display device tiles for T5 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first…(thus, tiles are ranked based on updated FoV prediction)].
Regarding claim 5, Xu, Han and Osman teach all the limitations of claim 4, and are analyzed as previously discussed with respect to that claim. Further on, Han teaches or suggests further comprising using the recommendation system to select a second highest ranked one of the second set of displayed region candidate [See Han: Fig. 7, at least par. 0015-0021, 0024-0026, 0031-0039, 0040-0048, 0058-0065, 0114-0115 regarding For each FoV prediction, tiles within or overlapping that FoV are ranked the highest, with adjacent tiles having the second highest rank, and so on…At T1, the server sends the display device the 10 tiles needed for T2, based on the FoV prediction performed at T1 and the 2 highest ranking tiles for T3 (based on the FoV prediction performed at T1).  At T2, assuming the T2 FoV prediction is the same as the T1 FoV prediction, the server sends the display device the remaining 8 tiles for T3 and the 4 highest ranking tiles for T4 (based on the FoV prediction performed at T2)… If the T2 FoV prediction is not the same as the T1 FoV prediction, at T2, the server may need to send 9 or even all 10 T3 tiles, based on the FoV prediction performed at T2. If there is remaining bandwidth, the server sends the display device tiles for T4 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first. If there is remaining bandwidth, after sending all of the T4 tiles, the server sends the display device tiles for T5 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first… As shown in 204, the system also obtains a ranking for each tile for each of a plurality of viewports. Each viewport represents a different possible FoV for a user/viewer. There are many possible viewports in each 360 degree video. Each viewport implicates a different ranking for each tile. For example, as discussed above, those tiles that cover or overlap a viewport are ranked the highest for that viewport. The surrounding tiles are ranked the next highest for that same viewport. But, any specific tile may have different rankings for different viewports. For example, a tile at the center of a first viewport (and therefore having the highest ranking for that first viewport) may be completely OOS of a second viewport (and therefore having the lowest ranking for that second viewport). As shown in 206, the system receives a request to view the media content from equipment of the viewer/user. The user equipment (UE) may include any of the access terminal 112, the data terminals 114, the mobile devices 124, the vehicle 126, the media terminal 142, the media terminal 142, a VR headset, and/or any other equipment the user may use to request and/or consume the media content…(The system is configured to select a second highest ranked viewport and so on)]
Regarding claim 6, Xu, Han and Osman teach all the limitations of claim 5, and are analyzed as previously discussed with respect to that claim. Further on, Han teaches or suggests further comprising, based on the second highest ranked one of the second set of displayed region candidates, fetching a second section of a second raw video frame that matches the second highest ranked one of the second set of displayed candidate regions[See Han: Fig. 7, at least par. 0015-0021, 0024-0026, 0031-0039, 0040-0048, 0058-0065, 0114-0115 regarding For each FoV prediction, tiles within or overlapping that FoV are ranked the highest, with adjacent tiles having the second highest rank, and so on…At T1, the server sends the display device the 10 tiles needed for T2, based on the FoV prediction performed at T1 and the 2 highest ranking tiles for T3 (based on the FoV prediction performed at T1).  At T2, assuming the T2 FoV prediction is the same as the T1 FoV prediction, the server sends the display device the remaining 8 tiles for T3 and the 4 highest ranking tiles for T4 (based on the FoV prediction performed at T2)… If the T2 FoV prediction is not the same as the T1 FoV prediction, at T2, the server may need to send 9 or even all 10 T3 tiles, based on the FoV prediction performed at T2. If there is remaining bandwidth, the server sends the display device tiles for T4 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first. If there is remaining bandwidth, after sending all of the T4 tiles, the server sends the display device tiles for T5 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first… As shown in 204, the system also obtains a ranking for each tile for each of a plurality of viewports. Each viewport represents a different possible FoV for a user/viewer. There are many possible viewports in each 360 degree video. Each viewport implicates a different ranking for each tile. For example, as discussed above, those tiles that cover or overlap a viewport are ranked the highest for that viewport. The surrounding tiles are ranked the next highest for that same viewport. But, any specific tile may have different rankings for different viewports. For example, a tile at the center of a first viewport (and therefore having the highest ranking for that first viewport) may be completely OOS of a second viewport (and therefore having the lowest ranking for that second viewport). As shown in 206, the system receives a request to view the media content from equipment of the viewer/user. The user equipment (UE) may include any of the access terminal 112, the data terminals 114, the mobile devices 124, the vehicle 126, the media terminal 142, the media terminal 142, a VR headset, and/or any other equipment the user may use to request and/or consume the media content…(The system is configured to select a second highest ranked viewport and so on, to send the tile or tiles related to the viewport/viewports in accordance to the ranking to the display device)].
Regarding claim 7, Xu, Han and Osman teach all the limitations of claim 1, and are analyzed as previously discussed with respect to that claim. Further on, Xu teaches or suggests wherein the machine learning algorithm has been trained to perform the machine learning task using a historical target environment analysis corpus comprising information from prior analyses performed by trained interpreters on other target environments[See Xu: at least abstract, Figs. 1-4, section 1 Introduction page 2694 last paragraph- page 2695 first paragraph, section 5 Online-DHP Approach pages 2699-2700  regarding deep reinforcement learning (DRL) can be applied to predict HM positions, via maximizing the reward of imitating human HM scanpaths through the agent’s actions. More specifically, a DRL-based HM prediction (DHP) approach with offline and online versions, called offline-DHP and online-DHP. In online-DHP, the next HM position of one subject is estimated given the currently observed HM position, which is achieved by developing a DRL algorithm upon the learned offline-DHP model…The online-DHP approach refers to predicting a specific subject’s HM position                         
                            
                                
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                ^
                                            
                                        
                                        
                                            t
                                            +
                                            1
                                        
                                    
                                    ,
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                ^
                                            
                                        
                                        
                                            t
                                            +
                                            1
                                        
                                    
                                
                            
                        
                     at frame                         
                            t
                            +
                            1
                        
                    , given his/her HM positions                         
                            
                                
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    1
                                                
                                            
                                            ,
                                             
                                            
                                                
                                                    y
                                                
                                                
                                                    1
                                                
                                            
                                        
                                    
                                    ,
                                    …
                                    ,
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                
                                                    t
                                                
                                            
                                            ,
                                            
                                                
                                                    y
                                                
                                                
                                                    t
                                                
                                            
                                        
                                    
                                
                            
                             
                             
                        
                    till frame                         
                            t
                        
                    . Additionally, we define the subject as the viewer, whose HM positions need to be predicted online. Fig. 4 shows the framework of our online-DHP approach. It is intuitive that the current HM position is correlated with the previous HM scanpaths and video content. Therefore, the input to our online-DHP framework is the viewer’s HM scanpath                         
                            
                                
                                    
                                        
                                            
                                                
                                                    α
                                                
                                                
                                                    1
                                                    ,
                                                
                                            
                                            
                                                
                                                    v
                                                
                                                
                                                    1
                                                    ,
                                                
                                            
                                        
                                    
                                    ,
                                    …
                                    ,
                                    
                                        
                                            
                                                
                                                    α
                                                
                                                
                                                    t
                                                    -
                                                    1
                                                
                                            
                                            ,
                                            
                                                
                                                    v
                                                
                                                
                                                    t
                                                    -
                                                    1
                                                
                                            
                                        
                                    
                                
                            
                        
                    ] and frame content                        
                             
                            
                                
                                    
                                        
                                            F
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                     
                                    …
                                    ,
                                    
                                        
                                            F
                                        
                                        
                                            t
                                        
                                    
                                
                            
                        
                    , , and the output is the predicted HM position                         
                            
                                
                                    
                                        
                                            
                                                
                                                    x
                                                
                                                ^
                                            
                                        
                                        
                                            t
                                            +
                                            1
                                        
                                    
                                    ,
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                ^
                                            
                                        
                                        
                                            t
                                            +
                                            1
                                        
                                    
                                
                            
                        
                     at the next frame for the viewer…].
Regarding claims 10 and 17, Xu, Han and Osman teach all the limitations of claims 1 and 15, and are analyzed as previously discussed with respect to those claims. Further on, Xu teaches or suggests wherein the processor operations further comprise / wherein the processor operations further comprise: receiving from the local user an adjustment to the first predicted display region; and using the RL system of the processor system to generate a second set of displayed region candidates based on updated inputs received from the online users while watching video [See Xu: at least abstract, Figs. 1-4, section 1 Introduction page 2694 last paragraph- page 2695 first paragraph, section 5 Online-DHP Approach pages 2699-2700  regarding deep reinforcement learning (DRL) can be applied to predict HM positions, via maximizing the reward of imitating human HM scanpaths through the agent’s actions. More specifically, a DRL-based HM prediction (DHP) approach with offline and online versions, called offline-DHP and online-DHP. In online-DHP, the next HM position of one subject is estimated given the currently observed HM position, which is achieved by developing a DRL algorithm upon the learned offline-DHP model…The online-DHP approach refers to predicting a specific subject’s HM position 
    PNG
    media_image1.png
    20
    79
    media_image1.png
    Greyscale
 at frame 
    PNG
    media_image2.png
    20
    35
    media_image2.png
    Greyscale
, given his/her HM positions 
    PNG
    media_image3.png
    20
    148
    media_image3.png
    Greyscale
till frame 
    PNG
    media_image4.png
    20
    7
    media_image4.png
    Greyscale
. Additionally, we define the subject as the viewer, whose HM positions need to be predicted online…(thus, the DRL-based HM prediction (DHP) approach is configured to track a specific subject’s changes in head movement(HM) positions to generate updated HM predicted positions)].  
Regarding claims 11 and 18, Xu, Han and Osman teach all the limitations of claims 10 and 17, and are analyzed as previously discussed with respect to those claims. Further on, Han teaches or suggests further comprising / wherein the processor operations further comprise using the recommendation system to rank the second set of displayed region candidates based on the adjustments to the first predicted display region received from the local user watching video[See Han: at least par. 0015-0021, 0024-0026,0031-0039, 0041-0048, 0058, 0063-0065, 0114-0115 regarding  multiple predicted fields of view may be obtained periodically according to a second time period less than the first time period. The predicted fields of view may be obtained from the equipment of the user or determined by a system performing the method, in which case they may be based on information received from the equipment of the user Likewise, the rankings may be determined by a system performing the method or received from a source of the content… With each prediction, the server re-prioritizes the tiles that need to be sent to the display device. For example, at T1, the server sends the display device whatever tiles it needs for T2, based on the FoV prediction performed at T1. If there is remaining bandwidth, the server sends the display device tiles for T3 (based on the FoV prediction performed at T1) according to the ranking, i.e. highest ranking tiles first… For example, let's assume each FoV needs 10 tiles, bandwidth capacity is constant at 12 tiles, and the display device's buffer is empty at T1. With that in mind: At T1, the server sends the display device the 10 tiles needed for T2, based on the FoV prediction performed at T1 and the 2 highest ranking tiles for T3 (based on the FoV prediction performed at T1). At T2, assuming the T2 FoV prediction is the same as the T1 FoV prediction, the server sends the display device the remaining 8 tiles for T3 and the 4 highest ranking tiles for T4 (based on the FoV prediction performed at T2). If the T2 FoV prediction is not the same as the T1 FoV prediction, at T2, the server may need to send 9 or even all 10 T3 tiles, based on the FoV prediction performed at T2. If there is remaining bandwidth, the server sends the display device tiles for T4 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first. If there is remaining bandwidth, after sending all of the T4 tiles, the server sends the display device tiles for T5 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first…(thus, tiles are ranked based on updated FoV prediction)].  
  Regarding claims 12 and 19, Xu, Han and Osman teach all the limitations of claims 11 and 18, and are analyzed as previously discussed with respect to those claims. Further on, Han teaches or suggests further comprising / wherein the processor operations further comprise using the recommendation system to select a second highest ranked one of the second set of displayed region candidates [See Han: at least par. 0015-0021, 0024-0026,0031-0039, 0041-0048, 0058, 0063-0065, 0114-0115 regarding  multiple predicted fields of view may be obtained periodically according to a second time period less than the first time period… With each prediction, the server re-prioritizes the tiles that need to be sent to the display device. For example, at T1, the server sends the display device whatever tiles it needs for T2, based on the FoV prediction performed at T1. If there is remaining bandwidth, the server sends the display device tiles for T3 (based on the FoV prediction performed at T1) according to the ranking, i.e. highest ranking tiles first… For example, let's assume each FoV needs 10 tiles, bandwidth capacity is constant at 12 tiles, and the display device's buffer is empty at T1. With that in mind: At T1, the server sends the display device the 10 tiles needed for T2, based on the FoV prediction performed at T1 and the 2 highest ranking tiles for T3 (based on the FoV prediction performed at T1). At T2, assuming the T2 FoV prediction is the same as the T1 FoV prediction, the server sends the display device the remaining 8 tiles for T3 and the 4 highest ranking tiles for T4 (based on the FoV prediction performed at T2). If the T2 FoV prediction is not the same as the T1 FoV prediction, at T2, the server may need to send 9 or even all 10 T3 tiles, based on the FoV prediction performed at T2. If there is remaining bandwidth, the server sends the display device tiles for T4 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first. If there is remaining bandwidth, after sending all of the T4 tiles, the server sends the display device tiles for T5 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first…(thus, tiles are ranked from highest to lowest level based on updated FoV prediction. Accordingly, one of selected tiles or regions will be second highest ranked)].
Regarding claims 13 and 20, Xu, Han and Osman teach all the limitations of claims 12 and 19, and are analyzed as previously discussed with respect to those claims. Further on, Han teaches or suggests further comprising / wherein: the processor operations further comprise, based on the second highest ranked one of the second set of displayed region candidates, fetching a second section of a second raw video frame that matches the second highest ranked one of the second set of displayed candidate regions and the second section of the second raw video frame comprises a second predicted display region of the video frame[See Han: at least par. 0015-0021, 0024,0031-0039,0041-0048, 0058, 0063-0065,  0114-0115 regarding  multiple predicted fields of view may be obtained periodically according to a second time period less than the first time period… With each prediction, the server re-prioritizes the tiles that need to be sent to the display device. For example, at T1, the server sends the display device whatever tiles it needs for T2, based on the FoV prediction performed at T1. If there is remaining bandwidth, the server sends the display device tiles for T3 (based on the FoV prediction performed at T1) according to the ranking, i.e. highest ranking tiles first… For example, let's assume each FoV needs 10 tiles, bandwidth capacity is constant at 12 tiles, and the display device's buffer is empty at T1. With that in mind: At T1, the server sends the display device the 10 tiles needed for T2, based on the FoV prediction performed at T1 and the 2 highest ranking tiles for T3 (based on the FoV prediction performed at T1). At T2, assuming the T2 FoV prediction is the same as the T1 FoV prediction, the server sends the display device the remaining 8 tiles for T3 and the 4 highest ranking tiles for T4 (based on the FoV prediction performed at T2). If the T2 FoV prediction is not the same as the T1 FoV prediction, at T2, the server may need to send 9 or even all 10 T3 tiles, based on the FoV prediction performed at T2. If there is remaining bandwidth, the server sends the display device tiles for T4 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first. If there is remaining bandwidth, after sending all of the T4 tiles, the server sends the display device tiles for T5 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first…(thus, tiles are ranked from highest to lowest level based on updated FoV prediction. Accordingly, one of selected tiles or regions will be second highest ranked)…For the tiling scheme, we spatially segment a 360-degree video into tiles and deliver only tiles overlapping with predicted FoVs for viewport-adaptive video streaming. To increase the robustness, a player can also fetch the rest at lower qualities. Each 360-degree video chunk is pre-segmented into multiple smaller chunks, which are called tiles. The easiest way to generate the tiles is to evenly divide a chunk containing projected raw frames into m x n rectangles each corresponding to a tile.].
 Regarding claim 14, Xu, Han and Osman teach all the limitations of claim 13, and are analyzed as previously discussed with respect to that claim. Further on, Han teaches or suggests wherein the second section of the second raw video frame comprises a second predicted display region of the video frame [See Han: at least par. 0015-0021, 0024,0031-0039,0041-0048, 0058, 0063-0065,  0114-0115 regarding  multiple predicted fields of view may be obtained periodically according to a second time period less than the first time period… With each prediction, the server re-prioritizes the tiles that need to be sent to the display device. For example, at T1, the server sends the display device whatever tiles it needs for T2, based on the FoV prediction performed at T1. If there is remaining bandwidth, the server sends the display device tiles for T3 (based on the FoV prediction performed at T1) according to the ranking, i.e. highest ranking tiles first… For example, let's assume each FoV needs 10 tiles, bandwidth capacity is constant at 12 tiles, and the display device's buffer is empty at T1. With that in mind: At T1, the server sends the display device the 10 tiles needed for T2, based on the FoV prediction performed at T1 and the 2 highest ranking tiles for T3 (based on the FoV prediction performed at T1). At T2, assuming the T2 FoV prediction is the same as the T1 FoV prediction, the server sends the display device the remaining 8 tiles for T3 and the 4 highest ranking tiles for T4 (based on the FoV prediction performed at T2). If the T2 FoV prediction is not the same as the T1 FoV prediction, at T2, the server may need to send 9 or even all 10 T3 tiles, based on the FoV prediction performed at T2. If there is remaining bandwidth, the server sends the display device tiles for T4 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first. If there is remaining bandwidth, after sending all of the T4 tiles, the server sends the display device tiles for T5 (based on the FoV prediction performed at T2) according to the ranking, i.e. highest ranking tiles first…(thus, tiles are ranked from highest to lowest level based on updated FoV prediction. Accordingly, one of selected tiles or regions will be second highest ranked)…For the tiling scheme, we spatially segment a 360-degree video into tiles and deliver only tiles overlapping with predicted FoVs for viewport-adaptive video streaming. To increase the robustness, a player can also fetch the rest at lower qualities. Each 360-degree video chunk is pre-segmented into multiple smaller chunks, which are called tiles. The easiest way to generate the tiles is to evenly divide a chunk containing projected raw frames into m x n rectangles each corresponding to a tile.].
References cited, not relied upon
11.	The prior art made of record and not relied upon is considered pertinent to Applicant’s disclosure.
Young et al.(US 2019/0354174 A1)
Conclusion
12.	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
12.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANA J PICON-FELICIANO whose telephone number is (571)272-5252. The examiner can normally be reached Monday-Friday 9:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Christopher Kelley can be reached on 571 272 7331. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/Ana Picon-Feliciano/            Examiner, Art Unit 2482        


/CHRISTOPHER S KELLEY/            Supervisory Patent Examiner, Art Unit 2482