DETAILED ACTION
Applicant’s amendments and remarks submitted 19 January 2021 have been entered and considered. Claims 1, 3-5, 7-8, 10-12, 14-15, and 17-20 are currently pending in this application. Claims 1, 3, 7-8, 10, 14-15, and 17 have been amended. Claims 2, 6, 9, 13, 16, and 20 have been canceled.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given by Aborn C. Chao (Reg. No. 66,538) on 22 March 2021.
Replace claims 1, 8, 15, and 17 with the following:

Claim 1. A method, comprising:
simultaneously capturing 360 degree video data and audio data from a plurality of viewpoints within a real-world environment;
preprocessing and compressing the 360 degree video data and the audio data into a three-dimensional representation suitable for display;
rendering a virtual environment of the real-world environment;

displaying the blended virtual environment in a display apparatus of a user; and
estimating physical properties of a plurality of objects in multiple 360 degree video data, wherein the estimated physical properties of the plurality of objects in the multiple 360 degree video data are utilized to establish a correspondence between the plurality objects in the multiple 360 degree video data and physical and virtual objects in the blended virtual environment so that the physical and the virtual objects in the blended virtual environment have position, scale, and orientation that are consistent with the multiple 360 degree video data across multiple views,
	wherein the captured audio data and the 360 degree video data are in time synchronization when navigating between the multiple locations and during display of the blended virtual environment, and
	wherein during the time synchronization of the captured audio data and the 360 degree video data, time progresses continuously when navigating between the multiple locations.

Claim 8. An apparatus, comprising:
at least one processor; and
at least one memory comprising computer program code,
the at least one memory and the computer program code are configured, with the at least one processor to cause the apparatus at least to:

preprocess and compress the 360 degree video data and the audio data into a three-dimensional representation suitable for display;
render a virtual environment of the real-world environment;
create a blended virtual environment by combining the captured 360 degree video data and the audio data at multiple locations with the rendered virtual environment;
display the blended virtual environment in a display apparatus of a user; and
estimate physical properties of a plurality of objects in multiple 360 degree video data, wherein the estimated physical properties of the plurality of objects in the multiple 360 degree video data are utilized to establish a correspondence between the plurality objects in the multiple 360 degree video data and physical and virtual objects in the blended virtual environment so that the physical and the virtual objects in the blended virtual environment have position, scale, and orientation that are consistent with the multiple 360 degree video data across multiple views,
wherein the captured audio data and the 360 degree video data are in time synchronization when navigating between the multiple locations and during display of the blended virtual environment, and
wherein during the time synchronization of the captured audio data and the 360 degree video data, time progresses continuously when navigating between the multiple locations.

Claim 15. A computer program, embodied on a non-transitory computer readable medium, the computer program, when executed by a processor, causes the processor to:
simultaneously capture 360 degree video data and audio data from a plurality of viewpoints within a real-world environment;
preprocess and compress the 360 degree video data and the audio data into a three-dimensional representation suitable for display;
render a virtual environment of the real-world environment;
create a blended virtual environment by combining the captured 360 degree video data and the audio data at multiple locations with the rendered virtual environment; and
display the blended virtual environment in a display apparatus of a user; and
estimate physical properties of a plurality of objects in multiple 360 degree video data, wherein the estimated physical properties of the plurality of objects in the multiple 360 degree video data are utilized to establish a correspondence between the plurality objects in the multiple 360 degree video data and physical and virtual objects in the blended virtual environment so that the physical and the virtual objects in the blended virtual environment have position, scale, and orientation that are consistent with the multiple 360 degree video data across multiple views,
wherein the captured audio data and the 360 degree video data are in time synchronization when navigating between the multiple locations and during display of the blended virtual environment, and
the time synchronization of the captured audio data and the 360 degree video data, time progresses continuously when navigating between the multiple locations.

Claim 17. The computer program according to claim 15, wherein the computer program, when executed by the processor, causes the processor to initialize a playback procedure to play back the 360 degree video data and the audio data as a photorealistic navigable virtual environment to the user to recreate the appearance and sound of the real-world environment from a given viewpoint at a given orientation.

Response to Arguments
Regarding claim objections raised in non-final rejection dated 18 August 2020, the applicant argued that “Claims 1, 3, 6, 8-10, 13, 15, 17, and 20 were objected to for informalities. As shown above, claims 1, 3, 8, 10, 15, and 17 have been amended to replace “360” with “360 degree,” claim 8 has been amended to include a colon after “cause the apparatus at least to.” As discussed and preliminarily agreed during the interview, the above amendments to the claims overcome this objection. Accordingly, withdrawal of this objection is respectfully requested.”
The objections are hereby withdrawn in light of the applicant’s amendment and the examiner’s amendment above. 
Allowable Subject Matter
Independent claims 1, 8, and 15 are allowed.
Regarding independent claim 1, the prior art of record Khalid et al., U.S. Pre-Grant Application Number 2017/0287220, hereinafter Khalid, discloses a method, comprising: capturing 360 degree video data from a plurality of viewpoints within a real-world environment; (Khalid Figure 4, 15, and Specification Paragraph [0053]: “Camera 402 may include any type of camera that is configured to capture data representative of a 360-degree image of real-world scenery 404 around a center point corresponding to camera 402. As used herein, a 360-degree image is any still or video image that depicts the surroundings (e.g., real-world scenery 404) of a center point (e.g., a center point associated with the location of camera 402) on all sides along at least one dimension.”; Paragraph [0107]: “To illustrate, FIG. 15 shows a plurality of exemplary 360-degree cameras arranged to capture real-world scenery 1500 from which an immersive virtual reality world may be generated. Specifically, 360-degree cameras 1502 (e.g., cameras 1502-1 through 1502-n) may be arranged to capture real-world scenery 1500 from a plurality of locations within the real world…Respective 360-degree images captured by each camera 1502 may then be used (e.g., by system 100, content creator 202, etc.) to generate a view of an immersive virtual reality world from a perspective of a center point that corresponds to the respective location of each camera 1502. As a result, users experiencing the immersive virtual reality world may move from center point to center point within the immersive virtual reality world to experience the immersive virtual reality world from different center points corresponding to the locations in the real world where cameras 1502 were placed.”)
(Khalid Specification Paragraph [0023: “For example, for point-to-multipoint delivery of virtual reality media content, an interactive media content provider system may generate overall data representative of an immersive virtual reality world.”; Paragraph [0025]: “Each of the media player devices to which the overall data is provided may be associated with a user, and may be configured to render a portion of the overall data within a field of view presented on a display screen of the media player device.”; Paragraph [0052]: “After preparing and/or processing the data representative of the 360-degree images to generate an immersive virtual reality world based on the 360-degree images, system 100 may provide overall data representative of the immersive virtual reality world to media player devices 206 (also described above in relation to FIG. 2).”; Paragraph [0059]: “Based on the camera-captured data representative of real-world scenery 404 (e.g., the 360-degree image), system 100 may generate and maintain an immersive virtual reality world (i.e., data representative of an immersive virtual reality world that may be experienced by a user).”)
creating a blended virtual environment by combining the captured 360 degree video data at multiple locations with the rendered virtual environment; and displaying the blended virtual environment in a display apparatus of a user. (Khalid Figure 5-6, 9, and Specification Paragraph [0059]: “system 100 may generate a three-dimensional ("3D") model of the immersive virtual reality world where virtual objects may be presented along with projections of real-world scenery 100 to a user experiencing the immersive virtual reality world.”; Paragraph [0064]: “To illustrate, FIG. 5 shows that content 506 may include real-world scenery depicting a beach with palm trees.”; Paragraph [0067]: “Additionally, FIG. 5 shows a virtual object 512 (i.e., a surfboard), which may be inserted into world 508 by system 100 during the generation of world 508. Any virtual object may be inserted along with the real-world scenery of world 508 as may serve a particular implementation.”; Paragraph [0069]: “As a first example of a media player device 600 that may be used to view and/or experience interactive media content, a head-mounted virtual reality device 602 may be mounted on the head of the user and arranged so that each of the user's eyes sees a distinct display screen 604 (e.g., display screens 604-1 and 604-2) within head-mounted virtual reality device 602. In some examples, a single display screen 604 may be presented and shared by both eyes of the user.”; Paragraph [0076]: “As such, world 700 may be formed from a 360-degree image that depicts the surroundings (e.g., real-world scenery such as real-world scenery 404 described above in relation to FIG. 4) of a center point associated with a position of user 502 within world 700 on all sides along the horizontal dimension.”; Paragraph [0107]: “Specifically, 360-degree cameras 1502 (e.g., cameras 1502-1 through 1502-n) may be arranged to capture real-world scenery 1500 from a plurality of locations within the real world.”)
While Khalid teaches feature of utilizing audio data in generating immersive virtual reality world (Khalid Specification Paragraph [0023]: “the immersive virtual reality world may, in certain examples, be generated based on data (e.g., image and/or audio data) representative of camera-captured real-world scenery”.), Khalid fails to expressly 
Following above, Khalid fails to expressly teach “A method, comprising…capturing 360 degree video data and audio data from a plurality of viewpoints within a real-world environment; preprocessing and compressing the 360 degree video data and the audio data into a three-dimensional representation suitable for display…creating a blended virtual environment by combining the captured 360 degree video data and the audio data at multiple locations with the rendered virtual environment”.
However, the prior art of record van Hoff et al., U.S. Patent Number 9,363,569, hereinafter Hoff, teaches feature of providing virtual reality display with reproduced audio by utilizing virtual reality content including compressed stream of three-dimensional video and three-dimensional audio (where the compressed stream is generated by processing received video captured from camera(s) and audio data captured from microphone(s)). (Hoff Figure 4C and Column 5 line 1-29: “Camera modules included in the camera array may have lenses mounted around a spherical housing and oriented in different directions with a sufficient diameter and field of view, so that sufficient view disparity may be captured by the camera array for rendering stereoscopic images…The microphone array is capable of capturing sounds from various directions. The microphone array may output the captured sounds and related directionalities to the content system, which allows the content system to reconstruct sounds from any arbitrary direction…The content system may include code and routines stored on a non-transitory memory for processing the raw video data and audio data received across multiple recording devices and for converting the raw video data and audio data into a single compressed stream of 3D video and audio data.”; Column 6 line 23-27: “The viewing system decodes and renders the 3D video and audio streams received from the content system on a virtual reality display device (e.g., a virtual reality display) and audio reproduction devices (e.g., headphones or other suitable speakers).”; Column 29 line 59-62: “the content server 139 stores content such as videos, images, music, video games, or any other VR content suitable for playback by the viewing system 133.”; Column 31 line 52-55: “The stream combination module 214 generates 430 VR content that includes the compressed stream of 3D video data and the stream of 3D audio data.”)
Khalid as modified by Hoff fails to expressly teach “simultaneously capturing 360 degree video data and audio data from a plurality of viewpoints within a real-world environment”. However, the prior art of record Khedkar et al., U.S. Pre-Grant Application Number 2017/0345215, hereinafter Khedkar, teaches feature of obtaining virtual reality content from cameras that simultaneously capture a 360 degree view of a real world scenery. (Khedkar Specification Paragraph [0016]: “An interactive VR content system disclosed herein includes a VR content creation system wherein VR content of various types is created. VR content obtained from cameras that simultaneously capture a 360 degree view of a real-world or virtual world scenery is fed to a VR content processor.”)
Regarding remaining limitations of “wherein the captured audio data and the 360 degree video data are in time synchronization when navigating between the multiple (Hoff Specification Column 20 line 21-28: “In some embodiments, the video module 208 receives raw video data describing image frames from the various camera modules 103 in the camera array 101. The video module 208 identifies a location and timing associated with each of the camera modules 103 and synchronizes the image frames based on locations and timings of the camera modules 103. The video module 208 synchronizes image frames captured by different camera modules 103 at the same times.”), Khalid as modified by Hoff and further modified by Khedkar fails to expressly teach “wherein the captured audio data and the 360 degree video data are in time synchronization when navigating between the multiple locations and during display of the blended virtual environment”.
However, prior art Venshtain et al., U.S. Pre-Grant Application Number 2019/0045157, hereinafter Venshtain, teaches feature of providing user-selected viewpoint for a virtual or augmented reality by receiving time-synchronized video frames of subject from video cameras at known locations. (Venshtain Figure 23 and Specification Paragraph [0409]-[0412]: “At step 2302, the HMD 112 receives time-synchronized video frames of a subject (e.g., the presenter 102) that were captured by video cameras (e.g., the camera assemblies 1024) at known locations in a shared geometry such as the shared geometry 1040...At step 2304, the HMD 112 obtains a time-synchronized 3D mesh of the subject…At step 2306, HMD 112 identifies a user-selected viewpoint for the shared geometry 1040. In various different embodiments, HMD 112 may carry out step 2306 on the basis of one or more factors such as eye gaze, head tilt, head rotation, and/or any other factors that are known in the art for determining a user-selected viewpoint for a VR or AR experience. At step 2308, HMD 112 calculates time-synchronized visible-vertices lists, again on a per-shared-frame-rate-time-period basis, from the vantage point of at least each of the camera assemblies that is necessary to render the 3D persona 116 based on the user-selected viewpoint that is identified in step 2306.)
Prior art Cooley et al., U.S. Patent Number 9,786,027, hereinafter Cooley, teaches feature of allowing user to navigate between different locations during same target streaming time. (Cooley Specification Column 12 line 42-52: “If the user chooses to proceed to the hall (location C), player device 110 (FIG. 1) can predictively load the source assets for the subsequent locations, such as a master bedroom (location F), and child's bedroom (location G), and a hallway bathroom (location H), during a target streaming time for minimum playback of location C. The target streaming time for location C can be similar or identical to the target streaming time for location A. The author of the content clip (e.g., 200 (FIG. 2)) can set the target streaming time for location C the same or different from the target streaming time for location A.”)
Regarding remaining limitations of “estimating physical properties of a plurality of objects in multiple 360 degree video data, wherein the estimated physical properties of the plurality of objects in the multiple 360 degree video data are utilized to establish a correspondence between the plurality objects in the multiple 360 degree video data and Wigdor, teaches feature of allowing virtual representation(s) corresponding to real object(s) to be displayed at location(s) of the real object(s) where the virtual representation(s) are displayed so that characteristics of the virtual representation(s) such as shape, size, orientation, content can be consistent with determined characteristics of the real object(s). (Wigdor Figure 4-5 and Specification Paragraph [0026]: “As non-limiting examples, a gaming system may consider one or more characteristics of a physical object such as geometric shape, geometric size, weight and/or textile feel. One or more said characteristics may be used to match a physical object to a virtualized representation…The system may modify the appearance, such as the size and/or the perspective view of candidates 302 and 304, to more closely match the dimensions of physical objects 102 and 104.”; Paragraph [0035]: “FIG. 5 shows a simplified processing pipeline in which physical object 106 within physical environment 100 is spatially modeled so that the resulting model can be used to select and render an appropriate virtualized representation 506 on a display device.”)
Wigdor further teaches feature of associating continually progressing time when displaying content to a user. (Wigdor Figure 7-9 and Specification Paragraph [0043]-[0044]: “For a first example, FIG. 6 schematically shows a game player 10 in a physical environment 600 at different moments in time (e.g., time t.sub.0, and time t.sub.1) that corresponds to FIG. 7, which schematically shows a game play sequence that may be derived from detecting the user moving within the physical environment of FIG. 6. At time t.sub.0, game player 10 wearing display device 14 observes physical environment 600, which may include one or more physical objects incorporated into integrated virtual environment 700. As described above, display device 14 may display integrated virtual environment 700 to game player 10. At time t.sub.1, game player 10 moves within physical environment 600 such that game player 10 is closer to physical object 106.”; Paragraph [0045]-[0046]: “As another example, FIG. 8 schematically shows a game player 10 in a physical environment 800 at different moments in time (e.g., time t.sub.0, time t.sub.1, and time t.sub.2) that corresponds to FIG. 9, which schematically shows a gameplay sequence that may be derived from detecting the user interacting with the physical environment of FIG. 8. At time t.sub.0, game player 10 wearing display device 14 observes physical environment 800, which may include one or more physical objects incorporated into integrated virtual environment 900. As shown at time t.sub.0, game player 10 may extend a hand to grasp physical object 108…At time t.sub.1, game player 10 throws physical object 108. Such an interaction may change integrated virtual environment 900 by moving virtualized representation 206 of the physical object within integrated virtual environment 900.”)
However, while each of prior art of record teaches particular portion of limitations of the independent claim 1 when taken individually, the examiner concludes that there is no sufficient rationale to combine the prior arts of record to disclose the independent claim 1. Therefore, the examiner concludes that the prior arts of record either alone or in combination fails to expressly teach or suggest the independent claim 1. 
Claims 8 and 15 are apparatus and computer program claims reciting functions that are similar in scope to the functions performed by the claim 1 and thus, includes similar allowable subject matter identified in the claim 1. Therefore, the claims 8 and 15 are allowed under the same rationale.
Claims 3-5, 7, 10-12, 14, and 17-19 are allowed as being dependent upon allowed independent claims 1, 8, and 15.
Conclusion
The prior art made of record and not relied upon is considered pertinent to the applicant’s disclosure: see PTO-892.
For example, Harrison, Donald., U.S. Pre-Grant Application Number 2016/0346494 teaches feature of allowing ability to virtual explore multiple locations at the same time. (Harrison Specification Paragraph [0043]: “the ability to virtually explore multiple destinations or locations at the same time via multiple guides or virtual travel guides 108 in multiple locations”.)
Yamazaki, Akio., U.S. Pre-Grant Application Number 2019/0180517 teaches feature of allowing virtual content to be superimposed at location of real object by utilizing characteristic of the real object such as the location and color. (Yamazaki Abstract: “A head-mounted display device with which a user can visually recognize a virtual image and an outside scene includes an image display unit configured to cause the user to visually recognize the virtual image, an augmented-reality processing unit configured to cause the image display unit to form the virtual image including a virtual object, at least a part of which is superimposed and displayed on a real object present in the real world, a color detecting unit configured to detect a real object color, which is a color of the real object, and a color adjusting unit configured to bring a visual observation color, which is a color obtained by superimposing a color of the virtual object on the real object color, close to a target color using the detected real object color.”)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAE WON YOON whose telephone number is (571)270-3051.  The examiner can normally be reached on 8AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached on (571)272-2976.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  
/SAE WON YOON/Primary Examiner, Art Unit 2612