DETAILED ACTION

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 2, 3, 8, 9, 10, 15, 16, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Shakib et al. (Publication: US 2018/0144547 A1) in view of Todd et al. (Publication: WO 2018/064601), Mildrew et al. (Publication: US 2018/0143756 A1).

Regarding claim 1, see rejection on claim 8.
Regarding claim 2, see rejection on claim 9.
Regarding claim 3, see rejection on claim 10.

	Regarding claim 8, Shakib discloses a system comprising: a video decoder to decode video data captured from a plurality of different cameras at an event to generate decoded video ([0035]  – System 400 comprises a memory. [0104] - Memory stores decoder that decodes video information of image data [0035].
[0024] - Capture devices capture image data and received by the system.
) , the decoded video comprising a plurality of video images captured from camera (
[0104] - Memory stores decoder that decodes video information of image data [0035]
[0024] – cap ture devices include mobile device, wearable imaging devices, and device fixed to a robotic arm or movable mount.
) ; 
image recognition hardware logic to performing image recognition on at least a portion of the video to identify objects within the plurality of video images ([0076] – navigation component 408 identifies the selected target points within the set of images.); 
a metadata generator to generate metadata with one or more of the objects ([0043] generate the metadata that contains annotations associated with video of an object.); 
a point cloud data generator to generate point cloud data (
[0042] – the 3D model generation component 406 generates a 3D model, point cloud, in the real world environment.
[0041] – 3D, point cloud, is generated based on the captured 3D information from the video camera [0040]. ), 
the point cloud data usable to render an environment for the event ( [0042] – the 3D model generation component 406 generates a 3D model, point cloud, in the real world environment.
[0058] – 3D model is displayed and allows the virtual camera to be arbitrarily orbited and navigated around the scene via mouse or finger dragging (or head motion, in the case of VR headsets). ) .
Shakib disclose point cloud data as discussed above.
Shakib does not however Todd discloses 
	the decoded video comprising a plurality of video images captured from each of the plurality of different cameras (
[00122] – decode the video streams. 
[00113] – streams are received from more than one camera. ); 
a network interface to transmit the data or VR data derived from the data to a client device ( [00123] - Client 1410 receive video streams via Network 1415 via different networks. ).
generate data based on the decoded video ([00122] –decode the video streams for rendering the data, display.);
render an immersive virtual reality (VR) environment ( [0001] - generates the video of the environment for rendering to the HMD.
[0006] – HMD provides an  in an immersive virtual reality experience. )
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Shakib with the decoded video comprising a plurality of video images captured from each of the plurality of different cameras; a network interface to transmit the data or VR data derived from the data to a client device; generate data based on the decoded video; render an immersive virtual reality (VR) environment  as taught by Todd. The motivation for doing so would have improve sharing VR experience as taught by Todd. 
Shakib in view of Todd do not however Mildrew discloses
Identify objects captures in the images; image recognition ([0198] - objects captures in the images are identified.);
Metadata for one or more of the objects to indicate locations of the objects that have been identified by the image recognition ([0206] - the tag index 2306 can include information including but not limited to: information identifying points, areas or objects included in one or more 3D models that are associated with one or more tags, information identifying the one or more 3D models including the respective points, areas or objects; information identifying related 3D models (e.g., different 3D model versions of the same environment or object); information identifying real-world location, environment or objects represented by the one or more 3D models; information identifying respective 3D locations of the points, areas or objects in the one or more 3D models, respectively; information identifying one or more tags associated with the point, area, or object, including the content of the one or more tags; and in some implementations, additional metadata associated with the one or more tags.).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Shakib in view of Todd, Mildrew with Identify objects captures in the images; Metadata for one or more of the objects to indicate locations of the objects that have been identified by the image recognition as taught by Mildrew. The motivation for doing so would have enhance 3D model as taught by Mildrew. 
	
Regarding claim 9, Shakib in view of Todd, Mildrew disclose all the limitations of claim 8 including point cloud data.
Todd discloses the client device comprises a VR engine to render the immersive VR environment using the [[point cloud]] or VR data ([0023] - The display of HMD renders the  view of the virtual environment using the video data.).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Shakib in view of Todd, Mildrew with the client device comprises a VR engine to render the immersive VR environment using the [[point cloud]] or VR data as taught by Todd. The motivation for doing so would have improve sharing VR experience as taught by Todd. 

Regarding claim 10, Shakib in view of Todd, Mildrew disclose all the limitations of claim 9 including the client device and VR environment .
Shakib discloses wherein the device is to interpret the metadata to identify objects within the environment ([0042] – metadata contain annotations that identify location and additional photographs of an object in the envornment.) .
Todd discloses to render graphical elements and superimpose the graphical elements on or around the objects within the VR environment ( [00133] - Video Source 1430 is optionally configured to provide overlays configured to be placed on other video, (virtual reality video [0023]). For example, these overlays may include a command interface, log in instructions, messages to a game player, images of other game players, video feeds of other game players (e.g., webcam video). In embodiments of Client 1410A including a touch screen interface or a gaze detection interface, the overlay may include a virtual keyboard, joystick, touch pad, and/or the like. In one example of an overlay a player's voice is overlaid on an audio stream. Video Source 1430 optionally further includes one or more audio sources. ).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Shakib in view of Todd, Mildrew with to render graphical elements and superimpose the graphical elements on or around the objects within the VR environment as taught by Todd. The motivation for doing so would have improve sharing VR experience as taught by Todd. 

Regarding claim 15, see rejection on claim 8.
Regarding claim 16, see rejection on claim 9.
Regarding claim 17, see rejection on claim 10.

Claims 4, 5, 11, 12, 18 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Shakib et al. (Publication: US 2018/0144547 A1) in view of Todd et al. (Publication: WO 2018/064601), Mildrew et al. (Publication: US 2018/0143756 A1) and Wu (Publication: CN 205408064 U).

Regarding claim 4, see rejection on claim 11.
Regarding claim 5, see rejection on claim 12.

Regarding claim 11, Shakib in view of Todd, Mildrew disclose all the limitations of claim 10 including the client device and VR environment .
Shakib in view of Todd, Mildrew do not however Wu discloses
	an audio processor to receive audio data captured from a plurality of microphones at the event (Page 1, last paragraph - multimedia collection devices comprises a plurality of microphones, audio processing device receives audio.) and to associating the audio data with portions of the video data based on 	timestamp values associated with the portions of the video data and portions of the audio data (Page 1, 8th paragraph - the multimedia collecting device corresponding to the media processing server, carrying the acquired multiplex video current time stamp is sent to the media processing server, and transmitting the acquired multi-channel audio streams carry timestamp sent to the user client for the user client generates a corresponding panoramic output audio. 
Page 4, 4th paragraph - the time stamp is a multimedia processing Device 203. The received panoramic video stream and the multi-channel audio -stream included in the panoramic Video stream and multiple audio streams for time synchronization. The corresponding panoramic include the position of the audio). 
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Shakib in view of Todd, Mildrew with an audio processor to receive audio data captured from a plurality of microphones at the event and to associating the audio data with portions of the video data based on timestamp values associated with the portions of the video data and portions of the audio data as taught by Wu. The motivation for doing so would have improved user experience as taught by Wu. 

Regarding claim 12, Shakib in view of Todd, Mildrew disclose all the limitations of claim 10 including the immersive VR environment.
Shakib in view of Todd, Mildrew do not however Wu discloses
Wu discloses wherein the network interface is to transmit the audio data to the client device, wherein audio of the event is generated on the client device synchronized with the environment (Page 1, 8th paragraph - transmitting the acquired multi-channel audio streams carry timestamp sent to the user client for the user client.
Page 3 9th paragraph - Client device 200 include video and audio stream.
Page 3 tth paragraph The received panoramic video stream and the multi-channel audio, stream included in the panoramic video stream and multiple audio streams for time synchronization.) .
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Shakib in view of Todd, Mildrew and Wu with wherein the network interface is to transmit the audio data to the client device, wherein audio of the event is generated on the client device synchronized with the environment as taught by Wu. The motivation for doing so would have improved user experience as taught by Wu. 

Regarding claim 18, see rejection on claim 11.
Regarding claim 19, see rejection on claim 12.

Claims 6, 7, 13, 14, 20, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Shakib et al. (Publication: US 2018/0144547 A1) in view of Todd et al. (Publication: WO 2018/064601), Mildrew et al. (Publication: US 2018/0143756 A1) and Rowell et al. (Publication: 2020/0342652 A1).

Regarding claim 6, see rejection on claim 13.
Regarding claim 7, see rejection on claim 14.

Regarding claim 13, Shakib in view of Todd, Mildrew disclose all the limitations of claim 8 including the immersive VR environment.
	Shakib in view of Todd, Mildrew do now however	Rowell discloses 
wherein the image image recognition hardware logic comprises a machine learning engine trained to identify one or more of the objects ([0005] - to generate a model identifying images including chairs a neural network is trained on many images including images having chairs of various shapes and types, images without chairs, and many edge case images including objects that resemble chairs but are not chairs.) .
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Shakib in view of Todd, Mildrew with wherein the image image recognition hardware logic comprises a machine learning engine trained to identify one or more of the objects as taught by Rowell. The motivation for doing so to meet rising demand for more capable mobile computing devices as taught by Rowell. 

Regarding claim 14, Shakib in view of Todd, Mildrew disclose all the limitations of claim 8 including the immersive VR environment.
Shakib discloses video reconstruction hardware logic to determine a location and orientation of a plurality of virtual cameras ( [0068] - Fig. 7 shows navigation component determines a position and orientation of a virtual cameras for several virtual cameras [0069].), 
the network interface to transmit to the client device ([00123] - Client 1410 receive video streams via Network 1415 via different networks.). 
Shakib in view of Todd, Mildrew do not however Rowell discloses
the indication usable by the client device to render the environment from the perspective of one of the virtual cameras selected by an end user ([0051] - The camera turning service view controlling the camera tuning service 107 may also display the performance of virtual camera devices having camera setting parameters identical to the selected camera device. The user interface 109 may include a click through menu for navigating between the above views. The menu may arrange the different views as tabs, icons, panels, or any other arrangement. The menu is selected by an user.);
to transmit an indication of the virtual cameras ([0051] - The camera turning service view controlling the camera tuning service 107 may also display the performance of virtual camera devices having camera setting parameters identical to the selected camera device. The user interface 109 may include a click through menu for navigating between the above views. The menu may arrange the different views as tabs, icons, panels, or any other arrangement. The menu is selected by an user.)  .
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Shakib in view of Todd, Mildrew with the indication usable by the client device to render the environment from the perspective of one of the virtual cameras selected by an end user; to transmit an indication of the virtual cameras as taught by Rowell. The motivation for doing so to meet rising demand for more capable mobile computing devices as taught by Rowell. 

Regarding claim 20, see rejection on claim 13.
Regarding claim 21, see rejection on claim 14.



Response to Arguments

Claim Rejection Under 35 U.S.C. 103
Applicant asserts “The cited Todd paragraph [0076] appears not to teach or suggest an image recognition on video data to identify objects captured within video images as in amended claim 1. Instead, it teaches how to find the best target points from a number of images ("The navigation component 408 can further examine a set of 2D images captured of the environment and identify a subset (e.g., including one or more 2D images) of the 2D images that show the 'best' view of the selected target point"). Finding the best target point (the best look here point) in images as taught by Todd is unrelated to performing image recognition on video data to identify objects captured within video images as recited: "performing image recognition on at least a portion of the video data to identify objects captured within the plurality of video images." For at least these reasons, Todd cited portion fails to teach or suggest the amended claim element. Additionally, the combination of Shakib and Todd appears not to teach or suggest "generating metadata for one or more of the objects to indicate locations of the objects that have been identified" as recited in amended claim 1. The amendment is supported by Specification, e.g., paragraph [00220]. For the claim element as previously presented, the Office Action cites Todd paragraph [0042], which merely teaches that metadata may be captured for 3D models as the imagery is captured”

The argument has been fully considered and is persuasive. Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Mildrew reference.  

Regarding claims 2 – 7, 9 – 14, and 16 – 21, the Applicant asserts that they are not obvious over based on their dependency from independent claims 1, 8, and 15 respectively. The examiner cannot concur with the Applicant respectfully from same reason noted in the examiner’s response to argument asserted from claims 1, 8, and 15 respectively. 

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ming Wu whose telephone number is (571) 270-0724.  The examiner can normally be reached on Monday-Thursday and alternate Fridays (9:30am - 6:00pm) PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on 571-272-7794.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Ming Wu/
Primary Examiner, Art Unit 2616