DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
Applicant’s argument filed on 11/05/2020 is entered and reviewed. Accordingly the action is made final.
Claim status:
Claims 24-43 are pending.
Claims 24 and 35 are amended.
No claim is cancelled.
No new claim is added.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.  A nonstatutory double patenting rejection is appropriate where the claims at issue are not identical, but at least In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the reference application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The USPTO internet Web site contains terminal disclaimer forms which may be used.  Please visit http://www.uspto.gov/forms/.  The filing date of the application will determine what form should be used.  A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission.  For more information about eTerminal Disclaimers, refer to http://www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.  

Appl./Pat. No.
Claim Correspondence

24, 27, 30,  34, 35, 39, 43
25, 36
26
27, 37
28
10332311
1, 10, 17
3
6, 12, 15, 19, 22
7
16


Claims 24, 27, 30, 34, 35 and 39 and 43 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 10 and 17 of US. Patent No. 10332311 in view of in view of Gregg et al. (US Patent No. 9032020, “Gregg”).


Appl. 16449035 claim 24
Patent 10332311 claim 1
A system, comprising: one or more computing devices that implement a real-time video exploration (RVE) system, configured to:
A system, comprising: one or more computing devices comprising one or 
more respective processors and memory and configured to implement a video 
system comprising:

an image collection module implemented via the one or more respective processors and memory, configured to obtain digital images from one 


in response to the user interactions:
pause the playback of the video content;

generate a model of the real-world scene from images associated with the real-world scene;
generate a model of the scene in a virtual world according to the composite image;  one or more video processing modules implemented via the one or more respective processors and memory, configured to:

stream a video including the scene to a client device;  
receive user interactions from the client device exploring the scene in the 
video indicating a change in a viewpoint of the scene;  and in response to the 
user interactions, render and stream additional video including the scene from 
the changed viewpoint and the model;
receive further input from the client device indicating further user interactions with the real-world scene to manipulate the real-world scene; in response to the further user interactions: determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation;
receive additional user interactions from 
the client device to manipulate the scene in the video;  determine, based at 
least in part on the additional user interactions, that a change in the scene 
is required as a result of the manipulation;
modify the model of the real-world scene to include the change that is required as a result of the manipulation;

and signal the virtual world generation engine to generate a model of the scene that includes the change in the scene that is required as a result of the manipulation.



pause the playback of the video content;
However, Gregg teaches, input from a client device indicating user interactions with a playback of a video content depicting a real-world scene on the client device; in response to the user interactions: pause the playback of the video content;
 (Col 10 lines 5-8 and lines 18-19 “In operation 902, the edit processing server 114 transmits the first video stream, here the enhanced video stream 184, to the client 104, and causes it to be displayed as the enhanced video representation 304. In operation 905, the edit processing server 114 pauses streaming of the first video stream”);
Gregg and Calim1 of Patent 10332311are analogous as they are from the field of video editing.
Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Patent 10332311 to have included, input from a client device indicating user interactions with a playback of a video content depicting a real-world scene on the client device; in response to the user interactions: pause the playback of the video content as taught by Gregg and thereby stream the edited content to the client.
The motivation for the modification is that user can view updated video without waiting to pre-recorded video to finish.
10332311 in view of Gregg and therefore are also obvious over claims 1, 10 and 17 of the Patent 10332311 modified by Gregg.
Claim 28 of the instant application recites limitations that are similar to the limitations recited in claim 16 of the Patent 10332311 in view of Gregg and therefore are also obvious over claim 16 of the Patent 10332311 modified by Gregg.

Claims 25 and 36 of the instant application recites limitations that are similar to the limitations recited in claim 3 of the Patent 10332311 in view of Gregg and Groman (US Patent Publication No. 2015/0043892, “Groman”) and therefore are also obvious over claim 3 of the Patent 10332311 modified by Gregg and Groman to enhance applicability of system of Kikinis by having option of acquiring images from different sources.

Claims 26, 27 and 37 of the instant application recites limitations that are similar to the limitations recited in claims 6, 7, 12, 15, 19, 22 of the Patent 10332311 in view of Gregg and Kuranov et al. (US Patent Publication No. 2008/0253685, “Kuranov”) and therefore are also obvious over claims 6, 7, 12, 15, 19, 22 of the Patent 10332311 modified by Gregg and Kuranov to use standard and well known method of generating 3d model from basic input images.

Appl/Pat. No.
Claim Correspondence

24, 30, 34-35, 39, 43
9894405
1, 11, 17


Claims 24, 30, 34-35, 39 and 43 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 11 and 17 of US. Patent No. 9894405 in view of in view of Croen et al. ( US patent publication: 20140036022, “Croen”).


Appl. 16449035 claim 24
Pat. 9894405 claim 1
A system, comprising:
one or more computing devices that implement a real-time video exploration (RVE) system, configured to:
A system, comprising: one or more computing devices comprising one or 
more hardware processors and memory and configured to implement a real-time 
video exploration (RVE) system comprising: a playback module implemented via the one or more hardware processors and memory and configured to begin playback of at least a portion of a pre-recorded video to a client device;  and a graphics processing and rendering module implemented via the 

receive identification input from the client device after playback of the pre-recorded video has begun, wherein the identification input identifies a particular object of a plurality of objects contained in a scene of the pre-recorded video;
in response to the user interactions:
pause the playback of the video content;
pause playback of the scene in response to the identification input;

receive manipulation input from the client device after playback of the pre-recorded video has begun directing manipulation of the particular object in the scene of the 
pre-recorded video;
generate a model of the real-world scene from images associated with the real-world scene; modify the model of the real-world scene to include the change that is required as a result of the manipulation;
obtain a model of the particular object according to graphics data for the pre-recorded video;  manipulate the model of the particular object in a model of the scene according to the manipulation input;
receive further input from the client device indicating further user interactions with 


particular object in the scene is replaced with the manipulated model of the 
particular object;


stream the new video content to the client device.
stream the new video of the scene including the rendering of the manipulated model of the particular object to the client device while playback of the scene is paused;  
and resume playback of the pre-recorded video to the client device in response to resume input from the client device, 
wherein streaming the new video is stopped in response to the resume input.



Croen teaches, determine, based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation; (Refer to Fig. 12 step 1204 and 1206 determines whether a model needs to be updated in response to user interaction. “ [0099]…. user interactions with the system are monitored and analyzed to determine whether any updates to the model are required and/or would be beneficial to the system (1204).”)
Claim 1 of Pat. 9894405 and Croen are analogous as they are from the field of rendering videos.
            Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 1 of Pat. 9894405 to have included based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation as taught by Croen.
            The motivation to include this modification is to reduce unnecessary processing of model building.
Claims 30, 34-35, 39 and 43 of the instant application recites limitations that are similar to the limitations recited in claims 1, 11 and 17 of the Pat. 9894405in view of Croen and therefore are also obvious over claims 1, 11 and 17 of the Pat. 9894405 modified by Croen.

Appl/Pat. No.
Claim Correspondence
16449035
24, 30, 34, 35, 39 and 43
9747727
1, 10 and 19


Claims 24, 30, 34-35, 39 and 43 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 10 and 17 of US. Patent No. 9747727 in view of in view of Kikinis et al. (US patent Publication: 2009/0237492, “Kikinis”) and Croen.
16449035 claim 24
9747727 claim 1
A system, comprising: one or more computing devices that implement a real-time video exploration (RVE) system, configured to:
A system, comprising: one or more computing devices comprising one or 
more processors and one or more memories storing program instructions 
executable by the one or more processors to implement a real-time video 
exploration (RVE) system comprising: a playback module implemented by at least 
one of the one or more computing devices and configured to begin playback of at least a portion of a pre-rendered video to a client device for display to a 

rendering module implemented by at least one of the one or more computing 
devices and configured to, during playback of the pre-rendered video to the 
client device:

pause the playback of the video content;

generate a model of the real-world scene from images associated with the real-world scene;
pause playback of the pre-rendered video to the client device in response to input from the client device;
receive input from a client device indicating user interactions with a playback of a video content depicting a real-world scene on the client device;
receiving viewpoint input from the 
client device indicating a change of viewing angle, a viewpoint movement, or 
both, based on interactions of the viewer exploring one or more scenes in the 
pre-rendered video;

render and stream, to the client device, new video of the one or more scenes in the pre-rendered video in response to the viewpoint input, wherein the new video 

receive selection input from the client device selecting an object in the one or more scenes of the pre-rendered video;
in response to the further user interactions: determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation;

modify the model of the real-world scene to include the change that is required as a result of the manipulation;
receive modification input from the client device specifying one or more modifications to be applied to the selected object in the one or more scenes of the pre-rendered video;  modify a model of the object according to the one or more modifications to generate a modified model of the object;
render new video content of the real-world scene based at least in part on the 


and resume playback of the pre-rendered video to the client device in response to resume input from the client device specifying that the pre-rendered video is to be resumed, wherein at least one portion of the pre-rendered video is replaced with the new video of the one or more scenes.


Claim 24 of the instant application differs from claim 1 of patent only: in response to the user interactions generate a model of the real-world scene from images associated with the real-world scene; in response to the further user interactions: determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation;
Kikinis teaches in response to the user interactions (“[0162] ….  In one embodiment, the update module 3036 communicates with the post-production engine 3040 for post-production effects.”):
(“[0166] The wire frame editing module 3046 edits the wire frames used in the immersive audio-visual production.  A wire frame model generally refers to a visual presentation of an electronic representation of a 3D or physical object used in 3D computer graphics.  Using a wire frame model allows visualization of the underlying design structure of a 3D model.  The wire frame editing module 3046, in one embodiment, creates traditional 2D views and drawings of an object by appropriately rotating the 3D representation of the object and/or selectively removing hidden lines of the 3D representation of the object.  In another embodiment, the wire frame editing module 3046 removes one or more wire frames from the recorded immersive audio-visual video scenes to create realistic simulation environment.”).
Claim 1 of Pat. 9747727 and Kikinis are analogous as they are from the field of rendering videos.
 Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 1 of Pat. 9747727 to have included in response to the user interactions generate a model of the real-world scene from images associated with the real-world scene as taught by Kikinis.
The motivation for the above is to follow standard procedure for 3d rendering.
Claim 1 of Pat. 9747727 modified by Kikinis doesn’t expressly teach in response to the further user interactions: determine, based at least in part on the further 
Croen teaches, determine, based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation; (Refer to Fig. 12 step 1204 and 1206 determines whether a model needs to be updated in response to user interaction. “ [0099]…. user interactions with the system are monitored and analyzed to determine whether any updates to the model are required and/or would be beneficial to the system (1204).”)
Claim 1 of Pat. 9747727 modified by Kikinis and Croen are analogous as they are from the field of rendering videos.
 Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 1 of Pat. 9747727 modified by Kikinis to have included based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation as taught by Croen.
  The motivation to include this modification is to reduce unnecessary processing of model building.
Claims 30, 34-35, 39 and 43 of the instant application recites limitations that are similar to the limitations recited in claims 1, 10 and 19 of the Pat. 9747727 in view of Kikinis and Croen and therefore are also obvious over claims 1, 10 and 19 of the Pat. 9747727 modified by Kikinis and Croen.

Appl/Pat. No
Claim Correspondence

24, 30, 34-35, 39 and 43
24 and 35
9892556
1, 8, 13, 19
2, 9 and 20


Claims 24, 30, 34-35, 39 and 43 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 8, 13 and 19 of US. Patent No. 9892556 in view of in view of Kikinis and Croen.

Claim 24 of 16449035
Claim 1 of 9892556
A system, comprising:
one or more computing devices that implement a real-time video exploration (RVE) system, configured to:
A system, comprising: one or more computing devices comprising one or 
more hardware processors and memory and configured to implement a real-time 
video exploration (RVE) system comprising: a playback module implemented via the one or more hardware processors and memory and configured to begin playback 
of at least a portion of a pre-recorded video to a client device;  and a 
graphics processing and rendering module implemented via the one or more 
hardware processors and memory and configured to:

receive, after playback of the pre-recorded video has begun, scene exploration input from the client 
device indicating an interaction with a scene of the pre-recorded video, 
wherein the interaction comprises a modification of a camera viewpoint of the 
scene, and wherein the scene comprises a plurality of objects;
in response to the user interactions:
pause the playback of the video content;
pause playback of the scene in response to the scene exploration input;
generate a model of the real-world scene from images associated with the real-world scene;
generate a model of the scene according to graphics data for the scene and the modified camera 
viewpoint;
in response to the further user interactions :determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation;


render new video content of the real-world scene based at least in part on the model as modified; and
stream the new video content to the client device.
render new video of the scene from the model of the scene based at 
least in part on the scene exploration input received from the client device;  
stream the new video of the scene to the client device while playback of the 
scene is paused;

and resume playback of the pre-recorded video to the client device in response to resume input from the client device, wherein streaming the new video is stopped in response to the resume input.


Claim 24 of the instant application differs from claim 1 of patent only: in response to the further user interactions: determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation; modify the model of the real-world scene to include the change that is required as a result of the manipulation;
Croen teaches, determine, based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the “ [0099]…. user interactions with the system are monitored and analyzed to determine whether any updates to the model are required and/or would be beneficial to the system (1204).”)
Claim 1 of Pat. 9892556 and Croen are analogous as they are from the field of rendering videos.
 Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 1 of Pat. 9892556 to have included based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation as taught by Croen.
  The motivation to include this modification is to reduce unnecessary processing of model building.
Claim 1 of Pat. 9892556 modified by Croen doesn’t expressly teach modify the model of the real-world scene to include the change that is required as a result of the manipulation.
Kikinis teaches modify the model of the real-world scene to include the change that is required as a result of the manipulation (“[0166] The wire frame editing module 3046 edits the wire frames used in the immersive audio-visual production.  A wire frame model generally refers to a visual presentation of an electronic representation of a 3D or physical object used in 3D computer graphics.  Using a wire frame model allows visualization of the underlying design structure of a 3D model.  The wire frame editing module 3046, in one embodiment, creates traditional 2D views and drawings of an object by appropriately rotating the 3D representation of the object and/or selectively removing hidden lines of the 3D representation of the object.  In another embodiment, the wire frame editing module 3046 removes one or more wire frames from the recorded immersive audio-visual video scenes to create realistic simulation environment.”).
Claim 1 of Pat. 9892556 modified by Croen and Kikinis are analogous as they are from the field of rendering videos.
 Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 1 of Pat. 9892556 modified by Croen to have included modify the model of the real-world scene to include the change that is required as a result of the manipulation as taught by Kikinis.
The motivation for the above is to follow standard procedure for 3d rendering.
Claims 30, 34-35, 39 and 43 of the instant application recites limitations that are similar to the limitations recited in claims 1, 8, 13 and 19 of the Pat. 9892556 in view of Kikinis and Croen and therefore are also obvious over claims 1, 8, 13 and 19 of the Pat. 9892556 modified by Kikinis and Croen.

Appl. No.
Claim Correspondence
16449035
24, 34-35, 43
27, 37
31, 40
14318026
1, 10, 18
17, 23
14-15, 22


Claims 24, 34-35 and 43 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 10 and 18 of US Pat. Pub. No. 14318026 in view of Kikinis and Croen.
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Claim 24 of 16449035
Claim 1 of 14318026
A system, comprising:
one or more computing devices that implement a real-time video exploration (RVE) system, configured to:
A system, comprising:
one or more computing devices comprising one or more hardware processors and configured to implement a real-time video exploration (RVE) system comprising: a playback module configured to begin playback of a prerecorded video from a video source to a client device; a graphics processing and rendering module configured to
receive input from a client device indicating user interactions with a playback of a video content depicting a real-world scene on the client device;


receive input, from the client device, after playback of the prerecorded video has begun, wherein the input directs modification of one or more scenes of the pre-recorded video as displayed on the client device, and wherein the 


modify the model of the real-world scene to include the change that is required as a result of the manipulation;
modify, in response to the input one or more models of the one or more scenes based on updating graphics data [[for]] used in generating and rendering the one or more scenes of the pre-recorded video wherein the modified one or more models specify an updated view of the one or more scenes according to the modified viewing angle or the modified viewing position of the one or more scenes for rendering;

render modified video content from the modified one or more models wherein the rendered modified video content comprises content not presented in the one or more scenes of the pre-recorded video; and an output module configured to:
pause the playback of the video content;
pause playback of the pre-recorded video in response to the input from the client device;
render new video content of the real-world scene based at least in part on the model as modified; and
stream the new video content to the client device.
initiate playback of the modified video content to the client device while playback of the pre-recorded video is paused; combine at least a portion of the modified video content and at least a portion of the pre-recorded video to generate merged video content, wherein the one or more scenes of the pre-recorded video modified by the input are replaced by the at least the portion of the modified video content in the merged video content;

record the merged video content to a video destination, wherein the recorded merged video content is available for playback; and resume playback of the pre-recorded video to the client device and ending the playback of the modified video content in response to resume input from the client device.


Claim 24 of the instant application differs from claim 1 of the application only: in response to the user interactions generate a model of the real-world scene from images associated with the real-world scene; in response to the further user interactions: determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation;
Kikinis teaches in response to the user interactions (“[0162] ….  In one embodiment, the update module 3036 communicates with the post-production engine 3040 for post-production effects.”):
generate a model of the real-world scene from images associated with the real-world scene; (“[0166] The wire frame editing module 3046 edits the wire frames used in the immersive audio-visual production.  A wire frame model generally refers to a visual presentation of an electronic representation of a 3D or physical object used in 3D computer graphics.  Using a wire frame model allows visualization of the underlying design structure of a 3D model.  The wire frame editing module 3046, in one embodiment, creates traditional 2D views and drawings of an object by appropriately rotating the 3D representation of the object and/or selectively removing hidden lines of the 3D representation of the object.  In another embodiment, the wire frame editing module 3046 removes one or more wire frames from the recorded immersive audio-visual video scenes to create realistic simulation environment.”).
Claim 1 of appl. 14318026 and Kikinis are analogous as they are from the field of rendering videos.
 Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 1 of appl. 14318026 to have included in response to the user interactions generate a model of the real-world scene from images associated with the real-world scene as taught by Kikinis.
The motivation for the above is to follow standard procedure for 3d rendering.
Claim 1 of appl. 14318026 modified by Kikinis doesn’t expressly teach in response to the further user interactions: determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation;
Croen teaches, determine, based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation; (Refer to Fig. 12 step 1204 and 1206 determines whether a model needs to be updated in response to user interaction. “ [0099]…. user interactions with the system are monitored and analyzed to determine whether any updates to the model are required and/or would be beneficial to the system (1204).”)
Claim 1 of appl. 14318026 modified by Kikinis and Croen are analogous as they are from the field of rendering videos.
 Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 1 of appl. 14318026 modified by Kikinis to have included based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation as taught by Croen.
  The motivation to include this modification is to reduce unnecessary processing of model building.
Claims 30, 31,  34-35, 39, 40 and 43 of the instant application recites limitations that are similar to the limitations recited in claims 1, 10 and 18 of the appl. 14318026 in view of Kikinis and Croen and therefore are also obvious over claims 1, 10 and 19 of the 14318026 modified by Kikinis and Croen.

Claims 27 and 37 of the instant application recites limitations that are similar to the limitations recited in claims 14-15, 17 and 22-23 of the 14318026 modified by Kikinis and Croen and Kuranov et al. (US Patent Publication No. 2008/0253685, “Kuranov”) and therefore are also obvious over claims 14-15, 17 and 22-23 of the 14318026 modified by Kikinis and Croen and Kuranov to use standard and well known method of generating 3d model from basic input images.


Appl. No.
Claim Correspondence
16449035
24, 34-35, 43
31, 40
15893500
21, 22, 31, 32, 36
24, 27, 28, 37, 40

Claims 24, 34-35 and 43 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 21, 22, 31, 32 and 36 of US Pat. Pub. No. 15893500 in view of Kikinis and Croen.
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Claim 24 of 16449035
Claim 31 of 15893500
A system, comprising: one or more computing devices that implement a real-time video exploration (RVE) system, configured to:
A system, comprising: one or more processors; and a memory storing instructions that, when executed by the one or more processors, cause the one or more processors to:
receive input from a client device indicating user interactions with a playback of a video content depicting a real-world scene on the client device;
receive input that identifies an object of a plurality of objects in a scene of a pre-recorded video for which playback has begun; pause playback of the pre-recorded video responsive to the input; receive input directing manipulation of the identified object;

generate a model of the real-world scene from images associated with the real-world scene; receive further input from the client device indicating further user interactions with the real-world scene to manipulate the real-world scene; in response to the further user interactions:
determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation;

modify the model of the real-world scene to include the change that is required as a result of the manipulation;
manipulate a model of the identified object based on graphics data for the pre-recorded video, wherein the model is manipulated according to the input directing manipulation;
render new video content of the real-world scene based at least in part on the model as modified; and
render new video including a rendering of the manipulated model of the identified object wherein the new video comprises one or more portions of the scene that are not visible in the pre-recorded video;

stream the new video content to the client device.
stream the new video to a client device while the playback of the prerecorded video is paused; and

resume playback of the pre-recorded video to the client device in response to input from the client device to resume the pre-recorded video.


Claim 24 of the instant application differs from claim 31 of the application only: in response to the user interactions generate a model of the real-world scene from  images associated with the real-world scene; receive further input from the client device indicating further user interactions with the real-world scene to manipulate the real-world scene; in response to the further user interactions: determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation;
Kikinis teaches in response to the user interactions (“[0162] ….  In one embodiment, the update module 3036 communicates with the post-production engine 3040 for post-production effects.”):
generate a model of the real-world scene from images associated with the real-world scene; (“[0166] The wire frame editing module 3046 edits the wire frames used in the immersive audio-visual production.  A wire frame model generally refers to a visual presentation of an electronic representation of a 3D or physical object used in 3D computer graphics.  Using a wire frame model allows visualization of the underlying design structure of a 3D model.  The wire frame editing module 3046, in one embodiment, creates traditional 2D views and drawings of an object by appropriately rotating the 3D representation of the object and/or selectively removing hidden lines of the 3D representation of the object.  In another embodiment, the wire frame editing module 3046 removes one or more wire frames from the recorded immersive audio-visual video scenes to create realistic simulation environment.”).
Claim 31 of appl. 15893500 and Kikinis are analogous as they are from the field of rendering videos.
 Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 31 of appl. 15893500 to have included in response to the user interactions generate a model of the real-world scene from images associated with the real-world scene as taught by Kikinis.
The motivation for the above is to follow standard procedure for 3d rendering.
Claim 31 of appl. 15893500 modified by Kikinis doesn’t expressly teach in response to the further user interactions: determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation;
Croen teaches, receive further input from the client device indicating further user interactions with the real-world scene to manipulate the real-world scene; determine, based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation; (Refer to Fig. 12 step 1204 “ [0099]…. user interactions with the system are monitored and analyzed to determine whether any updates to the model are required and/or would be beneficial to the system (1204).”)
Claim 31 of appl. 15893500 modified by Kikinis and Croen are analogous as they are from the field of rendering videos.
 Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 31 of appl. 15893500 modified by Kikinis to have included receive further input from the client device indicating further user interactions with the real-world scene to manipulate the real-world scene; based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation as taught by Croen.
  The motivation to include this modification is to reduce unnecessary processing of model building.
Claims 24, 31, 34-35, 40 and 43 of the instant application recites limitations that are similar to the limitations recited in claims 21, 22, 24, 27, 28, 31, 32, 36, 37 and 40 of the appl. 15893500 in view of Kikinis and Croen and therefore are also obvious over claims 21, 22, 24, 27, 28, 31, 32, 36, 37 and 40 of the appl. 15893500 modified by Kikinis and Croen.

Appl. No
Claim Correspondence
16449035
24, 34, 35, 43
31, 40

24, 25, 27, 28, 29, 32, 33, 35, 36, 37, 40, 41
38, 42-44



Claims 24, 34-35 and 43 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 24, 25, 27, 28, 29, 32, 33, 35, 36, 37, 40, 41 of US Pat. Pub. No. 15893522 in view of Kikinis and Croen.
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Claim 24 of 16449035
Claim 32 of 15893522
A system, comprising:
one or more computing devices that implement a real-time video exploration (RVE) system, configured to:
A system, comprising: one or more processors; and a memory storing instructions that, when executed by the one or more processors, cause the one or more processors to:
receive input from a client device indicating user interactions with a playback of a video content depicting a real-world scene on the client device; in response to the user interactions:
pause the playback of the video content;
pause playback of a pre-recorded video at a scene in response to input from a client device;

generate a model of the scene based at least in part on scene exploration input received from the client device, the scene exploration input directing a change of a viewpoint of the scene to view content that is not visible in the scene of the pre-recorded video;
stream the new video content to the client device.
initiate playback of a new video of the scene to the client device while playback of the pre-recorded video is paused, wherein the new video of the scene is generated based at least in part on the generated model of the scene; and
receive further input from the client device indicating further user interactions with the real-world scene to manipulate the real-world scene; in response to the further user interactions: determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation; modify the model of the real-world scene to include 

render new video content of the real-world scene based at least in part on the model as modified;


resume playback of the pre-recorded video to the client device in response to resume input from the client device.


Claim 24 of the instant application differs from claim 32 of the application only: receive further input from the client device indicating further user interactions with the real-world scene to manipulate the real-world scene; in response to the further user interactions: determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation; modify the model of the real-world scene to include the change that is required as a result of the manipulation; render new video content of the real-world scene based at least in part on the model as modified;
Croen teaches, determine, based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation; (Refer to Fig. 12 step 1204 and 1206 determines whether a model needs to be updated in response to user interaction. “ [0099]…. user interactions with the system are monitored and analyzed to determine whether any updates to the model are required and/or would be beneficial to the system (1204).”)
Claim 32 of 15893522 and Croen are analogous as they are from the field of rendering videos.
 Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 32 of 15893522 to have included based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation as taught by Croen.
  The motivation to include this modification is to reduce unnecessary processing of model building.
Claim 32 of 15893522 modified by Croen doesn’t expressly teach modify the model of the real-world scene to include the change that is required as a result of the manipulation.
Kikinis teaches modify the model of the real-world scene to include the change that is required as a result of the manipulation (“[0166] The wire frame editing module 3046 edits the wire frames used in the immersive audio-visual production.  A wire frame model generally refers to a visual presentation of an electronic representation of a 3D or physical object used in 3D computer graphics.  Using a wire frame model allows visualization of the underlying design structure of a 3D model.  The wire frame editing module 3046, in one embodiment, creates traditional 2D views and drawings of an object by appropriately rotating the 3D representation of the object and/or selectively removing hidden lines of the 3D representation of the object.  In another embodiment, the wire frame editing module 3046 removes one or more wire frames from the recorded immersive audio-visual video scenes to create realistic simulation environment.”);
render new video content of the real-world scene based at least in part on the model as modified;  (“[0149] The recording engine 3010 comprises a background creation module 3012, a video scene creation module 3014 and an immersive audio-visual production module 3016. [0151] ”The production engine 3016 employs a plurality of immersive audio-visual production tools/systems, such as the video rendering engine 204 illustrated in FIG. 5,”  [0066]….The scene background and captured video objects and interaction commands are rendered by the video rendering  engine 204.”)
Claim 32 of 15893522 modified by Croen and Kikinis are analogous as they are from the field of rendering videos.
 Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 32 of 15893522 modified by Croen to have included modify the model of the real-world scene to include the change that is required as a result of the manipulation; render new video content of the real-world scene based at least in part on the model as modified as taught by Kikinis.
The motivation for the above is to follow standard procedure for 3d rendering.

Claims 24, 31, 34-35, 40 and 43 of the instant application recites limitations that are similar to the limitations recited in claims 24, 25, 27, 28, 29, 32, 33, 35, 36, 37,38,  appl. 15893522 in view of Kikinis and Croen and therefore are also obvious over claims 24, 25, 27, 28, 29, 32, 33, 35, 36, 37,38,  40, 41-44 of the appl. 15893522 modified by Kikinis and Croen.
Appl. No.
Claim Correspondence
16449035
24, 34-35, 39, 43
31, 40
15688637
22-23, 30, 34, 37-39
25


Claims 24, 34-35, 39 and 43 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 22-23, 30, 34, 37-39 of US Pat. Pub. No. 15688637 in view of Kikinis and Croen.
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Claim 24 of 16449035
Claim 30 of 15688637
A system, comprising: one or more computing devices that implement a real-time video exploration (RVE) system, configured to:
A system, comprising: one or more computing devices comprising one or more processors and one or more memories storing instructions executable by the one or more processors to cause the one or more processors to:
receive input from a client device indicating user interactions with a 



generate a model of the real-world scene from images associated with the real-world scene;
modify a model of the object in the pre-recorded video according to the one or more modifications to generate a modified model of the object while the playback of the pre-recorded video is paused;
in response to the further user interactions: determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation; modify the model of the real-world scene to include the change that is required as a result of the manipulation; stream the new video content to the client device.


render new video of the scene including the modified model of the object;

identify at least a portion of the pre-recorded video to be replaced with the new video of the scene including the object as modified; and

in response to resume input from the client device, cause the playback of the pre-recorded video to be resumed to the client device, wherein at least the portion of the prerecorded video is replaced by the new video during the resumed playback.


Claim 24 of the instant application differs from claim 32 of the application only: receive further input from the client device indicating further user interactions with the real-world scene to manipulate the real-world scene; in response to the further user interactions: determine, based at least in part on the further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation; modify the model of the real-world scene to include the change that is required as a result of the manipulation; stream the new video content to the client device.
Croen teaches, determine, based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation; (Refer to Fig. 12 step 1204 and 1206 determines whether a model needs to be updated in response to user interaction. “ [0099]…. user interactions with the system are monitored and analyzed to determine whether any updates to the model are required and/or would be beneficial to the system (1204).”)
Claim 30 of 15688637 and Croen are analogous as they are from the field of rendering videos.
 Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 32 of 15893522 to have included based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation as taught by Croen.
  The motivation to include this modification is to reduce unnecessary processing of model building.
Claim 30 of 15688637 modified by Croen doesn’t expressly teach modify the model of the real-world scene to include the change that is required as a result of the manipulation.
Kikinis teaches modify the model of the real-world scene to include the change that is required as a result of the manipulation (“[0166] The wire frame editing module 3046 edits the wire frames used in the immersive audio-visual production.  A wire frame model generally refers to a visual presentation of an electronic representation of a 3D or physical object used in 3D computer graphics.  Using a wire frame model allows visualization of the underlying design structure of a 3D model.  The wire frame editing module 3046, in one embodiment, creates traditional 2D views and drawings of an object by appropriately rotating the 3D representation of the object and/or selectively removing hidden lines of the 3D representation of the object.  In another embodiment, the wire frame editing module 3046 removes one or more wire frames from the recorded immersive audio-visual video scenes to create realistic simulation environment.”);
stream the new video content to the client device. (Fig. 23 and “[0119]….The server 2303 is also communicatively coupled with the transmitter 2305 to send out the audio-visual data wirelessly to the hand held device 2301 via the transmitter 2305.  In another embodiment, the server 2303 sends the audio-visual data to the hand held device 2301 through land wire via the transmitter 2305.”)

Claim 30 of 15688637 modified by Croen and Kikinis are analogous as they are from the field of rendering videos.
 Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Claim 30 of 15688637 modified by Croen to have included modify the model of the real-world scene to include the change that is required as a result of the manipulation; stream the new video content to the client device as taught by Kikinis.
The motivation for the above is to follow standard procedure for 3d rendering.

appl. 15688637 in view of Kikinis and Croen and therefore are also obvious over claims 22-23, 25, 30, 34, 37-39 of the appl. 15688637 modified by Kikinis and Croen.



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 24, 28, 30-31, 35 and 39-40 are rejected under 35 U.S.C. 103 as being unpatentable over Kikinis et al. (US patent Publication: 2009/0237492, “Kikinis”) in view of Gregg et al. (US Patent: 9032020, “Gregg”) and  Croen et al. ( US patent publication: 20140036022, “Croen”).

Regarding claim 24, Kikinis teaches, a system (Fig. 1), comprising:
one or more computing devices ( Fig. 1 Clients 102A-N, and  immersive audio-visual system 120 or server) that implement a real-time video exploration (RVE) system, configured to:
receive input from a client device indicating user interactions with a playback of a video content depicting a real-world scene on the client device; (Fig. 30 and [161] “[0108] The exemplary ––screen of the video editing tool 1700 also shows a user interface window 1703 to control of elements of windows 1701 and 1702 and other items (such as virtual cameras and microphones not shown in the figure).  The user interface window 1703 has multiple controls 1703a-n, of which only control 1703c is shown.”
Fig. 30 and “[0161]. ….”In one embodiment, the update module 3036 updates the audio-visual production in real time, such as on-set editing the currently recorded video scenes using the editing tools illustrated in FIG. 17. In Fig. 17”)
in response to the user interactions:
generate a model of the real-world scene from images associated with the real-world scene; (“[0166] The wire frame editing module 3046 edits the wire frames used in the immersive audio-visual production.  A wire frame model generally refers to a visual presentation of an electronic representation of a 3D or physical object used in 3D computer graphics.  Using a wire frame model allows visualization of the underlying design structure of a 3D model.  The wire frame editing module 3046, in one embodiment, creates traditional 2D views and drawings of an object by appropriately rotating the 3D representation of the object and/or selectively removing hidden lines of the 3D representation of the object.  In another embodiment, the wire frame editing module 3046 removes one or more wire frames from the recorded immersive audio-visual video scenes to create realistic simulation environment.”)
 (At Fig. 31 step  3109. The flow goes back to beginning to send additional input and update the model.  “[0167]…..The system may starts a new training session using the updated immersive audio-visual production or other training programs in step 3109, or optionally ends its operations.” 
 “[0162] In another embodiment, the update module 3036 updates the immersive audio-visual production during the post-production time period.  In one embodiment, the update module 3036 communicates with the post-production engine 3040 for post-production effects.”
Fig. 30 and “[0161]. ….”In one embodiment, the update module 3036 updates the audio-visual production in real time, such as on-set editing the currently recorded video scenes using the editing tools illustrated in FIG. 17. In Fig. 17”)
in response to the further user interactions (Fig. 31 and step  3109 and [0167]….. “The system may starts a new training session using the updated immersive audio-visual production or other training programs in step 3109, or optionally ends its operations.”) 
modify the model of the real-world scene to include the change that is required as a result of the manipulation; (“[0166] The wire frame editing module 3046 edits the wire frames used in the immersive audio-visual production.  A wire frame model generally refers to a visual presentation of an electronic representation of a 3D or physical object used in 3D computer graphics.  Using a wire frame model allows visualization of the underlying design structure of a 3D model.  The wire frame editing module 3046, in one embodiment, creates traditional 2D views and drawings of an object by appropriately rotating the 3D representation of the object and/or selectively removing hidden lines of the 3D representation of the object.  In another embodiment, the wire frame editing module 3046 removes one or more wire frames from the recorded immersive audio-visual video scenes to create realistic simulation environment.“)
render new video content of the real-world scene based at least in part on the model as modified;  (“[0149] The recording engine 3010 comprises a background creation module 3012, a video scene creation module 3014 and an immersive audio-visual production module 3016. [0151] ”The production engine 3016 employs a plurality of immersive audio-visual production tools/systems, such as the video rendering engine 204 illustrated in FIG. 5,”  [0066]….The scene background and captured video objects and interaction commands are rendered by the video rendering  engine 204.”)  and
stream the new video content to the client device. (Fig. 23 and “[0119]….The server 2303 is also communicatively coupled with the transmitter 2305 to send out the audio-visual data wirelessly to the hand held device 2301 via the transmitter 2305.  In another embodiment, the server 2303 sends the audio-visual data to the hand held device 2301 through land wire via the transmitter 2305.”)
            Kikinis doesn’t expressly teach, in response to the user interactions, pause the playback of the video content; and 

However, Gregg teaches, in response to the user interactions, pause the playback of the video content; (Col 10 lines 5-8 and lines 18-19 “In operation 902, the edit processing server 114 transmits the first video stream, here the enhanced video stream 184, to the client 104, and causes it to be displayed as the enhanced video representation 304. In operation 905, the edit processing server 114 pauses streaming of the first video stream”. It is to note that Greg Col 10 lines 18-19 and lines 28-30also streams the new video content to the client device) 
Gregg and Kikinis are analogous as they are from the field of video editing.
Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Kikinis to have included, in response to the user interactions, pause the playback of the video content as taught by Gregg and thereby stream the edited content to the client.
The motivation for the modification is that user can view updated video without waiting to pre-recorded video to finish.
Kikinis as modified by Gregg doesn’t expressly teach, determine, based at least in part on the  further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation;
Croen teaches, determine, based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation; (Refer to Fig. 12 step 1204 and 1206 determines whether a model needs to be updated in response to user interaction. “ [0099]…. user interactions with the system are monitored and analyzed to determine whether any updates to the model are required and/or would be beneficial to the system (1204).”)
Kikinis as modified by Gregg and Croen are analogous as they are from the field of rendering videos.
            Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the claimed invention to have modified Kikinis as modified by Gregg to have included based at least in part on a further user interactions, that a change in the model of the real-world scene is required as a result of the manipulation as taught by Croen.
            The motivation to include this modification is to reduce unnecessary processing of model building.

            Claim 35 is directed to a method whose steps are similar in scope and functions of the elements of the device claim 24 and therefore claim 35 is rejected with same rationales as specified in the rejection of claim 24.

Regarding claim 28, Kikinis as modified by Gregg and Croen teaches, wherein the RVE system is implemented as an online gaming service using resources provided by a network-accessible service provider network. (Kikinis, “[0064]….The application engine 540 enables post-production viewing and editing with respect to the type of application and other factors for a variety of  applications, such as online intelligent gaming, military training simulations, cultural-awareness training, and casino-type of interactive gaming.“)

Regarding claims 30 and 39, Kikinis as modified by Gregg and Croen teaches, wherein the RVE system is configured to:
responsive to additional input from the client device indicating user interactions to manipulate an object in the real-world scene, render and stream subsequent video content including the object as manipulated as a result of the user interactions.(Kikinis, refer to Fig. 31, step 3109, when new training session start, program update module 3036 receives additional input from client based on Fig. 17 and [0108],  Refer to Kikinis, “Fig. 17 [0108] Control 1703c is a palette/color/saturation/transparency selection tool that can be used to select colors for the areas 1702a-n.andthe subsequent video content includes the object with the color or the texture as changed.”  Refer to Fig. 30 and [0162][0166] discloses, program update module 3036 receives the input based on Fig. 17 from a client and then sends the input to visual effect editing module and wireframe editing module to update the model of the object. Then rendering module 3016 generates updated video based on the new model. )

Regarding claims 31 and 40, Kikinis as modified by Gregg and Croen teaches, the additional input from the client device indicates a change of a color or a texture of the object (Refer to Kikinis, “Fig. 17 [0108] Control 1703c is a palette/color/saturation/transparency selection tool that can be used to select colors for the areas 1702a-n.andthe subsequent video content includes the object with the color or the texture as changed.”)
Kikinis, Fig. 30 and [0162] [0166]  discloses Program update module 3036 receives the input based on Fig. 17 from a client and then sends the input to visual effect editing module and wireframe editing module to update the model of the object. [0154] discloses, rendering module generates updated video based on the new model.) 


Claims 25 and 36 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kikinis as modified by Gregg and Croen and further in view of Groman (US Patent Publication No. 2015/0043892, “Groman”).
Regarding claims 25 and 36, Kikinis as modified by Gregg and Croen, doesn’t expressly teach wherein to generate the model of the real-world scene from images, the RVE system is configured to: identify the images from a repository of crowdsourced images for different scenes, wherein at least some of the images in the repository are obtained from a plurality of online contributors
However Groman teaches, identify the images from a repository of crowdsourced images for different scenes, wherein at least some of the images in the repository are obtained from a plurality of online contributors.  ([0086] “.....In the scenario posed by FIG. 9, wherein different contributors are capturing video feeds of a football game, the crowd sourcing may determine that one of the video feeds is particularly interesting (e.g., showing a dancing mascot, or continuously showing the opposing team members at various points in the game). As such, crowd sourcing may be used to further select and filter between the plurality of video feeds provided by the contributors, so that one video feed is selected at one time, another video feed is selected at another time, etc. to generate an edited video feed suitable for distribution”).
Groman and Kikinis as modified by Gregg and Croen are analogous arts as both of them are related to image processing and reconstructing.
Therefore it would have been obvious for an ordinary person skilled in the art before the effective filing date of the claimed invention to have modified Kikinis as modified by Gregg and Croen by having one or more sources for images include one or more crowdsourcing techniques for soliciting and obtaining digital images from a plurality of online contributors as taught by Groman. 
The motivation for the above is to enhance applicability of system of Kikinis by having option of acquiring images from different sources.

Claims 26, 27 and 37 are rejected under 35 U.S.C. 103 as being unpatentable over Kikinis as modified by Gregg and Croen and further in view of Kuranov et al. (US Patent Publication No. 2008/0253685, “Kuranov”).

Regarding claim 26, Kikinis as modified by Gregg and Croen teaches, doesn’t expressly teach, wherein to generate the model of the real-world scene from the images, the RVE system is configured to: align the images according to respective content of the images or respective image metadata indicating positioning of the images, merge at least portions of the aligned images to generate a composited image; and generate the model according to the composite image
Kuranov teaches, align the images according to respective content of the images or respective image metadata indicating positioning of the images; ([0091] [0092] aligns images based on the points of the objects which is position or metadata.)
merge at least portions of the aligned images to generate a composited image; and (Kuranov, “[0096] FIG. 12 is a flowchart of an embodiment of a method 1200 of rending an image, which may be implemented by rendering system 92.  In step 1202, an image and/or video scene is joined.  The joining may employ software and maps indicating how to stitch the images together.  To render the resultant joined panorama image each of the initial individual images should be appropriately transformed and placed into the final image.”); 
generate the model according to the composite image. (Kuranov, [0061] “... may involve building a model (e.g., a model of the three dimensional layout) of the scene being photographed or filmed”);
Kuranov and Kikinis as modified by Gregg and Croen are analogous arts as both of them are related to image processing and reconstructing.
Therefore it would have been obvious for an ordinary person skilled in the art before the effective filing date of the claimed invention to have modified Kikinis as modified by Gregg and Croen to generate the model of the real-world scene from the images, the RVE system is configured to: align the images according to respective content of the images or respective image metadata indicating positioning of the images, merge at least portions of the aligned images to generate a composited image; and generate the model according to the composite image as taught by Kuranov. 


Regarding claims 27 and 37, Kikinis as modified by Gregg and Croen teaches about generating 3d model (Kikinis [0166]) but doesn’t expressly teach, wherein to generate the model of the real-world scene from the images, the RVE system is configured to: generate a 3D model of the real-world scene from 2D images associated with the real-world scene.
However, Kuranov teaches, generate a 3D model of the real-world scene from 2D images associated with a real-world scene. (Kuranov [0037] captures or receives images using camera. These images are two dimensional as [0044] provides a transformation of two dimensional images.  “[0044] A perspective is a non-affine transformation determined by geometric principles applied to a two dimensional image”. Then [0061] discloses generating 3d model based on these two dimensional images. “[0061] FIG. 2A is flowchart of an example of automatically stitching a scene together.  Method 200 may be implemented by automatic stitcher system 100.  In an embodiment, an arbitrary number of images or video streams may be stitched together via method 200.  The stitching may include at least of two stages, which are configuration, step 201, and rendering, step 202.  During the configuration step (or phase) mappings are determined that map input still images or videos to a desired output scene.  The determination of the mappings may involve determining a mapping from one input mage to other input images and/or may involve building a model (e.g., a model of the three dimensional layout) of the scene being photographed or filmed.”)  
Kuranov and Kikinis as modified by Gregg and Croen are analogous arts as both of them are related to image processing and reconstructing.
Therefore it would have been obvious for an ordinary person skilled in the art before the effective filing date of the claimed invention to have modified Kikinis as modified by Gregg and Croen by generate a 3D model of the real-world scene from 2D images associated with a real-world scene as taught by Kuranov. 
The motivation for the above is to use standard and well known method of generating 3d model from basic input images.

Claims 29, 33, 38 and 42 are rejected under 35 U.S.C. 103 as being unpatentable over Kikinis as modified by Gregg and Croen and further in view of Uusitalo et al. (US Patent Publication No. 20090161963, “Uusitalo”).

Regarding claims 29 and 38, Kikinis as modified by Gregg and Croen doesn’t expressly teach, responsive to additional input from the client device indicating user interactions with the real-world scene, provide one or more of the images associated with the real-world scene to the client device.
However, Uusitalo teaches, responsive to input from a client device indicating user interactions with a real-world scene, provide one or more of the images associated with the real-world scene to the client device. (Refer to Fig.2 terminal 16 is a client and computing system 52 is an audio visual computing system or server.  [0047-“[0047]…..Accordingly, a semiotic region with defined interaction rules emulating real-world affordances of objects depicted within the semiotic region may act, for example, like an icon on a standard computer desktop insofar as, similar to clicking on an icon which establishes a link to corresponding functionality or content, clicking within or otherwise accessing a semiotic region with defined interaction rules may allow a user to execute corresponding functionality or access corresponding content depending on the associated user interaction rules.  Examples of objects having real-world affordances and their associated user interaction rules to emulate the real-world affordances within the virtual world include:  
[0048] a window object, such as a window in a building (e.g. window object 94 of FIG. 4), wherein annotated content beneath the window may be viewed,”)
Uusitalo and Kikinis as modified by Gregg and Croen are analogous arts as both of them are related to image processing and reconstructing.
Therefore it would have been obvious for an ordinary person skilled in the art before the effective filing date of the claimed invention to have modified Kikinis as modified by Gregg and Croen to include, responsive to additional input from the client device indicating user interactions with the real-world scene, provide one or more of the images associated with the real-world scene to the client device.as taught by Uusitalo. 


Regarding claims 33 and 42, Kikinis as modified by Gregg and Croen teaches the additional input and subsequent video as shown in claim 30 but doesn’t expressly teach, wherein: the additional input from the client device indicates to open a door or a compartment in the real-world scene; and the subsequent video content includes an additional portion of the real-world scene that was not visible prior to the opening of the door or the compartment.
However, Uusitalo teaches, input from the client device indicates to open a door or a compartment in the real-world scene; and the subsequent video content includes an additional portion of the real-world scene that was not visible prior to the opening of the door or the compartment. (Refer to Fig.2 terminal is a client  and computing system is an audio visual computing system or server.  [0047-0048] discloses when client manipulates a door in an image or video, the subsequent video shows the additional portion of real content behind the door which was not shown prior to opening door.    “[0047]…..Accordingly, a semiotic region with defined interaction rules emulating real-world affordances of objects depicted within the semiotic region may act, for example, like an icon on a standard computer desktop insofar as, similar to clicking on an icon which establishes a link to corresponding functionality or content, clicking within or otherwise accessing a semiotic region with defined interaction rules may allow a user to execute corresponding functionality or access corresponding content depending on the associated user interaction rules.  Examples of objects having real-world affordances and their associated user interaction rules to emulate the real-world affordances within the virtual world include:  
[0048] a window object, such as a window in a building (e.g. window object 94 of FIG. 4), wherein annotated content beneath the window may be viewed,”)
Uusitalo and Kikinis as modified by Gregg and Croen are analogous arts as both of them are related to image processing and reconstructing.
Therefore it would have been obvious for an ordinary person skilled in the art before the effective filing date of the claimed invention to have modified Kikinis as modified by Gregg and Croen  to have the additional input from the client device indicates to open a door or a compartment in the real-world scene; and the subsequent video content includes an additional portion of the real-world scene that was not visible prior to the opening of the door or the compartment similar to input from the client device indicates to open a door or a compartment in the real-world scene; and the subsequent video content includes an additional portion of the real-world scene that was not visible prior to the opening of the door or the compartment as taught by Uusitalo. 
The motivation for the above is to emulate additional image related to a real world object in a video without even physically manipulating the real world object. 

Claims 32 and 41 is rejected under 35 U.S.C. 103 as being unpatentable over Kikinis as modified by Gregg and Croen and further in view of Bellamy et al. (US Patent Publication No. 2010/0088195, “Bellamy”).
Kikinis as modified by Gregg and Croen teaches, additional input and subsequent video content as shown in claim 30/39 but doesn’t expressly teach, wherein the RVE system is configured to receive additional input from the client device to purchase the object in the subsequent video content.
However Bellamy teaches, receive additional input from a client device to purchase an object in a subsequent content.  (Bellamy refer to Fig. 4, [0073] and at step 3050 user request an object to view and at 3060 server provides subsequent image having the image of the object.   “[0073]….. If the input is the request to view objects, at step 3060, the server device invokes the view objects handler 2070 to provide the objects to the client device (client device that requested viewing objects) with a visibly identifiable form (e.g., a web page with a mark-up on orderable objects, a photo catalog including objects along with objectIDs and sellerIDs). Then, the server device returns to the step 3000 and waits another input from a client device.”   At step 3070 user provides the additional input to purchase the object in a subsequent content.  “[0074] When the input (the input evaluated at step 3050) is not the request of viewing objects, at step 3070, the server device evaluates whether the input is a request of a reseller form.  If the input is the request of the reseller form, at step 3080, the server device invokes the (request) reseller form handler 2080 to generate the reseller form (e.g., by retrieving the reseller form from a database associated with an object for which the request of the reseller form is submitted to the server device) and to provide the reseller form to the client device (client device that sent the request of the reseller form to the server device).  Then, the server device returns to the step 3000 and waits another input from a client device.”)
Bellamy and Kikinis as modified by Gregg and Croen are analogous art as both of them are related to client server communication.
Therefore it would have been obvious to an ordinary skilled person in the art before the effective filing date of the claimed invention to have Kikinis as modified by Gregg and Croen to have the RVE system configured to receive additional input from the client device to purchase the object in the subsequent video content similar to receiving additional input from a client device to purchase an object in the subsequent content as taught by Bellamy. 
The motivation for the above is to enhance application of Kikinis in the field of online sale system.

Claims 34 and 43 are rejected under 35 U.S.C. 103 as being unpatentable over Kikinis as modified by Gregg and Croen and further in view of Robinson et al. (US Patent Publication No. 2015/0215564, “Robinson”).

Regarding claims 34 and 43, Kikinis as modified by Gregg and Croen  teaches, generation of the model after a pause as shown in claim 1 but doesn’t expressly teach, responsive to additional input from the client device, resume the playback of the video content depicting a real-world scene prior to the generation of the model.
Robinson teaches, responsive to additional input from the client device, resume the playback of the video content depicting a real-world scene at the point /position of pause. (“[0036]……For example, the user interface 206 could be manipulated by an operator to pause a set of audio/video content during playback, thereby creating a pause-point to resume at a later time or on another electronic device.”)
Robinson and Kikinis as modified by Gregg and Croen are analogous as they are from the field of video processing.
Therefore it would have been obvious for an ordinary skilled person in the art before the effective filing date of the Kikinis as modified by Gregg and Croen to have modified responsive to additional input from the client device, resume the playback of the video content depicting a real-world scene at the point of pause based on Robinson’s teaching of responsive to additional input from the client device, resume the playback of the video content depicting a real-world scene at the point /position of pause and thereby resume the playback of the video depicting a real-world scene prior to the generation of the model as Kikinis generated the model after the pausing of the video.
The motivation for the above is to make sure that the user doesn’t miss any object or portion to modify or edit in the pre-recorded video.

Response to Arguments
Applicant’s arguments, see remarks Pages 9-10, filed 11/05/2020, with respect to rejection of claims under non-statutory double patenting have been fully considered and 

Applicant’s arguments, see remarks Page 10, filed 11/05/2020, with respect to rejection of claims 24-43 under 35 USC 112(b) have been fully considered and are persuasive.  Therefore the rejection has been withdrawn.

 Applicant’s arguments, see remarks Page 10, filed 11/05/2020, with respect to rejection of claim 24 under 35 USC 103 have been fully considered and are not persuasive.  Therefore the rejection has been maintained.

Applicant argues see remarks Pages 11-12. “ The connections between elements of Applicant’s claim appear to have been overlooked in the rejection. For example, Applicant’s real-world scene links the claim element receive input from a client device indicating user interactions with a playback of a video content depicting a real-world scene on the client device with the claim element in response to the user interactions: [..........On pp. 46-47 of the Office Action, paragraphs 161 and 166 of Kikinis are cited for the above-noted features, with the quoted parts of paragraph 161 describing real-time updates to the video production during on-set editing, and paragraph 166 describing use of wire frame editing in the production of the video. Thus, paragraph 166 describes use of wire frame editing to produce the video, not generating a wire-frame from images associated with the real-world scene in response to user interactions with a playback of a video content. Using wire frame editing to produce the 

	Examiner replies, Kikinis Fig. 30 and [161]  and Fig. 17 and [ 0108] discloses receiving input from a client device indicating user interactions with a playback of a video content depicting a real-world scene on the client device; (Fig. 30 and [161] indicates update module 3036  receives input from a client during playback of a video content as shown user’s action from Fig.17.  See Fig. 17  and “[0108] The exemplary ––screen of the video editing tool 1700 also shows a user interface window 1703 to control of elements of windows 1701 and 1702 …. The user interface window 1703 has multiple controls 1703a-n, of which only control 1703c is shown.”  Fig. 17 provides a user interface to provide an input to the scene of a video and Fig. 17 is referred by Fig. 30 and [0161].  The video of Fig. 17 is a real-world content.  Fig. 30 and Paragraph [0155] also discloses playback of a video. Then paragraph [0161] provides an input or edit command to update the video. Regarding applicant question of video of a real-world scene, the video which is user editing is a real-world video because Paragraph [181] indicates the video has real-world content.  [0181] To create realistic simulation environment, in one embodiment, the training system interleaves simulated virtual reality and real world videos in response to fidelity requirements, or when emotional requirements of training game participants go above a predetermined level.” Fig. 20 and [0114]  also provides video of real world object. “[0114] The exemplary recording set illustrated in FIG. 20 can be used to simulate any of several building environments and, similarly, outdoor environments. For example, a building on the recording set 2000 can be variously set in a grassy field, in a desert, in a town, or near a market, etc. Furthermore, post-production companies can bid on providing backgrounds as a set portraying a real area based on video images of said areas captured from satellite, aircraft, or local filming, and etc.”  Fig. 3A and [0069-0070] discloses background creation module having camera as capturing real world scene of users.  Paragraph [0166] generates a wireframe model of an object of video content having real-world object.   See [0166]….” A wire frame model generally refers to a visual presentation of an electronic representation of a 3D or physical object used in 3D computer graphics…… The wire frame editing module 3046, in one embodiment, creates traditional 2D views and drawings of an object by appropriately rotating the 3D representation of the object.”

Applicant argues, see remarks Page 12, “Furthermore, Applicant’s attorney has reviewed the reference and nothing in the reference links paragraphs 161 and 166 as attempted in the Office Action. In fact, while the quoted portion of paragraph 161 is explicitly directed to real-time editing on-set, the following paragraphs 162, 163 move on 
	Examiner replies, Kikinis [0161] described real-time editing and Paragraph [0162] refers to visual effect editing module and wire frame editing module for performing editing.  Paragraph [0166] describes the visual effect editing module and wire frame editing module. Therefore Paragraph [0166] is related to real time on-set editing. 
	
	Applicant argues, see remarks Pages 12-13, “Furthermore, instead of video content depicting a real-world scene, Kikinis displays videos created by “mixing live videos, and computer-generated graphic images.” Paras. 61, 69, 72, 77, 97, FIG. 6, block 604. The mixture of live video and computer generated graphic images (e.g., via use of “blue screens”) in Kikinis creates video that does not depict a real-world scene on the client device at least because the computer-generated images in Kikinis depict computer generated images, instead. Even when combined with Gregg (cited for the claim “pause” feature) the combination still fails to teach or suggest Applicant’s recited receive input from a client device indicating user interactions with a playback of a video content depicting a real-world scene on the client device; in response to the user interactions: pause the playback of the video content, at least because the video that is being played back in cited FIG. 30 and para. 161 of Kikinis is video that is created from a mixture of live videos, and computer-generated graphic images, which as explained above, do not depict a real-world scene.”
[0181] To create realistic simulation environment, in one embodiment, the training system interleaves simulated virtual reality and real world videos in response to fidelity requirements, or when emotional requirements of training game participants go above a predetermined level.” Kikinis Fig. 20 and [0114] also provides real world video.  Kikinis “[0114] The exemplary recording set illustrated in FIG. 20 can be used to simulate any of several building environments and, similarly, outdoor environments. For example, a building on the recording set 2000 can be variously set in a grassy field, in a desert, in a town, or near a market, etc. Furthermore, post-production companies can bid on providing backgrounds as a set portraying a real area based on video images of said areas captured from satellite, aircraft, or local filming, and etc.”  Kikinis Fig. 3A and [0069-0070] discloses background creation module having camera as capturing real world scene of users.  

Applicant argues, see remarks Page 13, “Applicant’s attorney has reviewed the latest office action and the reference and has been unable to determine what the model of the real-world scene depicted i…. Thus, because the Office Action has not adequately articulated what entity in the reference is mapped to Applicant’s model of the real-world scene depicted in the playback of a video content, the model generated from images associated with the real-world scene, a prima facie rejection has not been established. In re Jung, 637 F.3d 1356, 1362 (Fed. Cir. 2011) (thePTO fails to establish a prima facie case "when a rejection is so uninformative that it prevents the applicant 

Examiner replies, Kikinis Fig. 30 and Paragraph [0161][108] discloses that real time editing with user interface on a pre-recorded video is performed. This video contains real–world object because Fig. 30 described how the video is created based on background scene in recording engine 3010. The inclusion of Fig. 30 in mapping provides the support for real world video. Therefore prima-facie rejection has been maintained. For further support of real world object in played back video and editing with user interaction please see the reply above.

In response to applicant’s argument regarding rejections of claim 35, examiner refers applicant to the reply above for claim 24.

Applicant argues, see remarks Page 14-15 regarding rejection of claim 25 and 36, “On p. 53 of the Office Action, it is ceded that Kikinis as modified by Gregg and Croen fail to teach or suggest Applicant’s above-noted subject matter and para. 86 of Groman is cited to allegedly cure the deficiency. But the quoted part of Croen that appears on p. 53 of the Office Action has at least the same deficiency as Kikinis, noted above, describing generation of video based on crowd sourcing prior to distribution of the video, instead of generating the model in response to the user interactions with a 
Examiner replies, Page 53 of office action did not quote from Croen. Examiner quoted from Groman. Groman [0086] receives video of real-world images (football game) from contributors and selects (which is identification) which image can be used for a particular scene. 

Applicant argues, see remarks Page 15, “Also, the quoted part of Croen describes selecting the crowdsourced video feed to be the edited video feed, it does not describe generating a model from the crowdsourced video feeds. These overly-strained mappings are a clear indication that the combination of features as suggested in the Office Action is merely an attempt to combine disparate features of disparate references to achieve the Applicant’s claimed subject matter through improper hindsight analysis.”
Examiner replies, Croen is not used for argued limitation. Groman’s identified images are included as pre-recorded video with Kikinis as modified by Gregg and Croen. Kikinis teaches generating a model from playback video as shown in claim 24.  Generating model from playback video is not shown from Groman and therefore argument of hindsight is invalid. 

Applicant argues, see remarks Page 16, with respect to rejection of claim 26, “Kuranov’s description of building a model to determine mappings that are used to 
Examiner replies, Kuranov [0061] didn’t indicate that model is to be built before stitching, Paragraph [0061]’s generation stitched image refers to generating rendered stitched images.  This doesn’t only mean stitching or joining two images. That’s why rendering of stitched image is divided into two parts. Frist part is the configuration and the second part of rendering. On the first part (configuration) the input images joined and model is built which will be used in rendering stitched image. In Kuranov [0061] model building is done in configuration phase. It doesn’t mean that model building is done before joining two images.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAPTARSHI MAZUMDER whose telephone number is (571)270-3454.  The examiner can normally be reached on 8am-5pm PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached on (571)272-2976.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access 






/SAPTARSHI MAZUMDER/Primary Examiner, Art Unit 2612