DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Applicant’s amendment/response filed 08/29/2022 has been entered and made of record. Claims 1, 5-6, 15, 17, and 20 were amended. Claims 4 and 14 were cancelled. Claims 1-3, 5-13, and 15-20 are pending in the application.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1-2, 8-10, 13, 15, 17-18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wick et al. (US 2021/0150804) in view of Wang et al. (US 10055898), Holzer et al. (US 2018/0255290), Alesmaa et al. (US 2021/0166476), and Souchard (US 2017/0098312).
Regarding claim 1, Wick teaches/suggests: A method comprising: 
determining a three-dimensional representation of a scene captured in an action shot base video (Wick [0028] “Another exemplary embodiment can include the partial or complete, photorealistic creation of film scenes on the basis of 3D models” [0044]-[0045]: “A first step S11 includes generating a first individual image sequence with a real camera 10 … A second step S12 includes detecting the camera settings and the camera positions of the first image sequence”); 
generating an action shot video of the scene, the action shot video including a rendered object, the rendered object being positioned along the path through space (Wick [0028] “It is possible to superimpose real film scenes with computer animated scenes, for example virtual living organisms in real sceneries and/or actors in virtual sceneries or combinations of both” [0047]: “The virtual camera 12 can be, for example, a parameter set for settings of an image synthesis program 18 that can generate a virtual image sequence 14, resulting from a virtual scene 16, in accordance with the further step S14 a second image sequence, with the camera settings and camera positions”).
Wick is silent regarding:
the three-dimensional representation identifying a camera pose for motion of a camera along a path through space in the action shot base video; 
a rendered object determined based on the camera pose.
Wang, however, teaches/suggests:
the three-dimensional representation identifying a camera pose for motion of a camera along a path through space in the action shot base video (Wang col. 7 line 65 – col. 8 line 25: “A “three-dimensional scene reconstruction” or “3D scene reconstruction” refers to a 3D map or model of a 3D scene or trajectory over which video cameras travel while capturing videos. The 3D scene reconstruction can be generated using, for instance, a structure-from-motion (SfM) technique that determines camera poses along the trajectories over which the videos are captured”); 
Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the 3D model of Wick to be determined using an SfM technique as taught/suggested by Wang because that would have been well-understood, routine, and conventional for the 3D scene reconstruction. As such, Wick as modified by Wang teaches/suggests:
a rendered object determined based on the camera pose (Wick [0028] “It is possible to superimpose real film scenes with computer animated scenes, for example virtual living organisms in real sceneries and/or actors in virtual sceneries or combinations of both” Wang col. 7 line 65 – col. 8 line 25: “A “three-dimensional scene reconstruction” or “3D scene reconstruction” refers to a 3D map or model of a 3D scene or trajectory over which video cameras travel while capturing videos. The 3D scene reconstruction can be generated using, for instance, a structure-from-motion (SfM) technique that determines camera poses along the trajectories over which the videos are captured”).

Wick as modified by Wang does not teach/suggest:
determining a representation of an object by estimating a three-dimensional model from a multi-view representation of the object, the multi-view representation including a plurality of images of the object, each of the images being captured from a respective viewpoint, the multi-view representation being navigable in one or more dimensions; and 
a rendered object determined based on the representation.
Holzer, however, teaches/suggests:
determining a representation of an object from a multi-view representation of the object, the multi-view representation including a plurality of images of the object, each of the images being captured from a respective viewpoint, the multi-view representation being navigable in one or more dimensions (Holzer [0015] “For example, in certain embodiments, the MIDMR provides a three-dimensional view of the content without rendering and/or storing an actual three-dimensional model” [0117]: “With reference to FIG. 3, shown is illustrates an example of a process flow for generating a Multi-View Interactive Digital Media Representation (MIDMR). In the present example, a plurality of images is obtained at 302” [0120]: “In the present example embodiment, the plurality of images is fused into content and context models at 304” [0122]: “In the present embodiment, a MIDMR is generated from the content and context models at 308.”); 
Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the virtual living organisms of Wick as modified by Wang to be the MIDMRs of Holzer to provide 3D viewing without an actual 3D model. As such, Wick as modified by Wang and Holzer teaches/suggests:
a rendered object determined based on the representation (Wick [0028] “It is possible to superimpose real film scenes with computer animated scenes, for example virtual living organisms in real sceneries and/or actors in virtual sceneries or combinations of both” Holzer [0015] “For example, in certain embodiments, the MIDMR provides a three-dimensional view of the content without rendering and/or storing an actual three-dimensional model”).

Holzer is silent regarding by estimating a three-dimensional model. Alesmaa, however, teaches/suggests by estimating a three-dimensional model (Alesmaa [0027]: “Generating a realistic look of a 3D model from a 2D input image using fine-tuned deep neural networks which will be used in the visualization of 3D objects in AR devices for e-commerce purposes, and other similar or related solutions of AR. For this purpose, this invention proposes a framework to be used for the 3D reconstruction task. The algorithm benefits from deep neural networks to estimate a dense 3D model from a given 2D real-world image”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the MIDMRs of Wick as modified by Wang and Holzer to be estimated using deep neural networks as taught/suggested by Alesmaa to generate a realistic look.

Wick as modified by Wang, Holzer, and Alesmaa does not teach/suggest a rendered object animated via an animation effect. Souchard, however, teaches/suggests a rendered object animated via an animation effect (Souchard [0099]: “In other words, the source video and camera path 230 may be used to render additional content in the source video to generate the animation clip. For example, a 2D or 3D animated object such as a plate or an animated monster may be rendered on the table 210 of the source video. The camera path 230 may be used to determine where the camera is relative to the table 210, and thus where the animated object should be displayed in relation to the object(s) in the source video as the camera moves along the path 230”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the virtual living organisms of Wick as modified by Wang, Holzer, and Alesmaa to be animated as taught/suggested by Souchard to generate an animation clip.

Regarding claim 2, Wick as modified by Wang, Holzer, Alesmaa, and Souchard teaches/suggests: The method recited in claim 1, wherein the action shot base video is captured by a camera, and wherein determining the three-dimensional representation comprises applying a 3D reconstruction of the scene (Wick [0044]: “A first step S11 includes generating a first individual image sequence with a real camera 10” Wang col. 7 line 65 – col. 8 line 25: “A “three-dimensional scene reconstruction” or “3D scene reconstruction” refers to a 3D map or model of a 3D scene or trajectory over which video cameras travel while capturing videos. The 3D scene reconstruction can be generated using, for instance, a structure-from-motion (SfM) technique that determines camera poses along the trajectories over which the videos are captured”). The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Regarding claim 8, Wick as modified by Wang, Holzer, and Alesmaa, and Souchard teaches/suggests: The method recited in claim 1, wherein the multi-view representation of the object is generated on a mobile computing device comprising a camera, and wherein each of the plurality of images of the object are captured by the camera (Holzer [0117]: “In other embodiments, the camera may be a camera on a smartphone. In some embodiments, the camera may be configured to capture the plurality of images as a continuous video”). The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Regarding claim 9, Wick as modified by Wang, Holzer, and Alesmaa, and Souchard teaches/suggests: The method recited in claim 8, wherein the mobile computing device includes an inertial measurement unit configured to capture inertial measurement data (Holzer [0065]: “Another source of data that can be used to generate MIDMR includes location information obtained from sources such as accelerometers, gyroscopes, magnetometers, GPS, WiFi, IMU-like systems (Inertial Measurement Unit systems), and the like”), and wherein determining the representation of the object involves analyzing the inertial measurement data (Holzer [0066]: “In some embodiments, a MIDMR can be generated by a combination of data that includes both 2D images and location information, without any depth images provided. In other embodiments, depth images and location information can be used together”). The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Regarding claim 10, Wick as modified by Wang, Holzer, and Alesmaa, and Souchard teaches/suggests: The method recited in claim 1, wherein determining the representation of the object comprises determining a respective segmentation mask for the object in each of the images (Holzer [0167]: “In some embodiments, the probability maps may then be passed onto the temporal dense conditional random field (CRF) smoothing system, further described below with reference to FIG. 12, to obtain a binary mask for every frame that is sharply aligned to the boundaries and temporally consistent (non-fluctuating). These binary masks are then used to mask out pixels in every frame to extract the person or other object of interest out of the frames”). The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Regarding claim 13, Wick as modified by Wang, Holzer, and Alesmaa, and Souchard teaches/suggests: The method recited in claim 1, wherein determining the representation of the object comprises estimating a respective pose of the object for each of the images (Holzer [0070]: “In some embodiments, IMU data may be further implemented to generate a MIDMR including a three hundred sixty degree of an object based upon angle estimation using IMU data in accordance with embodiments of the present invention”). The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Regarding claim 15, Wick as modified by Wang, Holzer, and Alesmaa, and Souchard teaches/suggests: The method recited in claim 1, wherein estimating the three-dimensional model of the object comprises applying a neural network to one or more of the images (Alesmaa [0027]: “Generating a realistic look of a 3D model from a 2D input image using fine-tuned deep neural networks which will be used in the visualization of 3D objects in AR devices for e-commerce purposes, and other similar or related solutions of AR. For this purpose, this invention proposes a framework to be used for the 3D reconstruction task. The algorithm benefits from deep neural networks to estimate a dense 3D model from a given 2D real-world image”). The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Claims 17 and 18 recite limitations similar in scope to those of claims 1 and 2, respectively, and are rejected for the same reasons. Wick as modified by Wang, Holzer, and Alesmaa, and Souchard further teaches/suggests a communications interface; a memory module; and a processor (Wick [0022]: “Furthermore, the disclosure relates to a data processing system including means for carrying out the method according to the disclosure, and to a computer program” [0075]: “In this case, the lens/camera data can be made available to the rendering process in real time S46, for example with a wireless or wired transmission of the lens/camera data to a rendering computer, and be used for rendering”).

Claim 20 recites limitations similar in scope to those of claim 1 and is rejected for the same reasons. Wick as modified by Wang, Holzer, and Alesmaa, and Souchard further teaches/suggests one or more non-transitory computer readable media having instructions stored thereon (Wick [0022]: “Furthermore, the disclosure relates to a data processing system including means for carrying out the method according to the disclosure, and to a computer program).

Claim(s) 3 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wick et al. (US 2021/0150804) in view of Wang et al. (US 10055898), Holzer et al. (US 2018/0255290), Alesmaa et al. (US 2021/0166476), and Souchard (US 2017/0098312) as applied to claims 1 and 17 above, and further in view of Hillesland et al. (US 2014/0375634).
Regarding claim 3, Wick as modified by Wang, Holzer, and Alesmaa, and Souchard does not teach/suggest: The method recited in claim 1, wherein the action shot base video is a virtual scene, and wherein determining the three-dimensional representation comprises retrieving 3D model information associated with the virtual scene. Hillesland, however, teaches/suggests wherein the action shot base video is a virtual scene, and wherein determining the three-dimensional representation comprises retrieving 3D model information associated with the virtual scene (Hillesland [0030]: “For example, rendering module 114 can obtain 3D model data along with point-of-view information and synthesize a 2D image describing the 3D model from the point of view. The process of generating an image from a 3D model is also known as "rendering." For example, in a computer generated 3D environment (e.g., a video game, 3D simulation, etc.) a user can navigate around a virtual environment using input commands that can indicate a direction of viewing or moving within the environment”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the first image sequence of Wick as modified by Wang, Holzer, and Alesmaa, and Souchard to be computer generated (the claimed virtual) as taught/suggested by Hillesland for video gaming/3D simulation.

Claim 19 recites limitations similar in scope to those of claim 3 and is rejected for the same reasons.

Claim(s) 5-7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wick et al. (US 2021/0150804) in view of Wang et al. (US 10055898), Holzer et al. (US 2018/0255290), Alesmaa et al. (US 2021/0166476), and Souchard (US 2017/0098312) as applied to claim 1 above, and further in view of O’Connell et al. (US 2021/0225084).
Regarding claim 5, Wick as modified by Wang, Holzer, and Alesmaa, and Souchard does not teach/suggest: The method recited in claim 1, wherein the object is a vehicle that includes one or more wheels, and wherein the animation effect comprises turning one or more of the wheels. O’Connell, however, teaches/suggests wherein the object is a vehicle that includes one or more wheels, and wherein the animation effect comprises turning one or more of the wheels (O’Connell [0039]: “By way of non-limiting illustration, a virtual object depicting a car may be animated to show the wheels spinning”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify an animated object of Wick as modified by Wang, Holzer, and Alesmaa, and Souchard to be a car as taught/suggested by O’Connell to animate a car.

Regarding claim 6, Wick as modified by Wang, Holzer, and Alesmaa, and Souchard does not teach/suggest: The method recited in claim 1, wherein applying the animation effect comprises animating one or more components of the rendered object. O’Connell, however, teaches/suggests wherein applying the animation effect comprises animating one or more components of the rendered object (O’Connell [0039]: “By way of non-limiting illustration, a virtual object depicting a car may be animated to show the wheels spinning”). The same rationale to combine as set forth in the rejection of claim 5 above is incorporated herein.

Regarding claim 7, Marchak as modified by Holzer and O’Connell teaches/suggests: The method recited in claim 6, wherein determining the representation of the object comprises generating a respective three-dimensional representation of each of the one or more components (Holzer [0349]: “For example, a combined embedded MIDMR may comprise general object MIDMR with one or more additional MIDMRs embedded within MIDMR 3500. Such additional embedded MIDMRs may be specific feature MIDMRs which display a detailed view of a particular portion of object 3550” O’Connell [0039]: “By way of non-limiting illustration, a virtual object depicting a car may be animated to show the wheels spinning”). The same rationale to combine as set forth in the rejection of claims 1 and 5 above is incorporated herein.

Claim(s) 11 and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wick et al. (US 2021/0150804) in view of Wang et al. (US 10055898), Holzer et al. (US 2018/0255290), Alesmaa et al. (US 2021/0166476), and Souchard (US 2017/0098312) as applied to claim 1 above, and further in view of Rafii et al. (US 2018/0114264).
Regarding claim 11, Wick as modified by Wang, Holzer, and Alesmaa, and Souchard does not teach/suggest: The method recited in claim 1, wherein determining the three-dimensional representation of the scene comprises estimating a location for a light source associated with the scene. Rafii, however, teaches/suggests estimating a location for a light source associated with the scene (Rafii [0105]: “In some embodiments of the present invention, the 3D models may also include one or more light sources. By incorporating the sources of light of the object within the 3D model, embodiments of the present invention can further simulate the effect of the object on the lighting of the environment”). The claimed estimating is a well-known feature of a light source if its location is not known (Official Notice). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the 3D model of Wick as modified by Wang, Holzer, and Alesmaa, and Souchard to include one or more light sources as taught/suggested by Rafii to simulate lighting.

Regarding claim 12, Wick as modified by Wang, Holzer, and Alesmaa, and Souchard does not teach/suggest: The method recited in claim 1, wherein generating an action shot video of the scene comprises rendering a reflection of the scene onto the object. Rafii, however, teaches/suggests rendering a reflection of the scene onto the object (Rafii [0105]: “As such, embodiments of the present invention can render a simulation of how the dining room would look with the light bulbs in the light fixture turned on, including the rendering of shadows and reflections from surfaces within the room”). The same rationale to combine as set forth in the rejection of claim 11 above is incorporated herein.

Claim(s) 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wick et al. (US 2021/0150804) in view of Wang et al. (US 10055898), Holzer et al. (US 2018/0255290), Alesmaa et al. (US 2021/0166476), and Souchard (US 2017/0098312) as applied to claim 1 above, and further in view of Taraki et al. (US 9088550).
Regarding claim 16, Wick as modified by Wang, Holzer, and Alesmaa, and Souchard does not teach/suggest: The method recited in claim 1, the method further comprising: 
generating a transition sequence between the action shot base video and the action shot video.
Taraki, however, teaches/suggests generating a transition sequence (Taraki col. 12 ll. 31-36: “For example, the content transition module 112 may apply the blur 502 transition effect 418 to the eighteen image frames leading up to the transition point”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the animation clip of Wick as modified by Wang, Holzer, and Alesmaa, and Souchard to include the first image sequence transitioning to the second image sequence as taught/suggested by Taraki for special effects.
generating a transition sequence between the action shot base video and the action shot video (Wick [0044]: “A first step S11 includes generating a first individual image sequence with a real camera 10” [0047]: “The virtual camera 12 can be, for example, a parameter set for settings of an image synthesis program 18 that can generate a virtual image sequence 14, resulting from a virtual scene 16, in accordance with the further step S14 a second image sequence, with the camera settings and camera positions” Taraki col. 12 ll. 31-36: “For example, the content transition module 112 may apply the blur 502 transition effect 418 to the eighteen image frames leading up to the transition point”).
Response to Arguments
Applicant's arguments filed 05/27/2022 have been fully considered but they are moot. Specifically, Applicant's arguments regarding "the rendered object being positioned along the path through space" are moot in view of the new ground(s) of rejection set forth in this Office action.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
US 2014/0206443 – camera pose estimation
US 2017/0109930 – multi-view image
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANH-TUAN V NGUYEN whose telephone number is 571-270-7513. The examiner can normally be reached on M-F 9AM-5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KEE TUNG can be reached on 571-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ANH-TUAN V NGUYEN/
Primary Examiner, Art Unit 2611