DETAILED ACTIONS
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
The amendment filed July 22nd, 2022 has been entered. Claims 1-20 remain pending in application.

Response to Arguments
Applicant’s arguments with respect to claims 1, 7, and 15 have been considered but some are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
In response to applicant’s argument that Dekel does not teach “wherein the set of sparse feature points are computed from changes in position of the reference object due to motion of the reference object across the video frames” as recited claim 1 was found to be not persuasive. When read in light of the specification the reference object that was given as an example was a mountain range, a static object. Dekel teaches the same static features. The object appears to be moving in each frame however, it is due to the motion of the camera/viewer as seen in Fig. 5B and not the object itself. Therefore, Dekel teaches “wherein the set of sparse feature points are computed from changes in position of the reference object due to motion of the reference object across the video frames” as recited claim 1.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Dekel et al. (US 20210090279 A1), hereinafter referred to as Dekel, in view of Agarwala et al. (US 20130128121 A1), hereinafter referred to as Agarwala, and in further view of Granados "How Not to Be Seen –Object Removal from Videos of Crowded Scenes" (2012), hereinafter referred to as Granados.

Regarding claim 1, Dekel discloses a method (Fig. 9)  in which one or more processing devices (Fig. 2, processor 206) performs operations (para. 0005, “the computing device to perform operations”) comprising: 
accessing a scene (Fig. 9, step 900, “obtain, by a processor, a reference image and a target image representing an environment containing moving features and static features, para. 0057, “Images 302-309 may represent static features of an environment (e.g., boxes 314 and 316) and moving features of the environment (e.g., human 312). Static features may include objects and other physical features that are expected to remain stationary for a predetermined period of time, such as building structures, trees, roads, or sidewalks, among other possibilities. Moving features may include objects and other physical features that are expected to move within the predetermined period of time, such as humans, animals, or vehicles, among other possibilities.”) depicting a reference object (Fig. 9, step 900, “static features”) and a set of sparse feature points defining a three-dimensional model of the reference object (Fig. 9, step 904, determine, by the processor and based on motion parallax between the reference image and the target image, a static depth image that represents depth values of the static features in the target image), wherein the set of sparse feature points are computed from changes in position of the reference object due to motion of the reference object across the video frames (para. 0002, “The static depth image may be determined based on motion parallax between the reference image and the target image, and may be valid for static features of the target image.”, Fig. 5B, the position of the reference object or static feature 316 changed across the images, According to the specification, the reference object is the mountain range as shown in para. 0054, “three-dimensional feature points within a reference object such as a three- dimensional feature of the mountain range 204a-204b”, the mountain range is a static feature and does not move by itself so the motion of the mountain range across the video frames is due to motion parallax which is also seen in para. 0054, “geometric distortions can include parallax effects, pulling effects, perspective distortions, warping, axial rotations, radial distortions, asymmetries, etc. In this example, the mountain range 204 is axially rotated, compressed, and has a distanced perspective”), wherein the accessed scene has an annotation identifying a target region to be modified in one or more of the video frames (Fig. 5A, object mask 500, para. 0002, “an object mask configured to remove”).

Dekel does not explicitly disclose determining, from the three-dimensional model of the reference object, a motion constraint comprising a reference motion computed from the set of sparse feature points and computing a target motion of a target pixel subject to the motion constraint.
	However. Agarwala teaches determining, from the three-dimensional model of the reference object (Fig. 3C and 3D, Dekel discloses creating a three-dimension image or depth image of the static features), a motion constraint (para. 0006, “the video completion technique applies a subspace constraint technique that finds and tracks feature points in the video. These subspace tracks are used to form a model of the camera motion, and are also used to predict the locations of background scene points in frames where the background is occluded”) comprising a reference motion computed from the set of sparse feature points (para. 0090, “the video completion technique may apply a subspace constraint technique that finds and tracks feature points in the input video sequence to generate feature tracks, factors the feature tracks into a low-dimensional subspace, and generates a prediction of background scene motion according to the low-dimensional subspace”), computing a target motion of a target pixel subject to the motion constraint (Dekel teaches computing target motion seen in Fig. 5B, Fig. 1, step 104, perform motion planning, “para. 0090, “the technique warps the source frame using those predicted points as a guide, and thus arrives at a candidate region that can be used to at least partially fill the hole in the target frame. In at least some embodiments, a content-preserving warp technique may be used to warp the source frame. However, if there are large parallax effects, then the warped area in the source frame may not align correctly with the region around the hole in the target frame. Therefore, in at least some embodiments, the video completion technique may apply image consistency constraints to modify the warp so that the source frame region fills the hole in the target frame seamlessly”, para. 0061, “As indicated at 104 of FIG. 1, embodiments may perform motion planning on the eigen-trajectories, effectively smoothing the input motion while respecting the low rank relationship of the motion of points in the scene. Once the eigen-trajectories are computed using moving factorization, a final task before rendering is to smooth the eigen-trajectories to simulate smooth camera motion.”).
Dekel and Agarwala are both considered to be analogous to the claimed invention because they are in the same field of video completion or video inpainting. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as taught by Dekel to incorporate the teachings of Agarwala of determining, from the three-dimensional model of the reference object, a motion constraint comprising a reference motion computed from the set of sparse feature points and computing a target motion of a target pixel subject to the motion constraint. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because image consistency constraints can be applied to modify the warp so that it fills the hole seamlessly (Agarwala, Abstract).

Dekel does not explicitly disclose updating color data of the target pixel to correspond to the target motion. 
	However, Granados teaches updating color data of the target pixel to correspond to the target motion (Dekel discloses “fill in or inpaint depth values for the moving features” in para. 0031, Granados teaches updating color data in Sec. 6, “we attempt to find a spatio-temporal displacement (or offset) that points to another unoccluded pixel from which to copy the missing color”).
Dekel and Granados are both considered to be analogous to the claimed invention because they are in the same field of video completion or video inpainting. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as taught by Dekel to incorporate the teachings of Granados of updating color data of the target pixel to correspond to the target motion. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been so it would make that the result looks natural and consistent with the hole boundary (Granados, Sec. 7).

Regarding claim 2, the combination of Dekel in view of Agarwala and in further view of Granados discloses the method of claim 1 (Dekel, Fig. 9), the operations (Dekel, para. 0005, “the computing device to perform operations”) further comprising computing the set of sparse feature points by at least executing a structure from motion process that computes (Dekel, para. 0026,” the systems and operations disclosed herein approximate this ability of human depth perception by combining SfM and MVS methods with trained ML models”), from the video frames (Dekel, para. 0002, “determine depth values for images of a monoscopic video”), the three-dimensional model of the reference object (Dekel, Fig. 9, step 904, determine, by the processor and based on motion parallax between the reference image and the target image, a static depth image that represents depth values of the static features in the target image).

Regarding claim 3, the combination of Dekel in view of Agarwala and in further view of Granados discloses the method of claim 1(Dekel, Fig. 9), wherein determining the motion constraint (Agarwala, para. 0006, “the video completion technique applies a subspace constraint technique that finds and tracks feature points in the video. These subspace tracks are used to form a model of the camera motion, and are also used to predict the locations of background scene points in frames where the background is occluded”) comprises identifying a motion of the reference object (Agarwala, para. 0033, “new, smooth motion trajectories that respect geometric relationships between points are needed, so that the motion trajectories appear as the motion of a plausible, non-distorted view of the scene.”, para. 0042, “task is to create a new matrix of trajectories {circumflex over (M)} that guides the rendering of a new, stabilized video; this may be performed either by traditional full-frame warping or by content-preserving warps.”) through the target region that is consistent with the three-dimensional model (Agarwala, para. 0042, “This new matrix should both contain smooth trajectories and be consistent with the original 3D scene imaged by a moving camera”) derived from changes to a relative position of the reference object across the video frames (Agarwala, para. 0076, the position of the object changes in each frame).
Dekel and Agarwala are both considered to be analogous to the claimed invention because they are in the same field of video completion or video inpainting. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as taught by Dekel to incorporate the teachings of Agarwala wherein determining the motion constraint comprises identifying a motion of the reference object  through the target region that is consistent with the three-dimensional model derived from changes to a relative position of the reference object across the video frames. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because image consistency constraints can be applied to modify the warp so that it fills the hole seamlessly (Agarwala, Abstract).

Regarding claim 4, the combination of Dekel in view of Agarwala and in further view of Granados discloses the method of claim 1 (Dekel, Fig. 9), wherein computing the target motion of the target pixel subject to the motion constraint  (Agarwala, Fig. 1, step 104, perform motion planning, “para. 0090, “the technique warps the source frame using those predicted points as a guide, and thus arrives at a candidate region that can be used to at least partially fill the hole in the target frame. In at least some embodiments, a content-preserving warp technique may be used to warp the source frame. However, if there are large parallax effects, then the warped area in the source frame may not align correctly with the region around the hole in the target frame. Therefore, in at least some embodiments, the video completion technique may apply image consistency constraints to modify the warp so that the source frame region fills the hole in the target frame seamlessly”, para. 0061, “As indicated at 104 of FIG. 1, embodiments may perform motion planning on the eigen-trajectories, effectively smoothing the input motion while respecting the low rank relationship of the motion of points in the scene. Once the eigen-trajectories are computed using moving factorization, a final task before rendering is to smooth the eigen-trajectories to simulate smooth camera motion.”)comprises: 
	accessing a video frame in which the target region occludes the reference object (Agarwala, para. 0006, “background scene points in frames where the background is occluded”);
	specifying a first motion value for a first sub-region within the target region to the target motion (Agarwala, “At least some embodiments may assemble tracked features in the video into a trajectory matrix, factor the trajectory matrix into two low-rank matrices, and perform filtering or curve fitting in a low-dimensional linear space”, para. 0090, “in at least some embodiments, the video completion technique starts by predicting the location of static scene points in the area to fill (e.g., hole) in the target frame across one or more other frames”); and 
	applying, to the video frames, a machine-learning model that estimates a second motion value for a second sub-region within the target region based on both the specified first motion value and a boundary motion for the target region (Agarwala, para. 0090, “. In at least some embodiments, to predict the location of static scene points, the video completion technique may apply a subspace constraint technique that finds and tracks feature points in the input video sequence to generate feature tracks, factors the feature tracks into a low-dimensional subspace, and generates a prediction of background scene motion according to the low-dimensional subspace”, the prediction is the second motion value, para. 0092, “embodiments of the video completion technique may be applied to the video output of any video processing technique that crops edges of video frames, removes objects from video frames, or otherwise removes content from video frames to restore missing content, for example holes or border regions, in the video sequences”, Dekel discloses using a machine-learning model to estimate the motion within the target region as explained in Abstract).

Regarding claim 5, the combination of Dekel in view of Agarwala and in further view of Granados discloses the method of claim 1 (Dekel, Fig. 9), wherein the motion constraint is a first motion constraint (Agarwala, para. 0006, “the video completion technique applies a subspace constraint technique that finds and tracks feature points in the video. These subspace tracks are used to form a model of the camera motion, and are also used to predict the locations of background scene points in frames where the background is occluded”), the operations further comprising: 
	obtaining one or more user-specified sparse feature points (Granados, see Sec. 4, user guided completion, provide a tool for the user to add semantic understanding of the scene to the algorithm, Fig. 6), the one or more user- specified sparse feature points further defining the three-dimensional model of the reference object (Granados, see Fig. 6, the user marks the target region to be refined and source region); 
	determining, from the three-dimensional model of the reference object, a second motion constraint based at least in part on the one or more user-specified sparse feature points (Granados, see Fig. 6, Sec. 4, user constrains the search space to only-relevant spatio-temporal source region); and 
	re-computing the target motion of the target pixel subject to the first motion constraint and the second motion constraint  (Agarwala discloses computing target motion, Granados teaches re-computing a solution using the constrains, see Fig. 6, after computing a solution on the constrained space, the error is corrected).
Granados is considered to be analogous to the claimed invention because it is in the same field of video completion or video inpainting. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as taught by Dekel to incorporate the teachings of Granados to obtain one or more user-specified sparse feature points. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because it would assist the inpainting method in correcting artifacts (Granados, Sec 4).

Regarding claim 6, the combination of Dekel in view of Agarwala and in further view of Granados discloses the method of claim 1 (Dekel, Fig. 9), wherein the target region comprises an object to be removed or modified (Dekel, see Fig. 5A, [0002], and an object mask configured to remove, [0066], moving features will be removed from target image 308).

Regarding claim 7, Dekel discloses a computing system (Fig. 2) comprising: 
	a processing device (Fig. 2, processor 206); and 
	a non-transitory computer-readable medium (see [0131], computer readable medium may also include non-transitory computer readable media) communicatively coupled to the processing device (Fig. 2, processor 206) and storing program code, the processing device (Fig. 2, processor 206) configured to execute the program code (Fig. 9) and thereby performing operations (para. 0005, “the computing device to perform operations”) comprising: 
		accessing a set of video frames depicting a scene( Fig. 9, step 900, “obtain, by a processor, a reference image and a target image representing an environment containing moving features and static features, para. 0057, “Images 302-309 may represent static features of an environment (e.g., boxes 314 and 316) and moving features of the environment (e.g., human 312). Static features may include objects and other physical features that are expected to remain stationary for a predetermined period of time, such as building structures, trees, roads, or sidewalks, among other possibilities. Moving features may include objects and other physical features that are expected to move within the predetermined period of time, such as humans, animals, or vehicles, among other possibilities.”), the scene comprising a reference object (Fig. 9, step 900, “static features”); 
		generating a set of sparse feature points based on a first motion of the reference object across the video frames depicting the scene  (para. 0002, “The static depth image may be determined based on motion parallax between the reference image and the target image, and may be valid for static features of the target image.”, Fig. 5B, the position of the reference object or static feature 316 changed across the images, According to the specification, the reference object is the mountain range as shown in para. 0054, “three-dimensional feature points within a reference object such as a three- dimensional feature of the mountain range 204a-204b”, the mountain range is a static feature and does not move by itself so the motion of the mountain range across the video frames is due to motion parallax which is also seen in para. 0054, “geometric distortions can include parallax effects, pulling effects, perspective distortions, warping, axial rotations, radial distortions, asymmetries, etc. In this example, the mountain range 204 is axially rotated, compressed, and has a distanced perspective”), the set of sparse feature points comprising one or more sparse feature points corresponding to the reference object (Fig. 9, step 904, determine, by the processor and based on motion parallax between the reference image and the target image, a static depth image that represents depth values of the static features in the target image).

Dekel does not explicitly disclose wherein the one or more sparse feature points define a motion constraint comprising a reference motion and interpolating a target motion of a target pixel within a target region of the scene, wherein the target motion is subject to the motion constraint.
	However, Agarwala teaches wherein the one or more sparse feature points (Fig. 3C and 3D, Dekel discloses creating a three-dimension image or depth image of the static features) define a motion constraint (para. 0006, “the video completion technique applies a subspace constraint technique that finds and tracks feature points in the video. These subspace tracks are used to form a model of the camera motion, and are also used to predict the locations of background scene points in frames where the background is occluded”) comprising a reference motion  (para. 0090, “the video completion technique may apply a subspace constraint technique that finds and tracks feature points in the input video sequence to generate feature tracks, factors the feature tracks into a low-dimensional subspace, and generates a prediction of background scene motion according to the low-dimensional subspace”) and interpolating (para. 0076, “calculated by shifting the feature trajectory in time by .lamda. and interpolating its position at consecutive frames”) a target motion of a target pixel within a target region of the scene, wherein the target motion is subject to the motion constraint  (Fig. 1, step 104, perform motion planning, “para. 0090, “the technique warps the source frame using those predicted points as a guide, and thus arrives at a candidate region that can be used to at least partially fill the hole in the target frame. In at least some embodiments, a content-preserving warp technique may be used to warp the source frame. However, if there are large parallax effects, then the warped area in the source frame may not align correctly with the region around the hole in the target frame. Therefore, in at least some embodiments, the video completion technique may apply image consistency constraints to modify the warp so that the source frame region fills the hole in the target frame seamlessly”, para. 0061, “As indicated at 104 of FIG. 1, embodiments may perform motion planning on the eigen-trajectories, effectively smoothing the input motion while respecting the low rank relationship of the motion of points in the scene. Once the eigen-trajectories are computed using moving factorization, a final task before rendering is to smooth the eigen-trajectories to simulate smooth camera motion.”).
Dekel and Agarwala are both considered to be analogous to the claimed invention because they are in the same field of video completion or video inpainting. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the computing system as taught by Dekel to incorporate the teachings of Agarwala wherein the one or more sparse feature points define a motion constraint comprising a reference motion and interpolating a target motion of a target pixel within a target region of the scene, wherein the target motion is subject to the motion constraint.. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because image consistency constraints can be applied to modify the warp so that it fills the hole seamlessly (Agarwala, Abstract).
Dekel does not explicitly disclose updating color data of the target pixel to correspond to the target motion. 
	However, Granados teaches updating color data of the target pixel to correspond to the target motion (Dekel discloses “fill in or inpaint depth values for the moving features” in para. 0031, Granados teaches updating color data in Sec. 6, “we attempt to find a spatio-temporal displacement (or offset) that points to another unoccluded pixel from which to copy the missing color”).
Dekel and Granados are both considered to be analogous to the claimed invention because they are in the same field of video completion or video inpainting. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the computing system as taught by Dekel to incorporate the teachings of Granados of updating color data of the target pixel to correspond to the target motion. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been so it would make that the result looks natural and consistent with the hole boundary (Granados, Sec. 7).

Regarding claim 8, the combination of Dekel in view of Agarwala and in further view of Granados discloses the computing system of claim 7 (Dekel, Fig. 2), wherein the operations (Dekel, para. 0005, “the computing device to perform operations”) further comprise: computing the set of sparse feature points by at least executing a structure from motion process that computes (Dekel, para. 0026,” the systems and operations disclosed herein approximate this ability of human depth perception by combining SfM and MVS methods with trained ML models”) that generates a three-dimensional reconstruction of the reference object (Dekel, Fig. 9, step 904, determine, by the processor and based on motion parallax between the reference image and the target image, a static depth image that represents depth values of the static features in the target image)t, the three-dimensional reconstruction of the reference object comprising a three-dimensional pixel map (Dekel, Fig. 9, step 904, static depth image, the depth image has pixels that is in three-dimension, para. 0026, “these systems and operations are configured to generate depth maps from monoscopic images of a video containing both moving and static features, where the video is captured using a moving monoscopic camera”).

Regarding claim 9, the combination of Dekel in view of Agarwala and in further view of Granados discloses the computing system of claim 7 (Dekel, Fig. 2), wherein the target motion is a first target motion (Dekel, Fig. 5B), the target pixel is a first target pixel, the target region comprises a boundary point (Dekel, see Fig. 5A and 5B, object mask shows boundary of target region which is used for the optical flow image), and wherein the operations further comprise: accessing a video frame of the set of video frames, the video frame comprising the reference object (Dekel, Fig. 9, step 900, static features), wherein the reference object is partially occluded by a sub-region  of the target region in the video frame (Dekel, see Fig. 5B, the static feature 314 is occluded by the moving person partially); and interpolating a second target motion of a second target pixel within the target region of the scene, the second target motion corresponding to the sub-region of the target region (Dekel, see Fig. 5B, optical flow image also show motion flow for the moving person as well as the static feature), wherein the second target motion based at least in part on at least on both the first target motion and the boundary point of the target region (Dekel, Fig. 5B, the optical flow of the moving person is based on the object mask 500 which defines the boundary point of the target region).

Regarding claim 10, the combination of Dekel in view of Agarwala and in further view of Granados discloses the computing system of claim 7 (Dekel, Fig. 2), wherein the motion constraint is configured to provide a pixel value of the target pixel (Agarwala, Fig. 1, step 104, perform motion planning, “para. 0090, “the technique warps the source frame using those predicted points as a guide, and thus arrives at a candidate region that can be used to at least partially fill the hole in the target frame. In at least some embodiments, a content-preserving warp technique may be used to warp the source frame. However, if there are large parallax effects, then the warped area in the source frame may not align correctly with the region around the hole in the target frame. Therefore, in at least some embodiments, the video completion technique may apply image consistency constraints to modify the warp so that the source frame region fills the hole in the target frame seamlessly”, para. 0061, “As indicated at 104 of FIG. 1, embodiments may perform motion planning on the eigen-trajectories, effectively smoothing the input motion while respecting the low rank relationship of the motion of points in the scene. Once the eigen-trajectories are computed using moving factorization, a final task before rendering is to smooth the eigen-trajectories to simulate smooth camera motion.”), and wherein the operations further comprise: determining the motion constraint by identifying a motion vector of the reference object through the target region (Agarwala, “At least some embodiments may assemble tracked features in the video into a trajectory matrix, factor the trajectory matrix into two low-rank matrices, and perform filtering or curve fitting in a low-dimensional linear space”, para. 0090, “in at least some embodiments, the video completion technique starts by predicting the location of static scene points in the area to fill (e.g., hole) in the target frame across one or more other frames”), the motion vector being derived from spatiotemporal changes to the reference object across the set of video frames (Agarwala, para. 0076, the positions of the object changes in each frame and the trajectory or motion vector is derived from this); and updating the color data of the target pixel by changing a pixel value of the target pixel based at least in part on the motion constraint (Agarwala teaches changing the pixel or video completion using the motion constraint and Granados teaches updating the color data) .

Regarding claim 11, the combination of Dekel in view of Agarwala and in further view of Granados discloses the computing system of claim 7 (Dekel, Fig. 2), wherein the operations (Dekel, Fig. 9) further comprise: receiving a user input comprising at least one sparse feature point corresponding to the reference object (Granados, see Sec. 4, user guided completion, provide a tool for the user to add semantic understanding of the scene to the algorithm, Fig. 6); and updating the set of sparse feature points based on the user input  (Agarwala discloses computing target motion, Granados teaches re-computing a solution using the constrains, see Fig. 6, after computing a solution on the constrained space, the error is corrected).	
Granados is considered to be analogous to the claimed invention because it is in the same field of video completion or video inpainting. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the computing system as taught by Dekel to incorporate the teachings of Granados to obtain one or more user-specified sparse feature points. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because it would assist the inpainting method in correcting artifacts (Granados, Sec 4).

Regarding claim 12, the combination of Dekel in view of Agarwala and in further view of Granados discloses the computing system of claim 11 (Dekel, Fig. 2), wherein the motion constraint is a first motion constraint (Agarwala, para. 0006, “the video completion technique applies a subspace constraint technique that finds and tracks feature points in the video. These subspace tracks are used to form a model of the camera motion, and are also used to predict the locations of background scene points in frames where the background is occluded”), and wherein the operations (Dekel, Fig. 9)  further comprise: determining a second motion constraint (Granados, see Fig. 6, Sec. 4, user constrains the search space to only-relevant spatio-temporal source region)  based at least in part on the user input (Granados, see Sec. 4, user guided completion, provide a tool for the user to add semantic understanding of the scene to the algorithm, Fig. 6); and re-computing the target motion of the target pixel subject to the first motion constraint and the second motion constraint (Agarwala discloses computing target motion, Granados teaches re-computing a solution using the constrains, see Fig. 6, after computing a solution on the constrained space, the error is corrected).
Granados is considered to be analogous to the claimed invention because it is in the same field of video completion or video inpainting. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the computing system as taught by Dekel to incorporate the teachings of Granados to obtain one or more user-specified sparse feature points. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because it would assist the inpainting method in correcting artifacts (Granados, Sec 4).

Regarding claim 13, the combination of Dekel in view of Agarwala and in further view of Granados discloses the computing system of claim 7 (Dekel, Fig. 2), wherein the operations further comprise: obtaining a subset of sparse feature points, the subset of sparse feature points being randomly selected or user-selected (Granados, see Sec. 4, user guided completion, provide a tool for the user to add semantic understanding of the scene to the algorithm, Fig. 6); and validating the updated color data of the target region based on the subset of sparse feature points by re-computing the color data of the target region across a subset of the set of video frames in a forward temporal order or a reverse temporal order (Granados, see Sec. 4, Fig. 6, “User-assisted refinement: (a) After automatic in-painting  the  leg  of  the  person  was  incorrectly  completed.(b, c) The user marks the target region to be refined (red),and  marks  a  suitable  source  region  in  the  video  volume(green). (d) After computing a solution on the constrained space, the error is corrected.”).
Granados is considered to be analogous to the claimed invention because it is in the same field of video completion or video inpainting. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the computing system as taught by Dekel to incorporate the teachings of Granados to validate the updated color data by re-computing. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because it would assist the inpainting method in correcting artifacts (Granados, Sec 4).

Regarding claim 14, the combination of Dekel in view of Agarwala and in further view of Granados discloses the computing system of claim 7 (Dekel, Fig. 2), wherein the target region comprises an object to be removed or modified (Dekel, see Fig. 5A, [0002], and an object mask configured to remove, [0066], moving features will be removed from target image 308).

Regarding claim 15, Dekel discloses a non-transitory computer-readable medium (see [0131], computer readable medium may also include non-transitory computer readable media) having program code of a video editing tool stored thereon, wherein the program code, when executed by one or more processing devices (Fig. 2, processor 206), configures the one or more processing devices (Fig. 2, processor 206) to perform operations (Fig. 9) comprising: 
	accessing a scene (Fig. 9, step 900, “obtain, by a processor, a reference image and a target image representing an environment containing moving features and static features, para. 0057, “Images 302-309 may represent static features of an environment (e.g., boxes 314 and 316) and moving features of the environment (e.g., human 312). Static features may include objects and other physical features that are expected to remain stationary for a predetermined period of time, such as building structures, trees, roads, or sidewalks, among other possibilities. Moving features may include objects and other physical features that are expected to move within the predetermined period of time, such as humans, animals, or vehicles, among other possibilities.”) depicting a reference object (Fig. 9, step 900, “static features”) and a set of sparse feature points defining a three-dimensional model of the reference object (Fig. 9, step 904, determine, by the processor and based on motion parallax between the reference image and the target image, a static depth image that represents depth values of the static features in the target image), wherein the set of sparse feature points are generated from changes in relative positions across the video frames of the features that define the model of the reference object (Fig. 9, step 904, determine, by the processor and based on motion parallax between the reference image and the target image, a static depth image that represents depth values of the static features in the target image, there is change of position of the objects between the source image and target image), wherein the changes in relative position of the features is due to motion of the reference object across the video frames (para. 0002, “The static depth image may be determined based on motion parallax between the reference image and the target image, and may be valid for static features of the target image.”, Fig. 5B, the position of the reference object or static feature 316 changed across the images, According to the specification, the reference object is the mountain range as shown in para. 0054, “three-dimensional feature points within a reference object such as a three- dimensional feature of the mountain range 204a-204b”, the mountain range is a static feature and does not move by itself so the motion of the mountain range across the video frames is due to motion parallax which is also seen in para. 0054, “geometric distortions can include parallax effects, pulling effects, perspective distortions, warping, axial rotations, radial distortions, asymmetries, etc. In this example, the mountain range 204 is axially rotated, compressed, and has a distanced perspective”), wherein the set sparse feature points comprises one or more two-dimensional feature points or three-dimensional feature points (Fig. 9, the depth image comprises of three-dimensional feature points), wherein the accessed scene has an annotation identifying a target region to be modified in one or more of the video frames (Fig. 5A, object mask 500, para. 0002, “an object mask configured to remove”)
	
Dekel does not explicitly disclose determining, based on the reference object, a motion constraint comprising a reference motion computed from the set of sparse feature points and computing a target motion of a target pixel subject to the motion constraint.
	However. Agarwala teaches determining, based on the reference object (Fig. 3C and 3D, Dekel discloses creating a three-dimension image or depth image of the static features), a motion constraint (para. 0006, “the video completion technique applies a subspace constraint technique that finds and tracks feature points in the video. These subspace tracks are used to form a model of the camera motion, and are also used to predict the locations of background scene points in frames where the background is occluded”) comprising a reference motion computed from the set of sparse feature points (para. 0090, “the video completion technique may apply a subspace constraint technique that finds and tracks feature points in the input video sequence to generate feature tracks, factors the feature tracks into a low-dimensional subspace, and generates a prediction of background scene motion according to the low-dimensional subspace”), computing a target motion of a target pixel subject to the motion constraint (Dekel teaches computing target motion seen in Fig. 5B, Fig. 1, step 104, perform motion planning, “para. 0090, “the technique warps the source frame using those predicted points as a guide, and thus arrives at a candidate region that can be used to at least partially fill the hole in the target frame. In at least some embodiments, a content-preserving warp technique may be used to warp the source frame. However, if there are large parallax effects, then the warped area in the source frame may not align correctly with the region around the hole in the target frame. Therefore, in at least some embodiments, the video completion technique may apply image consistency constraints to modify the warp so that the source frame region fills the hole in the target frame seamlessly”, para. 0061, “As indicated at 104 of FIG. 1, embodiments may perform motion planning on the eigen-trajectories, effectively smoothing the input motion while respecting the low rank relationship of the motion of points in the scene. Once the eigen-trajectories are computed using moving factorization, a final task before rendering is to smooth the eigen-trajectories to simulate smooth camera motion.”).
Dekel and Agarwala are both considered to be analogous to the claimed invention because they are in the same field of video completion or video inpainting. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the non-transitory computer-readable medium as taught by Dekel to incorporate the teachings of Agarwala of determining, from the three-dimensional model of the reference object, a motion constraint comprising a reference motion computed from the set of sparse feature points and computing a target motion of a target pixel subject to the motion constraint. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because image consistency constraints can be applied to modify the warp so that it fills the hole seamlessly (Agarwala, Abstract).

Dekel does not explicitly disclose updating color data of the target pixel to correspond to the target motion. 
	However, Granados teaches updating color data of the target pixel to correspond to the target motion (Dekel discloses “fill in or inpaint depth values for the moving features” in para. 0031, Granados teaches updating color data in Sec. 6, “we attempt to find a spatio-temporal displacement (or offset) that points to another unoccluded pixel from which to copy the missing color”).
Dekel and Granados are both considered to be analogous to the claimed invention because they are in the same field of video completion or video inpainting. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the non-transitory computer-readable medium as taught by Dekel to incorporate the teachings of Granados of updating color data of the target pixel to correspond to the target motion. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been so it would make that the result looks natural and consistent with the hole boundary (Granados, Sec. 7).

Regarding claim 16, the combination of Dekel in view of Agarwala and in further view of Granados discloses the non-transitory computer-readable medium of claim 15 (Dekel, see [0131], computer readable medium may also include non-transitory computer readable media), wherein the motion constraint is a first motion constraint (Agarwala, para. 0006, “the video completion technique applies a subspace constraint technique that finds and tracks feature points in the video. These subspace tracks are used to form a model of the camera motion, and are also used to predict the locations of background scene points in frames where the background is occluded”), the operations further comprising: 
	obtaining one or more user-specified sparse feature points (Granados, see Sec. 4, user guided completion, provide a tool for the user to add semantic understanding of the scene to the algorithm, Fig. 6), the one or more user- specified sparse feature points further defining the three-dimensional model of the reference object (Granados, see Fig. 6, the user marks the target region to be refined and source region); 
	determining, from the three-dimensional model of the reference object, a second motion constraint based at least in part on the one or more user-specified sparse feature points (Granados, see Fig. 6, Sec. 4, user constrains the search space to only-relevant spatio-temporal source region); and 
	re-computing the target motion of the target pixel subject to the first motion constraint and the second motion constraint  (Agarwala discloses computing target motion, Granados teaches re-computing a solution using the constrains, see Fig. 6, after computing a solution on the constrained space, the error is corrected).
Granados is considered to be analogous to the claimed invention because it is in the same field of video completion or video inpainting. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method as taught by Dekel to incorporate the teachings of Granados to obtain one or more user-specified sparse feature points. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. The motivation for the proposed modification would have been because it would assist the inpainting method in correcting artifacts (Granados, Sec 4).

Regarding claim 17, the combination of Dekel in view of Agarwala and in further view of Granados discloses the non-transitory computer-readable medium of claim 15 (Dekel, see [0131], computer readable medium may also include non-transitory computer readable media), wherein the target region comprises an object to be removed or modified (Dekel, see Fig. 5A, [0002], and an object mask configured to remove, [0066], moving features will be removed from target image 308).

Regarding claim 18, the combination of Dekel in view of Agarwala and in further view of Granados discloses the non-transitory computer-readable medium of claim 15 (Dekel, see [0131], computer readable medium may also include non-transitory computer readable media), wherein the model of the reference object is a three-dimensional model (Dekel, Fig. 9, step 904, determine, by the processor and based on motion parallax between the reference image and the target image, a static depth image that represents depth values of the static features in the target image), and wherein the operations (Dekel, Fig. 9) further comprise: computing the set of computer-generated sparse feature points by at least executing a structure from motion process (Dekel, para. 0026,” the systems and operations disclosed herein approximate this ability of human depth perception by combining SfM and MVS methods with trained ML models”) that computes, from the video frames (Dekel, para. 0002, “determine depth values for images of a monoscopic video”), the three-dimensional model of the reference object (Dekel, Fig. 9, step 904, determine, by the processor and based on motion parallax between the reference image and the target image, a static depth image that represents depth values of the static features in the target image).

Regarding claim 19, the combination of Dekel in view of Agarwala and in further view of Granados discloses the non-transitory computer-readable medium of claim 15 (Dekel, see [0131], computer readable medium may also include non-transitory computer readable media),  wherein the model is a three-dimensional model (Dekel, Fig. 9, step 904, determine, by the processor and based on motion parallax between the reference image and the target image, a static depth image that represents depth values of the static features in the target image), and wherein determining the motion constraint (Agarwala, para. 0006, “the video completion technique applies a subspace constraint technique that finds and tracks feature points in the video. These subspace tracks are used to form a model of the camera motion, and are also used to predict the locations of background scene points in frames where the background is occluded”)comprises: identifying a motion vector of the reference object through the target region n(Agarwala, “At least some embodiments may assemble tracked features in the video into a trajectory matrix, factor the trajectory matrix into two low-rank matrices, and perform filtering or curve fitting in a low-dimensional linear space”, para. 0090, “in at least some embodiments, the video completion technique starts by predicting the location of static scene points in the area to fill (e.g., hole) in the target frame across one or more other frames”) that is consistent with the three-dimensional model (Agarwala, para. 0042, “This new matrix should both contain smooth trajectories and be consistent with the original 3D scene imaged by a moving camera”) derived from changes to a relative position of the reference object across the video frames (Agarwala, para. 0076, the position of the object changes in each frame). 

Regarding claim 20, the combination of Dekel in view of Agarwala and in further view of Granados discloses the non-transitory computer-readable medium of claim 15 (Dekel, see [0131], computer readable medium may also include non-transitory computer readable media),  wherein computing the target motion of the target pixel subject to the motion constraint  (Agarwala, Fig. 1, step 104, perform motion planning, “para. 0090, “the technique warps the source frame using those predicted points as a guide, and thus arrives at a candidate region that can be used to at least partially fill the hole in the target frame. In at least some embodiments, a content-preserving warp technique may be used to warp the source frame. However, if there are large parallax effects, then the warped area in the source frame may not align correctly with the region around the hole in the target frame. Therefore, in at least some embodiments, the video completion technique may apply image consistency constraints to modify the warp so that the source frame region fills the hole in the target frame seamlessly”, para. 0061, “As indicated at 104 of FIG. 1, embodiments may perform motion planning on the eigen-trajectories, effectively smoothing the input motion while respecting the low rank relationship of the motion of points in the scene. Once the eigen-trajectories are computed using moving factorization, a final task before rendering is to smooth the eigen-trajectories to simulate smooth camera motion.”)comprises: 
	accessing a video frame in which the target region occludes the reference object (Agarwala, para. 0006, “background scene points in frames where the background is occluded”);
	specifying a first motion value for a first sub-region within the target region to the target motion (Agarwala, “At least some embodiments may assemble tracked features in the video into a trajectory matrix, factor the trajectory matrix into two low-rank matrices, and perform filtering or curve fitting in a low-dimensional linear space”, para. 0090, “in at least some embodiments, the video completion technique starts by predicting the location of static scene points in the area to fill (e.g., hole) in the target frame across one or more other frames”); and 
	applying, to the video frames, a machine-learning model that estimates a second motion value for a second sub-region within the target region based on both the specified first motion value and a boundary motion for the target region (Agarwala, para. 0090, “. In at least some embodiments, to predict the location of static scene points, the video completion technique may apply a subspace constraint technique that finds and tracks feature points in the input video sequence to generate feature tracks, factors the feature tracks into a low-dimensional subspace, and generates a prediction of background scene motion according to the low-dimensional subspace”, the prediction is the second motion value, para. 0092, “embodiments of the video completion technique may be applied to the video output of any video processing technique that crops edges of video frames, removes objects from video frames, or otherwise removes content from video frames to restore missing content, for example holes or border regions, in the video sequences”, Dekel discloses using a machine-learning model to estimate the motion within the target region as explained in Abstract).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DENISE G ALFONSO whose telephone number is (571)272-1360. The examiner can normally be reached Monday - Friday 7:30 - 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire Wang can be reached on 571-270-1051. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DENISE G ALFONSO/Examiner, Art Unit 2663                               

/CLAIRE X WANG/Supervisory Patent Examiner, Art Unit 2663