Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 8/14/2020, 5/27/2021, and 12/17/2021 are being considered by the examiner.

Claim Objections
Claim 3 is objected to because of the following informalities:
“a the 3D convolution kernel” in line 4.  Appropriate correction is required.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-6, 8-13 and 15-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Non-patent literature “Adversarial Spatio-Temporal Learning for Video Deblurring” by Zhang et al. (hereinafter Zhang).  
A copy of Zhang provided by the applicant has a publication date of 08/2015 as shown below. Therefore, it qualifies as an 102(a)(1) reference.

    PNG
    media_image1.png
    169
    725
    media_image1.png
    Greyscale

For claim 1, Zhang teaches a video deblurring method (see, e.g., the proposed DBLRNet framework in FIG. 2, which teaches generating deblurred frames of a video), the method comprising: 
acquiring, by processing circuitry of an electronic device, N continuous image frames from a video clip, the N being a positive integer, and the N continuous image frames including a blurry image frame to be processed (see, e.g., section III A and Fig. 2, which teach receiving five time-consecutive blurry frames of a video as input); 
performing, by the processing circuitry of the electronic device, three-dimensional (3D) convolution processing on the N continuous image frames with a generative adversarial network model, to acquire spatio-temporal information corresponding to the blurry image frame (see, e.g., section III A and FIGS. 2 and 3, which teach performing 3D convolution on the input to acquire joint spatial-temporal representations using GAN architecture including a generator and a discriminator), the spatio- temporal information including spatial feature information of the blurry image frame and temporal feature information between the blurry image frame and a neighboring image frame of the N continuous image frames (see, e.g., abstract, section III A and FIG. 2, which teach that from the 3D convolutions with kernel size of 3x3x3, the examiner interprets the first two 3s as the claimed spatial information of the blurry image frame in the spatial domain, i.e., image plane, and the last 3 as the claimed temporal information in the temporal domain, i.e., between neighboring frames); and 
performing, by the processing circuitry of the electronic device, deblurring processing on the blurry image frame by using the spatio-temporal information corresponding to the blurry image frame through the generative adversarial network model, to output a sharp image frame (see, e.g., section III A and FIGS. 2 and 3, which disclose the DBLRNet framework outputting a deblurred central frame using the joint spatial-temporal feature representations, wherein the architecture of the framework consists of a generator and a discriminator; the examiner interprets “Output” in FIG. 2 as the claimed sharp image frame).

Claims 8 and 15 recite video deblurring apparatus and non-transitory computer readable medium, each of which has a scope that is similar to that of claim 1.  As such, claims 8 and 15 are rejected for the rationales provided above with respect to claim 1.

For claim 2, Zhang teaches the method according to claim 1, wherein the generative adversarial network model includes a generative network model and a discriminative network model (see, e.g., the architecture of the DBLRGAN in section III B and FIG. 3); and 
before the performing, by the processing circuitry of the electronic device, the three- dimensional (3D) convolution processing on the N continuous image frames with the generative adversarial network model (see, e.g., section IV A and B, which teach training the DBLRNet by performing deblurring until the content loss decrease is minimal; as the deblurring performed for training purpose is same as the deblurring after training, albeit with updated parameters, portions cited above with respect to claim 1 correspond to below limitation as well), the method further includes 
acquiring, by the processing circuitry of the electronic device, N continuous sample image frames and a real sharp image frame for discrimination from a video sample library, the N continuous sample image frames comprising a blurry sample image frame for training, and the real sharp image frame corresponding to the blurry sample image frame (see, e.g., section III A and Figs. 2 and 3, which teach receiving five time-consecutive blurry frames of a video, i.e. “Blurry” image frame in FIG. 3, as input; the examiner interprets “Sharp” image frame in FIG. 3 as the claimed real sharp image frame ); 
extracting, by the processing circuitry of the electronic device, spatio-temporal information corresponding to the blurry sample image frame from the N continuous sample image frames by using a 3D convolution kernel in the generative network model (see, e.g., section III A and FIGS. 2 and 3, which teach performing 3D convolution on the input to acquire joint spatial-temporal representations using GAN architecture including a generator and a discriminator); 
performing, by the processing circuitry of the electronic device, the deblurring processing on the blurry sample image frame by using the spatio-temporal information corresponding to the blurry sample image frame through the generative network model, to output a sharp sample image frame (see, e.g., section III A and FIGS. 2 and 3, which disclose the DBLRNet framework outputting a deblurred central frame using the joint spatial-temporal feature representations, wherein the architecture of the framework consists of a generator and a discriminator; the examiner interprets “Output” image frame in FIG. 3 as the claimed sharp sample image frame); and 
training, by the processing circuitry of the electronic device, the generative network model and the discriminative network model alternately according to the sharp sample image frame and the real sharp image frame (see, e.g., sections III B and FIG. 3, which teach training the generator and the discriminator alternately according to the Output and Sharp image frames).

Claims 9 and 16 recite video deblurring apparatus and non-transitory computer readable medium, each of which has a scope that is similar to that of claim 2.  As such, claims 9 and 16 are rejected for the rationales provided above with respect to claim 2.

For claim 3, Zhang teaches the method according to claim 2, wherein the generative network model includes a first 3D convolution kernel, and a second 3D convolution kernel (see, e.g., 3x3xx3 kernels in layers 1 and 2 of BDLRNet, i.e., the generator, in section III A, FIG. 2 and Table I,); and 
the extracting, by the processing circuitry of the electronic device, the spatio-temporal information continuous sample image frames by using a the 3D convolution kernel in the generative network model includes 
performing convolution processing on the N continuous sample image frames with the first 3D convolution kernel, to acquire low-level spatio-temporal features corresponding to the blurry sample image frame (see, e.g., section III A and Layer 1 in FIG. 2 and Table I, which teach using a 3x3x3 kernel in Layer 1 to acquire Feature map 1, corresponding to 1st set of spatio-temporal information of the Input image frame; the examiner interprets the feature map 1 as the claimed low-level spatio-temporal features because the last paragraph in section III A states that the feature map 1 is a lower level feature map compared to the feature map 2); 
performing the convolution processing on the low-level spatio-temporal features with the second 3D convolution kernel, to acquire high-level spatio-temporal features corresponding to the blurry sample image frame (see, e.g., section III A and Layer 2 in FIG. 2 and Table I, which teach using a 3x3x3 kernel in Layer 2 to acquire Feature map 2, corresponding to 2nd set of spatio-temporal information of the Input image frame; the examiner interprets the feature map 1 as the claimed high-level spatio-temporal features because the last paragraph in section III A states that the feature map 2 is a higher level feature map compared to the feature map 1); and 
fusing the high-level spatio-temporal features corresponding to the blurry sample image frame, to acquire the spatio-temporal information corresponding to the blurry sample image frame (see, e.g., the feature map 2 between Layer 2 and Layer 3 in FIG. 2, which is the feature map 2 in Layer 2 combined, fused, into a single string to acquire spatio-temporal information of the input image frame).

Claims 10 and 17 each has a scope that is similar to that of claim 3.  As such, claims 10 and 17 are rejected for the rationales provided above with respect to claim 3.

For claim 4, Zhang teaches the method according to claim 2, wherein the generative network model further includes M 2D convolution kernels, the M being a positive integer (see, e.g., 3x3x1 convolution kernels in Layers 3-35 in FIG. 2 and Table I); and 
the performing, by the processing circuitry of the electronic device, the deblurring processing on the blurry sample image frame by using the spatio-temporal information corresponding to the blurry sample image frame through the generative network model, to output a sharp sample image frame includes 
performing convolution processing on the spatio-temporal information corresponding to the blurry sample image frame by using each 2D convolution kernel of the M 2D convolution kernels in sequence, and acquiring the sharp sample image frame after the convolution processing is performed by using the last 2D convolution kernel of the M 2D convolution kernels (see, e.g., layers and kernel size in Table I and FIG. 2, which shows the performing convolution (Conv) on the feature map 2 using 3x3x1 convolution kernels in sequence, i.e., from layer 3 to layer 35 to reach the output).

Claims 11 and 18 each has a scope that is similar to that of claim 4.  As such, claims 11 and 18 are rejected for the rationales provided above with respect to claim 4.

For claim 5, Zhang teaches the method according to claim 4, wherein an odd 2D convolution kernel of the M 2D convolution kernels includes a first convolutional layer, a normalization layer, and an activation function, and an even 2D convolution kernel of the M 2D convolution kernels includes a second convolutional layer and an activation function (see, e.g., Table I and FIG. 2, which show odd layers with Batch Normalization BN and ReLU and even layers with BN only, and see also table II, which teaches that the ReLu represents the action function).

Claims 12 and 19 each has a scope that is similar to that of claim 5.  As such, claims 12 and 19 are rejected for the rationales provided above with respect to claim 5.

For claim 6, Zhang teaches the method according to claim 2, wherein the training the generative network model and the discriminative network model alternately, includes 
acquiring a reconstruction loss function according to the sharp sample image frame and the real sharp image frame (see, e.g., section III D and in FIG. 3, which teach acquiring the content loss function “Loss 1” according to the sharp sample image frame and the real sharp image frame); 
training the generative network model through the reconstruction loss function (see, e.g., section III E, which teaches updating the generator though an adversarial loss function which is based on the content loss function); 
training the discriminative network model by using the real sharp image frame and the sharp sample image frame, to acquire an adversarial loss function outputted by the discriminative network model (see, e.g., section III E and FIG. 3, which teaches training the discriminator using the real sharp image frame and the sharp image frame to acquire the adversarial loss function “Loss 2”); and 
training the generative network model continually through the adversarial loss function (see, e.g., section III E, which teaches using the adversarial loss function to update parameters of the generator during training such that the generator can generate sharp frames similar to the real-world frames).

Claims 13 and 20 each has a scope that is similar to that of claim 6.  As such, claims 13 and 20 are rejected for the rationales provided above with respect to claim 6.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 8 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over NPL titled “Removing Motion Blur With Space-Time Processing” by Takeda et al (hereinafter Takeda)  in view of Chinese  Patent Application Publication No. 108416752 A to Chen et al. (hereinafter Chen)
For claim 1, Takeda as applied teaches a video deblurring method (see, e.g., abstract, lines 1-19 in left col. on page 2990, lines 8-14 in left col. on page 2992, which teach space-time (3D) video deblurring method), the method comprising: 
acquiring, by processing circuitry of an electronic device, N continuous image frames from a video clip, the N being a positive integer, and the N continuous image frames including a blurry image frame to be processed (see, e.g., FIG. 4 and lines 5-13 in left col. on page 2994, which teach acquiring frames of the desired video to be deblurred); 
performing, by the processing circuitry of the electronic device, three-dimensional (3D) convolution processing on the N continuous image frames (see, e.g., FIGS. 5 and 6 and lines 14-36 on left col. on page 2994, lines 27-30 and 46-47 in right col. on page 2994, which teach using spatiotemporal (3D) upscaling using 3-D PSF (point spread function) kernels to acquire the space-time upscaled frames to be deblurred, the examiner interprets performing the spatiotemporal (3D) upscaling using 3-D PSF kernels as the claimed 3d convolution processing because it acquires the upscaled video that includes the spatio-temporal information corresponding to the frames to be deblurred), the spatio-temporal information including spatial feature information of the blurry image frame (see, e.g., lines 46-51 in right col. on page 2994, which teach that performing the spatiotemporal (3D) upscaling includes using the spatial upscaling factor, which corresponds to acquiring the claimed spatial feature information of the video sequence, which includes the blurry frames) and temporal feature information between the blurry image frame and a neighboring image frame of the N continuous image frames (see, e.g., lines 46-51 in right col. on page 2994, which teach that performing the spatiotemporal (3D) upscaling includes using the temporal upscaling factor, which corresponds acquiring the claimed temporal feature information of the video sequence, which includes the blurry frames);
performing, by the processing circuitry of the electronic device, deblurring processing on the blurry image frame by using the spatio-temporal information corresponding to the blurry image frame (see, e.g., FIG. 5 and lines 20-22, 45-49 in left col. on page 2996, lines 1-8 in right co. on page 2996, which teach performing a 3-D deblurring method, using 3-D spatial and temporal PSF kernels, on the spatiotemporally upscaled video sequence to recover the pixels across space and time).
Takeda as applied does not explicitly teach using a generative adversarial network model to perform convolution and deblurring processing.  But in the analogous art, Chen teaches using a trained GAN to convolve a blurred image to extract semantic information therefrom and to deblur the input image of the input image using the extracted information to output a deblurred image (see, e.g., lines 19-21 and 36-46 on page 13 of a machine-translated copy of Chen).
It would have been obvious to one of ordinary skill in the art at the time of the claimed invention to modify the teachings of Takeda with the teaching of Chen such that the above described convolution and deblurring processing of Takeda is performed suing the GAN of Chen because doing so would generate  a clear image with more realistic deblurring and more in line with human perception (see, e.g., lines 30-47 on page 14 of a machine-translated copy of Chen).

Claims 8 and 15 recite video deblurring apparatus and non-transitory computer readable medium, each of which has a scope that is similar to that of Claim 1.  As such, claims 8 and 15 are rejected for the rationales provided above with respect to claim 1.

Allowable Subject Matter
Claims 7 and 14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Please see appended form-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WOO RHIM whose telephone number is (571)272-6560. The examiner can normally be reached Mon - Fri 8:30 am - 5:00 pm et.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Leonard Chang can be reached on 571-270-3691. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

WOO CHUL RHIM
Examiner
Art Unit 4174




/CHAN S PARK/Supervisory Patent Examiner, Art Unit 2669