Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Notice to Applicants
This communication is in response to the Application filed on 7/31/2018.
Claims 1-13 are pending.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-5, 7-11 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al.  (U.S Publication No. 2018/0114056) (hereafter, "Wang") in view of CHEN et al. (WO 2019015466) (hereafter, "CHEN") and further in view of Kim et al. (U.S. Publication No. 2019/0205334) (hereafter, "Kim").
Regarding claim 1, Wang teaches acquiring at least two frames of facial images extracted from a target video; and inputting the at least two frames of facial images into a pre-trained generative model ([0010] First and second neural networks for respectively receiving first and second facial images are prepared, with each neural network sharing the same parameters and initialized with the same pre-trained convolutional neural network ... first and second facial images can be used to determine spatio-temporal constraints derived from video image frames); the standard facial image and the single facial generative image containing facial information of a same person ([0008] the multiple face tracking module can use a Loopy Belief Propagation (LBP) algorithm to provide person identities for selected trajectories based on extracted facial features. In still other embodiments a face tracklet module can be used to generate constraints that are provided to the face pair module, with constraints including a finding that faces in a single tracklet are of the same person, and that faces that appear at different positions in the same frame are different persons; [0010] A contrastive loss function can be used to ensure that determined metric distances between faces of different persons are greater than determined metric distances between faces of a same person), wherein the method is performed by at least one processor([0048] Implementations of the systems, devices, and methods disclosed herein may comprise or utilize a special purpose or general-purpose computer including computer hardware, such as, for example, one or more processors and system memory, as discussed herein).
Wang does not expressly teach a method for generating an image, comprising: to generate a single facial image, wherein the generative model is obtained through: inputting a single facial generative image outputted by an initial generative model into a pre-trained discriminative model to generate a probability of the single facial generative image being a real facial image; determining a loss function of the initial generative model based on the probability and a similarity between the single facial generative image and a standard facial image, and updating a model parameter of the initial generative model using the loss function to obtain the generative model.
However, CHEN teaches a method for generating an image, comprising: to generate a single facial image ([0025] S12, inputting the credential face image into a pre-trained generative adversarial network, and obtaining a reconstructed face image corresponding to the credential face image according to the output of the generative adversarial network), determining a loss function of the initial generative model based on the probability and a similarity between the single facial generative image and a standard facial image ([0037] training the discriminator based on the perceptual loss function The network parameters of the network and the network parameters of the generator network; the perceptual loss function is a function of the probability of discriminating the reconstruction face image output by the generator network as a real natural light face image), and updating a model parameter of the initial generative model using the loss function to obtain the generative model ([0036] The certificate face image sample is used as the input of the generator network, and the network parameters of the generator network are trained based on the square loss function until the square loss function is minimized; the square loss function is the natural light face image sample and the generator A function of the pixel-wise squared difference of the reconstructed face image output by the network).
It would have been obvious before the effective filing date of the claimed invention to one having ordinary skill in the art to modify the device and method of Wang to incorporate the step/system of generating a face image by using generative adversarial network which loss function based on probability of discriminating the reconstruction face image by the generator network as a real face image and updating network parameters based on the loss function taught by CHEN.
The suggestion/motivation for doing so would have been to improve the accuracy of verification ([0053] the authentication verification can be performed based on the reconstructed face image and the natural light face image collected in real time to improve the accuracy of verification).  Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predicted results.
The combination of Wang and CHEN does not expressly teaches wherein the generative model is obtained through: inputting a single facial generative image outputted by an initial generative model into a pre-trained discriminative model to generate a probability of the single facial generative image being a real facial image.
However, Kim teaches wherein the generative model is obtained through: inputting a single facial generative image outputted by an initial generative model into a pre-trained discriminative model to generate a probability of the single facial generative image being a real facial image ([0045] Generator GAB translates input image XA from domain A into XAB in domain B as represented by Equation 5. The generated image XAB is then translated into a domain-A image XABA to match the original input image XA as represented by Equation 6. Various forms of distance functions, such as MSE, cosine distance, and hinge-loss, can be used as the reconstruction loss d as represented by Equation 7. The translated output XAB is then scored by discriminator DB which compares the translated output XAB to a real sample x13 in domain B; [0048] this case is where the mapping GAB maps images of cars in two different orientations into the same mode of face images).
It would have been obvious before the effective filing date of the claimed invention to one having ordinary skill in the art to modify the device and method of combination of Wang and CHEN to incorporate the step/system of inputting a translated image output by a generator into discriminator to be scored by comparing the translated image and real image taught by Kim.
The suggestion/motivation for doing so would have been to improve the efficiency for training sample images ([0060] The generators and discriminators are alternately and repeatedly trained, when the set of sample images used in the earlier training stage can be reused in the next training stage, although it is efficient for training to use a new set of sample images). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predicted results. Therefore, it would have been obvious to combine Wang and CHEN and Kim to obtain the invention as specified in claim 1.
Regarding claim 2, the combination of Wang with CHEN and Kim teaches all the limitations of claim 1 above. Wang teaches wherein the determining a loss function of the initial generative model comprises: extracting respectively feature information of the single facial generative image and feature information of the standard facial image using a pre-trained recognition model ([0007] A fine-tuning module can be connected between the pre-trained neural network and the multiple face tracking module to adaptively extract discriminative face features; [0019] A facial recognition system and method 100 uses an external face recognition dataset 110 to provide input to a a pre-trained neural network 112; [0020] a convolutional neural network (CNN) can be used as the neural network. Advantageously, raw input video can taken as the input, and detection, tracking, clustering, and feature adaptation in a fully automatic way. In one embodiment, a deep convolutional neural network (CNN) is pre-trained for extracting generic face features on a large-scale external face recognition dataset), and calculating a Euclidean distance between the feature information of the single facial generative image and the feature information of the standard facial image ([0010] A face distinguishing model that determines whether the first and second facial images are the same or different based on the determined metric distance completes the evaluation, which can have a determined metric based on Euclidean distance); and obtaining the loss function of the initial generative model and, the Euclidean distance ([0010] A contrastive loss function can be used to ensure that determined metric distances between faces of different persons are greater than determined metric distances between faces of a same person).
Wang does not expressly teach according to the probability.
However, CHEN teaches according to the probability ([0037] training the discriminator based on the perceptual loss function The network parameters of the network and the network parameters of the generator network; the perceptual loss function is a function of the probability of discriminating the reconstruction face image output by the generator network as a real natural light face image).
It would have been obvious before the effective filing date of the claimed invention to one having ordinary skill in the art to modify the device and method of Wang to incorporate the step/system of using the probability of discriminating the reconstruction face image by the generator network as the real face taught by CHEN.
The suggestion/motivation for doing so would have been to improve the accuracy of verification ([0053] the authentication verification can be performed based on the reconstructed face image and the natural light face image collected in real time to improve the accuracy of verification).  Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predicted results. Therefore, it would have been obvious to combine Wang and CHEN to obtain the invention as specified in claim 2.
Regarding claim 3, the combination of Wang with CHEN and Kim teaches all the limitations of claim 1 above. CHEN teaches wherein the initial generative model is trained and obtained by: using at least two frames of initial training facial sample images extracted from an initial training video as an input, and using a preset initial training facial image as an output by using a machine learning method, the at least two frames of initial training facial sample images and the initial training facial image containing the facial information of the given person ([0034] The training method can be: first, pre-train the generation confrontation network based on the ImageNet database; then retrain the pre-trained generation confrontation network based on the preset witness sample library, until the generation confrontation network that meets the preset conditions is obtained. 252 Wherein, the ImageNet database is currently the largest database for image recognition in the world; the witness sample database includes a plurality of certificate photo samples and natural light face image samples corresponding to each certificate photo sample. 255 Through the training of different databases in two stages, a generative adversarial network that meets the needs of witness verification can be obtained to convert low-resolution ID face images into high-resolution reconstructed face images).
Regarding claim 4, the combination of Wang with CHEN and Kim teaches all the limitations of claim 1 above. Wang teaches wherein the discriminative model is trained and obtained by: using a first sample image as an input and use annotation information of the first sample image as an output by using a machine learning method, the first sample image comprising a positive sample image with annotation information and a negative sample image with annotation information, wherein the negative sample image is an image outputted by the generative model ([0011] Another described embodiment is a facial recognition method that uses online sparse learning. Steps to practice the method include initializing target position and scale; extracting positive and negative samples; [0035] From each of these samples and associated Haar-like features, a high dimensional Haar-like feature vector {right arrow over (b)}iϵRm can be extracted, along with a corresponding label Yiϵ{−1,1} (+1 corresponds to a positive sample and −1 corresponds to a negative sample)).
Regarding claim 5, the combination of Wang with CHEN and Kim teaches all the limitations of claim 2 above. Wang teaches wherein the recognition model is trained and obtained by: using a second sample image as an input and feature information of the second sample image as an output by using a machine learning method ([0035] From each of these samples and associated Haar-like features, a high dimensional Haar-like feature vector {right arrow over (b)}iϵRm can be extracted, along with a corresponding label Yiϵ{−1,1} (+1 corresponds to a positive sample and −1 corresponds to a negative sample); [0034] High dimensional Haar-like features, denoted as {right arrow over (B)}, are extractable from these samples to learn the appearance model, where every dimension of the Haar-like feature biϵ{right arrow over (B)} is selected randomly at the first time. Haar-like features can include, but are not limited to, digital images useful in face recognition and having features such as defined adjacent rectangular regions at a specific location in a detection window, Haar-like features have pixel intensities that can be summed in each region and computationally efficient calculation of the difference between these sums can be used to categorize subsections of an image. For example, in most faces the eye region is darker than the cheek region. This allows for use of a common Haar-like feature that is a set of two adjacent rectangles that lie above the eye and the cheek region).
With respect to claim 7, arguments analogous to those presented for claim 1, are applicable. 
With respect to claim 8, arguments analogous to those presented for claim 2, are applicable.
With respect to claim 9, arguments analogous to those presented for claim 3, are applicable. 
With respect to claim 10, arguments analogous to those presented for claim 4, are applicable.
With respect to claim 11, arguments analogous to those presented for claim 5, are applicable.
With respect to claim 13, arguments analogous to those presented for claim 1, are applicable.

Claim 6 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (U.S Publication No. 2018/0114056) (hereafter, "Wang") in view of CHEN et al. (WO2019015466) (hereafter, "CHEN") and further in view of Kim et al. (U.S. Publication No.  2019/0205334) (hereafter, "Kim") and HONKALA et al. (U.S. Publication No. 2019/0012581) (hereafter, "HONKALA").
Regarding claim 6, the combination of Wang with CHEN and Kim teaches all the limitations of claim 1 above. The combination of Wang with CHEN and Kim does not expressly teach wherein the generative model is a Long-Short Term Memory Model.
However, HONKALA teaches wherein the generative model is a Long-Short Term Memory Model ([0074] In FIG. 5, a pre-trained classification neural network 510 is used for extracting feature maps from multiple layers. This is done separately for real samples 520 and for generated samples from a generative model G 530; [0079] For videos or other temporal data the models C and G may include RNN or LSTM cells, and the distance can be computed by computing the activations of model C over all the time steps; [0050] the most advanced and effective types of RNN are the Long Short-Time Memory (LSTM) and Convolutional Long Short-Time Memory (Conv-LSTM)).
It would have been obvious before the effective filing date of the claimed invention to one having ordinary skill in the art to modify the device and method of combination of Wang with CHEN and Kim to incorporate the step/system of using a Long Short-Time Memory for the generative model taught by HONKALA.
The suggestion/motivation for doing so would have been to improve the accuracy for detecting objects in images ([0035] Deep learning techniques allow for recognizing and detecting objects in images or videos with great accuracy, outperforming previous methods). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predicted results. Therefore, it would have been obvious to combine Wang, CHEN, Kim and HONKALA to obtain the invention as specified in claim 6.
With respect to claim 12, arguments analogous to those presented for claim 6, are applicable.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL C. CHANG whose telephone number is (571)270-1277. The examiner can normally be reached Monday-Thursday and Alternate Fridays 8:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan S. Park can be reached on (571) 272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DANIEL C CHANG/Examiner, Art Unit 2669 
/CHAN S PARK/Supervisory Patent Examiner, Art Unit 2669