DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claims 1-20 are pending under this Office action.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 6 and 14 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. The term “            
                
                    
                        z
                    
                    
                        i
                    
                
            
        ” in the cited limitation of “wherein             
                
                    
                        D
                    
                    
                        
                            
                                θ
                            
                            
                                D
                            
                        
                    
                
                
                    
                        
                            
                                y
                            
                            
                                i
                            
                        
                    
                
            
         indicates whether the discriminator D considers the target datum             
                
                    
                        z
                    
                    
                        i
                    
                
            
         to be real” may be “            
                
                    
                        y
                    
                    
                        i
                    
                
            
        ”.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 7-9, 11, and 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Fu, etc. (US 20190295302 A1) in view of “Dual Encoder-Decoder based Generative Adversarial Networks for Disentangled Facial Representation Learning” (by CONG HU, ZHEN-HUA FENG, XIAO-JUN WU, AND JOSEF KITTLER, Digital Object Identifier 10.1109/ACCESS.2017.DOI, r IEEE TRANSACTIONS and JOURNALS, Volume 4, 2016, hereinafter referred as Hu).
Regarding claim 1, Fu teaches that a method of training a generator G of Generative Adversarial Network (GAN) (See Fu: Fig. 4, and [0049], “FIG. 3 illustrates a process of training an image generator, i.e., an image generator system, comprising a generator 301, discriminator 302, and segmentor 303 according to an embodiment. It is noted that in FIG. 3, components, such as the generator 301, discriminator 302, and segmentor 303, are depicted multiple times and this is done to simplify the diagram”), comprising: 
generating, by the generator G and in response to receiving a first input Z, a first output G(Z), wherein the first output G(Z) being a first ambient space representation of the first input Z (See Fu: Fig. 4, and [0050], “The generator 301 is configured to receive three inputs, an input image (source image) 304, a target segmentation 305, and a vector of target attributes 306. A goal of the training process is to configure the generator 301 to translate the input image 304 into a generated image (fake image) 307, which complies with the target segmentation 305 and attribute labels 306”); 
generating, by an encoder E of the GAN and in response to receiving the first output G(Z) as input, a second output E(G(Z)), wherein the second output E(G(Z)) being a first latent Space representation of the first output G(Z) (See Fu: Figs. 5-12, and [0095], “A deep encoder-decoder architecture was employed for both G and D with several residual blocks to increase the depth of the network while avoiding gradient vanishing. For the discriminator network, state-of-the-art loss function and training procedures were adopted from improved WGAN with gradient penalty [Gulrajani et al., “Improved training of wasserstein gans,” arXiv preprint arXiv:1704.00028, 2017], to stabilize the training process. In bottleneck layers, k=6 residual blocks were implemented for the generator G and k=4 residual blocks for the segmentor S. Three Adam optimizers were employed with beta1 of 0.5 and beta2 of 0.999 to optimize the networks. The learning rates were set to be 0.0001 for both G and D and 0.0002 for S”); 
generating, by the generator G and in response to receiving the second output E(G(Z)) as input, a third output G(E(G(Z))), wherein the third output G(E(G(Z))) being a second ambient space representation of the second output E(G(Z)) (See Fu: Fig. 4, and [0053], “The third path of generator training is a reconstruction loss path which takes the generated image 307 as an input to the generator 301, as well as two other inputs, a source segmentation 315 (which may be a ground-truth landmark based segmentation) and a source attributes label 316. This path is expected to reconstruct an image 317 from the generated fake image 307 that should match the input source image 304. The reconstructed image 317 is then compared with the input source image 304 to compute a reconstruction loss 318 which is provided to the generator optimizer 310”); 
generating, by the encoder E and in response to receiving the third output G(E(G(Z))) as input, a fourth output E(G(E(G(Z)))), wherein the fourth output E(G(E(G(Z)))) being a second latent space representation of the third output G(E(G(Z))) (See Fu: Fig. 4, and [0082], “The image generation network, SGGAN, according to an embodiment, generates two types of images. The first image is the fake image generated by the generator G, e.g., the image 307, generated from the real image, the target segmentation, and target attributes denoted G(x, s′, c′). The second image generated by the generator G is the reconstructed image, e.g., the image 317, generated from the fake image, source segmentation, and source attributes denoted by the label G(G(x, s′, c′), s, c). An embodiment, adopts adversarial loss to the former path and thus, forms a generative adversarial network with the discriminator D. The later path reconstructs the input image in the source domain using the fake image, which can be trained with supervision using the input image that additional adversarial loss is unnecessary”); 
training the encoder E to minimize a difference between the second output E(G(Z)) and the fourth output E(G(E(G(Z)))) (See Fu: Fig. 4, and [0054], “The fake adversarial loss term 313, the fake segmentation loss 309, the fake classification loss 314, and the reconstruction loss 318 are used by the optimizer 310 to optimize the generator 301. In an embodiment, the optimizer 310 sums up the loses 313, 309, 314, and 318 with weights, i.e., weights the losses differently, to determine a generator loss, which is used by the optimizer 310 to do back-propagation and update the parameters in a neural network implementing the generator 301. According, to an embodiment, losses are summed as shown in the equation below”. Note that the generator G has the generator G, Encoder E, and decoder); and 
using the second output E(G(Z)) and fourth output E(G(E(G(Z)))) to constrain a training of the generator G (See Fu: Fig. 4, and [0054], “The fake adversarial loss term 313, the fake segmentation loss 309, the fake classification loss 314, and the reconstruction loss 318 are used by the optimizer 310 to optimize the generator 301. In an embodiment, the optimizer 310 sums up the loses 313, 309, 314, and 318 with weights, i.e., weights the losses differently, to determine a generator loss, which is used by the optimizer 310 to do back-propagation and update the parameters in a neural network implementing the generator 301. According, to an embodiment, losses are summed as shown in the equation below”).
However, Fu fails to explicitly disclose that training the encoder E to minimize a difference.
However, Hu teaches that training the encoder E to minimize a difference (See Hu: Figs. 1A-D, and Section III. The Proposed Approach, Page 4, “In contrast to DR-GAN, we add a decoder to the discriminator, which is optimized for pixel-wise loss defined in terms of the Wasserstein distance, to balance the generator and discriminator. We also code the pose using a continuous variate instead of the discrete variate commonly specified by a one-hot vector. As a result, the task of pose disentanglement in the discriminator can be formulated as one of pose regression instead of classification, which further benefits the learning process”).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention was effectively filed to modify Fu to have training the encoder E to minimize a difference as taught by Hu in order to explicitly disentangle face imaging factors to obtain an interpretable face representation for PIFR and face synthesis across poses (See Hu: Fig. 1D, and Section B. Dual Encoder-Decoder Based GAN, Page 4, “Our DED-GAN explicitly disentangles face imaging factors to obtain an interpretable face representation for PIFR and face synthesis across poses. The backbone of DED-GAN consists of an encoder-decoder based generator and encoder-decoder based discriminator, as depicted in Fig. 1d. It learns the representation of a face by using the generator, where the encoded output of the generator is the identity-preserving representation”). Fu teaches a method and system that may train the image generator with neural networks, an encoder, and a decoder; while Hu teaches a system and method that may disentangle the facial representation learning with dual encoder-decoder based GAN and training of the generator, encoder, and discriminator. Therefore, it is obvious to one of ordinary skill in the art to modify Fu by Hu to perform dual encoder-decoder GAN techniques with training on generator, encoder, decoder, and discriminator. The motivation to modify Fu by Hu is “Use of known technique to improve similar devices (methods, or products) in the same way”.
Regarding claim 3, Fu and Hu teach all the features with respect to claim 1 as outlined above. Further, Fu teaches that the method of claim 1, wherein using the second output E(G(Z)) and fourth output E(G(E(G(Z)))) to constrain the training of the generator G comprising: using the second output E(G(Z)) and fourth output E(G(E(G(Z)))) in a loss function that is used to update weights of the generator G (See Fu: Fig. 3, and [0082], “The image generation network, SGGAN, according to an embodiment, generates two types of images. The first image is the fake image generated by the generator G, e.g., the image 307, generated from the real image, the target segmentation, and target attributes denoted G(x, s′, c′). The second image generated by the generator G is the reconstructed image, e.g., the image 317, generated from the fake image, source segmentation, and source attributes denoted by the label G(G(x, s′, c′), s, c). An embodiment, adopts adversarial loss to the former path and thus, forms a generative adversarial network with the discriminator D. The later path reconstructs the input image in the source domain using the fake image, which can be trained with supervision using the input image that additional adversarial loss is unnecessary. The adversarial loss is defined as: Equation (4)”).
Regarding claim 7, Fu and Hu teach all the features with respect to claim 1 as outlined above. Further, Fu teaches that the method of claim 1, wherein the encoder E comprising a neural network (See Fu: Fig. 15, and [0154], “To implement embodiments used the obtain the results, residual up-sampling blocks were leveraged instead of transposed convolution layers for up-sampling operation. An encoder-decoder structure with several residual blocks [He et al., “Deep residual learning for image recognition” In CVPR (2016)] as a bottleneck was used in the segmentor network. Batch normalization [Ioffe et al., “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” In International Conference on Machine Learning, 448-456 (2015)] in both the generator and the segmentor was replaced with instance normalization [Ulyanov et al., “Instance Normalization: The Missing Ingredient for Fast Stylization,” arXiv preprint arXiv:1607.08022 (2016)]. The PatchGAN structure [Isola et al., “Image-to-image translation with conditional adversarial networks,” arXiv preprint arXiv:1611.07004 (2016)] was followed with a no normalization method in constructing the discriminator network. Three Adam optimizers [Kingma et al., “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 (2014)] with β1 of 0.5 and β2 of 0.999 were used to optimize the networks. The learning rates were set to be 0.0001. The proposed SCGAN was implemented in Pytorch [Paszke et al., “Automatic differentiation in PyTorch,” (2017)]”).
Regarding claim 8, Fu and Hu teach all the features with respect to claim 7 as outlined above. Further, Fu teaches that the method of claim 7, wherein the neural network comprising an upscaling layer, and wherein each of the second output E(G(Z)) and the fourth output E(G(E(G(Z)))) of the encoder E having a first size equal to a second size of first input Z (See Fu: Figs. 2A-C, and [0041], “In detail, an embodiment of the proposed SGGAN framework comprises three networks, depicted in FIGS. 2A, 2B, and 2C, respectively, a generator network 220, a discriminator network 240, and a segmentor network 260. In an embodiment, the generator network 220 includes a convolutional block 221, a down-sampling convolutional block 222, a residual block 223, up-sampling convolutional block 224, and convolutional block 225. In an example embodiment, the residual block 223 is employed to provide bottleneck layers. Moreover, according to an embodiment, the residual block 223 is implemented as described in [Kaiming, et al., “Deep residual learning for image recognition,” Proceedings of the IEEE conference on computer vision and pattern recognition, 2016]. The generator 240 takes as inputs, a target segmentation 227, a given image 226, and a vector 228 indicating desired attributes of the image to be generated. The generator 220 implemented with the blocks 221-225 is configured to receive the inputs 226, 227, 228 and generate a target image 229 that is based on, i.e., a translated version of, the input image 226 and consistent with the input segmentation 227 and attributes 228”).
Regarding claim 9, Fu and Hu teach all the features with respect to claim 1 as outlined above. Further, Fu and Hu teach that a system for training a generator G of Generative Adversarial Network (GAN) (See Fu: Fig. 4, and [0049], “FIG. 3 illustrates a process of training an image generator, i.e., an image generator system, comprising a generator 301, discriminator 302, and segmentor 303 according to an embodiment. It is noted that in FIG. 3, components, such as the generator 301, discriminator 302, and segmentor 303, are depicted multiple times and this is done to simplify the diagram”), comprising: 
a processor (See Fu: Fig. 3, and [0065], “An example embodiment is directed to a system for training an image generator. In an embodiment, the system comprises a processor”); and 
a memory, the processor is configured to executed instructions stored in the memory (See Fu: Fig. 3, and [0065], “and the memory, with the computer code instructions, are configured to cause the system to provide a generator, discriminator, and segmentor”)to: 
generate, by the generator G and in response to receiving a first input Z, a first output G(Z), wherein the first output G(Z) being a first ambient space representation of the first input Z (See Fu: Fig. 4, and [0050], “The generator 301 is configured to receive three inputs, an input image (source image) 304, a target segmentation 305, and a vector of target attributes 306. A goal of the training process is to configure the generator 301 to translate the input image 304 into a generated image (fake image) 307, which complies with the target segmentation 305 and attribute labels 306”); 
generate, by an encoder E of the GAN and in response to receiving the first output G(Z) as input, a second output E(G(Z)), wherein the second output E(G(Z)) being a first latent space representation of the first output G(Z) (See Fu: Figs. 5-12, and [0095], “A deep encoder-decoder architecture was employed for both G and D with several residual blocks to increase the depth of the network while avoiding gradient vanishing. For the discriminator network, state-of-the-art loss function and training procedures were adopted from improved WGAN with gradient penalty [Gulrajani et al., “Improved training of wasserstein gans,” arXiv preprint arXiv:1704.00028, 2017], to stabilize the training process. In bottleneck layers, k=6 residual blocks were implemented for the generator G and k=4 residual blocks for the segmentor S. Three Adam optimizers were employed with beta1 of 0.5 and beta2 of 0.999 to optimize the networks. The learning rates were set to be 0.0001 for both G and D and 0.0002 for S”); 

generate, by the generator G and in response to receiving the second output E(G(Z)) as input, a third output G(E(G(Z))),  wherein the third output G(E(G(Z))) being a second ambient space representation of the second output E(G(Z)) (See Fu: Fig. 4, and [0053], “The third path of generator training is a reconstruction loss path which takes the generated image 307 as an input to the generator 301, as well as two other inputs, a source segmentation 315 (which may be a ground-truth landmark based segmentation) and a source attributes label 316. This path is expected to reconstruct an image 317 from the generated fake image 307 that should match the input source image 304. The reconstructed image 317 is then compared with the input source image 304 to compute a reconstruction loss 318 which is provided to the generator optimizer 310”); 
generate, by the encoder E and in response to receiving the third output G(E(G(Z))) as input, a fourth output E(G(E(G(Z)))), wherein the fourth output E(G(E(G(Z)))) being a second latent space representation of the third output G(E(G(Z))) (See Fu: Fig. 4, and [0082], “The image generation network, SGGAN, according to an embodiment, generates two types of images. The first image is the fake image generated by the generator G, e.g., the image 307, generated from the real image, the target segmentation, and target attributes denoted G(x, s′, c′). The second image generated by the generator G is the reconstructed image, e.g., the image 317, generated from the fake image, source segmentation, and source attributes denoted by the label G(G(x, s′, c′), s, c). An embodiment, adopts adversarial loss to the former path and thus, forms a generative adversarial network with the discriminator D. The later path reconstructs the input image in the source domain using the fake image, which can be trained with supervision using the input image that additional adversarial loss is unnecessary”); 
train the encoder E to minimize a difference (See Hu: Figs. 1A-D, and Section III. The Proposed Approach, Page 4, “In contrast to DR-GAN, we add a decoder to the discriminator, which is optimized for pixel-wise loss defined in terms of the Wasserstein distance, to balance the generator and discriminator. We also code the pose using a continuous variate instead of the discrete variate commonly specified by a one-hot vector. As a result, the task of pose disentanglement in the discriminator can be formulated as one of pose regression instead of classification, which further benefits the learning process”) between the second output E(G(Z)) and the fourth output E(G(E(G(Z)))) (See Fu: Fig. 4, and [0054], “The fake adversarial loss term 313, the fake segmentation loss 309, the fake classification loss 314, and the reconstruction loss 318 are used by the optimizer 310 to optimize the generator 301. In an embodiment, the optimizer 310 sums up the loses 313, 309, 314, and 318 with weights, i.e., weights the losses differently, to determine a generator loss, which is used by the optimizer 310 to do back-propagation and update the parameters in a neural network implementing the generator 301. According, to an embodiment, losses are summed as shown in the equation below”. Note that the generator G has the generator G, Encoder E, and decoder); and 
use the second output E(G(Z)) and fourth output E(G(E(G(Z)))) to constrain a training of the generator G (See Fu: Fig. 4, and [0054], “The fake adversarial loss term 313, the fake segmentation loss 309, the fake classification loss 314, and the reconstruction loss 318 are used by the optimizer 310 to optimize the generator 301. In an embodiment, the optimizer 310 sums up the loses 313, 309, 314, and 318 with weights, i.e., weights the losses differently, to determine a generator loss, which is used by the optimizer 310 to do back-propagation and update the parameters in a neural network implementing the generator 301. According, to an embodiment, losses are summed as shown in the equation below”).
Regarding claim 11, Fu and Hu teach all the features with respect to claim 9 as outlined above. Further, Fu teaches that the system of claim 9, wherein to use the second output E(G(Z)) and fourth output E(G(E(G(Z)))) to constrain the training of the generator G comprises to: use the second output E(G(Z)) and fourth output E(G(E(G(Z)))) in a loss function that is used to update weights of the generator G (See Fu: Fig. 3, and [0082], “The image generation network, SGGAN, according to an embodiment, generates two types of images. The first image is the fake image generated by the generator G, e.g., the image 307, generated from the real image, the target segmentation, and target attributes denoted G(x, s′, c′). The second image generated by the generator G is the reconstructed image, e.g., the image 317, generated from the fake image, source segmentation, and source attributes denoted by the label G(G(x, s′, c′), s, c). An embodiment, adopts adversarial loss to the former path and thus, forms a generative adversarial network with the discriminator D. The later path reconstructs the input image in the source domain using the fake image, which can be trained with supervision using the input image that additional adversarial loss is unnecessary. The adversarial loss is defined as: Equation (4)”).
Regarding claim 15, Fu and Hu teach all the features with respect to claim 9 as outlined above. Further, Fu teaches that the system of claim 9, wherein the encoder E comprising a neural network (See Fu: Fig. 15, and [0154], “To implement embodiments used the obtain the results, residual up-sampling blocks were leveraged instead of transposed convolution layers for up-sampling operation. An encoder-decoder structure with several residual blocks [He et al., “Deep residual learning for image recognition” In CVPR (2016)] as a bottleneck was used in the segmentor network. Batch normalization [Ioffe et al., “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” In International Conference on Machine Learning, 448-456 (2015)] in both the generator and the segmentor was replaced with instance normalization [Ulyanov et al., “Instance Normalization: The Missing Ingredient for Fast Stylization,” arXiv preprint arXiv:1607.08022 (2016)]. The PatchGAN structure [Isola et al., “Image-to-image translation with conditional adversarial networks,” arXiv preprint arXiv:1611.07004 (2016)] was followed with a no normalization method in constructing the discriminator network. Three Adam optimizers [Kingma et al., “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 (2014)] with β1 of 0.5 and β2 of 0.999 were used to optimize the networks. The learning rates were set to be 0.0001. The proposed SCGAN was implemented in Pytorch [Paszke et al., “Automatic differentiation in PyTorch,” (2017)]”).
Regarding claim 16, Fu and Hu teach all the features with respect to claim 15 as outlined above. Further, Fu teaches that the system of claim 15, wherein the neural network comprising an upscaling layer, and wherein each of the second output E(G(Z)) and the fourth output E(G(E(G(Z)))) of the encoder E having a first size equal to a second size of first input Z (See Fu: Figs. 2A-C, and [0041], “In detail, an embodiment of the proposed SGGAN framework comprises three networks, depicted in FIGS. 2A, 2B, and 2C, respectively, a generator network 220, a discriminator network 240, and a segmentor network 260. In an embodiment, the generator network 220 includes a convolutional block 221, a down-sampling convolutional block 222, a residual block 223, up-sampling convolutional block 224, and convolutional block 225. In an example embodiment, the residual block 223 is employed to provide bottleneck layers. Moreover, according to an embodiment, the residual block 223 is implemented as described in [Kaiming, et al., “Deep residual learning for image recognition,” Proceedings of the IEEE conference on computer vision and pattern recognition, 2016]. The generator 240 takes as inputs, a target segmentation 227, a given image 226, and a vector 228 indicating desired attributes of the image to be generated. The generator 220 implemented with the blocks 221-225 is configured to receive the inputs 226, 227, 228 and generate a target image 229 that is based on, i.e., a translated version of, the input image 226 and consistent with the input segmentation 227 and attributes 228”).
Regarding claim 17, Fu and Hu teach all the features with respect to claim 1 as outlined above. Further, Fu and Hu teach that a method for style transfer (See Fu: Fig. 4, and [0049], “FIG. 3 illustrates a process of training an image generator, i.e., an image generator system, comprising a generator 301, discriminator 302, and segmentor 303 according to an embodiment. It is noted that in FIG. 3, components, such as the generator 301, discriminator 302, and segmentor 303, are depicted multiple times and this is done to simplify the diagram”), comprising: 
receiving, by a generator G, an input corresponding to a source image including a first participant (See Fu: Fig. 4, and [0050], “The generator 301 is configured to receive three inputs, an input image (source image) 304, a target segmentation 305, and a vector of target attributes 306. A goal of the training process is to configure the generator 301 to translate the input image 304 into a generated image (fake image) 307, which complies with the target segmentation 305 and attribute labels 306”); and 
outputting, from the generator G, an output image, wherein the first participant is modified in the output image, wherein the generator G is trained using a Generative Adversarial Network (GAN) by steps (See Fu: Figs. 2A-D, and [0046], “In the training procedure 270, the generator 220 is configured to receive the target segmentation 271, desired attributes vector 272, and real image 273 and from the inputs 271-273, generate the image 274. Further, the generator 220 (which is depicted twice in FIG. 2D to show additional processing) is configured to perform a reconstruction process that attempts to reconstruct the input image 273 using a segmentation 275 that is based on the real image 273, attributes 276 of the real image 273, and the generated image 274. To train the generator 220, the generated image 274 is provided to the discriminator 240. The discriminator 240 makes a determination if the image 274 is real or fake and also determines attributes of the image 274. Then, based on these determinations, weights of the neural network implementing the generator 220 are adjusted so as to improve the generator's 240 ability to generate images that are in accordance with the desired attributes 272 and target segmentation 271 while also being indistinguishable from real images. Similarly, the generator 220 is also adjusted, i.e., weights of the neural network implementing the generator 220 are adjusted based on the reconstruction loss. In an embodiment, the reconstruction loss is the difference between the reconstructed image 277 and the real image 273”) comprising: 
generating, by the generator G and in response to receiving a first input Z, a first output G(Z), wherein the first output G(Z) being a first ambient space representation of the first input Z (See Fu: Fig. 4, and [0050], “The generator 301 is configured to receive three inputs, an input image (source image) 304, a target segmentation 305, and a vector of target attributes 306. A goal of the training process is to configure the generator 301 to translate the input image 304 into a generated image (fake image) 307, which complies with the target segmentation 305 and attribute labels 306”); 
generating, by an encoder E of the GAN and in response to receiving the first output G(Z) as input, a second output E(G(Z)), wherein the second output E(G(Z)) being a first latent space representation of the first output G(Z) (See Fu: Figs. 5-12, and [0095], “A deep encoder-decoder architecture was employed for both G and D with several residual blocks to increase the depth of the network while avoiding gradient vanishing. For the discriminator network, state-of-the-art loss function and training procedures were adopted from improved WGAN with gradient penalty [Gulrajani et al., “Improved training of wasserstein gans,” arXiv preprint arXiv:1704.00028, 2017], to stabilize the training process. In bottleneck layers, k=6 residual blocks were implemented for the generator G and k=4 residual blocks for the segmentor S. Three Adam optimizers were employed with beta1 of 0.5 and beta2 of 0.999 to optimize the networks. The learning rates were set to be 0.0001 for both G and D and 0.0002 for S”); 
generating, by the generator G and in response to receiving the second output E(G(Z)) as input, a third output G(E(G(Z))), wherein the third output G(E(G(Z))) being a second ambient space representation of the second output E(G(Z)) (See Fu: Fig. 4, and [0053], “The third path of generator training is a reconstruction loss path which takes the generated image 307 as an input to the generator 301, as well as two other inputs, a source segmentation 315 (which may be a ground-truth landmark based segmentation) and a source attributes label 316. This path is expected to reconstruct an image 317 from the generated fake image 307 that should match the input source image 304. The reconstructed image 317 is then compared with the input source image 304 to compute a reconstruction loss 318 which is provided to the generator optimizer 310”); 
generating, by the encoder E and in response to receiving the third output G(E(G(Z))) as input, a fourth output E(G(E(G(Z)))), wherein the fourth output E(G(E(G(Z)))) being a second latent space representation of the third output G(E(G(Z))) See Fu: Fig. 4, and [0082], “The image generation network, SGGAN, according to an embodiment, generates two types of images. The first image is the fake image generated by the generator G, e.g., the image 307, generated from the real image, the target segmentation, and target attributes denoted G(x, s′, c′). The second image generated by the generator G is the reconstructed image, e.g., the image 317, generated from the fake image, source segmentation, and source attributes denoted by the label G(G(x, s′, c′), s, c). An embodiment, adopts adversarial loss to the former path and thus, forms a generative adversarial network with the discriminator D. The later path reconstructs the input image in the source domain using the fake image, which can be trained with supervision using the input image that additional adversarial loss is unnecessary”); 
training the encoder E to minimize a difference (See Hu: Figs. 1A-D, and Section III. The Proposed Approach, Page 4, “In contrast to DR-GAN, we add a decoder to the discriminator, which is optimized for pixel-wise loss defined in terms of the Wasserstein distance, to balance the generator and discriminator. We also code the pose using a continuous variate instead of the discrete variate commonly specified by a one-hot vector. As a result, the task of pose disentanglement in the discriminator can be formulated as one of pose regression instead of classification, which further benefits the learning process”) between the second output E(G(Z)) and the fourth output E(G(E(G(Z)))) (See Fu: Fig. 4, and [0054], “The fake adversarial loss term 313, the fake segmentation loss 309, the fake classification loss 314, and the reconstruction loss 318 are used by the optimizer 310 to optimize the generator 301. In an embodiment, the optimizer 310 sums up the loses 313, 309, 314, and 318 with weights, i.e., weights the losses differently, to determine a generator loss, which is used by the optimizer 310 to do back-propagation and update the parameters in a neural network implementing the generator 301. According, to an embodiment, losses are summed as shown in the equation below”. Note that the generator G has the generator G, Encoder E, and decoder); and 
using the second output E(G(Z)) and fourth output E(G(E(G(Z)))) to constrain a training of the generator G (See Fu: Fig. 4, and [0054], “The fake adversarial loss term 313, the fake segmentation loss 309, the fake classification loss 314, and the reconstruction loss 318 are used by the optimizer 310 to optimize the generator 301. In an embodiment, the optimizer 310 sums up the loses 313, 309, 314, and 318 with weights, i.e., weights the losses differently, to determine a generator loss, which is used by the optimizer 310 to do back-propagation and update the parameters in a neural network implementing the generator 301. According, to an embodiment, losses are summed as shown in the equation below”).


Claims 5 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Fu, etc. (US 20190295302 A1) in view of “Dual Encoder-Decoder based Generative Adversarial Networks for Disentangled Facial Representation Learning” (by CONG HU, ZHEN-HUA FENG, XIAO-JUN WU, AND JOSEF KITTLER, Digital Object Identifier 10.1109/ACCESS.2017.DOI, r IEEE TRANSACTIONS and JOURNALS, Volume 4, 2016, hereinafter referred as Hu), further in view of Brubaker, etc. (US 20170103161 A1).
Regarding claim 5, Fu and Hu teach all the features with respect to claim 1 as outlined above. However, Fu fails to explicitly disclose that the method of claim 1, wherein the generator G is trained by applying a Lipschitz condition so as to upper bound a first difference between the output G(Z) and the third output G(E(G(Z))) to a second difference between the second output E(G(Z)) and fourth output E(G(E(G(Z)))). 
However,  Brubaker teaches that the method of claim 1, wherein the generator G is trained by applying a Lipschitz condition so as to upper bound a first difference between the output G(Z) and the third output G(E(G(Z))) to a second difference between the second output E(G(Z)) and fourth output E(G(E(G(Z)))) (See Brubaker: Fig. 3, and [0083], “The SAGD technique requires a Lipschitz constant L which is not generally known. Instead it is estimated using a line search algorithm where an initial value of L is increased until the instantiated Lipschitz condition is met. The line search for the Lipschitz constant L is only performed once per predetermined number of iterations, e.g. 20 iterations. More sophisticated line search could be performed if desired. A good initial value of L may be found using a bisection search where the upper bound is the smallest L found so far to satisfy the condition and the lower bound is the largest L found so far which fails the condition. In between line searches, L is gradually decreased to try to take larger steps”).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention was effectively filed to modify Fu to have the method of claim 1, wherein the generator G is trained by applying a Lipschitz condition so as to upper bound a first difference between the output G(Z) and the third output G(E(G(Z))) to a second difference between the second output E(G(Z)) and fourth output E(G(E(G(Z)))) as taught by Brubaker in order to  efficiently marginalize over unknown pose and position for each particle image, minimizing computational burden (See Brubaker: Fig. 3, and [0042], “The system 20 executes a method 200 for 3D molecular structure estimation as illustrated in FIG. 2. The method 200 comprises, at block 202 receiving a set of 2D images of a target specimen from an electron microscope or other imaging system 74, at block 204 carrying out a reconstruction technique to determine a likely molecular structure, and at block 206 outputting the estimated 3D structure of the specimen. The reconstruction technique of block 204 comprises, as illustrated more particularly in FIG. 3: at block 220, establishing a probabilistic generative model of target's density; at block 222, establishing a marginalized likelihood function for the generative model, upon which to perform a MAP estimation of structure; at block 224, using stochastic optimization—such as Stochastic Gradient Descent (SGD), or more specifically Stochastic Average Gradient Descent (SAGD)—to determine which structure is most likely; and, regarding block 226, optionally utilizing importance sampling to efficiently marginalize over unknown pose and position for each particle image, minimizing computational burden”). Fu teaches a method and system that may train the image generator with neural networks, an encoder, and a decoder; while Brubaker teaches a system and method that may reconstruct 3D structure from 2D image by minimizing the cost function under the Lipschitz condition in order to have a continuous solution to the optimization problem. Therefore, it is obvious to one of ordinary skill in the art to modify Fu by Brubaker to minimize the cost function under the Lipschitz condition. The motivation to modify Fu by Brubaker is “Use of known technique to improve similar devices (methods, or products) in the same way”.
Regarding claim 13, Fu and Hu teach all the features with respect to claim 1 as outlined above. Further, Brubaker teaches that the system of claim 9, wherein the generator G is trained by applying a Lipschitz condition so as to upper bound a first difference between the output G(Z) and the third output G(E(G(Z))) to a second difference between the second output E(G(Z)) and fourth output E(G(E(G(Z)))) (See Brubaker: Fig. 3, and [0083], “The SAGD technique requires a Lipschitz constant L which is not generally known. Instead it is estimated using a line search algorithm where an initial value of L is increased until the instantiated Lipschitz condition is met. The line search for the Lipschitz constant L is only performed once per predetermined number of iterations, e.g. 20 iterations. More sophisticated line search could be performed if desired. A good initial value of L may be found using a bisection search where the upper bound is the smallest L found so far to satisfy the condition and the lower bound is the largest L found so far which fails the condition. In between line searches, L is gradually decreased to try to take larger steps”).




Allowable Subject Matter
Claims 2, 4, 10, 12, and 18-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GORDON G LIU whose telephone number is (571)270-0382. The examiner can normally be reached Monday - Friday 8:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached on 571-272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GORDON G LIU/Primary Examiner, Art Unit 2612