DETAILED ACTION

Priority
Acknowledgment is made of applicant's claim for foreign priority based on an application filed in PEOPLE’S REPUBLIC OF CHINA on 06/30/2020. It is noted, however, that applicant has not filed a certified copy of the foreign priority application as required by 37 CFR 1.55.


Claim Interpretation
The essence of Applicant’s claimed invention in the independent claims replies on the recited “SPAdaIn.”  Its interpretation is essential. 	
             “SPAdaIn” seems to be an abbreviation.  However, the use of abbreviation does not make the claim indefinite.  The Examiner uses the claim interpretation approach adopted by the  PTAB panel in (USPTO, PTAB Final Decision, Ex parte Wang, USPTO PTAB Appeal 2017-4435, 2018 BL 217530 (04/20/2018)).   The abbreviations shall be interpreted in light of the specification. 
	How to read “SPAdaIn” in light of the specification also depends on how the term is used in the art, though the specification generally controls when this is an inconsistency.  The Examiner finds the term is not commonly used in the art.  The term is narrowly used in this application, and an academic publication (Wang, Jiashun, et al. "Neural pose transfer by spatially adaptive instance normalization." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.) authored by a group of scholars that include all the inventors of this Application.  The academic publication closely mirrors this application.  The Examiner attached Google search printouts for “‘spadain’ neural network” and “‘spadain’ pose transfer.”  The results sets are fewer than 100.  Most relevant results are the different versions of the academic publication (Wang, Jiashun, et al.) and few articles that reference the publication.  The Examiner concludes the term “SPAdaIn” is not widely used in the art. 
	The fact that the term “SPAdaIn” is probably not known to a person with ordinary skills in the art does not make the claim indefinite, because we are still able to interpret in light of the specification, generally within the four corners of the specification.  The following are the most relevant disclosures:
“Fig. 4 illustratively shows the detailed architecture of the network 
component, wherein Fig. 4(a) shows the architecture of Pose Extractor E, Fig. 4(b) shows the architecture of SPAdaIN and Fig. 4 (c) shows the architecture of SPAdaIN ResBlock.”  Spec. 61.


    PNG
    media_image1.png
    208
    333
    media_image1.png
    Greyscale

“See Fig. 4(b) for the detail of SPAdaIN. In this invention, spatially conditional normalization is proposed to generate the 3D human shape applied to pose transfer tasks while keeping the identity of meshes. In particular, SPAdaIN is a generalization of the work of Huang et al. ("Arbitrary style transfer in real-time with adaptive instance normalization", ICCV, 2017) and Park et al. ("Semantic image synthesis with spatially-adaptive normalization", CVPR, 2019) to deal with points. Similar to IN, the activation is normalized across the spatial dimensions independently for each channel and instance, and then modulated with learned scale y and bias p. It is assumed that in the i-th layer, M is the 3D model providing identity, V' is the number of 3D shape vertices in this layer, C' is the number of feature channel, N denotes the batch size, and h is the activation value of network (the footnote indicate specific index where 
    PNG
    media_image2.png
    25
    148
    media_image2.png
    Greyscale
. The value normalized by SPAdaIN can be computed as follows,
FB20F04001US-RL-FD 
    PNG
    media_image3.png
    190
    371
    media_image3.png
    Greyscale

where y and p are learnable affine parameters, 8 = le - 5 for numerical stability. As shown in Fig. 4(b), in SPAdalN the external data Mad is fed into 2 different 1 X 1 convolution layers to produce the modulation parameters y and p. The parameters are multiplied and added to the normalized feature.”  Spec. 63. 
“SPAdaIN is different from some other conditional normalization. 
Compare to SPADE (Park et al., "Semantic image synthesis with spatially-adaptive normalization", CVPR, 2019), instance normalization is used here. Since each instance may have different features to guide the transfer, normalize the activation of the network in channel-wise is not reasonable. So, we normalize the spatially-variant parameters instance-wise, which is more suitable for the neural pose transfer task. Compare to CIN (Dumoulin et al. "A learned representation for artistic style", 2016), in this invention normalization parameter vectors are not selected from a fixed set of identities or pose, the corresponding parameters y and p are adaptively learned, therefore, their approach cannot adapt to new identities or pose without re-training. Also, their parameters are aggregated across spatial axes; thus they may lose some detailed feature in particular spatial positions. Additionally, AdaIN (Huang et al., "Arbitrary style transfer in real-time with adaptive instance normalization", ICCV, 2017) is also not suitable for pose transfer. Though AdaIN can handle arbitrary new identities or pose as guidance, there are no learnable parameters in AdaIN. Due to the lack of learnable parameters, when adopting AdaIN as normalization, the network will tend to imitate the shape of M rather than use it as a condition to produce new posture.”  Spec. 64.
	Because “SPAdaIN” is a term coined by Applicant and it is not a term generally used in the art, the implementation of “SPAdaIN” disclosed in the specification is the only source that gives meaning to the term.  Therefore, the Examiner is reading “SPAdaIN” to mean a neural network structure substantially similar to Fig. 4(b) and implements the following equation:

    PNG
    media_image3.png
    190
    371
    media_image3.png
    Greyscale
.

	The terms “Instance Norm layer,” “tanh layer,” “residual block,” and “convolution layer” are commonly used and well-defined in the art.  Further, Applicant’s descriptions and uses of the terms in the disclosure are consistent with how these terms are used in the art.   




Allowable Subject Matter
Claims 1-20 are allowed.
The following is an examiner’s statement of reasons for allowance: the prior art does not disclose, and would not have rendered obvious, the combination of features recited in Applicant’s independent claims 1, 12 and 20, namely:

a network for neural pose transfer, comprising a pose feature extractor, and a style transfer decoder sequential to the pose feature extractor, wherein: 
the pose feature extractor comprises a plurality of sequential extracting stacks, and each extracting stack consists of a first convolution layer and an Instance Norm layer sequential to the first convolution layer; 
the style transfer decoder comprises a plurality of sequential decoding stacks, a second convolution layer sequential to the plurality of decoding stacks and a tan h layer sequential to the second convolution layer; 
each decoding stack consisting of a third convolution layer and a SPAdaIn residual block; 
each SPAdaIn residual block comprises a plurality of SPAdaIn sub-stacks, and 
each SPAdaIn sub-stack comprises a SPAdaIn unit and a fourth convolution layer following the SPAdaIn unit; 
each SPAdaIn unit comprises an Instance Norm layer and a plurality of fifth convolution layers; and 
a source pose mesh is input to the pose feature extractor, and an identity mesh is concatenated with the output of the pose feature extractor and meanwhile fed to each SPAdaIn residual block of the style transfer decoder, as recited in claim 1.

A network for neural pose transfer, comprising a pose feature extractor for receiving an input pose, and a style transfer decoder for receiving for extract pose feature, wherein: 
the pose feature extractor comprises a first number of convolution layers and a first number of Instance Norm layers; 
the first number of convolution layers and the first number of Instance Norm layers are iteratively and serially connected; 
the style transfer decoder comprises a second number of convolution layers, a third number of SPAdaIn residual blocks, and a tan h layer; 
the second number is one more than the third number;
the second number of convolution layers and the third number of SPAdaIn residual blocks are iteratively and serially connected, with the convolution layers sandwiching the SPAdaIn residual blocks; and
the tan h layer is the last layer of the style transfer decoder; and 
an input identity mesh is concatenated with the output of the pose feature extractor and then input to the style transfer decoder, and
the input identity mesh is meanwhile fed to each SPAdaIn residual block of the style transfer decoder, as recited in claim 12. 

a system for neural pose transfer, comprising an input device, a processor for processing input data, and an output device for outputting processed data; 
wherein the processor is configured to build a computing model comprising: a pose feature extractor for receiving an input pose, and a style transfer decoder for receiving for extract pose feature, wherein: 
the pose feature extractor comprises a first number of convolution layers and a first number of Instance Norm layers; 
the first number of convolution layers and the first number of Instance Norm layers are iteratively and serially connected; 
the style transfer decoder comprises a second number of convolution layers, a third number of SPAdaIn residual blocks, and a tan h layer; 
the second number is one more than the third number; 
the second number of convolution layers and the third number of SPAdaIn residual blocks are iteratively and serially connected, with the convolution layers sandwiching the SPAdaIn residual blocks; and 
the tan h layer is the last layer of the style transfer decoder; and 
an input identity mesh is concatenated with the output of the pose feature extractor and then input to the style transfer decoder, and the input identity mesh is meanwhile fed to each SPAdaIn residual block of the style transfer decoder, as recited in claim 20. 

	As a preliminary matter, please see the above section “Claim Interpretation” for details on how the above claim features are interpreted for allowance.  
The closest prior art to the above features of Applicant’s claims includes what has been placed on record in this application’s prosecution history, and what is cited in the attached PTO-892 with this official action.  Of the cited references, the following is noted: 
WO 2021/156511 A1: the instant reference teaches a generator network and/or a discriminator network for processing sequences of images, said network having a plurality of layers (see Abstract and paras. 50-53).  At least one of these layers can include one of more batch normalization layers, e.g. conditional batch normalization layers (see e.g. paras. 50-52).  As describe by the instant reference, a “batch normalization layer is a layer which transforms its input values into respective output values which have a predefined mean value (e.g. zero) and predefined variance (e.g. unit variance), by applying a gain and bias to the input values (which in the case of conditional batch normalization may depend on a data set referred to below as a conditional vector which is input to the batch normalization layer).” (see para. 52.). 
U.S. Patent Application Publication No. 2021/0097691: the instant reference is related to image generation using one or more neural networks (see Abstract).  The instant reference teaches a network that can include a conditional, spatially-adaptive normalization layer for propagating semantic information from semantic layout to other layers of a trained network. In at least one embodiment, this conditional normalization layer can be tailored for semantic image synthesis (paras. 47-50).  The instant reference also teaches spatially-adaptive normalization, accomplished using a conditional normalization layer (see para. 58).  In another embodiment, conditional normalization layers include representatives such as Conditional Batch Normalization (Conditional BN) and Adaptive Instance Normalization (AdaIN) (see para. 63). Likewise, in yet another embodiment, exemplary architecture includes several ResNet blocks with upsampling layers. In at least one embodiment, affine parameters of normalization layers are learned using SPADE. In at least one embodiment, since each residual block operates in a different scale, SPADE can downsample a semantic mask to match a spatial resolution (para. 67).  The instant reference also teaches embodiments whereby an image encoder is included, which can have = a series of convolutional layers followed by two linear layers that output a mean vector μ and a variance vector σ of output distribution (para. 74). Several other embodiments are taught by this instant reference that are relevant and pertinent to Applicant’s above claim features. 
	
However, the above cited references, alone or in combination with remaining references of record, would not have rendered obvious the specific combination of features as recited in Applicant’s independent claims. 
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Sarah Lhymn whose telephone number is (571)270-0632. The examiner can normally be reached M-F, 9:00 AM to 6:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu can be reached on 571-272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

Sarah Lhymn
Primary Examiner
Art Unit 2613



/Sarah Lhymn/Primary Examiner, Art Unit 2613