DETAILED ACTION

Introduction
1.         This office action is in response to Applicant’s submission filed on 09/01/2021 in response to Non-Final Rejection mailed 06/01/2021.   Claims 1-9, 12-22 are pending in the application. As such, Claims 1-9, 12-22 have been reconsidered and examined.

Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
3.         The response filed on 09/01/2021 has been correspondingly accepted and considered in this Office Action.  Claims 1-9, 12-22 have been examined.  

Response to Arguments 
4.         In view of Applicant’s amendments to independent Claims 1, 12, and 19, and corresponding persuasive Remarks on pp. 6-8 filed together on 09/01/2021, the previous rejections of Claims 1-4, 7, 9, 10, 12-15, 17-19 rejected under 35 U.S.C. §102(a)(1) and/or 102(b)(1) as being anticipated by Barbulescu et al., (A. Barbulescu, R. Ronfard and G. Bailly, "A Generative Audio-Visual Prosodic Model for Virtual Actors," in IEEE Computer Graphics and Applications, vol. 37, no. 6, pp. 40-51, November/December 2017), hereinafter referred to as BARBULESCU, are (H. X. Pham, S. Cheung and V. Pavlovic, "Speech-Driven 3D Facial Animation with Implicit Emotional Awareness: A Deep Learning Approach," 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017, pp. 2328-2336), hereinafter the latter referred to as PHAM, are respectfully reconsidered and withdrawn.

Allowable Subject Matter
5.	The following is an Examiner’s Statement of Reasons for Allowance:
Claims 1-9, and 12-22 are found allowable over the prior art of record for at least the following rationale. 
At best, BARBULESCU evidences an architecture comprising, see e.g., “…generating audio-visual speaking styles…use the phonotactic information to predict prosodic 
    PNG
    media_image1.png
    357
    792
    media_image1.png
    Greyscale
 feature contours…predicted rhythm is used to compute phoneme durations…expressive speech is synthesized with a vocoder that uses the neutral utterance, predicted rhythm, energy, and voice pitch, and the facial animation parameters are obtained by adding the warped neutral motion to the reconstructed and warped predicted motion contours…” (See e.g., BARBULESCU pp. 41-46, Fig. 2). In BARBULESCU, there is also further evidence that said architecture discloses see e.g., “…choos[ing] neural networks for carrying this type of nonlinear mapping between phonotactic 
    PNG
    media_image2.png
    415
    690
    media_image2.png
    Greyscale
contour values…the model should also be able to extrapolate in the case of new phonotactic information—that is, when we want to generate contours for an utterance with a number of syllables different from the ones seen in the training set. Expressive modeling is carried out separately for each feature (melody, rhythm, energy, and motion) by training a feed-forward neural network with a hidden layer of 17 neurons and a logistic activation function…” and “…learning audio-visual speaking styles…extract audio and visual prosodic features from the training example and learn SFC models and GV equalization parameters for all dramatic attitudes, resulting in a database of audio-visual prosodic contours, including melody, rhythm, and differential motion…” (See e.g., BARBULESCU pp. 41-45, Fig. 2).
Notwithstanding, BARBULESCU’s teachings still fail to teach or fairly suggest either individually or in a reasonable combination the recited limitations in amended independent Claims 1, 12, and 19. Similarly, dependent Claims 2-9; 13-18; and 20-22, further limit allowable independent Claims 1, 12, and 19 correspondingly, and thus said claims are also found allowable over the prior art of record by virtue of their dependency.
Any comments considered necessary by Applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”


Conclusion
6.       The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.  
Sadoughi et al., (N. Sadoughi and C. Busso, "Expressive Speech-Driven Lip Movements with Multitask Learning," 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), 2018, pp. 409-415), discloses see e.g., “…a conditional generative adversarial network, called conditional sequential GAN (CSG), which learns the relationship between emotion and lexical content in a principled manner. This model uses a set of articulatory and emotional features directly extracted from the speech signal as conditioning inputs, generating realistic movements…to create emotionally dependent models by either adapting the base model with the target emotional data (CSG-Emo-Adapted), or adding emotional conditions as the input of the model (CSG-Emo-Aware). Objective evaluations of these models show improvements for the CSG-Emo-Adapted compared with the CSG model, as the trajectory sequences are closer to the original sequences…” (Sadoughi et al., Abstract, § 4, Fig. 4).
Please, see additional references in form PTO-892 for more details.
7.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Edgar Guerra-Erazo whose telephone number is (571) 270-3708.  The examiner can normally be reached on M-F 7:30a.m.-5:00p.m. EST. If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Bhavesh Mehta can be reached on (571) 272-7453.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications 
/EDGAR X GUERRA-ERAZO/            Primary Examiner, Art Unit 2656