DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments, see remarks, filed 2/1/2022, with respect to claims 1-5, 7-21 have been fully considered and are persuasive.  The 103 rejections of claims 1-21 has been withdrawn. 
See reasons for allowance below.
Allowable Subject Matter
Claims 1-5, 7-21 are allowed.

Applicant has amended claim 1 as follows:
“encode, via a linguistic encoder, the linguistic sequence to generate an embedded linguistic sequence comprising a vector representation of a phoneme in a phonetic context; generate, via a trained prosody info predictor, combined prosody info comprising a plurality of observations based on the linguistic sequence, wherein the plurality of observations comprise linear combinations of statistical measures evaluating a plurality of prosodic components over a plurality of hierarchical time spans, wherein the observations are normalized; [[and]] modify the combined prosody info based on the prosody info offset, wherein the prosody info offset adjusts the target observation of the plurality of observations at the specified time span of the hierarchical time spans; embed the modified combined prosody info into a latent space to generate embedded prosody info; and generate, via a trained neural network, embedded prosody info concatenated with the linguistic embedding, wherein a prosodic characteristic of the generated acoustic sequence is adjusted based on the prosody info offset.” The applicants specification states in par. [0023] that “segment refers to a time span within this hierarchical temporal structure of paragraph/sentence/phrase/word/syllable/phone”. The cited art to Li and Zhao are silent with regards to hierarchical temporal time spans as described. Li and Zhao do not disclose modifying a combined prosody info based on the prosody info offset, where the prosody info offset adjusts the target observation of the plurality of observations at the specified time span of the hierarchical time spans, neither embedding the modified combined prosody info into a latent space to generate embedded prosody info, nor generating, via a trained neural network, an acoustic sequence based on the embedded prosody info concatenated with the linguistic embedding. Li is directed towards a method and a device for training personalized multiple acoustic models for speech synthesis, and a method and a device for speech synthesis. Zhao is generally directed towards an intent- recognition system and an emotional text-to-speech system and does not cure Li’s deficiencies. Fernandez teaches training a prosody predictor based on extracted unlabeled data but is also silent in regards to the above-mentioned limitation.
Claims 8 and 15 include similar recitations to claim 1 and are allowable for the same reasons. Dependent claims 2-5, 7, 9-14, and 16-21 further narrow the allowable parent claims and are therefore allowable as well.

A new search was performed and no art was found which teaches the claimed invention. See below.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Pertinent prior art available on form 892.
Arel ‘299 which teaches an acoustic model which outputs prosodic features at regular or irregular intervals, see col. 6 lines 4-38. However it is silent with regards to a hierarchical time temporal structure as claimed.
Kim ‘998 teaches a prosody feature extractor which receives input speech into a neural network to extract vectors corresponding to predetermined time units, see par. [0057], however it is silent with regards to hierarchical time structures.
Fernandez “Recognizing affect from speech prosody using hierarchical graphical models” mentions applying a class of hierarchical models to recognize prosody in natural speech, see abstract however it is silent with regards to the temporal segments discussed above.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Ortiz-Sanchez whose telephone number is (571)270-3711. The examiner can normally be reached Monday- Friday 9AM-6PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHAEL ORTIZ-SANCHEZ/Primary Examiner, Art Unit 2656