DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


The term “sufficient to” in claims 2 and 12 is a relative term which renders the claim indefinite. The term “sufficient to” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. Appropriate correction is required.
The term “statistical models” in claims 7-8 and 17-18 is a relative term which renders the claim indefinite. The term “statistical models” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. Appropriate correction is required.
The term “certain function words” in claims 8 and 18 is a relative term which renders the claim indefinite. The term “certain function words” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2 and 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Honeycutt (US 2012/0265533) in view of Kaszczuk et al. (US 9,484,014).

Claims 1 and 11,
Honeycutt teaches a text-to-speech conversion system comprising: a text converter adapted to convert input text to at least one phoneme selected from a plurality of phonemes stored in memory; a machine-learning model storing voice patterns for a. plurality of individuals and adapted to receive the at least one phoneme and an identity of a speaker and to generate acoustic features for each phoneme: and to receive the generated acoustic features and to generate a speech signal simulating a voice of the identified speaker ([0018-0022] TTS system 200 for outputting speech having voice characteristics based on a speaker profile; receives communications (e.g., e-mail, text message) and identifies metadata; metadata (e.g., e-mail address, contact card information) is used by metadata module 204 to generate a speaker profile; the raw text and the speaker profile is input to TTS engine 210; TTS 210 uses the speaker profile to select voice data from voice database 208; the voice data is used by TTS engine 210 to convert the raw text to speech having voice characteristics that best match the speaker profile; TTS engine 210 includes a synthesizer that incorporates a model of the human vocal tract or other human voice characteristics to create a synthetic speech output according to the speaker profile; TTS engine 210 performs text-to-phoneme or grapheme-to-phoneme conversion where phonetic transcriptions are assigned to each word and the text is divided; phonetic transcriptions and prosody information together make up a symbolic linguistic representation of the raw text; the synthesizer converts the symbolic linguistic representation into sound; the synthesizer can include the computation of a target prosody (e.g., pitch contour, phoneme durations), which is applied to the output speech; the target prosody can be determined based on the voice data that is selected based on a speaker profile).
The difference between the prior art and the claimed invention is that Honeycutt does not teach a decoder adapted to receive the generated acoustic features and to generate a speech signal simulating a voice of the identified speaker in a language.
Kaszczuk teaches a decoder adapted to receive the generated acoustic features and to generate a speech signal simulating a voice of the identified speaker in a language ([col. 8 lines 4-49] encoder/decoder for encoding and decoding speech data, such as digitized audio data, feature vectors, etc.; the speech synthesis engine 218 may include specialized databases or models to account for such user preferences; TTS device 202 may also be configured to perform TTS processing in multiple languages; for each language, the TTS module 214 may include specially configured data, instructions and/or components to synthesize speech in the desired language(s)).
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Honeycutt with teachings of Kaszczuk by modifying voice assignment for text-to-speech output as taught by Honeycutt to include a decoder (Kaszczuk [col. 8 lines 47-49]).

Claims 2 and 12,
Kaszczuk further teaches the system of claim 1, wherein the plurality of phonemes stored in memory comprise phonemes sufficient to generate speech for a plurality of languages ([Fig. 2] [col. 1 lines 42-54] [col. 4 47-52] [col. 8 line 44] a local device may also be configured with a smaller speech unit database to produce high-quality results for certain text; mapping to one or more phonetic units using a language dictionary stored in the TTS device 202, TTS storage module; synthesis speech in the desired language(s)).

Claims 3-4 and 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Honeycutt (US 2012/0265533) in view of Kaszczuk et al. (US 9,484,014) and further in view of Arik et al. (US 2018/0036880).

Claims 3 and 13,
Honeycutt and Kaszczuk teach all the limitations in claim 2. The difference between the prior art and the claimed invention is that Honeycutt nor Kaszczuk teach wherein the machine-learning model comprises a neural network model.
Arik teaches wherein the machine-learning model comprises a neural network model ([0048] TTS system using deep neural network).
(Arik [0048]).

Claims 4 and 14,
Arik further teaches the system of claim 3, wherein the neural network model comprises a deep learning neural network model ([0048] deep neural network).

Claims 5-10 and 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Honeycutt (US 2012/0265533) in view of Kaszczuk et al. (US 9,484,014) in view of Arik et al. (US 2018/0036880) and further in view of Vanreusel et al. (US 2017/0309272).

Claims 5 and 15,
Honeycutt, Kaszczuk and Arik teach all the limitations in claim 4. The difference between the prior art and the claimed invention is that Honeycutt, Kaszczuk nor Arik teach wherein the text converter is further adapted to detect a language of the input text to be converted.
Vanreusel teaches wherein the text converter is further adapted to detect a language of the input text to be converted ([0019] [[0030] a process for searching a user's voice against the phonetic inventory of regional nouns to find a matching transcription and outputting corresponding orthographic text and a process for personalizing a speech synthesis and recognition experience to a user; the term "regional noun" refers to a word that is tied to a particular language or region).
([0017]).

Claims 6 and 16,
Kaszczuk further teaches the system of claim 5, wherein the language of the input text to be converted is detected using an n-gram approach ([col. 6 line 17] Hidden Markov Models (HMM)).

Claims 7 and 17,
Kaszczuk further teaches the system of claim 5, wherein the language of the input text to be converted is detected using statistical methods ([col. 6 lines 17-19] HMMs to determine probabilities that audio output should match textual input).

Claims 8 and 18,
Kaszczuk further teaches the system of claim 7, wherein the statistical methods are based on the prevalence of certain function words ([col. 1 lines 45-47] a local device may also be configured with a smaller speech unit database to produce high-quality results for certain text).

Claims 9 and 19,
Vanreusel further teaches the system of claim 6, wherein the generated acoustic features include accent acoustic features and the generated speech signal further simulate a voice of the ([0031] the phonetic inventory includes a mapping of words (e.g., regional nouns), user accent classifications, and phonetic transcriptions).

Claims 10 and 20,
Vanreusel further teaches the system of claim 9, wherein the accent corresponds to a native accent of the identified speaker ([0027] user accent classification).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHREYANS A PATEL whose telephone number is (571)270-0689. The examiner can normally be reached Monday-Friday 8am-5pm PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like 

SHREYANS A. PATEL
Examiner
Art Unit 2657



/SHREYANS A PATEL/Examiner, Art Unit 2656