DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 6 is objected to because of the following informalities:  “for with the primary language” should be “for .  Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2-4 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. In particular, Claim 2 recites the limitation "causing the alternate audio data to be rendered via the one or more speakers of the computing device“.  There is insufficient antecedent basis for “the alternate audio data” in the claim. The claim is interpreted as meaning “an alternate audio data”. Claims 3 and 4 are rejected based on their dependency

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


              Claims 11-13 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea of text processing without significantly more. The claims recite mental processes as well as certain methods of organizing Human activities of analyzing received text to be converted into speech, determining languages and phonemes associated with the text and processing the phonemes to generate audio that mimics a human. The steps can be achieved by a human with a pen and paper and vocalizing retrieved phonemes corresponding to written text, where the vocalized phonemes can mimic another human. This judicial exception is not integrated into a practical application because absent of the abstract idea, the generically recited computer elements (computing system, client device) do not add meaningful limitations to the abstract idea because they amount to simply implementing the abstract idea on a computer. The claims further do not include additional elements that are sufficient to amount to significantly more than the judicial exception because “processing the first set of phonemes and the modified second set of phonemes to generate audio data that mimics a human speaker speaking the first set of phonemes and the modified second set of phonemes.” Correspond to well-understood, routine, conventional computer functions as recognized by the cited references (see Rogers, Roberts – PTO claims 12 and 13 do not add significantly more that the abstract idea and are similarly rejected.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

1.       Claims 1, 2, 5-7 and 11-14 are rejected under 35 U.S.C. 103 as being unpatentable over Rogers et al US PGPUB 2010/0082328 A1 (“Rogers” - IDS) in view of Qian et al “A Cross-Language State Sharing and Mapping Approach to Bilingual (Mandarin–English) TTS” (“Qian”)
Per Claim 1, Rogers discloses a method for generating computer generated speech from a natural language textual data stream, the method implemented by one or more processors and comprising:
  receiving a natural language textual data stream to be converted into computer generated speech for rendering to a user via one or more speakers of a computing 
determining whether the secondary language portion of the natural language textual data stream is in a secondary language that is not assigned as a familiar language for the user (Such text strings may also originate in one or more native languages and may need to be converted into one or more other target languages that are familiar to certain users…, para. [0014]; para. [0050]-[0052]; para. [0055]); 
processing the primary portion of the natural language textual data stream to determine a first set of phonemes that are assigned to the primary language and that correspond to the primary portion (one or more phonemes corresponding to the normalized text may be obtained in the text's native language…, para. [0054]-[0055]); 
processing the secondary portion of the natural language textual data stream to determine a second set of phonemes in a set that corresponds to the secondary portion, wherein the set includes at least phonemes corresponding to the primary language and the secondary language (This determination may be implemented using a technique that may be referred to as phoneme mapping, which may be used in conjunction with a table look up…, para. [0054]-[0055], mapping table as set); 

generating a modified second set of phonemes by replacing the one or more second phonemes, in the second set of phonemes, with the correlated phonemes in the primary language (para. [0014]; para. [0054]-[0055]; para. [0101]); 
processing the first set of phonemes and the modified second set of phonemes to generate audio data that mimics a human speaker speaking the first set of phonemes and the modified second set of phonemes (para. [0014]; para. [0088]); and 
causing the audio data to be rendered via the one or more speakers of the computing device (Abstract; para. [0010]; para. [0015], speech playback by player device as implying one or more speakers)
 Rogers does not explicitly disclose the use of a universal phoneme set 
However this feature is taught by Qian (TABLE 1; International Phonetic Alphabet (IPA) is an international standard phonetic symbol set for transcribing speech sounds of any spoken language…, pg. 1232, sec. IIA)
It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to implement the use of a universal phoneme set by substituting the International Phonetic Alphabet set of Qian with the mapping table of Rogers in 
Per Claim 2, Rogers in view of Qian discloses the method of claim 1, 
   Rogers discloses in response to determining that the secondary language portion is not in the secondary language that is not assigned as a familiar language for the user and instead is in an additional secondary language that is assigned as a familiar language for the user: processing the first set of phonemes and the second set of phonemes without mapping the second set of phonemes to phonemes in the primary language (para. [0054]-[0055]); and 
 causing the alternate audio data to be rendered via the one or more speakers of the computing device (Abstract; para. [0010]; para. [0015]); para. [0054]-[0055]).
         Per Claim 5, Rogers in view of Qian discloses the method of claim 1,
    Rogers discloses wherein a remote computing system provides the natural language textual data stream and provides, with the natural language textual data stream, an indication that the secondary language portion is not in the primary language (fig. 1; fig. 2; para. [0055]; para. [0066]; para. [0068]; para. [0088]). 
Per Claim 6, Rogers in view of Qian discloses the method of claim 1, 
  Rogers discloses determining that the secondary language portion of the natural language textual data stream is not in the primary language, wherein determining that the secondary language portion is not in the primary language comprises: determining that one or more secondary words in the natural language 
Per Claim 7, Rogers in view of Qian discloses the method of claim 6, 
   Rogers discloses wherein processing the secondary portion of the natural language textual data stream to determine the second set of phonemes in the set that correspond to the secondary portion comprises: determining that the one or more second words that are not in the primary language lexicon for the primary language, are in an alternate lexicon (para. [0054]; para. [0068]; para. [0073]); para. [0104]); and 
retrieving the second set of phonemes for the secondary language portion in the alternate lexicon (para. [0054]; para. [0068]; para. [0073]); para. [0104]) 
Qian discloses the universal phoneme set (TABLE 1; International Phonetic Alphabet (IPA) is an international standard phonetic symbol set for transcribing speech sounds of any spoken language…, pg. 1232, sec. IIA)
Per Claim 11, Rogers discloses a method for generating computer generated speech from a natural language textual data stream, the method implemented by one or more processors and comprising: 
receiving, at a computing system remote from a client device, a natural language textual data stream to be converted into computer generated speech for rendering to a user via one or more speakers of the client device, wherein the natural language textual data stream includes a primary portion that is in a primary language assigned to the user, and a secondary language portion that is not in the primary language assigned to the user (Abstract; fig. 1; fig. 2; Such text strings may also originate in one or more native languages and may need to be converted…, para. [0014]; para. [0038]; para. 
determining whether the secondary language portion of the natural language textual data stream is in a secondary language that is not assigned as a familiar language for the user (Such text strings may also originate in one or more native languages and may need to be converted into one or more other target languages that are familiar to certain users…, para. [0014]; para. [0050]-[0052]; para. [0055]);  
processing the primary portion of the natural language textual data stream to determine a first set of phonemes that are assigned to the primary language and that correspond to the primary portion (one or more phonemes corresponding to the normalized text may be obtained in the text's native language…, para. [0054]-[0055]);
processing the secondary portion of the natural language textual data stream to determine a second set of phonemes in a set that correspond to the secondary portion, wherein the set includes at least phonemes corresponding to the primary language and the secondary language (This determination may be implemented using a technique that may be referred to as phoneme mapping, which may be used in conjunction with a table look up…, para. [0054]-[0055], mapping table as set);
in response to determining that the secondary language portion is in the secondary language that is not assigned as a familiar language for the user: mapping the one or more second phonemes, that correspond to the secondary portion and that are not for the primary language, to one or more correlated phonemes in the primary 
generating a modified second set of phonemes by replacing the one or more second phonemes, in the second set of phonemes, with the correlated phonemes in the primary language (para. [0014]; para. [0054]-[0055]; para. [0101]);
processing the first set of phonemes and the modified second set of phonemes to generate audio data that mimics a human speaker speaking the first set of phonemes and the modified second set of phonemes (para. [0014]; para. [0088])
Rogers does not explicitly disclose the use of a universal phoneme set 
However this feature is taught by Qian (TABLE 1; International Phonetic Alphabet (IPA) is an international standard phonetic symbol set for transcribing speech sounds of any spoken language…, pg. 1232, sec. IIA)
It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to implement the use of a universal phoneme set by substituting the International Phonetic Alphabet set of Qian with the mapping table of Rogers in arriving at the claimed universal phoneme set, because such substitution would have resulted in obtaining a phoneme sound for any letter of a word in a spoken language (Qian, TABLE 1; pg. 1232, sec. IIA)
Per Claim 12, Rogers in view of Qian discloses the method of claim 11, 
     Rogers discloses wherein the natural language textual data stream is generated by the computing system remote from the client device (fig. 1; fig. 2)
Per Claim 13, Rogers in view of Qian discloses the method of claim 12,

Per Claim 14, Rogers discloses a system comprising one or more processors and memory operably coupled with the one or more processors, wherein the memory stores instructions that, in response to execution of the instructions by one or more processors, cause the one or more processors to perform: 
receiving a natural language textual data stream to be converted into computer generated speech for rendering to a user via one or more speakers of a computing device, wherein the natural language textual data stream includes a primary portion that is in a primary language assigned to the user, and a secondary language portion that is not in the primary language assigned to the user (Abstract; fig. 1; fig. 2; Such text strings may also originate in one or more native languages and may need to be converted…, para. [0014]; para. [0038]; para. [0051]; certain normalized texts need not need a pronunciation change from one language to another, as indicated by the dotted line arrow bypassing steps 206 and 208.  This may be true for text having a native language that corresponds to the target language…, para. [0055]; para. [0088]);
determining whether the secondary language portion of the natural language textual data stream is in a secondary language that is not assigned as a familiar language for the user (Such text strings may also originate in one or more native languages and may need to be converted into one or more other target languages that are familiar to certain users…, para. [0014]; para. [0050]-[0052]; para. [0055]);  
processing the primary portion of the natural language textual data stream to determine a first set of phonemes that are assigned to the primary language and that 
 processing the secondary portion of the natural language textual data stream to determine a second set of phonemes in a set that corresponds to the secondary portion, wherein the set includes at least phonemes corresponding to the primary language and the secondary language (This determination may be implemented using a technique that may be referred to as phoneme mapping, which may be used in conjunction with a table look up…, para. [0054]-[0055], mapping table as set);
in response to determining that the secondary language portion is in the secondary language that is not assigned as a familiar language for the user: mapping the one or more second phonemes, that correspond to the secondary portion and that are not for the primary language, to one or more correlated phonemes in the primary language, wherein mapping the one or more second phonemes to the one or more correlated phonemes is based on defined mappings between phonemes in the set to primary language phonemes (para. [0014]; para. [0054]);
generating a modified second set of phonemes by replacing the one or more second phonemes, in the second set of phonemes, with the correlated phonemes in the primary language (para. [0014]; para. [0054]-[0055]; para. [0101]);
processing the first set of phonemes and the modified second set of phonemes to generate audio data that mimics a human speaker speaking the first set of phonemes and the modified second set of phonemes (para. [0014]; para. [0088]) and 
speech playback by player device as implying one or more speakers)
Rogers does not explicitly disclose the use of a universal phoneme set 
However this feature is taught by Qian (TABLE 1; International Phonetic Alphabet (IPA) is an international standard phonetic symbol set for transcribing speech sounds of any spoken language…, pg. 1232, sec. IIA)
It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to implement the use of a universal phoneme set by substituting the International Phonetic Alphabet set of Qian with the mapping table of Rogers in arriving at the claimed universal phoneme set, because such substitution would have resulted in obtaining a phoneme sound for any letter of a word in a spoken language (Qian, TABLE 1; pg. 1232, sec. IIA)

2.         Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Rogers in view of Qian as applied to claim 2 above, and further in view of Roberts et al US 8,831,948 B2 (“Roberts”)
Per Claim 3, Rogers in view of Qian discloses the method of claim 2,
    Rogers in view of Qian does not explicitly disclose wherein the additional secondary language is assigned as a familiar language for the user based on data provided by the computing device or based on data stored in association with an account assigned to the user
    However, this feature is taught by Roberts (col. 5, ln 3-6)


2.         Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Rogers in view of Qian as applied to claim 6 above, and further in view of Legat US PGPUB 2014/0222415 A1 (“Legat” - IDS)
Per Claim 8, Rogers in view of Qian discloses the method of claim 6, 
   Qian discloses the use of a universal phoneme set (TABLE 1; International Phonetic Alphabet (IPA) is an international standard phonetic symbol set for transcribing speech sounds of any spoken language…, pg. 1232, sec. IIA)
   Rogers in view of Qian does not explicitly disclose wherein processing the secondary portion of the natural language textual data stream to determine the second set of phonemes in the set that correspond to the secondary portion comprises: automatically determining the second set of phonemes using a grapheme to phoneme model
  However, this feature is taught by Legat (para. [0004]-[0005]; para. [0007]; para. [0009]-[0012])
.

2.         Claims 4, 9 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Rogers in view of Qian as applied to Claim 1 above, and further in view of Fructuoso et al US PGPUB 2015/0186359 A1 (“Fructuoso”)
Per Claim 4, Rogers in view of Qian discloses the method of claim 2, 
  Qian discloses wherein processing the first set of phonemes and the second set of phonemes to generate the alternate audio data comprises processing the first set of phonemes and the second set of phonemes using a trained HMM model trained at least in part based on audio data from a human speaker that is fluent in the primary language and is fluent in the additional secondary language (pg. 1233, sec. III; pg. 1235, sec. B).
 Rogers in view of Qian does not explicitly disclose using a neural network model
  However, this feature is taught by Fructuoso (Abstract)
 It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to implement the use of a neural network model by substituting the neural network model of Fructuoso with the HMM model of Qian in arriving at the 
           Per Claim 9, Rogers in view of Qian discloses the method of claim 1, 
   Qian disclose wherein processing the first set of phonemes and the modified second set of phonemes to generate audio data that mimics a human speaker speaking the first set of phonemes and the modified second set of phonemes comprises processing the first set of phonemes and the second set of phonemes using a HMM model trained to generate human speech using phonemes that are specific to each of multiple languages (pg. 1233, sec. III; pg. 1235, sec. B)
 Rogers in view of Qian does not explicitly disclose using a neural network model
  However, this feature is taught by Fructuoso (Abstract)
 It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to implement the use of a neural network model by substituting the neural network model of Fructuoso with the HMM model of Qian in arriving at the claimed model, because such substitution would have resulted in facilitating cross lingual learning and as a matter of design choice (Fructuoso, para. [0003]; para. [0037])
Per Claim 10, Rogers in view of Qian and Fructuoso discloses the method of claim 9, 
  Qian discloses wherein the HMM model is trained by: training the HMM model based on a plurality of training instances that each includes a corresponding cross-lingual spoken utterance from a multilingual user and corresponding cross-lingual phonemes corresponding to the cross-lingual spoken utterance (pg. 1233, sec. III; pg. 1235, sec. B)


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to OLUJIMI A ADESANYA whose telephone number is (571)270-3307.  The examiner can normally be reached on Monday-Friday 8:30-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on 571-272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 

/OLUJIMI A ADESANYA/Primary Examiner, Art Unit 2658