DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they do not include the following reference signs mentioned in the description: 601, 602, and 603, on page 14, lines 4-5.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
The disclosure is objected to because of the following informalities:
On page 14, line 5 contains “shown in FIG. 6”, but the drawings do not contain a Figure 6.
Appropriate correction is required.
Claim Objections
Claim 11 is objected to because of the following informalities:
In line 11, “between a text and speech,” should read “between a text and speech.”.
Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1 – 15 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  The claim recites a speech output method, comprising: determining a target text to be processed; determining a preset text corresponding to the target text by matching the target text with a local text database; and determining, based on the preset text, output speech of the target text from a local speech database to output the output speech; wherein the local speech database is pre-configured based on a correspondence between a text and speech.
	The claim 1 limitations, under their broadest reasonable interpretation, cover performance of the limitation in the mind.  For example, “determining a target text” in the context of this claim encompasses a person reading text to be converted to speech, “determining a preset text” in the context of this claim encompasses the person matching the text to be converted to speech with text in a text-to-speech database, and “determining, based on the preset text, output speech of the target text” in the context of this claim encompasses the person selecting the matching text in a text-to-speech database to output the corresponding speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  In particular, the claim only recites two additional elements, a text database and a speech database.  The databases are recited at a high-level of generality such that they amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.  The claim is directed to an abstract idea.   
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of a text database and speech database amount to no more than mere instructions to apply the exception using generic computer components.  Mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 2 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 2 depends from claim 1, and thus recites the limitations of claim 1, with the additional limitations: wherein determining the preset text corresponding to the target text by matching the target text with the local text database comprises: in response to failing to determine the preset text corresponding to the target text by matching the target text as a whole with the local text database, splitting the target text to obtain at least two target keywords; and matching the at least two target keywords with the local text database respectively to determine preset keywords corresponding to the target keywords; and determining, based on the preset text, the output speech of the target text from the local speech database comprises: determining, based on the preset keywords, the output speech of the target text from the local speech database.
For the reasons discussed above for claim 1, the claim 1 limitations recite abstract ideas.  The additional limitations of claim 2 do not preclude the steps of claim 1 from practically being performed in the mind.  For example, “splitting” in the context of this claim encompasses the person reading individual words of the text to be converted to speech, “matching” in the context of this claim encompasses the person matching the words to be converted to speech with words in a text-to-speech database, and “determining” in the context of this claim encompasses the person selecting the matching words in a text-to-speech database to output the corresponding speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  For the reasons discussed above for claim 1, the text database and speech database amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  For the reasons discussed above for claim 1, mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 3 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 3 depends from claim 2, and thus recites the limitations of claim 2, with the additional limitations: wherein determining, based on the preset keywords, the output speech of the target text from the local speech database comprises: determining, based on the preset keywords, speech segments corresponding to the target keywords from the local speech database; and splicing the speech segments based on a sequence of the target keywords in the target text, to obtain the output speech of the target text.
For the reasons discussed above for claim 2, the claim 2 limitations recite abstract ideas.  The additional limitations of claim 3 do not preclude the steps of claim 2 from practically being performed in the mind.  For example, “determining” in the context of this claim encompasses the person selecting the matching words in a text-to-speech database to output the corresponding speech, and “splicing” in the context of this claim encompasses the person combining the speech segments from a text-to-speech database to output speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  For the reasons discussed above for claim 2, the text database and speech database amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  For the reasons discussed above for claim 2, mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 4 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 4 depends from claim 3, and thus recites the limitations of claim 3, with the additional limitations: wherein determining, based on the preset keywords, the output speech of the target text from the local speech database comprises: for a specific keyword that fails to match with a preset keyword from the local text database in the at least two target keywords, determining a synthesized speech segment corresponding to the specific keyword by adopting offline text to speech; and splicing, based on the sequence of the target keywords in the target text, the synthesized speech segment and the speech segment determined from the local speech database to obtain the output speech of the target text.
For the reasons discussed above for claim 3, the claim 3 limitations recite abstract ideas.  The additional limitations of claim 4 do not preclude the steps of claim 3 from practically being performed in the mind.  For example, “determining” in the context of this claim encompasses the person selecting text segments without matches in a text-to-speech database and selecting a speech synthesizer to generate speech segments for the unmatched text segments, and “splicing” in the context of this claim encompasses the person combining speech segments from a text-to-speech database and speech segments from a speech synthesizer to output speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  For the reasons discussed above for claim 3, the text database and speech database amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  For the reasons discussed above for claim 3, mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 5 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 5 depends from claim 1, and thus recites the limitations of claim 1, with the additional limitations: wherein the local speech database comprises navigation terms.
For the reasons discussed above for claim 1, the claim 1 limitations recite abstract ideas.  The additional limitations of claim 5 do not preclude the steps of claim 1 from practically being performed in the mind.  For example, a person converting text to speech could convert navigation terms from text to speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  For the reasons discussed above for claim 1, the text database and speech database amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  For the reasons discussed above for claim 1, mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 6 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  The claim recites an electronic device, comprising: at least one processor; and a storage device communicatively connected to the at least one processor; wherein, the storage device stores an instruction executable by the at least one processor, and when the instruction executed by the at least one processor, the processor implements a speech output method, and the speech output method comprises: determining a target text to be processed; determining a preset text corresponding to the target text by matching the target text with a local text database; and determining, based on the preset text, output speech of the target text from a local speech database to output the output speech; wherein the local speech database is pre-configured based on a correspondence between a text and speech.
	The claim 6 limitations, under their broadest reasonable interpretation, cover performance of the limitation in the mind.  For example, “determining a target text” in the context of this claim encompasses a person reading text to be converted to speech, “determining a preset text” in the context of this claim encompasses the person matching the text to be converted to speech with text in a text-to-speech database, and “determining, based on the preset text, output speech of the target text” in the context of this claim encompasses the person selecting the matching text in a text-to-speech database to output the corresponding speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  In particular, the claim only recites the additional elements of a processor, a storage device, a text database, and a speech database.  These elements are recited at a high-level of generality such that they amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.  The claim is directed to an abstract idea.   
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional elements amount to no more than mere instructions to apply the exception using generic computer components.  Mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 7 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 7 depends from claim 6, and thus recites the limitations of claim 6, with the additional limitations: wherein determining the preset text corresponding to the target text by matching the target text with the local text database comprises: in response to failing to determine the preset text corresponding to the target text by matching the target text as a whole with the local text database, splitting the target text to obtain at least two target keywords; and matching the at least two target keywords with the local text database respectively to determine preset keywords corresponding to the target keywords; and determining, based on the preset text, the output speech of the target text from the local speech database comprises: determining, based on the preset keywords, the output speech of the target text from the local speech database.
For the reasons discussed above for claim 6, the claim 6 limitations recite abstract ideas.  The additional limitations of claim 7 do not preclude the steps of claim 6 from practically being performed in the mind.  For example, “splitting” in the context of this claim encompasses the person reading individual words of the text to be converted to speech, “matching” in the context of this claim encompasses the person matching the words to be converted to speech with words in a text-to-speech database, and “determining” in the context of this claim encompasses the person selecting the matching words in a text-to-speech database to output the corresponding speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  For the reasons discussed above for claim 6, the additional elements of a processor, a storage device, a text database, and a speech database amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  For the reasons discussed above for claim 6, mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 8 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 8 depends from claim 7, and thus recites the limitations of claim 7, with the additional limitations: wherein determining, based on the preset keywords, the output speech of the target text from the local speech database comprises: determining, based on the preset keywords, speech segments corresponding to the target keywords from the local speech database; and splicing the speech segments based on a sequence of the target keywords in the target text, to obtain the output speech of the target text.
For the reasons discussed above for claim 7, the claim 7 limitations recite abstract ideas.  The additional limitations of claim 8 do not preclude the steps of claim 7 from practically being performed in the mind.  For example, “determining” in the context of this claim encompasses the person selecting the matching words in a text-to-speech database to output the corresponding speech, and “splicing” in the context of this claim encompasses the person combining the speech segments from a text-to-speech database to output speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  For the reasons discussed above for claim 7, the additional elements of a processor, a storage device, a text database, and a speech database amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  For the reasons discussed above for claim 7, mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 9 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 9 depends from claim 8, and thus recites the limitations of claim 8, with the additional limitations: wherein determining, based on the preset keywords, the output speech of the target text from the local speech database comprises: for a specific keyword that fails to match with a preset keyword from the local text database in the at least two target keywords, determining a synthesized speech segment corresponding to the specific keyword by adopting offline text to speech; and splicing, based on the sequence of the target keywords in the target text, the synthesized speech segment and the speech segment determined from the local speech database to obtain the output speech of the target text.
For the reasons discussed above for claim 8, the claim 8 limitations recite abstract ideas.  The additional limitations of claim 9 do not preclude the steps of claim 8 from practically being performed in the mind.  For example, “determining” in the context of this claim encompasses the person selecting text segments without matches in a text-to-speech database and selecting a speech synthesizer to generate speech segments for the unmatched text segments, and “splicing” in the context of this claim encompasses the person combining speech segments from a text-to-speech database and speech segments from a speech synthesizer to output speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  For the reasons discussed above for claim 8, the additional elements of a processor, a storage device, a text database, and a speech database amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  For the reasons discussed above for claim 8, mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 10 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 10 depends from claim 6, and thus recites the limitations of claim 6, with the additional limitations: wherein the local speech database comprises navigation terms.
For the reasons discussed above for claim 6, the claim 6 limitations recite abstract ideas.  The additional limitations of claim 10 do not preclude the steps of claim 6 from practically being performed in the mind.  For example, a person converting text to speech could convert navigation terms from text to speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  For the reasons discussed above for claim 6, the additional elements of a processor, a storage device, a text database, and a speech database amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  For the reasons discussed above for claim 6, mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 11 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  The claim recites a non-transitory computer-readable storage medium having a computer instruction stored thereon, wherein the computer instruction is configured to make a computer implement a speech output method, and the speech output method comprises: determining a target text to be processed; determining a preset text corresponding to the target text by matching the target text with a local text database; and determining, based on the preset text, output speech of the target text from a local speech database to output the output speech; wherein the local speech database is pre-configured based on a correspondence between a text and speech.
	The claim 11 limitations, under their broadest reasonable interpretation, cover performance of the limitation in the mind.  For example, “determining a target text” in the context of this claim encompasses a person reading text to be converted to speech, “determining a preset text” in the context of this claim encompasses the person matching the text to be converted to speech with text in a text-to-speech database, and “determining, based on the preset text, output speech of the target text” in the context of this claim encompasses the person selecting the matching text in a text-to-speech database to output the corresponding speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  In particular, the claim only recites the additional elements of a computer-readable storage medium, a computer, a text database, and a speech database.  These elements are recited at a high-level of generality such that they amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.  The claim is directed to an abstract idea.   
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional elements amount to no more than mere instructions to apply the exception using generic computer components.  Mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 12 depends from claim 11, and thus recites the limitations of claim 11, with the additional limitations: wherein determining the preset text corresponding to the target text by matching the target text with the local text database comprises: in response to failing to determine the preset text corresponding to the target text by matching the target text as a whole with the local text database, splitting the target text to obtain at least two target keywords; and matching the at least two target keywords with the local text database respectively to determine preset keywords corresponding to the target keywords; and determining, based on the preset text, the output speech of the target text from the local speech database comprises: determining, based on the preset keywords, the output speech of the target text from the local speech database.
For the reasons discussed above for claim 11, the claim 11 limitations recite abstract ideas.  The additional limitations of claim 12 do not preclude the steps of claim 11 from practically being performed in the mind.  For example, “splitting” in the context of this claim encompasses the person reading individual words of the text to be converted to speech, “matching” in the context of this claim encompasses the person matching the words to be converted to speech with words in a text-to-speech database, and “determining” in the context of this claim encompasses the person selecting the matching words in a text-to-speech database to output the corresponding speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  For the reasons discussed above for claim 11, the additional elements of a computer-readable storage medium, a computer, a text database, and a speech database amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  For the reasons discussed above for claim 11, mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 13 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 13 depends from claim 12, and thus recites the limitations of claim 12, with the additional limitations: wherein determining, based on the preset keywords, the output speech of the target text from the local speech database comprises: determining, based on the preset keywords, speech segments corresponding to the target keywords from the local speech database; and splicing the speech segments based on a sequence of the target keywords in the target text, to obtain the output speech of the target text.
For the reasons discussed above for claim 12, the claim 12 limitations recite abstract ideas.  The additional limitations of claim 13 do not preclude the steps of claim 12 from practically being performed in the mind.  For example, “determining” in the context of this claim encompasses the person selecting the matching words in a text-to-speech database to output the corresponding speech, and “splicing” in the context of this claim encompasses the person combining the speech segments from a text-to-speech database to output speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  For the reasons discussed above for claim 12, the additional elements of a computer-readable storage medium, a computer, a text database, and a speech database amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  For the reasons discussed above for claim 12, mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 14 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 14 depends from claim 13, and thus recites the limitations of claim 13, with the additional limitations: wherein determining, based on the preset keywords, the output speech of the target text from the local speech database comprises: for a specific keyword that fails to match with a preset keyword from the local text database in the at least two target keywords, determining a synthesized speech segment corresponding to the specific keyword by adopting offline text to speech; and splicing, based on the sequence of the target keywords in the target text, the synthesized speech segment and the speech segment determined from the local speech database to obtain the output speech of the target text.
For the reasons discussed above for claim 13, the claim 13 limitations recite abstract ideas.  The additional limitations of claim 14 do not preclude the steps of claim 13 from practically being performed in the mind.  For example, “determining” in the context of this claim encompasses the person selecting text segments without matches in a text-to-speech database and selecting a speech synthesizer to generate speech segments for the unmatched text segments, and “splicing” in the context of this claim encompasses the person combining speech segments from a text-to-speech database and speech segments from a speech synthesizer to output speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  For the reasons discussed above for claim 13, the additional elements of a computer-readable storage medium, a computer, a text database, and a speech database amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  For the reasons discussed above for claim 13, mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim 15 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 15 depends from claim 11, and thus recites the limitations of claim 11, with the additional limitations: wherein the local speech database comprises navigation terms.
For the reasons discussed above for claim 11, the claim 11 limitations recite abstract ideas.  The additional limitations of claim 15 do not preclude the steps of claim 11 from practically being performed in the mind.  For example, a person converting text to speech could convert navigation terms from text to speech.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application.  For the reasons discussed above for claim 11, the additional elements of a computer-readable storage medium, a computer, a text database, and a speech database amount to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  For the reasons discussed above for claim 11, mere instructions to apply an exception using generic computer components cannot provide an inventive concept.  The claim is not patent eligible.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1 – 4, 6 – 9 and 11 – 14 are rejected under 35 U.S.C. 102(a)(1) and 102(a)(2) as being anticipated by Meyer et al. (US Patent No. 8,949,128), hereinafter Meyer.
Regarding claim 1, Meyer discloses a speech output method, comprising:
determining a target text to be processed (Column 4, lines 20-24, "One embodiment is directed to a method for providing a speech output for a speech-enabled application, the method comprising receiving from the speech-enabled application a text input comprising a text transcription of a desired speech output");
determining a preset text corresponding to the target text by matching the target text with a local text database (Column 4, lines 24-27, "selecting, using at least one computer system, at least one audio recording provided by a developer of the speech-enabled application, the at least one audio recording corresponding to at least a first portion of the text input");
and determining, based on the preset text, output speech of the target text from a local speech database to output the output speech (Column 4, lines 24-29, "selecting, using at least one computer system, at least one audio recording provided by a developer of the speech-enabled application, the at least one audio recording corresponding to at least a first portion of the text input; and providing for the speech-enabled application a speech output comprising the at least one audio recording.");
wherein the local speech database is pre-configured based on a correspondence between a text and speech (Column 7, lines 20-26, "In accordance with some embodiments of the present invention, the developer of the speech-enabled application may decide which portions of desired output speech prompts to pre-record as prompt recordings and to provide to the synthesis system, and may engage a desired voice talent to speak the prompt recordings in precisely the style the developer prefers.").
Regarding claim 2, Meyer discloses the method as claimed in claim 1, wherein determining the preset text corresponding to the target text by matching the target text with the local text database comprises:
in response to failing to determine the preset text corresponding to the target text by matching the target text as a whole with the local text database, splitting the target text to obtain at least two target keywords; and matching the at least two target keywords with the local text database respectively to determine preset keywords corresponding to the target keywords (Column 23, lines 38-47, " If no audio recording is available whose metadata information and/or constraints match all of the information and/or constraints of a portion of the text input, one or more matches may be identified as audio recordings whose metadata information and/or constraints match some subset of the information and/or constraints of that portion of the text input, without conflicting constraints. If the determination at act 440 is that a match is available, method 400 may proceed to act 450, at which one or more best matches may be selected.");
and determining, based on the preset text, the output speech of the target text from the local speech database comprises: determining, based on the preset keywords, the output speech of the target text from the local speech database (Column 25, lines 18-21, "At act 480, any audio recording(s) selected in the various iterations of act 450 and any additional audio segment(s) generated at act 470 may be concatenated to produce a speech output."; The audio recordings selected in the iterations of “act 450” reads on the output speech of the target text from the local speech database.).
Regarding claim 3, Meyer discloses the method as claimed in claim 2, wherein determining, based on the preset keywords, the output speech of the target text from the local speech database comprises:
determining, based on the preset keywords, speech segments corresponding to the target keywords from the local speech database (Column 23, lines 38-47, " If no audio recording is available whose metadata information and/or constraints match all of the information and/or constraints of a portion of the text input, one or more matches may be identified as audio recordings whose metadata information and/or constraints match some subset of the information and/or constraints of that portion of the text input, without conflicting constraints. If the determination at act 440 is that a match is available, method 400 may proceed to act 450, at which one or more best matches may be selected.");
and splicing the speech segments based on a sequence of the target keywords in the target text, to obtain the output speech of the target text (Column 25, lines 18-21, "At act 480, any audio recording(s) selected in the various iterations of act 450 and any additional audio segment(s) generated at act 470 may be concatenated to produce a speech output."; Concatenating the audio recordings selected in the iterations of “act 450” reads on splicing the speech segments.).
Regarding claim 4, Meyer discloses the method as claimed in claim 3, wherein determining, based on the preset keywords, the output speech of the target text from the local speech database comprises:
for a specific keyword that fails to match with a preset keyword from the local text database in the at least two target keywords, determining a synthesized speech segment corresponding to the specific keyword by adopting offline text to speech (Column 24, lines 40-45, "If at any iteration it is determined at act 440 that no matching audio recording is available for any remaining unmatched portion(s) of the text input, method 400 may proceed to act 470, at which additional audio segment(s) for the unmatched portion(s) of the text input may be generated using TTS synthesis.");
and splicing, based on the sequence of the target keywords in the target text, the synthesized speech segment and the speech segment determined from the local speech database to obtain the output speech of the target text (Column 25, lines 18-21, "At act 480, any audio recording(s) selected in the various iterations of act 450 and any additional audio segment(s) generated at act 470 may be concatenated to produce a speech output."; The audio segments generated at “act 470” read on the synthesized speech segment, the audio recordings selected in “act 450” read on the speech segment determined from the local speech database, and concatenating the audio recordings and audio segments reads on splicing the speech segments.).
Regarding claim 6, Meyer discloses an electronic device, comprising:
at least one processor (Figure 5, Processor 510);
and a storage device communicatively connected to the at least one processor (Figure 5, Memory 520);
wherein, the storage device stores an instruction executable by the at least one processor, and when the instruction executed by the at least one processor, the processor implements a speech output method, and the speech output method comprises:
determining a target text to be processed (Column 4, lines 20-24, "One embodiment is directed to a method for providing a speech output for a speech-enabled application, the method comprising receiving from the speech-enabled application a text input comprising a text transcription of a desired speech output");
determining a preset text corresponding to the target text by matching the target text with a local text database (Column 4, lines 24-27, "selecting, using at least one computer system, at least one audio recording provided by a developer of the speech-enabled application, the at least one audio recording corresponding to at least a first portion of the text input");
and determining, based on the preset text, output speech of the target text from a local speech database to output the output speech (Column 4, lines 24-29, "selecting, using at least one computer system, at least one audio recording provided by a developer of the speech-enabled application, the at least one audio recording corresponding to at least a first portion of the text input; and providing for the speech-enabled application a speech output comprising the at least one audio recording.");
wherein the local speech database is pre-configured based on a correspondence between a text and speech (Column 7, lines 20-26, "In accordance with some embodiments of the present invention, the developer of the speech-enabled application may decide which portions of desired output speech prompts to pre-record as prompt recordings and to provide to the synthesis system, and may engage a desired voice talent to speak the prompt recordings in precisely the style the developer prefers.").
Regarding claim 7, Meyer discloses the electronic device as claimed in claim 6, wherein determining the preset text corresponding to the target text by matching the target text with the local text database comprises:
in response to failing to determine the preset text corresponding to the target text by matching the target text as a whole with the local text database, splitting the target text to obtain at least two target keywords; and matching the at least two target keywords with the local text database respectively to determine preset keywords corresponding to the target keywords (Column 23, lines 38-47, " If no audio recording is available whose metadata information and/or constraints match all of the information and/or constraints of a portion of the text input, one or more matches may be identified as audio recordings whose metadata information and/or constraints match some subset of the information and/or constraints of that portion of the text input, without conflicting constraints. If the determination at act 440 is that a match is available, method 400 may proceed to act 450, at which one or more best matches may be selected.");
and determining, based on the preset text, the output speech of the target text from the local speech database comprises: determining, based on the preset keywords, the output speech of the target text from the local speech database (Column 25, lines 18-21, "At act 480, any audio recording(s) selected in the various iterations of act 450 and any additional audio segment(s) generated at act 470 may be concatenated to produce a speech output."; The audio recordings selected in the iterations of “act 450” reads on the output speech of the target text from the local speech database.).
Regarding claim 8, Meyer discloses the electronic device as claimed in claim 7, wherein determining, based on the preset keywords, the output speech of the target text from the local speech database comprises:
determining, based on the preset keywords, speech segments corresponding to the target keywords from the local speech database (Column 23, lines 38-47, " If no audio recording is available whose metadata information and/or constraints match all of the information and/or constraints of a portion of the text input, one or more matches may be identified as audio recordings whose metadata information and/or constraints match some subset of the information and/or constraints of that portion of the text input, without conflicting constraints. If the determination at act 440 is that a match is available, method 400 may proceed to act 450, at which one or more best matches may be selected.");
and splicing the speech segments based on a sequence of the target keywords in the target text, to obtain the output speech of the target text (Column 25, lines 18-21, "At act 480, any audio recording(s) selected in the various iterations of act 450 and any additional audio segment(s) generated at act 470 may be concatenated to produce a speech output."; Concatenating the audio recordings selected in the iterations of “act 450” reads on splicing the speech segments.).
Regarding claim 9, Meyer discloses the electronic device as claimed in claim 8, wherein determining, based on the preset keywords, the output speech of the target text from the local speech database comprises:
for a specific keyword that fails to match with a preset keyword from the local text database in the at least two target keywords, determining a synthesized speech segment corresponding to the specific keyword by adopting offline text to speech (Column 24, lines 40-45, "If at any iteration it is determined at act 440 that no matching audio recording is available for any remaining unmatched portion(s) of the text input, method 400 may proceed to act 470, at which additional audio segment(s) for the unmatched portion(s) of the text input may be generated using TTS synthesis.");
and splicing, based on the sequence of the target keywords in the target text, the synthesized speech segment and the speech segment determined from the local speech database to obtain the output speech of the target text (Column 25, lines 18-21, "At act 480, any audio recording(s) selected in the various iterations of act 450 and any additional audio segment(s) generated at act 470 may be concatenated to produce a speech output."; The audio segments generated at “act 470” read on the synthesized speech segment, the audio recordings selected in “act 450” read on the speech segment determined from the local speech database, and concatenating the audio recordings and audio segments reads on splicing the speech segments.).
Regarding claim 11, Meyer discloses a non-transitory computer-readable storage medium having a computer instruction stored thereon (Column 25, lines 39-44, “To perform any of the functionality described herein, the processor 510 may execute one or more instructions stored in one or more computer-readable storage media (e.g., the memory 520), which may serve as non-transitory computer-readable storage media storing instructions for execution by the processor 510.”),
wherein the computer instruction is configured to make a computer implement a speech output method, and the speech output method comprises:
determining a target text to be processed (Column 4, lines 20-24, "One embodiment is directed to a method for providing a speech output for a speech-enabled application, the method comprising receiving from the speech-enabled application a text input comprising a text transcription of a desired speech output");
determining a preset text corresponding to the target text by matching the target text with a local text database (Column 4, lines 24-27, "selecting, using at least one computer system, at least one audio recording provided by a developer of the speech-enabled application, the at least one audio recording corresponding to at least a first portion of the text input");
and determining, based on the preset text, output speech of the target text from a local speech database to output the output speech (Column 4, lines 24-29, "selecting, using at least one computer system, at least one audio recording provided by a developer of the speech-enabled application, the at least one audio recording corresponding to at least a first portion of the text input; and providing for the speech-enabled application a speech output comprising the at least one audio recording.");
wherein the local speech database is pre-configured based on a correspondence between a text and speech (Column 7, lines 20-26, "In accordance with some embodiments of the present invention, the developer of the speech-enabled application may decide which portions of desired output speech prompts to pre-record as prompt recordings and to provide to the synthesis system, and may engage a desired voice talent to speak the prompt recordings in precisely the style the developer prefers.").
Regarding claim 12, Meyer discloses the storage medium as claimed in claim 11, wherein determining the preset text corresponding to the target text by matching the target text with the local text database comprises:
in response to failing to determine the preset text corresponding to the target text by matching the target text as a whole with the local text database, splitting the target text to obtain at least two target keywords; and matching the at least two target keywords with the local text database respectively to determine preset keywords corresponding to the target keywords (Column 23, lines 38-47, " If no audio recording is available whose metadata information and/or constraints match all of the information and/or constraints of a portion of the text input, one or more matches may be identified as audio recordings whose metadata information and/or constraints match some subset of the information and/or constraints of that portion of the text input, without conflicting constraints. If the determination at act 440 is that a match is available, method 400 may proceed to act 450, at which one or more best matches may be selected.");
and determining, based on the preset text, the output speech of the target text from the local speech database comprises: determining, based on the preset keywords, the output speech of the target text from the local speech database (Column 25, lines 18-21, "At act 480, any audio recording(s) selected in the various iterations of act 450 and any additional audio segment(s) generated at act 470 may be concatenated to produce a speech output."; The audio recordings selected in the iterations of “act 450” reads on the output speech of the target text from the local speech database.).
Regarding claim 13, Meyer discloses the storage medium as claimed in claim 12, wherein determining, based on the preset keywords, the output speech of the target text from the local speech database comprises:
determining, based on the preset keywords, speech segments corresponding to the target keywords from the local speech database (Column 23, lines 38-47, " If no audio recording is available whose metadata information and/or constraints match all of the information and/or constraints of a portion of the text input, one or more matches may be identified as audio recordings whose metadata information and/or constraints match some subset of the information and/or constraints of that portion of the text input, without conflicting constraints. If the determination at act 440 is that a match is available, method 400 may proceed to act 450, at which one or more best matches may be selected.");
and splicing the speech segments based on a sequence of the target keywords in the target text, to obtain the output speech of the target text (Column 25, lines 18-21, "At act 480, any audio recording(s) selected in the various iterations of act 450 and any additional audio segment(s) generated at act 470 may be concatenated to produce a speech output."; Concatenating the audio recordings selected in the iterations of “act 450” reads on splicing the speech segments.).
Regarding claim 14, Meyer discloses the storage medium as claimed in claim 13, wherein determining, based on the preset keywords, the output speech of the target text from the local speech database comprises:
for a specific keyword that fails to match with a preset keyword from the local text database in the at least two target keywords, determining a synthesized speech segment corresponding to the specific keyword by adopting offline text to speech (Column 24, lines 40-45, "If at any iteration it is determined at act 440 that no matching audio recording is available for any remaining unmatched portion(s) of the text input, method 400 may proceed to act 470, at which additional audio segment(s) for the unmatched portion(s) of the text input may be generated using TTS synthesis.");
and splicing, based on the sequence of the target keywords in the target text, the synthesized speech segment and the speech segment determined from the local speech database to obtain the output speech of the target text (Column 25, lines 18-21, "At act 480, any audio recording(s) selected in the various iterations of act 450 and any additional audio segment(s) generated at act 470 may be concatenated to produce a speech output."; The audio segments generated at “act 470” read on the synthesized speech segment, the audio recordings selected in “act 450” read on the speech segment determined from the local speech database, and concatenating the audio recordings and audio segments reads on splicing the speech segments.).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 5, 10 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Meyer in view of Zhao et al. (US Patent No. 8,996,377), hereinafter Zhao.
Regarding claim 5, Meyer discloses the method as claimed in claim 1.
Meyer does not specifically disclose: which is applied to an offline navigation scene, wherein the local speech database comprises navigation terms.
Zhao teaches:
which is applied to an offline navigation scene, wherein the local speech database comprises navigation terms (Column 1, lines 27-33, "A text-to-speech (TTS) engine combines recorded speech with synthesized speech from a TTS synthesizer based on text input. The TTS engine receives the text input and identifies the domain for the speech (e.g. navigation, dialing, . . . ). The identified domain is used in selecting domain specific speech recordings (e.g. pre-recorded static phrases such as "turn left", "turn right" . . . ) from the input text.";  Column 2, lines 14-19, "The recorded speech is stored within a data store, such as voice data 130. Some of the voice data may be static prompts for the specific domains. For example, prompts for a specific domain such as a navigation system are stored as static phrases (e.g. turn left, turn left onto, arrive at, stay to the right, merge onto, and the like.").
Zhao teaches using recorded speech to convert text to speech for navigation systems in order to generate speech from text with high quality speech recordings (Column 2, lines 13-25, "Narrators are typically used to create recorded speech of high quality for different domains. The recorded speech is stored within a data store, such as voice data 130. Some of the voice data may be static prompts for the specific domains. For example, prompts for a specific domain such as a navigation system are stored as static phrases (e.g. turn left, turn left onto, arrive at, stay to the right, merge onto, and the like. Other voice data is also stored in voice data 130 and/or some other data store. The speech is segmented into one or more of: phonemes, diphones, syllables, morphemes, words, phrases and sentences. Generally, each sound in a chosen language is recorded in at least one voice such that the TTS engine can select the appropriate sounds to create the desired speech.").
Meyer and Zhao are considered to be analogous to the claimed invention because they are in the same field of text-to-speech systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Meyer to incorporate the teachings of Zhao to use recorded speech to convert text to speech for navigation systems.  Doing so would allow for generating speech from text with high quality speech recordings.
Regarding claim 10, Meyer discloses the electronic device as claimed in claim 6.
Meyer does not specifically disclose: wherein the local speech database comprises navigation terms.
Zhao teaches:
wherein the local speech database comprises navigation terms (Column 1, lines 27-33, "A text-to-speech (TTS) engine combines recorded speech with synthesized speech from a TTS synthesizer based on text input. The TTS engine receives the text input and identifies the domain for the speech (e.g. navigation, dialing, . . . ). The identified domain is used in selecting domain specific speech recordings (e.g. pre-recorded static phrases such as "turn left", "turn right" . . . ) from the input text.";  Column 2, lines 14-19, "The recorded speech is stored within a data store, such as voice data 130. Some of the voice data may be static prompts for the specific domains. For example, prompts for a specific domain such as a navigation system are stored as static phrases (e.g. turn left, turn left onto, arrive at, stay to the right, merge onto, and the like.").
Zhao teaches using recorded speech to convert text to speech for navigation systems in order to generate speech from text with high quality speech recordings (Column 2, lines 13-25, "Narrators are typically used to create recorded speech of high quality for different domains. The recorded speech is stored within a data store, such as voice data 130. Some of the voice data may be static prompts for the specific domains. For example, prompts for a specific domain such as a navigation system are stored as static phrases (e.g. turn left, turn left onto, arrive at, stay to the right, merge onto, and the like. Other voice data is also stored in voice data 130 and/or some other data store. The speech is segmented into one or more of: phonemes, diphones, syllables, morphemes, words, phrases and sentences. Generally, each sound in a chosen language is recorded in at least one voice such that the TTS engine can select the appropriate sounds to create the desired speech.").
Meyer and Zhao are considered to be analogous to the claimed invention because they are in the same field of text-to-speech systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Meyer to incorporate the teachings of Zhao to use recorded speech to convert text to speech for navigation systems.  Doing so would allow for generating speech from text with high quality speech recordings.
Regarding claim 15, Meyer discloses the storage medium as claimed in claim 11.
Meyer does not specifically disclose: wherein the local speech database comprises navigation terms.
Zhao teaches:
wherein the local speech database comprises navigation terms (Column 1, lines 27-33, "A text-to-speech (TTS) engine combines recorded speech with synthesized speech from a TTS synthesizer based on text input. The TTS engine receives the text input and identifies the domain for the speech (e.g. navigation, dialing, . . . ). The identified domain is used in selecting domain specific speech recordings (e.g. pre-recorded static phrases such as "turn left", "turn right" . . . ) from the input text.";  Column 2, lines 14-19, "The recorded speech is stored within a data store, such as voice data 130. Some of the voice data may be static prompts for the specific domains. For example, prompts for a specific domain such as a navigation system are stored as static phrases (e.g. turn left, turn left onto, arrive at, stay to the right, merge onto, and the like.").
Zhao teaches using recorded speech to convert text to speech for navigation systems in order to generate speech from text with high quality speech recordings (Column 2, lines 13-25, "Narrators are typically used to create recorded speech of high quality for different domains. The recorded speech is stored within a data store, such as voice data 130. Some of the voice data may be static prompts for the specific domains. For example, prompts for a specific domain such as a navigation system are stored as static phrases (e.g. turn left, turn left onto, arrive at, stay to the right, merge onto, and the like. Other voice data is also stored in voice data 130 and/or some other data store. The speech is segmented into one or more of: phonemes, diphones, syllables, morphemes, words, phrases and sentences. Generally, each sound in a chosen language is recorded in at least one voice such that the TTS engine can select the appropriate sounds to create the desired speech.").
Meyer and Zhao are considered to be analogous to the claimed invention because they are in the same field of text-to-speech systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Meyer to incorporate the teachings of Zhao to use recorded speech to convert text to speech for navigation systems.  Doing so would allow for generating speech from text with high quality speech recordings.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to James Boggs whose telephone number is (571)272-2968. The examiner can normally be reached M-F 8:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JAMES BOGGS/Examiner, Art Unit 2657                                                                                                                                                                                                        

/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657