DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments with respect to 35 U.S.C. 102 in regards to claims 1 and 12 have been considered but are moot due to new grounds of rejection necessitated by amendments. See detailed rejection below.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 14 recites the limitation "an emotion."  There is insufficient antecedent basis for this limitation in the claim. Appropriate correction is required. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory 
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-2, 5-6, 8-9, 11-13, 16-17 and 19-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Agapi et al. (US 2009/0299733).

Claims 1 and 12,
Agapi teaches an electronic apparatus comprising: a memory configured to store at least one instruction; and a processor configured to execute the at least one instruction stored in the memory, which when executed causes the processor to control to ([0014] a system 100 for generating and representing XML data, according to one embodiment of the invention. The system 100 illustratively includes one or more processors 102, which can comprise a plurality of registers, logic-gates, and other logic-based circuitry (not explicitly shown) for processing electronic data according to a predetermined set of processing instructions): 
acquire input data to be input into a text-to-speech (TTS) module for outputting a voice through the TTS module, acquire a voice signal corresponding to the input data through the TTS module ([0020] the text T1 can be input to the text-to-speech engine 104. The audible rendering of the text T1 produced by the text-to-speech engine 104), 
identify at least one of a length of the voice signal, an emotion of the voice signal or a spacing of the voice signal, detect an error in the voice signal based on the identified at least one of the length of ([0021] [0058-0060] the parsed version of the synthesized speech output can be used by the comparator-annotator 110 as a reference or basis for comparison; the comparator-annotator 110 can compare characteristics of the recorded user speech to the corresponding portions of the base and determine whether the difference exceeds a predetermined threshold; if the difference exceeds the predetermined threshold, the comparator-annotator 110 can be configured to insert an appropriate element; determining whether a difference between an amplitude of a word contained in the recorded voice utterances and an amplitude of a corresponding word contained in the synthesized speech output exceeds a predetermined threshold; determining whether a difference between a speech length of a word contained in the recorded voice utterances and a synthesized speech length of a corresponding word contained in the synthesized speech output exceeds a predetermined threshold; determining whether a difference between a period of silence between a pair of words contained in the recorded voice utterances and a period of silence between a pair of corresponding words contained in the synthesized speech output exceeds a predetermined threshold).

Claims 2 and 13,
Agapi further teaches the electronic apparatus of claim 1, wherein the input data comprises first text data, and the processor when executing the at least one instruction is further configured to: convert the voice signal into second text data, compare the first text data included in the input data and the second text data, and detect the error in the voice signal based on a result of comparing the first text data and the second text data ([0048] the system 100 can detect where an SSML <sub> is to be inserted is to convey the recorded voice utterances as an audio file to a transcription server; the server returns the text, the comparison is made between the original text input and transcribed text showing where any differences occur).

Claims 5 and 16,
Agapi further teaches the electronic apparatus of claim 1, wherein the processor when executing the at least one instruction is further configured to: based on detecting the error in the voice signal, correct at least one of the spacing or a punctuation mark of text data included in information on the input data, and input corrected input data having the at least one of the spacing or the punctuation mark of the text data into the TTS module ([0060] the comparison performed at step 212 comprises determining whether a difference between a period of silence between a pair of words contained in the recorded voice utterances and a period of silence between a pair of corresponding words contained in the synthesized speech output exceeds a predetermined threshold; the annotating step 212 then further comprises annotating the XML-based speech synthesis document with an XML break element to increase the period of silence between the pair of corresponding words if the difference exceeds the predetermined threshold).

Claims 6 and 17,
Agapi further teaches the electronic apparatus of claim 1, wherein the processor when executing the at least one instruction is further configured to: based on detecting the error in the voice signal, correct the input data by applying a speech synthesis markup language (SSML) to text data included in the input data, and input corrected input data having the speech synthesis markup language (SSML) applied to the text data into the TTS module ([0018] the procedures and functions performed by the system 100 can be initiated when a system user begins the SSML or other XML-based speech synthesis application for auto tagging a speech synthesis document).

Claims 8 and 19,
Agapi further teaches the electronic apparatus of claim 1, further comprising: a speaker, wherein the processor when executing the at least one instruction is further configured to: add an indicator indicating correction to the voice signal, and output the voice signal having the indicator through the speaker ([0006] [0021] recording and output device; the comparator-annotator 110 can be configured to insert an appropriate element; it is inherent that the system requires a speaker to “audible” output the synthesized speech).

Claims 9 and 20,
Agapi further teaches the electronic apparatus of claim 1, further comprising: a speaker; and a microphone, wherein the processor when executing the at least one instruction is further configured to: output the voice signal through the speaker, and based on the voice signal output through the speaker being received through the microphone, detect the error in the voice signal received through the microphone based on the input data ([0006] [0024] recording and output device; it is inherent that the system requires a speaker to “audible” output the synthesized speech; the speech synthesis document comparator-annotator 110 can be further configured to annotate the XML-based speech synthesis document with an XML prosody element to increase the synthesized speech length of the corresponding word if the difference exceeds the predetermined threshold).

Claim 11,
Agapi further teaches the electronic apparatus of claim 1, further comprising: a communicator, wherein the processor when executing the at least one instruction is further configured to: transmit the ([Fig. 1] [0048] the system can convey the recorded voice utterances as an audio file transcription server).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 7 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Agapi et al. (US 2009/0299733) and further in view of Pore et al. (US 2019/0295527).

Claims 7 and 18, 
Agapi teaches all the limitations in claim 1. The difference between the prior art and the claimed invention is that Agapi does not explicitly tech convert a received user voice into text data by using a voice recognition module, analyze an intent of the text data, and acquire response information corresponding to the received user voice as the input data.
Pore teaches convert a received user voice into text data by using a voice recognition module, analyze an intent of the text data, and acquire response information corresponding to the received user voice as the input data ([0016] speech-to-text; NLP for extracting key topic, sentiment and the features).
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Agapi with teachings of Pore by modifying the method and system for creating and editing an XML-based speech synthesis document as (Pore [0001]).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Agapi et al. (US 2009/0299733) and further in view of Wood et al. (US 2006/0149546).

Claim 10,
Agapi teaches all the limitations in claim 9. The difference between the prior art and the claimed invention is that Agapi does not explicitly teach identify an identity of the voice signal received through the microphone, based on the voice signal received through the microphone being the voice signal output through the speaker based on the identity, detect the error in the voice signal, and based on the voice signal received through the microphone having been uttered by a user based on the identity, convert the voice signal into text data by using a voice recognition module, and analyze an intent of the text data and acquire response information corresponding to the received user voice as the input data.
Wood teaches identify an identity of the voice signal received through the microphone, based on the voice signal received through the microphone being the voice signal output through the speaker based on the identity, detect the error in the voice signal, and based on the voice signal received through the microphone having been uttered by a user based on the identity, convert the voice signal into text data by using a voice recognition module, and analyze an intent of the text data and acquire response information corresponding to the received user voice as the input data ([0006] (i) the at least one user request intent, determined based on the user request, and (ii) a transcribed text of the at least one user request, the transcribed text being based on speech recognition processing of the user request).
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Agapi with teachings of Wood by modifying the method and system for creating and editing an XML-based speech synthesis document as taught by Agapi to include identify an identity of the voice signal received through the microphone, based on the voice signal received through the microphone being the voice signal output through the speaker based on the identity, detect the error in the voice signal, and based on the voice signal received through the microphone having been uttered by a user based on the identity, convert the voice signal into text data by using a voice recognition module, and analyze an intent of the text data and acquire response information corresponding to the received user voice as the input data as taught by Wood for the benefit of registering new intents increases the knowledge of the general purpose virtual assistant, or overloads the handling of an existing intent (Wood [Abstract]).

Allowable Subject Matter
Claims 3 and 14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Runge et al. (US 2006/0149546) – Communication system, communication emitter and appliance for detecting erroneous text messages
Zhao et al. (US 2014/0257815) – Speech recognition assisted evaluation on text-to-speech pronunciation issue detection
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHREYANS A PATEL whose telephone number is (571)270-0689. The examiner can normally be reached Monday-Friday 8am-5pm PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit 

SHREYANS A. PATEL
Examiner
Art Unit 2657



/SHREYANS A PATEL/Examiner, Art Unit 2656