DETAILED ACTION
Introduction
This office action is in response to Applicant’s submission filed on 03/02/2021. Claims 1-19 are pending in the application and have been examined.
	
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 7-9, 10-11 and 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Slifka et. al., US Patent 10,163,436 in view of Wieman et. al., US Patent Application Publication 2021/0174783. 
Regarding claim 1, Slifka teaches a method of controlling an electronic device, the method comprising: performing natural language understanding for a first text included in learning data (see Slifka, col 3, lines 51-57 the server 120 may receive (150) text data from the application server 125, this data is used to train the NLU component of the system); obtaining first information associated with a speech corresponding to the first text being uttered based on a result of the natural language understanding (see Slifka, col 5, lines 9-53 teaches how the speech obtained based on the prompt is processed using the NLU component); obtaining second information associated with an acoustic feature corresponding to the speech corresponding to the first text being uttered based on the first information  (see Slifka, col 7 lines 11-26 teaches processing of audio data to obtain different features for ASR processing) ; training a speech recognition model based on the plurality of obtained speech signals and the first text (see Slifka, col 5, lines 30-35,  The server 120 may then use (160) the created text to further train the NLU component regarding how spoken utterances triggering the command associated with the application server 125 are spoken by users). However, Slifka fails to teach obtaining a plurality of speech signals corresponding to the first text by converting a first speech signal corresponding to the first text based on the first information and the second information.
However, Wieman teaches obtaining a plurality of speech signals corresponding to the first text by converting a first speech signal corresponding to the first text based on the first information and the second information (see Wieman, [0144] FIG. 10 shows a diagram of a neural TTS generator 1001. It accepts text of variable values as input. Wieman, [0146]create a corpus of a diverse range of voices. Wieman [0163-0170] teaches determining a plurality of parameter sets that represent the diversity of users' voices with minimal bias ; interpreted as obtaining speech segments for training based on first information and second information).
Slifka and Wieman are considered to be analogous to the claimed invention because they relate to training speech recognition system. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Slifka on speech processing system including a dynamic NLU component that enables variable utterance structures with the speech synthesis parameter sets teachings of Wieman to reduce Corpora for domain-specific language models ( see Wieman[0015]).
Regarding claim 2, Slifka in view of Wieman teaches the method of claim 1.  Wieman further teaches wherein the first information comprises probability information for each of a plurality of parameters that indicate the speech corresponding to the first text being uttered (see Wieman, [0010-0011] Phrasings within grammars can be expressed in various specific formats such as regular expression format, or proprietary formats. FIG. 3 shows an example grammar with the phrasing “what's the weather [going to be] in <PLACE>[on] <TIME>”. Grammars produce intents and scores. For example, the grammar shown in FIG. 3 will produce an intent with a weather API universal resource locator (URL) that includes arguments filled with the PLACE and TIME slot values. Grammars also can give scores based on the probability of particular phrasings; scores are interpreted as probability ).
Regarding claim 3, Slifka in view of Wieman teaches the method of claim 2. Wieman further teaches wherein the plurality of parameters comprise at least one of a parameter for a main body of the speech, a parameter for a receiving device of the speech, or a parameter for an utterance place of the speech (see Wieman, [0009] and Fig. 3 shows the transcription hypotheses are compared to grammars 301. Grammars include phrasings and slots. Slots are place holders for information that can be filled from lists of possible values), wherein probability information for each of the plurality of parameters comprises at least one of a probability distribution regarding who is a main body of the speech, a probability distribution regarding what is a receiving device of the speech, and a probability distribution regarding where is an utterance place of the speech(see Wieman, [0011] and Fig. 3 teaches Grammars produce intents and scores which are interpreted as probability information of parameters in parts of the speech).
Regarding claim 7, Slifka in view of Wieman teaches the method of claim 1.  Wieman further teaches based on a speech signal corresponding to the first text not existing in the learning data, obtaining the first speech signal by inputting the first text to a speech synthesis model (see Wieman, [0144] FIG. 10 shows a diagram of a neural TTS generator 1001. It accepts text of variable values as input. Systems may accept words, letters, or phonemes as input. Systems may accept different numbers of inputs. The TTS generator 1001 uses a neural network with a set of weights to convert the input text to output speech audio).
Regarding claim 8, Slifka in view of Wieman teaches the method of claim 1.  Wieman further teaches the first text is obtained by inputting the first speech signal to the speech recognition model (see Wieman, Fig. 2 shows transcript or text data from input speech data).
Regarding claim 9, Slifka in view of Wieman teaches the method of claim 1. Slifka further teaches wherein the training further comprises, based on the plurality of obtained speech signals being input to the speech recognition model, training the speech recognition model to output the first text  (see Slifka col. 4 lines 4-13 and col 5, lines 30-35,  The server 120 may then use (160) the created text to further train the NLU component regarding how spoken utterances triggering the command associated with the application server 125 are spoken by users ).
Regarding claim 10, is directed to a device claim corresponding to the method claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1.
Regarding claim 11, is directed to a device claim corresponding to the method claim presented in claim 2 and is rejected under the same grounds stated above regarding claim 2.
Regarding claim 12, is directed to a device claim corresponding to the method claim presented in claim 3 and is rejected under the same grounds stated above regarding claim 3.
Regarding claim 16, is directed to a device claim corresponding to the method claim presented in claim 7 and is rejected under the same grounds stated above regarding claim 7.
Regarding claim 17, is directed to a device claim corresponding to the method claim presented in claim 8 and is rejected under the same grounds stated above regarding claim 8.
Regarding claim 18, is directed to a device claim corresponding to the method claim presented in claim 9 and is rejected under the same grounds stated above regarding claim 9.
Regarding claim 19, is directed to a non-transitory computer readable medium claim corresponding to the method claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1.
Claims 4-6 and 13-15 are rejected under 35 U.S.C. 103 as being unpatentable over Slifka et. al., US Patent 10,163,436 in view of Wieman et. al., US Patent Application Publication 2021/0174783 further in view of Kim et. al., US Patent Application Publication 2018/0061394.
Regarding claim 4, Slifka in view of Wieman teach the method of claim 3. Slifka further teaches wherein the second information comprises at least one of information about a speech feature of the main body of utterance wherein the obtaining the plurality of speech signals comprises obtaining a plurality of speech signals comprising at least one of the speech feature of the main body of the utterance (see Slifka, col 7 lines 11-26 teaches processing of audio data to obtain different features for ASR processing).
However, Slifka in view of Wieman fail to teach information about a microphone feature of the receiving device wherein the obtaining the plurality of speech signals comprises obtaining a plurality of speech signals comprising at least one of the speech feature of the main body of the utterance, the microphone feature of the receiving device. 
However, Kim teaches  information about a microphone feature of the receiving device, or information about a noise feature of the utterance place, wherein the obtaining the plurality of speech signals comprises obtaining a plurality of speech signals comprising at least one of the speech feature of the main body of the utterance, the microphone feature of the receiving device, or the noise feature of the utterance place (see Kim, [0071] and Fig. 3, in operation 310 user information and device information is obtained. The user information refers to information associated with the user in context information, which is used to analyze an intent of an utterance of the user. For example, the user information includes at least one of user profile information including, for example, an age of the user, a gender of the user, and history information of the user, or surrounding environment information; interpreted as noise feature of utterance space).
Slifka ,Wieman and Kim are considered to be analogous to the claimed invention because they relate to training speech recognition system. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Slifka and Wieman on speech processing system and training with speech synthesis parameter sets with the probability distribution of intent teachings of Kim  to analyze an intent or meaning from the received voice signal, and executes a task corresponding to the analyzed intent ( see Kim[0004]).
Regarding claim 5, Slifka in view of Wieman in view of Kim teach the method of claim 4. Wieman further teaches wherein the obtaining the plurality of speech signals comprises obtaining the plurality of speech signals so that an entire speech signal corresponding to the first text includes an acoustic feature of a ratio corresponding to probability information for each of the plurality of parameters in the learning data (see Wieman [0156-0161, 0172] teaches generating synthesizing speech signals based on probability or intent scores).
Regarding claim 6, Slifka in view of Wieman in view of Kim teach the method of claim 5. Wieman further teaches wherein the probability information of the at least one parameter, among the plurality of parameters, is preset (see Wieman, [0144] teaches a TTS generator using a set of weights to convert the input text to output speech audio is interpreted as probability information of one parameter is preset ).
Regarding claim 13, is directed to a device claim corresponding to the method claim presented in claim 4 and is rejected under the same grounds stated above regarding claim 4.
Regarding claim 14, is directed to a device claim corresponding to the method claim presented in claim 5 and is rejected under the same grounds stated above regarding claim 5.
Regarding claim 15, is directed to a device claim corresponding to the method claim presented in claim 6 and is rejected under the same grounds stated above regarding claim 6.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Wang et. al, US Patent Application Publication 2020/0335084 (cited in IDS) teaches performing a plurality of augmentations to the plurality of voice data articulations of predetermined phrases, to generate a corpus audio data set that includes the first quantity of audio samples and a second quantity of audio samples including augmented versions of the first quantity of audio samples (see Wang, abstract).
Li, Jason, et al. "Training neural speech recognition systems with synthetic speech augmentation." arXiv preprint arXiv:1811.00707 (2018) teaches synthetic data to build large neural speech recognition systems (see Li, sect 2 and 3).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NANDINI SUBRAMANI whose telephone number is (571)272-3916. The examiner can normally be reached Monday - Friday 12:00pm - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh M Mehta can be reached on (571)272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/NANDINI SUBRAMANI/Examiner, Art Unit 2656                                                                                                                                                                                                        
/BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656