DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This Office action is in response to correspondence filed 06 July 2022 in reference to application 16/996,408.  Claims 1, 3, 4, and 6-8 are pending and have been examined.

Response to Amendment
The amendment filed 06 July 2022 has been accepted and considered in this office action.  Claims 1 and 8 have been amended, and claims 2 and 5 cancelled.

Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 3, 4, and 6-8 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim(s) 1, 3, 4, and 6-8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Keskin et al. (Many-to-many Voice Conversion with Out-of-Dataset Speaker Support) in view of Qian et al. (US PAP 2012/0253781).

Consider claim 1, Keskin teaches a training method of training a speaker identification model which receives voice data as an input and outputs speaker 5identification information for identifying a speaker of an utterance included in the voice data (abstract, using generative adversarial network), the training method comprising: 
performing voice quality conversion of first voice data of a first speaker to generate second voice data of a second speaker (section 2.1, Voice transformed from one source voice to target voice); and 
performing training of the speaker identification model using, 10as training data, the first voice data and the second voice data (section 2.1, discriminator network, which is trained based on error function that measure difference between real and generated speech).
Keskin does not specifically teach wherein the voice quality conversion is processing based on voice data of a first language of the first speaker and voice data of a second language of the second speaker, 
the first and second languages are different languages, and 
in the performing of the voice quality conversion, the voice quality conversion is performed on the first voice data of the first language to generate the second voice data of the first language.
In the same field of voice conversion, Qian teaches wherein the voice quality conversion is processing based on voice data of a first language of the first speaker and voice data of a second language of the second speaker (abstract, 0003, 0017-19, voice conversion of voice in first language to voice originally in a second language, in a first language.), 
the first and second languages are different languages (0018, for example English and Mandarin), and 
in the performing of the voice quality conversion, the voice quality conversion is performed on the first voice data of the first language to generate the second voice data of the first language (abstract, 0003, 0017-19, voice conversion of voice in first language to voice originally in a second language, in a first language.).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use cross lingual conversion as taught by Qian in the system of Keskin in order to for more voice conversion options to generate more diverse data accurately (Qian Background).

Consider claim 3, Keskin teaches the training method according to claim 1, wherein the voice quality conversion includes inputting the 20first voice data to a voice quality conversion model and outputting the second voice data from a voice quality conversion model (section 2.1, converting source voice to target voice while maintaining source content using generator network), the voice quality conversion model having been trained in advance to output voice data of the second speaker upon receiving, as the input, voice data of the first speaker (section 2.1, model is fed to discriminator after each training round, and thus is trained in advance of each time the discriminator is applied).

Consider claim 4, Keskin teaches the training method according to claim 3, wherein the voice quality conversion model includes a deep neural network (section 2.1, section 3.3, generator networks) which receives, as the input, voice data in waveform 30audio (WAV) format and outputs voice data in the WAV format (section 3.2, data used is Voice Conversation Challenge 2018 dataset, which is waveform data, see attached description listed on PTO-892).

Consider claim 6, Keskin teaches the training method according to claim 1, wherein the speaker identification model includes a deep neural network which obtains, as the input, an utterance feature 5amount indicating a feature of an utterance included in voice data and outputs a speaker-dependent feature amount indicating a feature of a speaker (section 2.2, output from generator is fed to discriminator which classifies utterance is real or imposter.).

Consider claim 7, Keskin teaches A method of identifying a speaker, comprising: inputting voice data to the speaker identification model which has been trained in advance using the training method according to claim 1 (section 2.2, output from generator is fed to discriminator); and 
outputting the speaker identification information from the 15speaker identification model (section 2.2, output from generator is fed to discriminator which classifies utterance is real or imposter).

Consider claim 8, Keskin teaches 20execut[ing] training of a speaker identification model which receives voice data as an input and outputs speaker identification information for identifying a speaker of an utterance included in the voice data (abstract, using generative adversarial network), wherein the training includes: 
performing voice quality conversion of first voice data of a first 25speaker to generate second voice data of a second speaker (section 2.1, Voice transformed from one source voice to target voice); and 
performing training of the speaker identification model using, as training data, the first voice data and the second voice data (section 2.1, Voice transformed from one source voice to target voice).
	Keskin does not specifically teach a non-transitory computer-readable recording medium having a program recorded thereon,
wherein the voice quality conversion is processing based on voice data of a first language of the first speaker and voice data of a second language of the second speaker, 
the first and second languages are different languages, and 
in the performing of the voice quality conversion, the voice quality conversion is performed on the first voice data of the first language to generate the second voice data of the first language.

	In the same field of training models, Qian teaches a non-transitory computer-readable recording medium having a program recorded thereon (0029, memory, RAM, ROM),
wherein the voice quality conversion is processing based on voice data of a first language of the first speaker and voice data of a second language of the second speaker (abstract, 0003, 0017-19, voice conversion of voice in first language to voice originally in a second language, in a first language.), 
the first and second languages are different languages (0018, for example English and Mandarin), and 
in the performing of the voice quality conversion, the voice quality conversion is performed on the first voice data of the first language to generate the second voice data of the first language (abstract, 0003, 0017-19, voice conversion of voice in first language to voice originally in a second language, in a first language.).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use cross lingual conversion as taught by Qian in the system of Keskin in order to for more voice conversion options to generate more diverse data accurately, and to implement model training using widely available and ubiquitous computer components (Qian Background).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451. The examiner can normally be reached 6:30am-5pm Monday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on (571)272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

DOUGLAS GODBOLD
Examiner
Art Unit 2655



/DOUGLAS GODBOLD/Primary Examiner, Art Unit 2655