DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 08/01/2022 has been entered.
 
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 4, 6-8, 11, and 13-15 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

	Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 6-8, and 13-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hasegawa (US 2021/0398540 A1), in view of Kajarekar et al. (US 2013/0144414 A1), hereinafter referred to as Kajarekar, and further in view of Ding et al. (US 10,832,685 B2), hereinafter referred to as Ding.

Regarding claim 1, Hasegawa teaches:
An electronic apparatus, comprising: 
a microphone (para [0067], where a microphone is used);
a processor (Fig. 3, element 301, para [0062], where a CPU is used)configured to
perform a voice recognition function corresponding to each of a plurality of first user voice inputs to the microphone (para [0034], where a learned model is generated from learning data, associating voice inputs with speaker labels, and para [0067], where a microphone is used), 
identify utterance characteristics of each of the plurality of first user voice inputs (para [0042], where voice characteristics are acquired and used for classification),
obtain a plurality of voice groups in which the plurality of first user voice inputs are classified according to the identified utterance characteristics (para [0042], where one or more groups are classified into based on voice characteristics),
select a voice group corresponding to a predetermined user among the classified plurality of voice groups (para [0031], [0042], where speech sections are classified into groups on the basis of voice characteristics, and assigned an ID), the selected voice group including a largest data of the first user voice inputs among the plurality of voice groups (para [0116], where the speaker who has spoken most frequently is identified as the speaker), 
generate a speaker model corresponding to the predetermined user based on the utterance characteristics of the selected voice group (para [0034], [0042], where the identification model or learned model is learned from voice information, and where speech is classified based on the voice characteristics), the generated speaker model corresponding to an initial speaker model for the predetermined user (para [0035-37], where first learning data is used to train the model), 
perform speaker recognition of the user and the voice recognition function for a second user voice input to the microphone based on the generated speaker model (para [0039], where speaker identification is performed on conversation voice information),
update the generated speaker model corresponding to the predetermined user based on the second user voice input (para [0035-37], where the learned model is updated using learning data from the known and unknown speakers), and
perform the speaker recognition of the user and the voice recognition function for a third user voice input to the microphone based the updated speaker model (para [0038-39], where speaker recognition is performed using the updated model).  
Hasegawa does not teach:
update the provided speaker model corresponding to the predetermined user based on the second user voice input whose similarity with the provided speaker model is equal to or greater than a threshold, 
identify a user voice input of another user corresponding to the second user voice input whose similarity with the generated speaker model is lower than the threshold, and
generate a new speaker model corresponding to the another user based on a speaker model corresponding to the another user not being identified.
Kajarekar teaches:
update the generated speaker model corresponding to the predetermined user based on the second user voice input whose similarity with the generated speaker model is equal to or greater than a threshold (para [0030], [0049], where models with similarity above a threshold are merged, which is interpreted as correcting the model).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Hasegawa by using the model merging of Kajarekar (Kajarekar para [0049]) for the models of Hasegawa (Hasegawa para [0042]) in order to integrate two groups of speaker models together (Kajarekar para [0030]).
Ding teaches:
identify a user voice input of another user corresponding to the second user voice input whose similarity with the generated speaker model is lower than the threshold (col. 5 lines 38-51, where similarity is lower than a threshold), and
generate a new speaker model corresponding to the another user based on a speaker model corresponding to the another user not being identified (col. 5 lines 38-51, where a new speaker model corresponding to the unknown speaker is generated).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Hasegawa in view of Kajarekar by using the speaker model generation of Ding (Ding col. 5 lines 38-51) in the speaker identification process of Hasegawa in view of Kajarekar (Hasegawa para [0039]), so that clusters corresponding to unknown speakers receive their own model and are stored (Ding col. 5 lines 23-37).

Regarding claim 6, Hasegawa in view of Kajarekar and Ding teaches:
The electronic apparatus of claim 1, wherein the processor is further configured to: 
identify whether the speaker model corresponding to the predetermined user is generated (Hasegawa para [0036-37], where learning data corresponding to other speakers is categorized as second learning data); and 
generate the speaker model corresponding to the predetermined user when the speaker model corresponding to the predetermined user is not generated (Hasegawa para [0037], where the identification model including the dummy speakers is generated).  

Regarding claim 7, Hasegawa in view of Kajarekar and Ding teaches:
The electronic apparatus of claim 1, wherein the processor is configured to:
generate a plurality of speaker models for the user (Kajarekar para [0049], where multiple speaker models are generated, where similar models correspond to the same user); 
identify similarity of utterance characteristics between the plurality of speaker models (Kajarekar para [0049], where speaker models are compared for similarity); and 
merge two or more speaker models having the similarity equal to or greater than a threshold (Kajarekar para [0049], where similar speaker models are merged).  

Regarding claim 8, Hasegawa teaches:
A control method of an electronic apparatus, comprising: 
performing a voice recognition function corresponding to each of a plurality of first user voice inputs to a microphone (para [0034], where a learned model is generated from learning data, associating voice inputs with speaker labels, and para [0067], where a microphone is used); 
identifying utterance characteristics of each of the plurality of first user voice inputs (para [0042], where voice characteristics are acquired and used for classification), and obtaining a plurality of voice groups in which the plurality of first user voice inputs are classified according to the identified utterance characteristics (para [0042], where one or more groups are classified into based on voice characteristics),
selecting a voice group corresponding to a predetermined user among the classified plurality of voice groups (para [0031], [0042], where speech sections are classified into groups on the basis of voice characteristics, and assigned an ID), the selected voice group including a largest data of the first user voice inputs among the plurality of voice groups (para [0116], where the speaker who has spoken most frequently is identified as the speaker); and 
generating a speaker model corresponding to the predetermined user based on the utterance characteristics of the selected voice group (para [0034], [0042], where the identification model or learned model is learned from voice information, and where speech is classified based on the voice characteristics), the generated speaker model corresponding to an initial speaker model for the predetermined user (para [0035-37], where first learning data is used to train the model), 
performing speaker recognition of the user and the voice recognition function for a second user voice input to the microphone based on the generated speaker model (para [0039], where speaker identification is performed on conversation voice information);  
updating the generated speaker model corresponding to the predetermined user based on the second user voice input (para [0035-37], where the learned model is updated using learning data from the known and unknown speakers), 
performing the speaker recognition of the user and the voice recognition function for a third user voice input to the microphone based the updated speaker model (para [0038-39], where speaker recognition is performed using the updated model).  
Hasegawa does not teach:
updating the generated speaker model corresponding to the predetermined user based on the second user voice input whose similarity with the generated speaker model is equal to or greater than a threshold, 
identifying a user voice input of another user corresponding to the second user voice input whose similarity with the generated speaker model is lower than the threshold, and
generating a new speaker model corresponding to the another user based on a speaker model corresponding to the another user not being identified.
Kajarekar teaches:
updating the generated speaker model corresponding to the predetermined user based on the second user voice input whose similarity with the generated speaker model is equal to or greater than a threshold (para [0030], [0049], where models with similarity above a threshold are merged, which is interpreted as correcting the model).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Hasegawa by using the model merging of Kajarekar (Kajarekar para [0049]) for the models of Hasegawa (Hasegawa para [0042]) in order to integrate two groups of speaker models together (Kajarekar para [0030]).
Ding teaches:
identify a user voice input of another user corresponding to the second user voice input whose similarity with the generated speaker model is lower than the threshold (col. 5 lines 38-51, where similarity is lower than a threshold), and
generate a new speaker model corresponding to the another user based on a speaker model corresponding to the another user not being identified (col. 5 lines 38-51, where a new speaker model corresponding to the unknown speaker is generated).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Hasegawa in view of Kajarekar by using the speaker model generation of Ding (Ding col. 5 lines 38-51) in the speaker identification process of Hasegawa in view of Kajarekar (Hasegawa para [0039]), so that clusters corresponding to unknown speakers receive their own model and are stored (Ding col. 5 lines 23-37).

Regarding claim 13, Hasegawa in view of Kajarekar and Ding teaches:
The control method of claim 8, wherein the generating of the speaker model includes: 
identifying whether the speaker model corresponding to the predetermined user is generated (Hasegawa para [0036-37], where learning data corresponding to other speakers is categorized as second learning data); and 
generating the speaker model corresponding to the predetermined user when the speaker model corresponding to the predetermined user is not generated (Hasegawa para [0037], where the identification model including the dummy speakers is generated).  

Regarding claim 14, Hasegawa in view of Kajarekar and Ding teaches:
The control method of claim 8, wherein the updating of the speaker model includes:
generating a plurality of speaker models for the user (Kajarekar para [0049], where multiple speaker models are generated, where similar models correspond to the same user); 
identifying similarity of utterance characteristics between the plurality of speaker models (Kajarekar para [0049], where speaker models are compared for similarity); and 
merging two or more speaker models having the similarity equal to or greater than a threshold (Kajarekar para [0049], where similar speaker models are merged).  

Regarding claim 15, Hasegawa teaches:
A non-transitory recording medium stored with a computer program (Fig. 3 element 305, para [0062], where a recording medium is used) including a code performing a control method of an electronic apparatus as a computer-readable code, wherein the control method of the electronic apparatus includes: 
performing a voice recognition function corresponding to each of a plurality of first user voice inputs to a microphone (para [0034], where a learned model is generated from learning data, and para [0067], where a microphone is used); 
identifying utterance characteristics of each of the plurality of first user voice inputs (para [0042], where voice characteristics are acquired and used for classification), and obtain a plurality of voice groups in which the plurality of first user voice inputs are classified according to the identified utterance characteristics (para [0042], where one or more groups are classified into based on voice characteristics),
selecting a voice group corresponding to a predetermined user among the plurality of voice groups (para [0031], [0042], where speech sections are classified into groups on the basis of voice characteristics, and assigned an ID), the selected voice group including a largest data of the first user voice inputs among the plurality of voice groups (para [0116], where the speaker who has spoken most frequently is identified as the speaker);  
generating a speaker model corresponding to the predetermined user based on the utterance characteristics of the selected voice group (para [0034], [0042], where the identification model or learned model is learned from voice information, and where speech is classified based on the voice characteristics), the generated speaker model corresponding to an initial speaker model for the predetermined user (para [0035-37], where first learning data is used to train the model), 
performing speaker recognition of the user and the voice recognition function for a second user voice input to the microphone based on the generated speaker model (para [0039], where speaker identification is performed on conversation voice information);
updating the generated speaker model corresponding to the predetermined user based on the second user voice input (para [0035-37], where the learned model is updated using learning data from the known and unknown speakers), and
performing the speaker recognition of the user and the voice recognition function for a third user voice input to the microphone based the updated speaker model (para [0038-39], where speaker recognition is performed using the updated model).  
Hasegawa does not teach:
updating the generated speaker model corresponding to the predetermined user based on the second user voice input whose similarity with the generated speaker model is equal to or greater than a threshold, 
identifying a user voice input of another user corresponding to the second user voice input whose similarity with the generated speaker model is lower than the threshold, and
generating a new speaker model corresponding to the another user based on a speaker model corresponding to the another user not being identified.
Kajarekar teaches:
updating the generated speaker model corresponding to the predetermined user based on the second user voice input whose similarity with the generated speaker model is equal to or greater than a threshold (para [0030], [0049], where models with similarity above a threshold are merged, which is interpreted as correcting the model).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Hasegawa by using the model merging of Kajarekar (Kajarekar para [0049]) for the models of Hasegawa (Hasegawa para [0042]) in order to integrate two groups of speaker models together (Kajarekar para [0030]).
Ding teaches:
identify a user voice input of another user corresponding to the second user voice input whose similarity with the generated speaker model is lower than the threshold (col. 5 lines 38-51, where similarity is lower than a threshold), and
generate a new speaker model corresponding to the another user based on a speaker model corresponding to the another user not being identified (col. 5 lines 38-51, where a new speaker model corresponding to the unknown speaker is generated).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Hasegawa in view of Kajarekar by using the speaker model generation of Ding (Ding col. 5 lines 38-51) in the speaker identification process of Hasegawa in view of Kajarekar (Hasegawa para [0039]), so that clusters corresponding to unknown speakers receive their own model and are stored (Ding col. 5 lines 23-37).

Claims 4 and 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hasegawa, in view of Kajarekar, and Ding, and further in view of Fleizach et al. (US 9,495,129 B2), hereinafter referred to as Fleizach.

Regarding claim 4, Hasegawa in view of Kajarekar and Ding teaches:
The electronic apparatus of claim 1
Hasegawa in view of Kajarekar and Ding does not teach:
wherein the utterance characteristics include at least one of tone, strength, and speed of the plurality of input first user voice inputs.
Fleizach teaches:
wherein the utterance characteristics include at least one of tone, strength, and speed of the plurality of input first user voice inputs (col. 17 line 60 - col. 18 line 16, where voice characteristics may include pitch, speed, and volume).  
The prior art contained a device (method, product, etc.) which differed from the claimed device by the substitution of some components (step, element, etc.) with other components; the substituted components and their functions were known in the art; one of ordinary skill in the art could have substituted one known element for another, and the results of the substitution would have been predictable.

Regarding claim 11, Hasegawa in view of Kajarekar and Ding teaches:
The control method of claim 8
Hasegawa in view of Kajarekar and Ding does not teach:
wherein the utterance characteristics include at least one of tone, strength, and speed of the plurality of input first user voice inputs.
Fleizach teaches:
wherein the utterance characteristics include at least one of tone, strength, and speed of the plurality of input first user voice inputs (col. 17 line 60 - col. 18 line 16, where voice characteristics may include pitch, speed, and volume).  
The prior art contained a device (method, product, etc.) which differed from the claimed device by the substitution of some components (step, element, etc.) with other components; the substituted components and their functions were known in the art; one of ordinary skill in the art could have substituted one known element for another, and the results of the substitution would have been predictable.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US 2020/0243094 A1 para [1124-1125] teaches creation of a new speaker voice model if a comparison fails to yield a match.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRYAN S BLANKENAGEL whose telephone number is (571)270-0685. The examiner can normally be reached 8:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BRYAN S BLANKENAGEL/Primary Examiner, Art Unit 2658