DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Response to Arguments
Applicant's arguments filed 04/21/2022 have been fully considered but they are not persuasive. Regarding arguments on pages 8-9, Examiner notes that the claims do not preclude the voice interaction system or device from being a server. Indeed, the voice interaction system may be interpreted to include both the handheld device and the server. Furthermore, Ljolje para [0038] teaches that components and modules may be distributed between systems. Therefore, the amended limitations are taught by the cited references.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 and 4-8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ljolje et al. (US 2015/0248884 A1), hereinafter referred to as Ljolje, in view of Pearce et al. (US 9,953,634 B1), hereinafter referred to as Pearce.

Regarding claim 1, Ljolje teaches:
A control system comprising: 
a central processing unit configured to control a voice interaction system including a plurality of voice recognition models (Fig. 1 element 120, Fig. 3, para [0033], where the system includes multiple speaker dependent models),
wherein the central processing unit instructs, when a conversation with a target person is started, the voice interaction system to first perform voice recognition and response generation by one voice recognition model that has been tentatively selected from among the plurality of voice recognition models (Figs. 3, 5, para [0045], where an utterance initiates parallel ASR using each of the selected models), determines a voice recognition model that is estimated to be optimal among the plurality of voice recognition models held in a storage of the voice interaction system based on results of the voice recognition of a speech made by the target person using a voice recognition model stored in a storage of a voice recognition server (Fig. 4, para [0042], where after the second utterance, a dominant model is determined and the others dropped), and
Ljolje does not teach:
instructs, when the voice recognition model that is estimated to be optimal and the one voice recognition model that has been tentatively selected are different from each other, the voice interaction system to switch the voice recognition model to the one that is estimated to be optimal and to perform voice recognition and response generation.
Pearce teaches:
instructs, when the voice recognition model that is estimated to be optimal and the one voice recognition model that has been tentatively selected are different from each other, the voice interaction system to switch the voice recognition model to the one that is estimated to be optimal and to perform voice recognition and response generation (Fig. 4 element 406, col. 6 lines 34-39, where the device switches from a speaker independent model to a speaker dependent model).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Ljolje by using the model switching of Pearce (Pearce col. 6 lines 34-39) to change the models of Ljolje (Ljolje Fig. 3), in order to improve quality and accuracy of detection of a keyword over time the more the keyword is uttered and detected (Pearce col. 2 lines 13-15).

Regarding claim 4, Ljolje in view of Pearce teaches:
The control system according to Claim 1, wherein the central processing unit takes into account information other than a voice regarding the target person when the central processing unit determines the voice recognition model that is estimated to be optimal (Ljolje Fig. 6 para [0047], where the user is identified using caller identification information, or using user profile, username, age, gender, or ethnicity, then identifies environmental conditions to properly select the model).  

Regarding claim 5, Ljolje teaches:
A voice interaction system comprising: 
a plurality of voice recognition models and a controller (Fig. 3, para [0033], where the system includes multiple speaker dependent models), wherein
the controller first performs, when a conversation with a target person is started, voice recognition and response generation by one voice recognition model that has been tentatively selected from among the plurality of voice recognition models (Figs. 3, 5, para [0045], where an utterance initiates parallel ASR using each of the selected models), determines a voice recognition model that is estimated to be optimal among the plurality of voice recognition models based on results of the voice recognition of a speech made by the target person in a voice recognition server (Fig. 4, para [0042], where after the second utterance, a dominant model is determined and the others dropped), and
Ljolje does not teach:
switches, when the voice recognition model that is estimated to be optimal and the one voice recognition model that has been tentatively selected are different from each other, the voice recognition model to the one that is estimated to be optimal and performs voice recognition and response generation.
Pearce teaches:
switches, when the voice recognition model that is estimated to be optimal and the one voice recognition model that has been tentatively selected are different from each other, the voice recognition model to the one that is estimated to be optimal and performs voice recognition and response generation (Fig. 4 element 406, col. 6 lines 34-39, where the device switches from a speaker independent model to a speaker dependent model).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Ljolje by using the model switching of Pearce (Pearce col. 6 lines 34-39) to change the models of Ljolje (Ljolje Fig. 3), in order to improve quality and accuracy of detection of a keyword over time the more the keyword is uttered and detected (Pearce col. 2 lines 13-15).

Regarding claim 6, Ljolje teaches:
A voice recognition server (para [0025], where the device is a server) comprising: 
a controller (Fig. 3, para [0024], where a CPU connected with various components is the controller),
wherein the controller instructs, when a conversation with a target person is started, a voice interaction system including a plurality of voice recognition models to first perform voice recognition and response generation by one voice recognition model that has been tentatively selected from among the plurality of voice recognition models (Figs. 3, 5, para [0045], where an utterance initiates parallel ASR using each of the selected models), determines a voice recognition model that is estimated to be optimal among the plurality of voice recognition models held in the voice interaction system based on results of the voice recognition of a speech made by the target person (Fig. 4, para [0042], where after the second utterance, a dominant model is determined and the others dropped), and
Ljolje does not teach:
instructs, when the voice recognition model that is estimated to be optimal and the one voice recognition model that has been tentatively selected are different from each other, the voice interaction system to switch the voice recognition model to the one that is estimated to be optimal and to perform voice recognition and response generation.
Pearce teaches:
instructs, when the voice recognition model that is estimated to be optimal and the one voice recognition model that has been tentatively selected are different from each other, the voice interaction system to switch the voice recognition model to the one that is estimated to be optimal and to perform voice recognition and response generation (Fig. 4 element 406, col. 6 lines 34-39, where the device switches from a speaker independent model to a speaker dependent model).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Ljolje by using the model switching of Pearce (Pearce col. 6 lines 34-39) to change the models of Ljolje (Ljolje Fig. 3), in order to improve quality and accuracy of detection of a keyword over time the more the keyword is uttered and detected (Pearce col. 2 lines 13-15).

Regarding claim 7, Ljolje teaches:
A computer readable non-transitory storage medium storing a control program for controlling a voice interaction system including a plurality of voice recognition models (Fig. 1, para [0026], where a hard disk is used), wherein 
the control program causes a computer to execute the following processing procedures of: 
a processing procedure for instructing, when a conversation with a target person is started, the voice interaction system to first perform voice recognition and response generation by one voice recognition model that has been tentatively selected from among the plurality of voice recognition models (Figs. 3, 5, para [0045], where an utterance initiates parallel ASR using each of the selected models); 
a processing procedure for determining a voice recognition model that is estimated to be optimal among the plurality of voice recognition models held in the voice interaction system based on results of the voice recognition of a speech made by the target person in a voice recognition server (Fig. 4, para [0042], where after the second utterance, a dominant model is determined and the others dropped); and
Ljolje does not teach:
a processing procedure for instructing, when the voice recognition model that is estimated to be optimal and the one voice recognition model that has been tentatively selected are different from each other, the voice interaction system to switch the voice recognition model to the one that is estimated to be optimal and to perform voice recognition and response generation.  
Pearce teaches:
a processing procedure for instructing, when the voice recognition model that is estimated to be optimal and the one voice recognition model that has been tentatively selected are different from each other, the voice interaction system to switch the voice recognition model to the one that is estimated to be optimal and to perform voice recognition and response generation (Fig. 4 element 406, col. 6 lines 34-39, where the device switches from a speaker independent model to a speaker dependent model).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Ljolje by using the model switching of Pearce (Pearce col. 6 lines 34-39) to change the models of Ljolje (Ljolje Fig. 3), in order to improve quality and accuracy of detection of a keyword over time the more the keyword is uttered and detected (Pearce col. 2 lines 13-15).

Regarding claim 8, Ljolje teaches:
A method (Fig. 5, where a method is used) of controlling a voice interaction system including a plurality of voice recognition models, the method comprising the following steps of: 
instructing, when a conversation with a target person is started, the voice interaction system to first perform voice recognition and response generation by one voice recognition model that has been tentatively selected from among the plurality of voice recognition models (Figs. 3, 5, para [0045], where an utterance initiates parallel ASR using each of the selected models); 
determining a voice recognition model that is estimated to be optimal among the plurality of voice recognition models held in the voice interaction apparatus based on results of the voice recognition of a speech made by the target person in a voice recognition server (Fig. 4, para [0042], where after the second utterance, a dominant model is determined and the others dropped); and
Ljolje does not teach:
instructing, when the voice recognition model that is estimated to be optimal and the one voice recognition model that has been tentatively selected are different from each other, the voice interaction apparatus to switch the voice recognition model to the one that is estimated to be optimal and to perform voice recognition and response generation.
Pearce teaches:
instructing, when the voice recognition model that is estimated to be optimal and the one voice recognition model that has been tentatively selected are different from each other, the voice interaction apparatus to switch the voice recognition model to the one that is estimated to be optimal and to perform voice recognition and response generation (Fig. 4 element 406, col. 6 lines 34-39, where the device switches from a speaker independent model to a speaker dependent model).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Ljolje by using the model switching of Pearce (Pearce col. 6 lines 34-39) to change the models of Ljolje (Ljolje Fig. 3), in order to improve quality and accuracy of detection of a keyword over time the more the keyword is uttered and detected (Pearce col. 2 lines 13-15).

Claims 2-3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ljolje, in view of Pearce, and further in view of Beattie et al. (US 5,865,626 A), hereinafter referred to as Beattie.

Regarding claim 2, Ljolje in view of Pearce teaches:
The control system according to Claim 1
Ljolje in view of Pearce does not teach:
wherein the one voice recognition model that has been tentatively selected is a voice recognition model that has been determined to be used most frequently among the plurality of voice recognition models included in the voice interaction system based on past conversation information.
Beattie teaches:
wherein the one voice recognition model that has been tentatively selected is a voice recognition model that has been determined to be used most frequently among the plurality of voice recognition models included in the voice interaction system based on past conversation information (Fig. 4, col. 6 lines 9-19, where the most frequently used model is selected).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Ljolje in view of Pearce by using the model selection of Beattie (Beattie col. 6 lines 9-19) to select the models of Ljolje in view of Pearce (Ljolje Fig. 3), in order to provide models for speech dialects and/or channels which produce high accuracy in real time for a large variety of speakers (Beattie col. 2 lines 65-67).

Regarding claim 3, Ljolje in view of Pearce and Beattie teaches:
The control system according to Claim 2, wherein the central processing unit causes, when the voice interaction system switches the voice recognition model to the one that is estimated to be optimal, the voice interaction system to switch, in stages, the voice recognition model to the one that is estimated to be optimal from a voice recognition model whose similarity level with the one voice recognition model that has been tentatively selected is high (Pearce Fig. 4 element 406, col. 6 lines 41-47, where the threshold for keyword recognition/sensing is tightened).  

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US 2019/0096396 A1 Fig. 1, para [0025-29] teaches switching from a current speech recognition model to a new model; US 2008/0300871 A1 para [0026] teaches switching to an optimal acoustic model for automated speech recognition.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRYAN S BLANKENAGEL whose telephone number is (571)270-0685. The examiner can normally be reached 8:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BRYAN S BLANKENAGEL/Primary Examiner, Art Unit 2658