Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 05/20/22 has been entered.
 
This office action is in response to correspondence 05/20/22 regarding application 16/633,792, in which claims 1-3, 8, and 11-16 were amended and claim 10 was cancelled. Claims 1-9 and 11-16 are pending and have been considered.


Non-Compliant Amendment
The claim amendments submitted 05/20/22 fail to comply with 37 CFR 1.121 because claims 14 and 15 present with status “currently amended” but contain no markups or apparent amendments relative to the previous claims 02/04/22. In order to expedite prosecution, the examiner assumes these claims should have been presented with status “previously presented”.

Claim Objections
In claim 16, line 6, should “acquire” be “acquiring”?


Response to Arguments
Applicant’s arguments on page 6 regarding the 35 U.S.C. 103 rejections based on Lovitt, Mitsufuji, Decanne, Mohideen, Mutagi, and Yu have been considered but are moot in view of the new grounds for rejection


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-8 and 11-14, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Lovitt (2016/0125879) in view of Basart et al. (2010/0020951), in further view of Mitsufuji et al. (2015/0058015).

Consider claim 1, Lovitt discloses a voice operation system comprising: 
an array microphone comprising a plurality of microphones that each converts voice vibration into voice information (microphone array, [0011] for speech and/or voice recognition, [0045]);
a storage medium that stores registered user information (user account information, [0012], stored on memory, [0041]);
a memory storing instructions (memory, [0041]), and 
a processor configured to execute the instructions (processors executing instructions, [0039]) to:
perform voice recognition on voice information and generate voice operation information (recognizing and performing the action command if the user is identified, above the probability threshold, [0025]);
acquire the registered user information (a template of audio streams for audio-based identification, [0028], from user account information on the server, [0012]);
identify the talker using the registered user information (performing template matching of audio streams for audio-based identification, [0028]).
Lovitt does not specifically mention user information in which a user and information indicating a direction of the user are related to each other;
calculating direction information on a talker;
identifying the talker using the calculated direction information. 
Basart discloses user information in which a user and information indicating a direction of the user are related to each other (speaker profile associated with a given position, [0057]);
calculating direction information on a talker (distance and direction information from microphone array, [0055]);
identifying the talker using the calculated direction information (identifying the speaker using the position and stored voice profile information, [0057-0058]). 
It would have been obvious to one of ordinary skill in the art before the effective filign date of the claimed invention to modify the invention of Lovitt by including user information in which a user and information indicating a direction of the user are related to each other;
calculating direction information on a talker;
and identifying the talker using the calculated direction information in order to reduce difficulty in knowing who is speaking, as suggested by Basart ([0004]).
Lovitt and Basart do not specifically mention a voice quality model, the voice quality model including a voice waveform or a frequency spectrum.
Mitsufuji discloses a voice quality model, the voice quality model including a voice waveform or a frequency spectrum (voice quality model uses spectral envelope provided by time frequency conversion units, i.e. “frequency spectrum”, [0128-0130]). 
Hence the prior art includes each element claimed, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference.
 In combination, Lovitt and Basart perform the same function as it does separately of user recognition via speech. Mitsufuji performs the same function as it does separately of evaluating voice quality using a voice quality model including a voice waveform or a frequency spectrum. 
Therefore one of ordinary skill in the art could have combined the elements as claimed by known methods, and that in combination, each element merely performs the same function as it does separately.
The results of the combination would have been predictable and resulted in modifying the invention of Lovitt and Basart to include evaluating voice quality using a voice quality model including a voice waveform or a frequency spectrum, as disclosed by Mitsufuji thereby improving the user recognition of Lovitt and Basart by additionally evaluating the quality of the voice signal, overcoming the difficulty of accurately identifying the speaker of the key phrase using the key phrase data alone, as discussed by Lovitt ([0014]). Therefore, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention. 

Consider claim 16, Lovitt discloses a control method of a voice operation apparatus, the control method comprising: 
performing voice recognition on voice information into which voice vibration has been converted by each of a plurality of microphones of a microphone array (microphone array, [0011] for speech and/or voice recognition, [0045], recognizing and performing the action command if the user is identified, above the probability threshold, [0025]);
acquiring registered user information stored in a storage medium (a template of audio streams for audio-based identification, [0028], from user account information on the server, [0012]);
identifying the talker using the registered user information (performing template matching of audio streams for audio-based identification, [0028]).
Lovitt does not specifically mention calculating direction information on a talker;
information in which a user and information indicating a direction of the user are related to each other;
identifying the talker using the calculated direction information. 
Basart discloses calculating direction information on a talker (distance and direction information from microphone array, [0055]);
user information in which a user and information indicating a direction of the user are related to each other (speaker profile associated with a given position, [0057]);
identifying the talker using the calculated direction information (identifying the speaker using the position and stored voice profile information, [0057-0058]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Lovitt by calculating direction information on a talker;
using information in which a user and information indicating a direction of the user are related to each other; and identifying the talker using the calculated direction information for reasons similar to those for claim 1. 
Lovitt and Basart do not specifically mention a voice quality model including a waveform or a frequency spectrum.
Mitsufuji discloses a voice quality model including a waveform or a frequency spectrum (voice quality model uses spectral envelope provided by time frequency conversion units, i.e. “frequency spectrum”, [0128-0130]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Lovitt and Basart by including a voice quality model including a waveform or a frequency spectrum as taught by Mitsufuji for reasons similar to those for claim 1.

Consider claim 2, Lovitt discloses: the voice model of the user is registered in advance in registered user information (user profile, [0024]). 
Lovitt and Basart do not specifically mention a voice quality model.
Mitsufuji discloses a voice quality model (voice quality model, [0128-0130]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Lovitt and Basart by using Mitsufuji’s voice quality model as or in addition to the voice model in Lovitt for reasons similar to those for claim 1. 

Consider claim 3, Lovitt discloses the voice model is one of a plurality of voice models on a per-user basis in the registered user information (probability for a selected user of the one or more possible users, [0049], associated with a particular user profile, [0024]), and wherein the processor is further configured to execute the instructions to select the voice model in accordance with auxiliary information (based on other environmental sensor data collected, [0009]). 
Lovitt and Basart do not specifically mention a voice quality model.
Mitsufuji discloses a voice quality model (voice quality model, [0128-0130]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Lovitt and Basart by using Mitsufuji’s voice quality model as or in addition to the voice model in Lovitt and Basart for reasons similar to those for claim 1. 

Consider claim 4, Lovitt discloses the processor is further configured to execute the instructions to select the voice model on the per-user basis in the registered user information, and in accordance with the auxiliary information (server 112 contains user account information such as voice pattern data, image recognition data, etc., [0012]). 
Lovitt and Basart do not specifically mention a voice quality model.
Mitsufuji discloses a voice quality model (voice quality model, [0128-0130]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Lovitt and Basart by using Mitsufuji’s voice quality model as or in addition to the voice model in Lovitt and Basart for reasons similar to those for claim 1. 

Consider claim 5, Lovitt discloses the processor is further configured to execute the instructions to calculate a similarity between the voice information and the voice model and identify the talker based on the similarity (comparing the audio streams, [0028]). 
Lovitt and Basart do not specifically mention a voice quality model.
Mitsufuji discloses a voice quality model (voice quality model, [0128-0130]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Lovitt and Basart by using Mitsufuji’s voice quality model as or in addition to the voice model in Lovitt and Basart for reasons similar to those for claim 1. 

Consider claim 6, Lovitt discloses the processor is further configured to execute the instructions to identify the user having a highest similarity as the talker of the voice operation (the identified user has the highest probability of all possible users, [0024]). 

Consider claim 7, Lovitt discloses the registered user information has information unique to the user in association with the auxiliary information, and wherein the processor is further configured to execute the instructions to correct the similarity in accordance with the auxiliary information (if the analysis of the additional acoustic data indicates the user was not speaking before or after the utterance, the computing device may decrease the probability that the key phrase was spoken by the identified user, [0020], which is considered to “correct the similarity” of the acoustic match comparison to the model for the user, which is “information unique to the user”, [0028]). 
Consider claim 8, Lovitt discloses the registered user information has a correction value of the similarity on a per-user basis, and wherein the processor is further configured to execute the instructions to reflect a result of identifying the talker in the correction value and learn a correlation between the auxiliary information and the correction value (learning adjustments to probability in real time based on the audio history, [0031]). 

Consider claim 11, Lovitt discloses a ranging sensor that acquires a distance to the talker of the voice operation as distance information, wherein the processor is further configured to execute the instructions to identify the talker by using the distance information (using a depth camera, [0011]). 
Consider claim 12, Lovitt discloses a clock that acquires utterance time of the voice operation as the time information, wherein the processor is further configured to execute the instructions to identify the talker by using the time information (determining the identified user was speaking within a window of time before and/or after detection of the key phrase, [0019], a “clock” being inherent for measuring a window of time). 

Consider claim 13, Lovitt discloses a GPS device that acquires a position of the voice operation system as the position information, wherein the processor is further configured to execute the instructions to selects the voice model in accordance with position information (location information via GPS, [0021]). 
Lovitt and Basart do not specifically mention a voice quality model.
Mitsufuji discloses a voice quality model (voice quality model, [0128-0130]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Lovitt and Basart by using Mitsufuji’s voice quality model as or in addition to the voice model in Lovitt and Basart for reasons similar to those for claim 1. 

Consider claim 14, Lovitt discloses the registered user information further includes schedule information on the user, and wherein the processor is further configured to execute the instructions to identify the talker by further using the schedule information (schedule information, [0022]). 
Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Lovitt (2016/0125879) in view of Basart et al. (2010/0020951), in further view of Mitsufuji et al. (2015/0058015), in further view of Decanne (2016/0072915).

Consider claim 9, Lovitt discloses the registered user information includes information unique to the user (user profile, [0024]), and wherein the processor is further configured to execute the instructions to correct the similarity in accordance with the information included in the voice operation information (if the analysis of the additional acoustic data indicates the user was not speaking before or after the utterance, the computing device may decrease the probability that the key phrase was spoken by the identified user, [0020], which is considered to “correct the similarity” of the acoustic match comparison to the model for the user, which is “information unique to the user”, [0028]).
Lovitt, Basart, and Mitsufuji do not specifically mention a keyword unique to the user.
Decanne discloses a keyword unique to the user (user-specific unique keyword, [0046]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Lovitt, Basart, and Mitsufuji by including a keyword unique to the user in order to better individualize content, as suggested by Decanne ([0003]).

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Lovitt (2016/0125879) in view of Basart et al. (2010/0020951), in further view of Mitsufuji et al. (2015/0058015), in further view of Mutagi et al. (10,418,033).

Consider claim 15, Lovitt, Basart, and Mitsufuji do not, but Mutagi discloses the registered user information has information customized in accordance with user preference on a per-user basis, wherein the processor is further configured to execute the instructions to perform an operation process in accordance with the user preference corresponding to the voice operation information, and wherein the voice operation system further comprises a speaker that provides a notification of a performance result by voice (custom output of weather results via TTS according to user preferences, Col 3 lines 4-26).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Lovitt, Basart, and Mitsufuji such that the registered user information has information customized in accordance with the user's preference on the user basis, a control calculation unit that wherein the processor is further configured to execute the instructions to perform an operation process in accordance with the user's preference corresponding to the voice operation information, and wherein the voice operation system further comprises a speaker that provides a notification of a performance result by voice in order to improve human computer interactions, as suggested by Mutagi (Col 1 lines 18-20).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jesse Pullias whose telephone number is 571/270-5135. The examiner can normally be reached on M-F 8:00 AM - 4:30 PM. The examiner’s fax number is 571/270-6135.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Andrew Flanders can be reached on 571/272-7516. 

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).


/Jesse S Pullias/
Primary Examiner, Art Unit 2655                                                  08/09/22