Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .



DETAILED ACTION

NOTE as per MPEP: Applicant authorizes the Director to charge Deposit Account No. 506632 for any extension of time fees that are required to enter this amendment and to treat this communication as a request for any such required extension of time.

The following Examiners amendment is based on the claim listing filed on 01/06/2021:
In the claims:
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given in a telephone interview with Brett W. Scott on 08/24/2021.

Please replace the current claim set in its entirety with the following set as follows:

1. 	(Currently amended)	A network microphone device comprising:
one or more microphones;

one or more processors;
memory comprising tangible, non-transitory computer-readable media storing instructions executable by the one or more processors to cause the network microphone device to perform operations comprising:
receiving, via the one or more microphones, voice data indicating a voice input, wherein the received voice data includes a first portion representing an activation word corresponding to one of a plurality of voice services and a second portion representing a voice command, wherein the plurality of voice services are externally registered to a media playback system associated with the networked microphone device;
identifying, prior to performing speech recognition on the second portion of the received voice data representing the voice command, from among the plurality of voice services, a voice service to process the voice input, wherein the identifying comprises (i) determining a closest match of the first portion of the received voice data representing the activation word with corresponding activation word data stored in a recognition dataset on the network microphone device, (ii) determining a confidence score of the closest match of the first portion of the received voice data representing the activation word with the corresponding activation word data, and (iii) comparing the confidence score with a predetermined threshold score, wherein the predetermined threshold score has a first value if the closest match of the first portion of the received voice data representing the activation word with the corresponding activation word data is associated with a first voice service, and the predetermined threshold score has a second, different value if the closest match of the first portion of the received voice data representing the activation word with the corresponding activation word data is associated with a second voice service; 

transmitting, via the network interface, the first portion of the received voice data representing the activation word and the second portion of the received voice data representing the voice command to the selected voice service only if the confidence score is greater than or equal to the predetermined threshold score; 
receiving, from the identified voice service, an indication of whether the first portion of the received voice data representing the activation word was recognized by the identified voice service; and
in response to the received indication of whether the first portion of the received voice data representing the activation word was recognized by the identified voice service, updating the activation word data in the recognition dataset.

2.	(Canceled)

3.	(Canceled)

4.	(Canceled)

5.	(Currently amended)	The network microphone device of claim [[3]] 1, wherein the network microphone device is a first device of the media playback system, and wherein the instructions stored on the memory further include instructions for:
predetermined threshold score.

6.	(Previously presented)	The network microphone device of claim 5, wherein the confidence score is a first confidence score, and wherein the instructions stored on the memory further include instructions for:
receiving, via the network interface from the second device, an indication of a second confidence score, wherein the second confidence score is greater than the first confidence score; 
comparing the second confidence score with the predetermined threshold score; and 
transmitting, via the network interface, the first portion of the received voice data representing the activation word and the second portion of the received voice data representing the voice command to the identified voice service only if the second confidence score is greater than or equal to the predetermined threshold score.

7.	(Original)	The network microphone device of claim 5, further comprising:
a transducer configured to output audio, 
wherein the confidence score is a first confidence score, and wherein the instructions stored on the memory further include instructions for:

comparing the second confidence score with the predetermined threshold score; and 
outputting, via the transducer, a request for additional user voice input if the second confidence score is less than the predetermined threshold score.

8.	(Canceled)

9.	(Currently amended)	The network microphone device of claim [[3]] 1, wherein updating the activation word data in the recognition dataset comprises adjusting the predetermined threshold score.

10. 	(Currently amended)	A tangible, non-transitory computer-readable medium storing instructions that, when executed by one or more processors of a network microphone device, cause the network microphone device to perform operations comprising:
receiving, via one or more microphones of the network microphone device, voice data indicating a voice input, wherein the received voice data includes a first portion representing an activation word corresponding to one of a plurality of voice services and a second portion representing a voice command, wherein the plurality of voice services are externally registered to a media playback system associated with the networked microphone device;
identifying, prior to performing speech recognition on the second portion of the received voice data representing the voice command, from among the plurality of voice services, a voice service to process the voice input, wherein the identifying comprises (i) determining a closest , (ii) determining a confidence score of the closest match of the first portion of the received voice data representing the activation word with the corresponding activation word data, and (iii) comparing the confidence score with a predetermined threshold score, wherein the predetermined threshold score has a first value if the closest match of the first portion of the received voice data representing the activation word with the corresponding activation word data is associated with a first voice service, and the predetermined threshold score has a second, different value if the closest match of the first portion of the received voice data representing the activation word with the corresponding activation word data is associated with a second voice service; 
selecting, based on the determined closest match, the identified voice service and foregoing selection of another voice service; 
transmitting, via a network interface of the network microphone device, the first portion of the received voice data representing the activation word and the second portion of the received voice data representing the voice command to the selected voice service only if the confidence score is greater than or equal to the predetermined threshold score; 
receiving, from the identified voice service, an indication of whether the first portion of the received voice data representing the activation word was recognized by the identified voice service; and
in response to the received indication of whether the first portion of the received voice data representing the activation word was recognized by the identified voice service, updating the activation word data in the recognition dataset.



12.	(Canceled)

13.	(Canceled)	

14.	(Currently amended)	The tangible, non-transitory computer-readable medium of claim [[12]] 10, wherein the network microphone device is a first device of a plurality of devices in a media playback system, the instructions further including instructions for:
transmitting, via the network interface, the first portion of the received voice data representing the activation word and the second portion of the received voice data representing the voice command to a second device in the media playback system, wherein the second device is configured to further analyze the first portion of the received voice data representing the activation word and the second portion of the received voice data representing the voice command if the confidence score is less than the predetermined threshold score.

15.	(Previously presented)	The tangible, non-transitory computer-readable medium of claim 14, wherein the confidence score is a first confidence score, the instructions further including instructions for:
receiving, via the network interface from the second device, an indication of a second confidence score, wherein the second confidence score is greater than the first confidence score; 
comparing the second confidence score with the predetermined threshold score; and 


16.	(Original)	The tangible, non-transitory computer-readable medium of claim 14, wherein the confidence score is a first confidence score, the instructions further including instructions for:
receiving, via the network interface from the second device, an indication of a second confidence score, wherein the second confidence score is greater than the first confidence score; 
comparing the second confidence score with the predetermined threshold score; and 
outputting, via a transducer of the network microphone device, a request for additional user voice input if the second confidence score is less than the predetermined threshold score.

17.	(Canceled)	

18.	(Currently amended)	The tangible, non-transitory computer-readable medium of claim [[12]] 10, wherein updating the recognition data set comprises adjusting the predetermined threshold score.

19. 	(Currently amended)	A method of operating a network microphone device, the method comprising:

identifying, prior to performing speech recognition on the second portion of the received voice data representing the voice command, by the network microphone device from among the plurality of voice services, a voice service to process the voice input, wherein the identifying comprises (i) determining a closest match of the first portion of the received voice data representing the activation word with corresponding activation word data stored in a recognition dataset on the network microphone device, (ii) determining a confidence score of the closest match of the first portion of the received voice data representing the activation word with the corresponding activation word data, and (iii) comparing the confidence score with a predetermined threshold score, wherein the predetermined threshold score has a first value if the closest match of the first portion of the received voice data representing the activation word with the corresponding activation word data is associated with a first voice service, and the predetermined threshold score has a second, different value if the closest match of the first portion of the received voice data representing the activation word with the corresponding activation word data is associated with a second voice service; 
selecting, based on the determined closest match, the identified voice service and foregoing selection of another voice service; 
transmitting, via a network interface of the network microphone device, the first portion of the received voice data representing the activation word and the second portion of the received only if the confidence score is greater than or equal to the predetermined threshold score;
receiving, from the identified voice service, an indication of whether the first portion of the received voice data representing the activation word was recognized by the identified voice service; and
in response to the received indication of whether the first portion of the received voice data representing the activation word was recognized by the identified voice service, updating the activation word data in the recognition dataset.

20.	(Canceled)

21.	(Currently amended)		The network microphone device of claim [[4]] 1, wherein updating the activation word data in the recognition dataset comprises adjusting the first value of the predetermined threshold score.

22.	(Currently amended)		The tangible, non-transitory computer-readable medium of claim [[13]] 10, wherein updating the activation word data in the recognition dataset comprises adjusting the first value of the predetermined threshold score.

23.	(New)		The method of claim 19, wherein the network microphone device is a first device of the media playback system, and wherein the method further comprises:
transmitting, via the network interface of the network microphone device, the first portion of the received voice data representing the activation word and the second portion of the received 

24.	(New)		The method of claim 23, wherein the confidence score is a first confidence score, and wherein the method further comprises:
receiving, via the network interface of the network microphone device from the second device, an indication of a second confidence score, wherein the second confidence score is greater than the first confidence score; 
comparing the second confidence score with the predetermined threshold score; and 
transmitting, via the network interface of the network microphone device, the first portion of the received voice data representing the activation word and the second portion of the received voice data representing the voice command to the identified voice service only if the second confidence score is greater than or equal to the predetermined threshold score.

25.	(New)		The method of claim 23, wherein the confidence score is a first confidence score, and wherein the method further comprises:
receiving, via the network interface of the network microphone device from the second device, an indication of a second confidence score, wherein the second confidence score is greater than the first confidence score; 
comparing the second confidence score with the predetermined threshold score; and 


26.	(New)		The method of claim 19, wherein updating the recognition data set comprises adjusting the predetermined threshold score.

27.	(New)		The method of claim 19, wherein updating the activation word data in the recognition dataset comprises adjusting the first value of the predetermined threshold score.



Allowable Subject Matter
Claims 1,5-7,9-10,14-16,18-19 and 21-27 allowed.
The following is an examiner’s statement of reasons for allowance: 
After a full review of the previous interview from 06/09/2021 applicant has agreed to amend the claims to advance prosecution to allowance, wherein after further review of all prior arguments, and after careful review of the complex claims as a whole, the examiner believes that the prior art taken alone or in combination fails to teach the claims as a whole such as receiving, via the one or more microphones, voice data indicating a voice input, wherein the received voice data includes a first portion representing an activation word corresponding to one of a plurality of voice services and a second portion representing a voice command, wherein the plurality of voice services are externally registered to a media playback system associated with the networked microphone device; identifying, prior to performing speech recognition on the second portion of the received 
The above claims are deemed allowable given the complex nature of the precise steps of at least identifying, determining, comparing, and transmitting as claimed as a whole. The closest prior art combination teaches multiple models, engines, and isolated voice services (Google, Siri, Alexa) for diverse phone applications e.g. music playback, command/control, GPS, etc. which utilize known probabilistic/confidence score model procedures for best matching user intent. The prior art simply does not teach the precise complexities as claimed. Even in a piecewise approach with not necessarily well-known uses but existing in some prior art based concepts, the complex precise steps are not suggested by the prior art. Therefore the prior art fails to teach the complex claims as a whole.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”




Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 


	Multi-model speech analysis

Phillips; Michael S. et al.	US 20110066634 A1
	Contextual ASR remote and local

Boesen; Peter Vincent	US 20170151930 A1
	Vehicle sensor based awareness

Shams; Nima Lahijani et al.	US 20170076212 A1
	User tracking with multimodal sensors

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL C COLUCCI whose telephone number is (571)270-1847.  The examiner can normally be reached on M-F 9 AM - 5 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached at (571)272-7453.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.



/MICHAEL COLUCCI/Primary Examiner, Art Unit 2656                                                                                                                                                                                                        (571)-270-1847
Examiner FAX:  (571)-270-2847
Michael.Colucci@uspto.gov