Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION


Response to Arguments
Applicant's arguments filed 08/12/2022 have been fully considered but they are not persuasive. On pages 10-11 of the arguments Applicant argues that the amendment is not taught. Examiner disagrees. 
For instance, in Shariffi 0006 a server is used to generate scores that are correlated directly to a type of pattern (e.g. thermostat temperature inquiry versus a command to make a phone call). The score comparison takes place on the user device though the information used for comparison is generated by the server. The scores represent input patterns, following comparison of scores (patterns), when the type of pattern is established, the server can also be subsequently contacted for outbound transmission such as to process a command as in 0035 i.e. hotword, wake up, ASR.

	
	

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 21-41 is/are rejected under 35 U.S.C. 103 as being unpatentable over Koulomzin; Daniel G. US 20150248885 A1 (hereinafter Koulomzin) in view of Sharifi; Matthew US  US 20160104480 A1 (hereinafter Shariffi).
Re claims 21, 27, and 33, Koulomzin teaches
21. (New) A system, comprising: 
a remote server having a speech analysis engine, and the speech analysis engine being configured to perform natural language processing on audio; (detecting a sound signature for a hotword, otherwise processes audio via ASR or an alternative means such as NL processing, 0037, 0038 0069 0054 0077 with fig. 2a)
a first device having an activation trigger engine, and the activation trigger engine being configured to: receive an audio input, (activation being a keyword which can be any word e.g. “play” or “song”, microphone input and processor, detecting a sound signature for a hotword, otherwise processes audio via ASR or an alternative means such as NL processing, 0037, 0038 0069 0054 0077 with fig. 2a)
transmit at least a portion of the audio input to the remote server (remote server processes inputs, system attempts to detect a sound signature for a hotword, but does not preclude other operations, for instance it otherwise processes audio via ASR or an alternative means such as NL processing, 0038 0069 0054 0077 with fig. 2a)
wherein the remote server is further configured to detect a command from the audio input received from the first device.  (entire premise is to detect a command and intent, remote server processes inputs, system attempts to detect a sound signature for a hotword, but does not preclude other operations, for instance it otherwise processes audio via ASR or an alternative means such as NL processing, 0038 0069 0054 0077 with fig. 2a)
in response to determining that the pattern is the second type of pattern: determine that the audio input includes a command associated with the command-type activation trigger without using the speech analysis engine, and (execution and analysis can take place where a single word is both command and wakeup in common circumstances when a device is listening, such that the hotword detector detects raw audio input, see also 0077 for audio pattern signatures, further such as a wakeup word but with the purpose of recognizing the hotword itself as a command independent of ASR, here the system proceeds using hotword detector if it can e.g. “play” regarding a media player where “play” can simultaneously be the wakeup word and command fig. 2a and fig. 3 with 0037, 0075, in this instance a separate ASR server happens to be present in fig. 2a at 206, which is not needed unless speech-to- text or ASR operations are warranted...)


Koulomzin while teaching hotword or audio detection prior to commands, it fails to teach
compare the audio input with a pattern, determine, on the device, the result of the comparison indicates that the audio input matches a first type of pattern or a second type of pattern, wherein the first type of pattern indicates the audio input includes an analysis-type activation trigger and does not include a command-type activation trigger, and the second type of pattern indicates the audio input includes a command-type activation trigger, and 
in response to determining that the pattern is the first type of pattern: establish a connection with the remote server,
(Shariffi 0006 a server is used to generate scores that are correlated directly to a type of pattern (e.g. thermostat temperature inquiry versus a command to make a phone call). The score comparison takes place on the user device though the information used for comparison is generated by the server. The scores represent input patterns, following comparison of scores (patterns), when the type of pattern is established, the server can also be subsequently contacted for outbound transmission such as to process a command as in 0035 i.e. hotword, wake up, ASR… regarding hotwords, Shariffi, user can say a hotword “Ok computer” and the system can analyze this to determine the best application/device the user is attempting to access. Following this step, the system then process any following command e.g. “Ok Computer” (wake word for instance)… “Call Alice” (command portion). Another example is “Ok computer, remind me to buy milk”. The command portion and wake portions are treated differently analogous to the claims. See 0027-0028 for the wake word processing (non-command portion) and 0035 for the command portion processing… prevents time and memory usage by unnecessarily processing command portions until a wake word is found and a matching device thereof)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Koulomzin to incorporate the above claim limitations as taught by Shariffi to allow for highest-scoring an initial hotword wake word to determine the correct device and thereafter suspending operation of the remaining candidate devices thereby once a wake word and device is paired, the subsequent processing of a command can take place without all devices simultaneously processing high scores, wherein instead of devices A, B, C, and D produce a score and all process the command only say device B will process the command thereby expressly saving processing time and resources for a faster acquisition of user requests, and additionally preventing errors while maximizing model/system accuracy when learning/adapting to inputs.


Re claims 22, 28, and 34, Koulomzin teaches
22. (New) The system of claim 21, wherein the first device is further configured to: receive the command from the remote server; and execute the command.  (receive input, send to remote server which processes inputs to execute the user intent, system attempts to detect a sound signature for a hotword, but does not preclude other operations, for instance it otherwise processes audio via ASR or an alternative means such as NL processing, 0038 0069 0054 0077 with fig. 2a)

Re claims 23, 29, and 35, Koulomzin teaches
23. (New) The system of claim 21, wherein the remote server is further configured to: execute the command.  (receive input, send to remote server which processes inputs to execute the user intent, system attempts to detect a sound signature for a hotword, but does not preclude other operations, for instance it otherwise processes audio via ASR or an alternative means such as NL processing, 0038 0069 0054 0077 with fig. 2a)

Re claims 24, 30, and 36, Koulomzin teaches
24. (New) The system of claim 21, wherein the first device is further configured to: execute a command associated with the command-type activation trigger responsive to determining that the audio input includes the command-type activation trigger.  (hotword is executed e.g. “play”, overall it shall receive input, send to remote server which processes inputs to execute the user intent, system attempts to detect a sound signature for a hotword, but does not preclude other operations, for instance it otherwise processes audio via ASR or an alternative means such as NL processing, 0038 0069 0054 0077 with fig. 2a)


Re claims 25, 31, and 37, Koulomzin teaches
25. (New) The system of claim 21, wherein the first device determines the command- type activation trigger without using speech-to-text conversion.  (system attempts to detect a sound signature for a hotword, but does not preclude other operations, for instance it otherwise processes audio via ASR or an alternative means such as NL processing, 0038 0069 0054 0077 with fig. 2a)


Re claims 26, 32, and 38, Koulomzin teaches
26. (New) The system of claim 21, wherein the first device includes a speech analysis engine, the speech analysis engine configured to: 
selectively operate in an inactive mode and an active mode; and (trigger shifts the system from inactive to active mode, system attempts to detect a sound signature for a hotword, but does not preclude other operations, for instance it otherwise processes audio via ASR or an alternative means such as NL processing, 0038 0069 0054 0077 with fig. 2a)

transition from the inactive mode to the active mode responsive to determining that the audio input includes a command-type activation trigger; and (trigger when detected shifts the system from inactive to active mode, system attempts to detect a sound signature for a hotword, but does not preclude other operations, for instance it otherwise processes audio via ASR or an alternative means such as NL processing, 0038 0069 0054 0077 with fig. 2a)
while in the active mode, perform natural language processing on the audio input. (trigger when detected shifts the system from inactive to active mode, system attempts to detect a sound signature for a hotword, but does not preclude other operations, for instance it otherwise processes audio via ASR or an alternative means such as NL processing, 0038 0069 0054 0077 with fig. 2a)

Re claims 39-41, Koulomzin teaches
39. (New) The system according to claim 21, wherein the second type of pattern indicates the audio input includes an analysis-type activation trigger and the command- type activation trigger (relating to instances where there is adjacent subsequent inputs at a later time or in the instance where a single input is both wake and command in itself, execution and analysis can take place where a single word is both command and wakeup in common circumstances when a device is listening, such that the hotword detector detects raw audio input, see also 0077 for audio pattern signatures, further such as a wakeup word but with the purpose of recognizing the hotword itself as a command independent of ASR, here the system proceeds using hotword detector if it can e.g. “play” regarding a media player where “play” can simultaneously be the wakeup word and command fig. 2a and fig. 3 with 0037, 0075, in this instance a separate ASR server happens to be present in fig. 2a at 206, which is not needed unless speech-to- text or ASR operations are warranted...)


THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

US 20180061419 A1	Melendo Casado; Diego et al.
Hotword detection device distinguish

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL COLUCCI whose telephone number is (571)270-1847.  The examiner can normally be reached on M-F 9 AM - 7 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached at (571)272-7516.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/MICHAEL COLUCCI/Primary Examiner, Art Unit 2655                                                                                                                                                                                               (571)-270-1847
Examiner FAX:  (571)-270-2847
Michael.Colucci@uspto.gov