DETAILED ACTION

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
This communication is responsive to the applicant's amendment dated 07/13/2022.  The examiner recognized that applicant(s) amended claims 1, 4-7, 11 and 14-17 (see the amendment: pages 2-5).

Response to Arguments
Applicant's arguments filed on 07/13/2022 with respect to the claim rejection under 35 USC 103, have been fully considered but are moot in view of the new ground(s) of rejection, since the amended claims introduce new issue and/or change scope of the claims. Accordingly, response to the applicant’s arguments (see Remarks: pages 6-9) based on the newly amended claims is directed to new claim rejection with necessitated new ground (see below). It is also noted that the previously cited references are still applicable to some amended claims for prior art rejection with necessitated new ground(s) (may include newly combined prior art teachings and/or claim interpretations) (see detailed rejection below).
In addition, in response to applicant’s augments regarding claims 1 and 11 that “the references (Slifka in view of Van Os) fail to disclose enabling a pre-defined voice command pattern based on a current context of the client device where the pre-defined voice command pattern specifies a particular type of action and an identifier representing a plurality of candidate objects of the particular type of action” (Note: content of Italian and/or bold font is emphasized by the applicant) (Remarks: page 6, paragraph 3 to page 9, paragraph 1), examiner has a different view on prior art teachings of combined references and claim interpretations.
It is noted that, claimed limitation of “current context of the client device” can be properly read on prior art teachings of (i) status/results of ASR processes related to derived ‘intent’ or ‘a desired action’ or determined ‘user’ intent with ‘matching’ result which is/are associated a voice command pattern/model, such as a command + target of command, like ‘call mom’ to ‘activate telephone in a user’s device’ (Slifka: col. 8, lines 17-44), and/or (ii) ‘available voice commands or options’ displayed on ‘screen’ for ‘the user selected element’ (Van Os: p24, p28-29), based in broadest reasonable interpretation of the clamed limitation in light of the specification (p71-p72).  Further, claimed/argued limitation of “a identifier representing a plurality of candidate objects of the particular type of action” is properly read on prior art teachings of ‘a target of the command’ as a contact in a voice command ‘call person’, ‘call mom’, or ‘call John’, wherein a person/contact (the target)  representing a plurality of candidate persons/contacts (including ‘mom’, ‘John’, other ‘Johns’, or the other person as objects) or a plurality of candidate phone numbers (may belong to one person) in the contact list/phone book for a phone call (particular type of action) (Slifka: col. 8, lines 17-44, Van Os: p17, 57, p72-p73), based in broadest reasonable interpretation of the clamed limitation in light of the specification (p72).  Thus, it can be seen that the rejection with combined teachings of the references fully covers and satisfies the claimed/argued limitation(s) (see detail below).  For above reason, the applicant’s arguments are not persuasive. 

Claim Rejections - 35 USC § 103
Claims 1-4, 6, 8, 11-14, 16 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over SLIFKA (US 9,224,387) in view of Van Os et al. (US 2010/0312547) hereinafter referenced as Van Os. 
As per claim 1, SLIFKA discloses ‘targeted detection of regions in speech processing data streams’ (title), comprising:
receiving (or capturing), at data processing hardware residing on a client device (read on ‘CPU’/‘processor’ included in ‘device 202’ referred to one of ‘computers’, ‘cellular phones’, ‘PDAs’), a speech utterance (‘spoken utterance’ or ‘audio data’), (col. 2, line 47 to col. 3, line 55); 
transcribing (or recognizing), by the data processing hardware (including ‘ASR module’), the speech utterance (col. 4, lines 7-37); 
enabling (read on ‘interprets’, ‘processed’, and/or ‘annotate’), by the data processing hardware (same above), a pre-defined voice command pattern (one of ‘models’, such as ‘a command’ + ‘a target’) based on a current context (read on status/results of processes related to derived ‘intent’ or ‘a desired action’ or  determined ‘user’ intent with ‘matching’ result, such as a command of ‘call’ followed with a command target of ‘mom’ which is determined/resulted to ‘activate a telephone in his/her device and to initiate a call…’, in a broad sense) of the client device (same above), the pre-defined voice command pattern specifying a particular type of action (read on ‘a desired action’ or ‘command’, such as ‘call’ as ‘command’ to ‘initiate a call with a contact’ or to ‘execute a phone call’) and an identifier (read on ‘a target of the command’, such as ‘mom’) representing a plurality of candidate objects (read on items in ‘contact list’ that inherently have multiple contacts as candidate objects, including ‘mom’ as ‘a contact’ and/or her ‘telephone number’) of the particular type of action (same above), wherein the pre-defined voice command pattern (such as ‘call mom’) is only enabled in the current context (such as when a/the ‘telephone number for “mom” in a contact list’) of the client device (same above) [and the particular type of action specified by the pre-defined voice command pattern is only available in the current context of the client device] (Note: the underline portion(s) is/are newly amended limitation(s) by the applicant), (col 8, lines 17-51);
semantically interpreting (‘semantic interpretation of’), by the data processing hardware (including ‘NLU unit’), the transcription (or transcribed ‘text’, recognized ‘words’/ ‘sentence’, or ‘ASR result’) of the speech utterance to determine if the transcription of the speech utterance matches the pre-defined voice command pattern (such as text ‘call mom’) (col 8, lines 17-51); 
when the transcription of the speech utterance matches the pre-defined voice command pattern (same above), determining, by the data processing hardware, that the speech utterance includes a voice command (such as ‘call mon’ or ‘play the who’) corresponding to the pre-defined voice command pattern (col 8, lines 17-51, col. 13, line 3-17).  
SLIFKA does not expressly disclose the particular type of action specified by the pre-defined voice command pattern being “only available in the current context of the client device.”  However, the same/similar concept/feature is well known in the art as evidenced by Van Os who in the same field of endeavor, discloses ‘contextual voice commands’ (title), comprising ‘voice commands’ being ‘contextual in inputs’ that indicate different levels or types of context for the commands’(p(paragraph)17-p18), providing ‘choice of contextual voice commands and/or instructions to the user using audible and visual indications’ (read on “current context”)(p24), including displaying ‘available voice commands or options’ that the ‘user can select a data item or an element displayed’ (read on “only available in the current context”) on ‘screen’ of the ‘device’ (read on “client device”) (p28-p29), providing ‘a visual indication of the available contextual voice commands’ (p38) by receiving and/or detecting user ‘voice input’, such as ‘Call John’ as a one of ‘voice commands’ (read on pre-defined voice command pattern: i.e. a command/action + a target/object, also read on as being “only enabled in the current context of the client device”), displaying ‘multiple Johns’ as ‘the contact information for everyone named John (wherein related action, such as ‘call’ one of “Johns’ is also read on “only available in the current context”) listed in the users’ phone book’, and/or further displaying ‘the name(s) of corresponding phone number(s) for each John’, such as displaying ‘John Smith’ associated with ‘multiple numbers’ (wherein related action, such as ‘call’ one of numbers of John Smith is also read on “only available in the current context in the current context”) on ‘screen’ of context of the ‘device’ (p72).  Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine teachings of SLIFKA and Van Os together by providing a mechanism of enabling a pre-determined voice command pattern (such as in a pattern of action/command + object/target, like ‘call mom’ or ‘call John’ for calling a person in a contact list having multiple contacts)  based on a current context of the device in the way that the command pattern is only enabled for displayed voice command(s) (such as visual indications of ‘call’ a person and/or ‘play’ a music) and the type of action is only available in current context (process status/condition, or visual indications/options, such as visual indications corresponding to command(s) or command(s) with target(s) for particular type of application, like a phone call to a person in a contact list, or a phone call to particular person having multiple numbers in a contact list) of the user/client device, as claimed, for the purpose (motivation) of combining physical and voice inputs to implement contextual voice commands that control device operations across different contexts/applications on a device and/or accurately predicating the intent of the user from a single voice command (Van Os: abstract, p3).
As per claim 2 (depending on claim 1), SLIFKA in view of VAN OS further discloses “wherein transcribing the speech utterance comprises executing an automatic speech recognizer on the data processing hardware to transcribe audio features (represented by ‘feature vector(s)’) corresponding to the speech utterance into text”, (SLIFKA: col. 5, lines 1-60).
As per claim 3 (depending on claim 1), SLIFKA in view of VAN OS further discloses “after determining that the speech utterance includes the voice command corresponding to the pre-defined voice command pattern (same above), performing (or ‘execute’), by the data processing hardware (same above), the particular type of action specified by the pre-defined voice command pattern (same above)”, (SLIFKA: col 8, lines 32-51). 
As per claim 4 (depending on claim 3), SLIFKA in view of VAN OS further discloses “wherein the transcription of the speech utterance comprises one or more terms (such as ‘mon’, ‘John Smith’) explicitly identifying a respective one  (‘target’ as ‘a contact’) of the candidate objects of the plurality of candidate objects (same above, such as ‘available Johns’ in ‘the user’s phone book’) of the particular type of action (same above, such as ‘command’ of ‘call’)”, (SLIFKA: col 8, lines 32-51; Van Os: p72). 
As per claim 6 (depending on claim 1), SLIFKA in view of VAN OS further discloses that “each respective candidate object of the plurality of candidate objects (persons in the ‘contact list’ or ‘phone book’) of the particular type of action specified by the predefined voice command pattern (same above) comprises an identifier (‘target of command’) for a corresponding contact name (read on ‘mom’ or ‘John Smith’ as an identity in the ‘contact list’/ ‘phone book’)”, (SLIFKA: col 8, lines 32-51; (Van Os: p72). 
As per claim 8 (depending on claim 1), the rejection is based on the same reason described for claim 1, because it also reads on the limitations of claim 8.
As per claims 11-14, 16 and 18, they recite a system.  The rejection is based on the same reason described for claims 1-4, 6 and 8 respectively, because the claims recite/includes the same/similar limitation(s) as claims 1-4, 6 and 8 respectively (also see SLIFKA: Fig. 2, col. 3, lines 29-51).

Claims 7 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over SLIFKA in view of VAN OS as applied to claim 1 and 11, and further in view of NOSTRANT (US 9,542,956). 
As per claim 7 (depending on claim 1), even though SLIFKA in view of VAN OS discloses “each respective candidate object of the plurality of candidate objects of the particular type of action specified by the predefined voice command pattern (same above), SLIFKA in view of VAN OS does not expressly disclose the pattern comprising “the identifier for the object of the action specified by the pre-defined voice command pattern comprises an identifier for a city name”.  However, the same/similar concept/feature is well known in the art as evidenced by NOSTRANT who discloses ‘systems and methods for responding to human spoken audio’ (title), comprising converting ‘audio commands’ into ‘string of text’ through ‘speech to text’ (col. 3, lines 37-67), translating ‘spoken ward command’ such as ‘Weather Los Angeles (read on identifier for a city name)’ into text query/command (col. 6, lines 33-54).  Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine teachings of SLIFKA, VAN OS and NOSTRANT together by providing a mechanism of processing/recognizing voice/spoken command with an identifier for a city name, as claimed, for the purpose (motivation) of handling domain specific sub-classes of command/query fulfillers and/or related responses (NOSTRANT: col. 9, lines 59-60, col. 6, lines 44-67).  
As per claim 17 (depending on claim 11), the rejection is based on the same reason described for claim 7, because the claim recites the same/similar limitation(s) as claim 7.

Claims 9 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over SLIFKA in view of VAN OS as applied to claims 1 and 11, and further in view of BASYE et al. (US 2014/0163978) hereinafter referenced as BASYE. 
As per claim 9 (depending on claim 1), SLIFKA in view of VAN OS does not expressly disclose “prior to transcribing the speech utterance, detecting, by the data processing hardware, using a hotword detector, audio features indicative of a hotword in an initial portion of the speech utterance”.  However, the same/similar concept/feature is well known in the art as evidenced by BASYE who discloses ‘speech recognition power management’ (title), comprising: ‘keyword’ or ‘wakeword’ (read on hotword) detected by ‘100’ or a combined mechanism of ‘106’, ‘108’, ‘110’ and ‘120’ by using ‘acoustic features of speech’ such as ‘spectral slope’, ‘energy levels’, ‘signal-to noise ratios’, for ‘first audio input’  (prior to) or ‘at least a portion’ of ‘audio input’/‘the obtained audio input’ including ‘speech’, (Fig.4A, p(paragraph)1, p10, p20-p28, p50, claim 1), providing ‘speech recognition’ with ‘language model’ on a portion of ‘a voice command, or query’ also included in ‘the audio input’, (p60-p61, p65-69).  Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine teachings of SLIFKA, VAN OS and BASYE together by providing a mechanism of detecting/recognizing wakeword (hotword) using feature vector(s)/acoustic features of speech for a first audio input/portion of audio/speech prior to transcribing/recognizing a second audio input/portion of the audio/speech ad a voice command/query, as claimed, for the purpose (motivation) of improving energy efficiency of the computing device/system (BASYE: p14).  
As per claim 19 (depending on claim 11), the rejection is based on the same reason described for claim 9, because the claim recites the same/similar limitation(s) as claim 9.

Allowable Subject Matter
Claims 5 and 15 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to QI HAN whose telephone number is (571)272-7604.  The examiner can normally be reached on 9-19:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on 571-272-7799.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.                                                                                                                                                                                          
QH/qh
September 21, 2022
/QI HAN/Primary Examiner, Art Unit 2659