DETAILED ACTION
This communication is in response to the amendments and arguments filed on 07/29/2022. Claims 1-20 are pending and have been examined.
Any previous objection/rejection not mentioned in this Office Action has been withdrawn by the Examiner.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Examiner Note
The Examiner notes the presence of multiple conditional statements with respect to determining a give speech command including the predefined guard phrase or without the guard phrase in present claim 1. Therefore, although the Examiner has mapped both conditional statements for compact prosecution, if further amendments or arguments are made, the Examiner is required to only map one of those conditional statements under ex parte Schulhauser. 

Response to Amendments and Arguments 
The Applicant has amended each of the independent claims. More specifically, claim 1 and 20 have been amended by initially comprising a transition without the use of the guard phrase and performing a command and a second option where if the guard phrase is detected to perform an activation of commands and performing the command. The same combination of references have been used for claims 1 and 20 and with a different interpretation of the guard phrase as mapped below. With respect to claim 12, amendments centered around the transition of the interface in response to image data and use of a user speech model is described. Hence, the Applicant’s argument is moot in view of new grounds for rejection and further since the Applicant argument comprise general allegation of patentability over the currently cited references. 
The Double Patenting Rejections have been withdrawn by the Examiner as a result of the current claim amendments.  The Examiner reserves the right to request a Terminal Disclaimer in the next round of prosecution based on any further amendments to the claims.


Claim Objections
Claim 1 is objected to because of the following informalities:  In  the “analyzing,,,” limitation of “whether any of the plurality of speech commands…” should be “whether any of the one or more speech commands…” as the plurality of speech commands includers speech commands not activated and only the “one or more speech commands are activated” for the given interface.  Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 12-19 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. The claims, as presently amended, require there be a transition into an interface mode based on image data and as a result activate speech commands in order for the captured audio data to be received and checked if it corresponds to a voice command. The Applicant cites to paragraphs [0024], [0027]-[0028], [0084], [0088], [0093], [0099], [0106], [0110]. However, none of these paragraphs show support for the transition into a give interface mode in response to receiving image data. The Examiner notes there is support for the capture of images, eye tracking and gaze tracking by way of user input interfaces in order to aide in interpreting input data received in an interface mode (see [0082] as filed). However, nowhere is this connected to the transition into an interface mode based on image data which activates speech commands as currently claimed. The Applicant is suggested to provide explicit support from the specification showing the paragraph and content from the specification or cancel the new matter as noted. 

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 13 and 15 recites the limitation "of the plurality of speech commands" in in the last limitation of the claim.  There is insufficient antecedent basis for this limitation in the claim.
Claims 14 is rejected for its dependency of an indefinite base claim.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.



Claims 1-8, 10 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Han (US 8,650,036) in view of Monson (US 2013/0090930) in view of Newman (US 2014/0074481).

As to claim 1 Han teaches a method implemented by one or more processors of a computing device, the method comprising:
transitioning the computing device into a given interface mode in response to receiving sensor data captured by at least one sensor of the computing device (see col. 11, lines 30-33, where it is determined that the voice start command is the first voice start command and see col. 11, lines 51-54, where it is determined to be the second voice start command), wherein the sensor data does not include a predetermined guard phrase (e.g. Using the same citations as above, the Examiner notes that that it is interpreted that the predetermined guard phrase is the first voice start command)
in response to the computing device being in a given interface mode (see col. 11, lines 56-59, where second voice task mode is mode in which the electronic apparatus is controlled), where voice start command is detected and checked if it does not correspond to a first voice start command):
activating one or more speech commands of a plurality of speech commands that are specific to the given interface mode (see col. 11, lines 65-col. 12, lines 5, where once the start command is determined to be associated with the second mode, then subsequent commands are presented to the user  for selection and performing the task),
wherein the given interface mode corresponds to a current state of a user interface of the computing device mode (see col. 11, lines 16-19, 53-55, where voice start command is detected and checked if it corresponds to a first or second voice start command), 
wherein the given interface mode is one of multiple interface modes each corresponding to corresponding alternate states of the user interface (see Figure 6, two modes, first voice mode and second voice mode and see Figure 4 and 5, which shows two different interfaces based on the first and second modes.), and 
wherein the plurality of speech commands includes one or more additional speech commands that are not activated for the given interface mode (see col. 11, lines 39-49, where commands associated with the first mode are mode are not activated as a result of the first voice start command not received);
while the computing device is in the given interface mode and while the one or more speech commands are activated (see col. 12, lines 1-3, where a plurality of voice items are available for selection by user based on mode):
receiving audio data captured by at least one audio sensor of the computing device (see col. 15, lines 15-37 and col. 12, lines 17-10, where analysis occurs of the different second mode voice guide and if an error has occurred based on user input);
analyzing, based on the one or more speech commands being activated, the audio data to determine whether any of the (plurality: see objection above) one or more speech commands are included in the audio data (see col. 15, lines 15-37 and col. 12, lines 17-10, where analysis occurs of the different second mode voice guide and if an error has occurred);
in response to determining a given speech command, of the one or more speech commands, is included in the audio data: performing one or more actions, via the computing device, that correspond to the given speech command (see col.12, lines 11-12, where command is performed with respect to the second voice guide information), and
in response to determining that the predetermined guard phrase is included in the audio data (see col. 11, lines 35-38, where first voice task mode is mode in which the electronic device is controlled and where voice start command is detected and checked if it corresponds to a first voice start command): 
activating the one or more additional speech commands for the given interface mode (see col. 11, lines 33-36 and 38-50, where voice start command is determined to be associated with the first mode and then subsequent commands are presented to the user for selection and performing the task), 
determining, based on activating the one or more additional speech commands, that the audio data includes a given additional speech command of the one or more additional speech commands (see col. 11, lines 43-49, where voice input is received corresponding to the voice items in first voice guide information), and 
performing, via the computing device and responsive to activating the one or more additional speech commands, one or more additional actions that correspond to the given additional speech command (see col. 11, lines 49-50, where command is performed)
However, Han does not specifically teach in response to determining that the audio is received within a predetermined period of time of transitioning the computing device into the given interface mode.
Monson does teach in response to determining that the audio is received within a predetermined period of time of transitioning the computing device into the given interface mode (see [0038] where determination is made as to whether the speech input of a context word is received after a spoken trigger word and [0030], where input of a trigger word causes the context menu to be presented which then permits context option selections).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han with the predetermined period of time as taught by Monson in order prevent a cumbersome experience by allowing user to switch between different contexts and or interfaces (see Monson [0001], [0014]).
However, Han in view of Monson does not specifically teach the “without a predefined guard phrase” and the transitioning.
Newman does teach without a predefined guard phrase (see [0061], where the user is allowed to issue a series of directive commands (i.e. speech commands) without having to repeat the attention command (i.e. guard phrase)),
and transitioning the computing device to a different interface mode (see [0061], where Tgate period is aborted once each direct command is received thereby changing the mode to require the entry of an attention command (see  [0062]) (e.g. The Examiner interprets the mode here to be an attention mode vs non attention requiring mode)).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han in view of Monson with determining “without a predetermined guard phrase” as taught by Newman in order provide a convenience to users where the attention command need to be repeated for each directive command (see Newman [0061]),
As to claim 20, apparatus claims 20 and method claim 1 are related as apparatus and the method of using same, with each claimed element's function corresponding to the claimed method step. Accordingly claims 29 and 37are similarly rejected under the same rationale as applied above with respect to apparatus claim. Furthermore, Han discloses a non-transitory computer readable medium (see col. 15, lines 48-54). Furthermore, Han teaches a computing device including a memory and one or more processors configured to execute instructions stored in memory (see col. 5, lines 35-37, electronic apparatus and col. 15, lines 48-54, medium and program code)

As to claims 2, Han in view of Monson in view of Newman teach all of the limitations as in claims 1, above.
Furthermore, Monson teaches further comprising: while the computing device is in the given interface mode and while the one or more speech commands are activated (see [0038], where after a trigger phrase is detected, a timer starts allowing for context words to be spoken): determining the predetermined period of time has lapsed (see [0038], a time begins to elapse); and in response to determining the predetermined period of time has lapsed, deactivating the one or more speech commands of the plurality speech commands that are specific to the given interface mode (see [0038], where the context menu is removed if the context word is not recognized within a particular period of time).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han with the predetermined period of time and deactivation as taught by Monson in order prevent a cumbersome experience by allowing user to switch between different contexts and or interfaces (see Monson [0001], [0014]).

As to claims 3, Han in view of Monson in view of Newman teach all of the limitations as in claims 2, above.
Furthermore, Monson teaches further comprising: while the computing device is in the given interface mode and while the one or more speech commands are deactivated (see [0038], where the context menu is removed if the context word is not recognized within a particular period of time).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han with the predetermined period of time and deactivation as taught by Monson in order prevent a cumbersome experience by allowing user to switch between different contexts and or interfaces (see Monson [0001], [0014]).
Furthermore, Newman teaches in response to determining the given speech command, of the one or more speech commands, is included in the audio data without the predefined guard phrase, and in response to determining that the audio data is not received within the predetermined period of time (see [0059]-[0060], where when the Tgate expires the gate parameter is set to disabling and no further directive commands are allowed until the attention command is received): refraining from performing one or more of the actions, via the computing device, that correspond to the given speech command (see [0060], where it is described that no directive commands are allowed and therefore these commands cannot be performed).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han in view of Monson with determining “without a predetermined guard phrase” as taught by Newman in order provide a convenience to users where the attention command need to be repeated for each directive command (see Newman [0061]),

As to claims 4, Han in view of Monson in view of Newman teach all of the limitations as in claims 3, above.
Furthermore, Newman teaches subsequent to refraining from performing one or more of the actions, via the computing device, that correspond to the given speech command (see [0060], where it is described that no directive commands are allowed and therefore these commands cannot be performed): receiving additional audio data captured by the at least one audio sensor of the computing device, the additional audio data including at least the predefined guard phrase (see [0060], where until the attention command is received no further directive commands are allowed and where attention command is received and gate parameter set to enabling); and re-activating one or more speech of the of the plurality of speech commands that are specific to the given interface mode (see [0059], where once the attention command is spoken then directive commands are permitted).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han in view of Monson with determining refraining and re-activating as taught by Newman in order provide a convenience to users where the attention command need to be repeated for each directive command (see Newman [0061]).

As to claim 5, Han in view of Monson in view of Newman teach all of the limitations as in claims 1, above.
 Furthermore, Han teaches further comprising: while the computing device is in the given interface mode and while the one or more speech commands are activated: displaying a visual cue for one or more of the speech commands via a display of the computing device (see col. 11, lines 35-38, when first task mode is a mode in which the electronic apparatus is controlled and lines 39-47, where voice items concerning the first voice guide also shown and see Figure 4, icon 424 and Fig. 6, step 640).

As to claim 6, Han in view of Monson in view of Newman teach all of the limitations as in claims 5, above.
Furthermore, Monson teaches further comprising: while the computing device is in the given interface mode and while the one or more speech commands are activated (see [0029], where spoken trigger is spoken to activate speech recognition functionalities): determining the predetermined period of time has lapsed (see [0038], where time beings to elapse after spoken trigger is detected and context menu is presented up to a particular time interval); and in response to determining the predetermined period of time has lapsed, causing the visual cue for the one or more speech commands to be removed from the display of the computing device (see [0038], where if the context word is not recognized within the particular time interval, the process returns to 400 and context menu is removed from a display).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han with the predetermined period of time and deactivation as taught by Monson in order prevent a cumbersome experience by allowing user to switch between different contexts and or interfaces (see Monson [0001], [0014]).

As to claim 7, Han in view of Monson in view of Newman teach all of the limitations as in claims 1, above.
 Furthermore, Han teaches wherein activating the one or more speech commands that are specific to the given interface mode comprises loading, at the computing device, at least one hotword process for the one or more speech commands (see col. 11, lines 14-30, where the system is loaded to start of with determining the voice start command in order to determined mode); and
wherein analyzing, based on the one or more speech commands being activated, the audio data to determine whether any of the one or more speech commands are included in the audio data comprises:
analyzing the audio data using the at least one hotword process (see col. 11, lines 28-30, 50-54, where it is determined which type of voice start command is used in order to determine the one or more commands).

As to claim 8, Han in view of Monson in view of Newman teach all of the limitations as in claims 1, above.
 Furthermore, Han teaches wherein the sensor data captured by the at least one sensor of the computing device comprises preceding audio data captured by the at least one audio sensor of the computing device (col. 5, lines 45-47, microphone), the preceding audio data including at least the predefined guard phrase (see col. 11, lines 31-37, where if the voice start command is the first, which is the interpreted guard phrase) and an indication of the given interface mode (see col. 11, lines 33-35, 38-50, where the start command is determined to be associated with the first mode).

As to claim 10, Han in view of Monson in view of Newman teach all of the limitations as in claims 1, above.
 Furthermore, Han teaches wherein activating one or more of the speech commands in performed based on receiving additional audio data that includes the predetermined guard phrase while the computing device is in the given interface mode  (see col. 11, lines 33-35, 38-50, where the start command is determined to be associated with the first mode then subsequent commands are presented to the user  for selection and performing the task and where user provides voice input as a result and see col. 12 lines 32-37, where mode changes can occur based on input of the first voice start command). 

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Han in view of Monson in view of Newman, as applied in claim 1, above and further in view of Chi (US 2013/0018659).
As to claim 11, Han in view of Monson in view of Newman teaches all of the limitations as in claim 1.
	However, Han in view of Monson in view of Newman does not specifically teach wherein the computing device is a head-mountable device.
Chi does teach wherein the computing device is a head-mountable device (see [0085], wearable computing device and see Figure 5A, glasses and [0028], HMD).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have substituted the device as taught by Han in view of Monson in view of Newman with the HMD as taught by Chi in order to provide a predictable result of providing a visual display which takes as input spoken commands which would benefit the system of Han (see KSR v. Teleflex).

Claims 12-13 and 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Han (US 8,650,036) in view of Monson (US 2013/0090930) in view of Macho (US 2014/0122087) in view of Rabin (US 6,081,782).
As to claim 12, Han teaches a computing device including memory and one or more processors configured to execute instructions stored in memory  (see col. 5, lines 35-37, electronic apparatus and col. 15, lines 48-54, medium and program code), the instructions comprising instructions to: 
transition the computing device into a given interface mode in response to receiving [[image data capturing a user of the computing device, the image data being generated based on output of at least one sensor]] of the computing device (see col. 11, lines 30-33, where it is determined that the voice start command is the first voice start command and see col. 11, lines 51-54, where it is determined to be the second voice start command); 
in response to the computing device being in the given interface mode (see col. 11, lines  35-38, where first voice task mode is mode in which the electronic apparatus is controlled), where voice start command is detected and checked if it corresponds to a first voice start command):
activate one or more speech commands that are specific to the given interface mode (see col. 11, lines 33-35, 38-50, where once the start command is determined to be associated with the first mode, then subsequent commands are presented to the user  for selection and performing the task), 
wherein the given interface mode corresponds to a current state of a user interface of the computing device  (see col. 11, lines 16-19, where voice start command is detected and checked if it corresponds to a first voice start command), and 
wherein the given interface mode is one of multiple interface modes each corresponding to corresponding alternate states of the user interface (see Figure 6, two modes, first voice mode and second voice mode and see Figure 4 and 5, which shows two different interfaces based on the first and second modes.); 
while the computing device is in the given interface mode and while the one or more speech commands are activated  (see col. 11, lines 39-43, where a plurality of voice items are available for selection by user based on mode):: 
receive audio data captured by at least one audio sensor of the computing device  (see col. 11, lines 48, where voice is input related to channel up/down); 
determine, based on processing the audio data and based on the one or more speech commands being activated (see col. 11, lines 40-50, where once the first voice start command received then further command can be provided),
analyzing the audio data to determine whether the audio data includes any of the one or more speech commands, and determining, based on the analyzing, that the audio data includes the given speech command of the one or more speech commands, and  (see col. 11, lines 43-49, where it’s determined based on commands associated with first mode whether a user input corresponds and see col. 12, lines  54-col. 13, lines 15, where system determined if command matches and if not error is displayed otherwise not), 
perform one or more actions, via the computing device, that correspond to the given speech command (see col.11, lines 49-50, where command is performed)..
However, Han does not specifically disclose in response to determining that the audio is received within a predetermined period of time of transitioning the computing device into the given interface mode.
Monson does teach in response to determining that the audio data indicates that the user of the computing device spoke the given speech command, of the one or more speech commands, and in response to determining that the audio data is received within a predetermined period of time of transitioning the computing device into the given interface mode (see [0038] where determination is made as to whether the speech input of a context word is received after a spoken trigger word and [0030], where input of a trigger word causes the context menu to be presented which then permits context option selections) (e.g. the user specific determination is taught below in the further combination).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han with the predetermined period of time as taught by Monson in order prevent a cumbersome experience by allowing user to switch between different contexts and or interfaces (see Monson [0001], [0014]).
However, Han in view of Monson does not specifically disclose  image data capturing a user of the computing device, the image data being generated based on output of at least one sensor.
Macho does teach transition the computing device into a given interface mode in response to receiving image data capturing a user of the computing device, the image data being generated based on output of at least one sensor of the computing device (see [0036], where image of the user face is captured and if identification data is detected then communication device is activated to accept further speech or voice commands).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han in view of Monson with the image data as taught by Macho in order prevent reduction in battery life (see Macho [0003]).
However, Han in view of Monson in view of Macho do not specifically disclose analyzing, based on determining that the audio data includes the given speech command, the audio data using a speech model generated for the user based on one or more previous instances of audio data capturing the user speaking the given speech command.
 Rabin does teach that the audio data indicates that the user of the computing device spoke a given speech command of the one or more speech commands, wherein determining that the audio data indicates that the user of the computing device spoke the given speech command of the one or more speech commands includes: analyzing, based on determining that the audio data includes the given speech command, the audio data using a speech model generated for the user based on one or more previous instances of audio data capturing the user speaking the given speech command (see Figure 4, where user is enrolled by user speaking a command and collecting a sample in step 404 which constructs the model and see Figure 5, where user is identified and newly created models of the command are compared to prestored command related models to determine a match and when verification is successful then action executed).
  Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han in view of Monson in view of Macho with the user model as taught by Rabin in order perform both function of identity verification and command execution using a single command (See Rabin, col. 1, lines 5-11).

As to claim 13, Han in view of Monson in view of Macho in view of Rabin teach all of the limitations as in claim 12, above.
Furthermore, Monson teaches further comprising: while the computing device is in the given interface mode and while the one or more speech commands are activated (see [0038], where after a trigger phrase is detected, a timer starts allowing for context words to be spoken): determining the predetermined period of time has lapsed (see [0038], a time begins to elapse); and in response to determining the predetermined period of time has lapsed, deactivating the one or more speech commands of the plurality speech commands that are specific to the given interface mode (see [0038], where the context menu is removed if the context word is not recognized within a particular period of time).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han with the predetermined period of time and deactivation as taught by Monson in order prevent a cumbersome experience by allowing user to switch between different contexts and or interfaces (see Monson [0001], [0014]).

As to claim 16, Han in view of Monson in view of Macho in view of Rabin teach all of the limitations as in claims 12, above.
 Furthermore, Han teaches further comprising: while the computing device is in the given interface mode and while the one or more speech commands are activated: displaying a visual cue for one or more of the speech commands via a display of the computing device (see col. 11, lines 35-38, when first task mode is a mode in which the electronic apparatus is controlled and lines 39-47, where voice items concerning the first voice guide also shown and see Figure 4, icon 424 and Fig. 6, step 640).

As to claim 17, Han in view of Monson in view of Macho in view of Rabin teach all of the limitations as in claims 16, above.
Furthermore, Monson teaches further comprising: while the computing device is in the given interface mode and while the one or more speech commands are activated (see [0029], where spoken trigger is spoken to activate speech recognition functionalities): determining the predetermined period of time has lapsed (see [0038], where time beings to elapse after spoken trigger is detected and context menu is presented up to a particular time interval); and in response to determining the predetermined period of time has lapsed, causing the visual cue for the one or more speech commands to be removed from the display of the computing device (see [0038], where if the context word is not recognized within the particular time interval, the process returns to 400 and context menu is removed from a display).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han with the predetermined period of time and deactivation as taught by Monson in order prevent a cumbersome experience by allowing user to switch between different contexts and or interfaces (see Monson [0001], [0014]).

As to claim 18, Han in view of Monson in view of Macho in view of Rabin teach all of the limitations as in claims 12, above. 
Furthermore, Han teaches wherein activating the one or more speech commands that are specific to the given interface mode comprises loading, at the computing device, at least one hotword process for the one or more speech commands (see col. 11, lines 14-30, where the system is loaded to start of with determining the voice start command in order to determined mode); and
wherein analyzing, based on the one or more speech commands being activated, the audio data to determine whether any of the one or more speech commands are included in the audio data comprises:
analyzing the audio data using the at least one hotword process (see col. 11, lines 28-30, 50-54, where it is determined which type of voice start command is used in order to determine the one or more commands).

As to claim 19, Han in view of Monson in view of Macho in view of Rabin teach all of the limitations as in claims 12, above.
 Furthermore, Han teaches wherein the sensor data captured by the at least one sensor of the computing device comprises preceding audio data captured by the at least one audio sensor of the computing device (col. 5, lines 45-47, microphone), the preceding audio data including at least the predefined guard phrase (see col. 11, lines 31-37, where if the voice start command is the first, which is the interpreted guard phrase) and an indication of the given interface mode (see col. 11, lines 33-35, 38-50, where the start command is determined to be associated with the first mode).

Claims 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Han (US 8,650,036) in view of Monson (US 2013/0090930) in view of Macho (US 2014/0122087) in view of Rabin (US 6,081,782), as applied in claim 13, above and further in view of Newman (US 2014/0074481).
As to claims 14, Han in view of Monson in view of Macho in view of Rabin  teach all of the limitations as in claims 13, above.
Furthermore, Monson teaches further comprising: while the computing device is in the given interface mode and while the one or more speech commands are deactivated (see [0038], where the context menu is removed if the context word is not recognized within a particular period of time).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han with the predetermined period of time and deactivation as taught by Monson in order prevent a cumbersome experience by allowing user to switch between different contexts and or interfaces (see Monson [0001], [0014]).
However, Han in view of Monson in view of Macho in view of Rabin do  not specifically disclose in response to determining the given speech command, of the one or more speech commands, is included in the audio data without the predefined guard phrase, and in response to determining that the audio data is not received within the predetermined period of time: refraining from performing one or more of the actions, via the computing device, that correspond to the given speech command.
Newman does teach in response to determining the given speech command, of the one or more speech commands, is included in the audio data without the predefined guard phrase, and in response to determining that the audio data is not received within the predetermined period of time (see [0059]-[0060], where when the Tgate expires the gate parameter is set to disabling and no further directive commands are allowed until the attention command is received): refraining from performing one or more of the actions, via the computing device, that correspond to the given speech command (see [0060], where it is described that no directive commands are allowed and therefore these commands cannot be performed).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han in view of Monson in view of Macho in view of Rabin with determining “without a predetermined guard phrase” as taught by Newman in order provide a convenience to users where the attention command need to be repeated for each directive command (see Newman [0061]).

As to claims 15, Han in view of Monson in view of Macho in view of Rabin  teach all of the limitations as in claims 14, above.
However, Han in view of Monson in view of Macho in view of Rabin do not specifically disclose subsequent to refraining from performing one or more of the actions, via the computing device, that correspond to the given speech command: receiving additional audio data captured by the at least one audio sensor of the computing device, the additional audio data including at least the predefined guard phrase; and re-activating one or more speech of the of the plurality of speech commands that are specific to the given interface mode.
Newman does disclose subsequent to refraining from performing one or more of the actions, via the computing device, that correspond to the given speech command (see [0060], where it is described that no directive commands are allowed and therefore these commands cannot be performed): receiving additional audio data captured by the at least one audio sensor of the computing device, the additional audio data including at least the predefined guard phrase (see [0060], where until the attention command is received no further directive commands are allowed and where attention command is received and gate parameter set to enabling); and re-activating one or more speech of the of the plurality of speech commands that are specific to the given interface mode (see [0059], where once the attention command is spoken then directive commands are permitted).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the voice command as taught by Han in view of Monson in view of Macho in view of Han with determining refraining and re-activating as taught by Newman in order provide a convenience to users where the attention command need to be repeated for each directive command (see Newman [0061]).

Allowable Subject Matter
Claim 9 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The currently cited prior art of record speak of a single duration time out period for the interface. The prior art of record do not specifically disclose “wherein the predetermined period of time is specific to the given interface mode, and wherein a different predetermined period of time is associated with at least one of the multiple interface modes other than the given interface mode”.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PARAS D SHAH whose telephone number is (571)270-1650.  The examiner can normally be reached on Monday-Thursday 7:30AM-3PM, 5PM-7PM (EST), Friday 8AM-noon (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on 571-272-7799.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/Paras D Shah/Primary Examiner, Art Unit 2659                                                                                                                                                                                                        

09/11/2022