DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 3/25/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Status of Claims
Claims 23-29 are cancelled and claims 31-37 are added, leaving claims 1-22 and 31-37 pending in this application.  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5-6, 12-17 and 30-32 are rejected under 35 U.S.C. 103 as being unpatentable over Hart et al. (U.S. Patent Application Publication 2014/0249817) in view of Kim (U.S. Patent Application Publication 2019/0272459).
As per claims 1 and 13, Hart et al. discloses:
(Figure 1, item 106 and Paragraph [0016] – Voice controlled device) comprising:
 a memory configured to store instructions (Figure 1, item 110 and Paragraph [0018] – memory storing applications); and
one or more processors (Figure 1, item 108 and Paragraph [0018] – the voice controlled device has a processor) configured to execute the instructions to: 
receive, via one or more microphones (Paragraphs [0017-0018] – The voice controlled device has microphones), a keyword and a first command spoken by a first user (Paragraph [0025] – the wake up phrase and a command are uttered by a user); 
subsequent to receiving the first command, receive a second command via the one or more microphones without an intervening receipt of the keyword (Paragraph [0044] – the same user utters a subsequent command); and 
based at least in part on determining that the second command is spoken by the first user and that the first user directed the second command to the device, process the second command (Paragraphs [0045] & [0063] – based on the user being positively identified as the same user, the command is executed).
Hart et al. fails to disclose:
based on determining that the first user looked towards the device while speaking the second command, determine that the first user directed the second command to the device;
However, Kim in the same field of endeavor teaches:
based on determining that the first user looked towards the device while speaking the second command, determine that the first user directed the second command to the device (Paragraph [0084] – the system determines which device the user is gazing towards and delivers the command to that device);
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the device and method of Hart et al. with the gaze direction determination capabilities of Kim because it is a case of combining prior art elements according to known methods to yield predictable results.


As per claim 5, the combination of Hart et al. and Kim teaches all of the limitations of claim 1 above. Hart et al. in the combination further discloses:
determine that the keyword is spoken by the first user based on voice recognition, facial recognition, or other biometric recognition (Paragraphs [0014] & [0025-0027] – the device recognizes the wake phrase and command via speech recognition and speaker recognition).

As per claim 6, the combination of Hart et al. and Kim teaches all of the limitations of claim 1 above. Hart et al. in the combination further discloses:
determine that the keyword is spoken by the first user based on a direction of arrival analysis associated with the keyword. (Paragraphs [0014] & [0059] – the user can be identified by location and beamforming techniques are direction of arrival calculations).

As per claim 12, the combination of Hart et al. and Kim teaches all of the limitations of claim 1 above. Hart et al. in the combination further discloses:
the one or more processors are implemented in an audio device, and wherein the audio device includes a wireless speaker and voice activated device with an integrated assistant application (Figures 1 & 6 and Paragraphs [0076-0082]).

As per claim 14, the combination of Hart et al. and Kim teaches all of the limitations of claim 13 above. Hart et al. in the combination further discloses:
receiving an audio signal from the one or more microphones; determining that a first portion of the audio signal corresponds to the keyword spoken by the first user; and determining that a second portion of the audio signal corresponds to the first command spoken by the first user (Figure 4 and Paragraphs [0012-0014] & [0025-0027]).

As per claim 15, the combination of Hart et al. and Kim teaches all of the limitations of claim 14 above. Hart et al. in the combination further discloses:
determining first voice characteristics indicated by the first portion of the audio signal; and generating, based on the first voice characteristics, a speech model associated with the first user (Figure 4 and Paragraphs [0012-0014] & [0025-0027]).

As per claim 16, the combination of Hart et al. and Kim teaches all of the limitations of claim 14 above. Hart et al. in the combination further discloses:
determining first voice characteristics indicated by the first portion of the audio signal; and in response to determining that the first voice characteristics match a speech model associated with the first user, determining that the keyword is spoken by the first user (Figure 4 and Paragraphs [0012-0014] & [0025-0027]).

As per claim 17, the combination of Hart et al. and Kim teaches all of the limitations of claim 14 above. Hart et al. in the combination further discloses:
determining that a third portion of the audio signal corresponds to the second command; determining second voice characteristics indicated by the third portion of the audio signal; and determining that the second command is spoken by the first user in response to determining that the second voice characteristics match a speech model associated with the first user (Figure 4 and Paragraphs [0012-0014] & [0025-0027]).

As per claim 30, the combination of Hart et al. and Kim teaches all of the limitations of claim 1 above. Hart et al. in the combination further discloses:
the one or more processors are integrated into at least one of a voice activated device, a wireless speaker and voice activated device, a portable electronic device, a car, a vehicle, a computing device, a communication device, an internet-of- things (IoT) device, a virtual reality (VR) device, or a combination thereof (Figures 1 & 6 and Paragraphs [0076-0082]).

As per claim 31, the combination of Hart et al. and Kim teaches all of the limitations of claim 1 above. Kim in the combination further discloses:
the determination that the first user looked towards the device while speaking the second command includes a determination that the first user looked toward at least one of the one or more microphones while speaking the second command (Paragraph [0084] – the system determines which device the user is gazing towards and delivers the command to that device. Since the device includes a camera, this meets the limitation. While Kim discusses external devices, it would have been obvious to one having ordinary skill in the art at the time the invention was made to apply it to the command device, since it has been held that forming in one piece an article which has formally been formed in two pieces and put together involves only routine skill in the art.  Howard v. Detroit Stove Works, 150 U.S. 164 (1893).).

As per claim 32, the combination of Hart et al. and Kim teaches all of the limitations of claim 1 above. Kim in the combination further discloses:
he determination that the first user looked towards the device while speaking the second command includes a determination that the first user looked toward at least one of the one or more microphones while speaking the second command (Paragraph [0084] – the system determines which device the user is gazing towards and delivers the command to that device. Since the device includes a microphone, this meets the limitation. While Kim discusses external devices, it would have been obvious to one having ordinary skill in the art at the time the invention was made to apply it to the command device, since it has been held that forming in one piece an article which has formally been formed in two pieces and put together involves only routine skill in the art.  Howard v. Detroit Stove Works, 150 U.S. 164 (1893).).

Claims 2, 10 and 18-21 are rejected under 35 U.S.C. 103 as being unpatentable over Hart et al. (U.S. Patent Application Publication 2014/0249817) and Kim (U.S. Patent Application Publication 2019/0272459) in view of Parker et al. (U.S. Patent Application Publication 2015/0106089).
As per claim 2, the combination of Hart et al. and Kim teaches all of the limitations of claim 1 above. The combination fails to disclose:
process the second command based at least in part on determining that a conversation mode is enabled.
However, Parker et al. in the same field of endeavor teaches:
process the second command based at least in part on determining that a conversation mode is enabled (Paragraphs [0025], [0030-0031] & [0036]).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the device of Hart et al. and Kim with the conversation mode of Parker et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

As per claim 10, the combination of Hart et al. and Kim teaches all of the limitations of claim 1 above. The combination fails to disclose:
the one or more processors are included in an integrated circuit.
However, Parker et al. in the same field of endeavor teaches:
the one or more processors are included in an integrated circuit (Paragraph [0087]).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the device of Hart et al. and Kim with the conversation mode of Parker et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

As per claim 18, the combination of Hart et al. and Kim teaches all of the limitations of claim 13 above. The combination fails to disclose:
in response to determining that the keyword is spoken by the first user, initiating a first conversation session associated with the first user.
However, Parker et al. in the same field of endeavor teaches:
in response to determining that the keyword is spoken by the first user, initiating a first conversation session associated with the first user (Paragraphs [0021-0036]).


As per claim 19, the combination of Hart et al., Kim and Parker et al. teaches all of the limitations of claim 18 above. Parker et al. in the combination further discloses:
processing the second command based on determining that the second command is received during the first conversation session (Paragraphs [0021-0036]).

As per claim 20, the combination of Hart et al., Kim and Parker et al. teaches all of the limitations of claim 18 above. Parker et al. in the combination further discloses:
in response to receiving the keyword spoken by a second user during the first conversation session: ending the first conversation session associated with the first user; and initiating a second conversation session associated with the second user (Paragraphs [0021-0036]).

As per claim 21, the combination of Hart et al., Kim and Parker et al. teaches all of the limitations of claim 18 above. Parker et al. in the combination further discloses:
in response to receiving an end session command spoken by the first user, ending the first conversation session associated with the first user(Paragraphs [0021-0036]).

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Hart et al. (U.S. Patent Application Publication 2014/0249817) and Kim (U.S. Patent Application Publication 2019/0272459) in view of Olson et al. (U.S. Patent Application Publication 2020/0312318).
As per claim 3, the combination of Hart et al. and Kim teaches all of the limitations of claim 1 above. The combination fails to disclose:
process the second command based at least in part on determining that the second command is received within a threshold duration of receiving the first command.
However, Olson et al. in the same field of endeavor teaches:
(Paragraph [0064] – the system has a no speak timeout window).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the devices of Hart et al. and Kim with the timeout capabilities of Olson et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Hart et al. (U.S. Patent Application Publication 2014/0249817) and Kim (U.S. Patent Application Publication 2019/0272459) in view of Blanksteen et al. (U.S. Patent 9,098,467).
As per claim 4, the combination of Hart et al. and Kim teaches all of the limitations of claim 1 above. The combination fails to disclose:
subsequent to receiving the second command, receive, via the one or more microphones, the keyword spoken by a second user; receive, via the one or more microphones, a third command spoken by the first user; and based on determining that the third command spoken by the first user is received subsequent to receiving the keyword spoken by the second user without an intervening receipt of the keyword spoken by the first user, refrain from processing the third command.
However, Blanksteen in the same field of endeavor teaches:
subsequent to receiving the second command, receive, via the one or more microphones, the keyword spoken by a second user; receive, via the one or more microphones, a third command spoken by the first user; and based on determining that the third command spoken by the first user is received subsequent to receiving the keyword spoken by the second user without an intervening receipt of the keyword spoken by the first user, refrain from processing the third command (Figures 2 & 4, claims 16-20 and Column 7, lines 4-67 & Column 8, lines 42-56).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the devices of Hart et al. and Kim with the user identification capabilities of Blanksteen et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

Claims 7 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Hart et al. (U.S. Patent Application Publication 2014/0249817), Kim (U.S. Patent Application Publication 2019/0272459) and Blanksteen et al. (U.S. Patent 9,098,467) in view of Liu et al. (U.S. Patent 10,957,329).
As per claim 7, the combination of Hart et al., Kim and Blanksteen et al. teaches all of the limitations of claim 4 above. The combination fails to disclose:
wherein receiving the keyword from the first user initiates a first session with the first user, wherein receiving the keyword from the second user initiates a second session with the second user, and wherein initiating the second session ends the first session.
However, Liu et al. in the same field of endeavor teaches:
wherein receiving the keyword from the first user initiates a first session with the first user, wherein receiving the keyword from the second user initiates a second session with the second user, and wherein initiating the second session ends the first session (Figures 6A & 6B, Claim 11 and Column 29, line 14 – Column 30, line 28 – different users have personalized keywords for each assistant system and changing assistant systems ends a session).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the devices of Hart et al., Kim and Blanksteen with the session switching capabilities of Liu et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

As per claim 8, the combination of Hart et al., Kim, Blanksteen et al. and Liu et al. teaches all of the limitations of claim 7 above. Blanksteen et al. in the combination further discloses:
wherein the first session is associated with a period of time when the first session is scheduled to be active, wherein the keyword is received from the second user during the period of time, and wherein the third command is received during the period of time (Figure 2 and Column 7, lines 33-67).

Claims 9, 11 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Hart et al. (U.S. Patent Application Publication 2014/0249817) and Kim (U.S. Patent Application Publication 2019/0272459) in view of Mamkina et al. (U.S. Patent 10,032,451).
As per claim 9, the combination of Hart et al. and Kim teaches all of the limitations of claim 1 above. The combination fails to disclose:
a biometric sensor coupled to the one or more processors, wherein the one or more processors are further configured to: receive biometric input from the biometric sensor; and determine that the keyword is spoken by the first user based on the biometric input.
However, Mamkina et al. in the same field of endeavor teaches:
a biometric sensor coupled to the one or more processors, wherein the one or more processors are further configured to: receive biometric input from the biometric sensor; and determine that the keyword is spoken by the first user based on the biometric input (Column 28, line 21 – Column 30, line 62 & Column 35, lines 45-63).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the devices of Hart et al. and Kim with the image and biometric capabilities of Mamkina et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

As per claim 11, the combination of Hart et al. and Kim teaches all of the limitations of claim 1 above. The combination fails to disclose:
the one or more processors are included in a vehicle.
However, Mamkina et al. in the same field of endeavor teaches:
the one or more processors are included in a vehicle (Figure 12, item 110e and Column 30, lines 9-38 & Column 36, lines 42-64).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the devices of Hart et al. and Kim with the vehicular capabilities of Mamkina et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

As per claim 22, the combination of Hart et al. and Kim teaches all of the limitations of claim 13 above. The combination fails to disclose:
receiving, via an image sensor, an image of the first user; and in response to determining that the image of the first user is received concurrently with receiving the keyword, determining that the keyword is spoken by the first user.
However, Mamkina et al. in the same field of endeavor teaches:
receiving, via an image sensor, an image of the first user; and in response to determining that the image of the first user is received concurrently with receiving the keyword, determining that the keyword is spoken by the first user (Column 28, line 21 – Column 30, line 62 & Column 35, lines 45-63).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the devices of Hart et al. and Kim with the image and biometric capabilities of Mamkina et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

Claim 33 is rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (U.S. Patent 10,957,329) in view of Blanksteen et al. (U.S. Patent 9,098,467).
As per claim 33, Liu et al. discloses:
A device (Figure 11) comprising: 
a memory configured to store instructions (Figure 11, item 1104 and Column 44, lines 21-57); and 
one or more processors configured to execute the instructions (Figure 11, item 1102 and Column 44, lines 21-57) to: 
receive, via one or more microphones (Figure 11, item 1108 and Column  45, lines 16-37), a keyword and a first command spoken by a first user; 
initiate a first conversation session with the first user based on receipt of the keyword and the first command from the first user (Figures 6A & 6B, Claim 11 and Column 29, line 14 – Column 30, line 28 – different users have personalized keywords for each assistant system and changing assistant systems ends a session); 
receive, via the one or more microphones, the keyword and a second command spoken by a second user while the first conversation session with the first user is ongoing (Figures 6A & 6B, Claim 11 and Column 29, line 14 – Column 30, line 28 – different users have personalized keywords for each assistant system and changing assistant systems ends a session); 
terminate the first conversation session with the first user and initiate a second conversation session with the second user based on receipt of the keyword and the second command from the second user (Figures 6A & 6B, Claim 11 and Column 29, line 14 – Column 30, line 28 – different users have personalized keywords for each assistant system and changing assistant systems ends a session); and
Liu et al. fails to disclose but Blanksteen et al. in the same field of endeavor discloses: 
while the second conversation session with the second user is ongoing: 
process a third command that is received, via the one or more microphones, from the second user without an intervening receipt of the keyword; and 
refrain from processing a fourth command that is received, via the one or more microphones, from the first user (Figures 2 & 4, claims 16-20 and Column 7, lines 4-67 & Column 8, lines 42-56).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the device of Liu et al. with the user identification capabilities of Blanksteen et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

As per claim 36, Liu et al. discloses:
A device comprising: 
A device (Figure 11) comprising: 
a memory configured to store instructions (Figure 11, item 1104 and Column 44, lines 21-57); and 
(Figure 11, item 1102 and Column 44, lines 21-57) to: 
initiate a first conversation session with a first user based on receiving, via one or more microphones, a keyword from the first user, wherein commands received from the first user via one or more microphones during the first conversation session are processed (Figures 6A & 6B, Claim 11 and Column 29, line 14 – Column 30, line 28 – different users have personalized keywords for each assistant system and changing assistant systems ends a session) and 
terminate the first conversation session with the first user and initiate a second conversation session with the second user based on receiving, via the one or more microphones, the keyword from the second user during the first conversation session, wherein commands received from the second user via the one or more microphones during the second conversation session are processed (Figures 6A & 6B, Claim 11 and Column 29, line 14 – Column 30, line 28 – different users have personalized keywords for each assistant system and changing assistant systems ends a session)
Liu et al. fails to disclose but Blanksteen et al. in the same field of endeavor teaches:
commands received from a second user via the one or more microphones during the first conversation session are not processed (Figures 2 & 4, claims 16-20 and Column 7, lines 4-67 & Column 8, lines 42-56);
commands received from the first user via the one or more microphones during the second conversation session are not processed (Figures 2 & 4, claims 16-20 and Column 7, lines 4-67 & Column 8, lines 42-56).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the device of Liu et al. with the user identification capabilities of Blanksteen et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

Claims 34-35 and 37 are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (U.S. Patent 10,957,329) and Blanksteen et al. (U.S. Patent 9,098,467) in view of Hart et al. (U.S. Patent Application Publication 2014/0249817).
As per claim 34, the combination of Liu et al. and Blanksteen et al. teaches all of the limitations of claim 33 above. The combination fails to disclose but Hart et al. in the same field of endeavor teaches:
determine that the fourth command is received from the first user based on a first direction of arrival analysis of the fourth command; and determine that the third command is received from the second user based on a second direction of arrival analysis of the third command (Paragraphs [0014] & [0059] – the user can be identified by location and beamforming techniques are direction of arrival calculations).
It would be obvious for a person having ordinary skill in the art at the effective filing date of the invention to modify the device of Liu et al. and Blanksteen et al. with the directional capabilities of Hart et al. because it is a case of combining prior art elements according to known methods to yield predictable results.

As per claim 35, the combination of Liu et al., Blanksteen et al. and Hart et al. teaches all of the limitations of claim 34 above. Hart et al. in the combination further discloses:
the first direction of arrival analysis corresponds to a first location of the first user and the second direction of arrival analysis corresponds to a second location of the second user (Paragraphs [0014] & [0059] – the user can be identified by location and beamforming techniques are direction of arrival calculations).

As per claim 37, the combination of Liu et al. and Blanksteen et al. teaches all of the limitations of claim 36 above. The combination fails to disclose but Hart et al. in the same field of endeavor teaches:
distinguish between speakers of received commands based on direction of arrival analyses of the received commands (Paragraphs [0014] & [0059] – the user can be identified by location and beamforming techniques are direction of arrival calculations).

Response to Arguments
Applicant’s arguments, see pages 9-10, filed 7/13/2021, with respect to the rejection of claims 1 and 13 under 35 U.S.C. 102 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new grounds of rejection is made in view of Kim.

Examiner Notes
The Examiner cites particular columns and line numbers in the references as applied to the claims above for the convenience of the Applicant.  Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well.  It is respectfully requested that, in preparing responses, the Applicant fully considers the references in its entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or as disclosed by the Examiner. 
Communications via Internet e-mail are at the discretion of the applicant and require written authorization. Should the Applicant wish to communicate via e-mail, including the following paragraph in their response will allow the Examiner to do so:
“Recognizing that Internet communications are not secure, I hereby authorize the USPTO to communicate with me concerning any subject matter of this application by electronic mail. I understand that a copy of these communications will be made of record in the application file.”
Should e-mail communication be desired, the Examiner can be reached at Edwin.Leland@USPTO.gov

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EDWIN S LELAND III whose telephone number is (571)270-5678.  The examiner can normally be reached on 8:00 - 5:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tammy Goddard can be reached on (571) 272-7773.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/EDWIN S LELAND III/Primary Examiner, Art Unit 2677