DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/21/2021 has been entered.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Response to Amendment
This communication is responsive to the applicant’s amendment dated 12/21/2020.  The applicant(s) amended claims 1, 9, and 16.

Response to Arguments
Applicant's arguments with respect to claims 1, 9, and 16 have been considered but are moot in view of the new ground(s) of rejection because the arguments pertain to the newly amended limitations.

Regarding claim 1, the Applicant argues that “In particular, the voiceprint of Zeljkovic is created using voice samples that the user provides to the mobile device 102 during enrollment, whereas the phrase key of Claim 1 is randomly selected and provided to a user via a virtual assistant device (VAD)..” (Remarks: pg. 12) The Examiner respectfully disagrees.
Upon further consideration of the prior art, Zelkovis teaches a randomly selected phrase key that is provided to a user via a virtual assistant device (par. 0081; ‘At operation 714, the authentication server 106 generates a sample phrase message including a sample phrase for use in authenticating the user via voice recognition utilizing the voice print created during the voice enrollment procedure 600. In some embodiments, the sample phrase includes a random phrase.’).

Regarding claim 1, the Applicant argues, “Zeljkovic does not disclose or suggest a server configured to determine that parsed data matches a phrase key that is randomly selected and provided to a user via the VAD, where a phrase spoken by the user is based on the phrase key.” (Remarks: pg. 12) The Examiner respectfully disagrees.


Regarding claim 1, the Applicant argues, “Giles does not disclose or suggest that a server receives, from a target server, a set of user credentials that authenticates a target service with a personal device.” (Remarks: pg. 13) The Examiner respectfully disagrees.
Upon further consideration of the prior art, the service libraries of Gilles establishes a relationship with the personal voice assistant (par. 0028; ‘At 264, services library 108 may confirm and/ or establish an authenticated symmetrical relationship between personal voice assistant 106 that serves user 100 and the targeted voice­actuable service 110. This established relationship may thereafter be referenced whenever user 100 invokes personal voice assistant 106 to access the targeted service 110, without user 100 having to provide credentials.’). The combination of Zelkovic and Gilles collectively teaches a server receiving, from a target server, a set of user credentials that authenticates a target service with a personal device. 



Regarding claim 9, the Applicant argues, “Additionally, generating a speakable credential does not disclose or suggest a virtual assistant device that is configured to receive, from a server via a target server associated with a target service, a set of user credentials that authenticates a personal device to the target service after a user verbally recites a phrase key.” (Remarks: pg. 15) The Examiner respectfully disagrees.
As noted above, the established relationship between the personal assistant and targeted voice-actuable service reads on receiving, via services library server, the user credentials that authenticates the personal device to the target service (Gilles: par. 0028; ‘At 264, services library 108 may confirm and/ or establish an authenticated symmetrical relationship between personal voice assistant 106 that serves user 100 and the targeted voice­actuable service 110. This established relationship may thereafter be referenced whenever user 100 invokes personal voice assistant 106 to access the targeted service 110, without user 100 having to provide credentials.’).

Claim Rejections - 35 USC § 103
Claims 1, 6-10, 12, 15, 16, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Zeljkovic et al. (US 20130097682 A1) in view of Gilles et al. (EP 3396667 A1).

claims 1 and 16, Zeljkovic teaches:
“A server” (par. 0004; ‘authentication server’) comprising:
“a communication interface configured to communicate with a virtual assistant device (VAD)” (par. 0116; ‘The user interface devices 1206 may include, but are not limited to, computers, servers, personal digital assistants, telephones (e.g., cellular, IP, or landline), or any suitable computing devices.’); and
“at least one processor coupled to the communication interface” (par. 0116; ‘In one embodiment, the I/O devices 1208 are operatively connected to an I/O controller (not shown) that enables communication with the processing unit 1202 via the system bus 1212.’), the at least one processor configured to:
“receive, via the communication interface from the target service, a file generated by the personal device that includes parsed data based on speech recognition processing of a phrase spoken by a user” (par. 0044; ‘In some embodiments, the speech recognition engine is stored in the mobile device 102.’; par. 0045; ‘The speech recognition engine may include one or more types of statistical models to process speech including, for example an acoustic model and a language model. The speech recognition may utilize more than one acoustic model and/or one or more language model. In some embodiments, the speech recognition engine is configured to receive audio, process the audio, and output a sequence of sub-word units or words.’; par. 0082; ‘At operation 716, the MFA application 120 receives speech input from the user. At operation 718, the MFA application 120 generates a sample phrase input message including the speech input provided by the user and sends the sample phrase input message to the authentication server 106.’);
that is randomly selected and provided to the user via the VAD, wherein the phrase spoken by the user is based on the phrase key” (par. 0081; ‘At operation 714, the authentication server 106 generates a sample phrase message including a sample phrase for use in authenticating the user via voice recognition utilizing the voice print created during the voice enrollment procedure 600. In some embodiments, the sample phrase includes a random phrase.’; par. 0083; ‘The speech server 114 receives the speech input and compares the speech input to the voice print. At operation 722, the speech server 114 generates a speech recognition response including a result of the comparison indicating whether or not the speech input matches the voice print and sends the speech recognition response to the authentication server 106.’);
“send, via the communication interface, a first notification to the target service, upon a determination that the parsed data matches a phrase key” (par. 0083; ‘At operation 722, the speech server 114 generates a speech recognition response including a result of the comparison indicating whether or not the speech input matches the voice print and sends the speech recognition response to the authentication server 106.’);
receive, via the communication interface, a set of user credentials from the target service (par. 0083; ‘At operation 724, the authentication server 106 generates an authentication result notification indicating whether or not the user is authenticated and sends the authentication result notification message to the provider server 108.’); and
send, via the communication interface, the set of user credentials to the VAD (par. 0084; ‘ At operation 726, the authentication server 106 generates an end message 
However, Zeljkovic does not expressly teach:
“receive via the communication interface, an indication for authenticating a target service on the VAD, wherein the target service is currently authenticated with a personal device based on a set of user credentials”,
“receive, via the communication interface from a target server associated with the target service, a file generated by the personal device that includes parsed data based on speech recognition processing of a phrase spoken by a user”;
“in response to determining that the parsed data matches the phrase key, send, via the communication interface, a first notification to the target server”;
“receive, via the communication interface from the target server, the set of user credentials that authenticates the target service with the personal device”; and
“send, via the communication interface, the set of user credentials to the VAD for authenticating the target service by the VAD.”
Gilles teaches:
 “receive via the communication interface, an indication for authenticating a target service on the VAD, wherein the target service is currently authenticated with a personal device based on a set of user credentials” (par. 0004; ‘In some instances, the user may be able to extend or link the pizza delivery app to a personal voice assistant that 
“receive, via the communication interface from a target server associated with the target service, a file generated by the personal device that includes parsed data based on speech recognition processing of a phrase spoken by a user” (par. 0024; ‘At 128, services library 108 may provide (e.g., transmit) the speakable credential to client device 104, e.g., over one or more computing networks (not depicted).’);
“in response to determining that the parsed data matches the phrase key, send, via the communication interface, a first notification to the target server” (par. 0026; ‘At 244, personal voice assistant 106 may invoke the targeted voice-actuable service 110 at services library 108.’);
“receive, via the communication interface from the target server, the set of user credentials that authenticates the target service with the personal device” (par. 0028; ‘At 264, services library 108 may confirm and/ or establish an authenticated symmetrical relationship between personal voice assistant 106 that serves user 100 and the targeted voice­actuable service 110. This established relationship may thereafter be referenced 
“send, via the communication interface, the set of user credentials to the VAD for authenticating the target service by the VAD” (par. 0023; ‘Referring now to Fig. 2B , at 252, the user utters aloud the speakable credential to standalone interactive speaker 102. At 254, the local instance of personal voice assistant 106A relays the spoken utterance to cloud-based personal voice assistant 106, which then attempts to invoke the targeted service with the spoken credential at 256. At this point, personal voice assistant 106 may have (e.g., by way of a natural language processor) converted the spoken utterance from user 100 into textual content.’; See also par. 0028).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to Zeljkovic’s authentication server by incorporating Gilles’ personal assistant authentication methods in order to extend or link apps/services to personal voice assistants. (Gilles: par. 0004) 


Regarding claim 6 (dep. on claim 1), the combination of Zeljkovic in view of Gilles further teaches:
“request the set of user credentials via the first notification” (Z: par. 0080; ‘If, however, the device ID matches the device ID stored in the user profile associated with a user ID entered to initiate the authentication request, the MFA application 120 may display a prompt for the user's PIN. Other authentication credentials such as a password may be used in lieu of or in addition to the PIN.’).

Regarding claim 7 (dep. on claim 1), the combination of Zeljkovic in view of Gilles further teaches:
“wherein the set of user credentials comprise a personalization profile of the user” (Z: par. 0063; ‘From operation 306, the pre-registration procedure 300 proceeds to operation 308, wherein the authentication server 106 creates the user profile including the information provided by the user via the user interface. The authentication server 106 then saves the user profile.’).

Regarding claim 8 (dep. on claim 1), the combination of Zeljkovic in view of Gilles further teaches:
“wherein the received file further comprises a user identification” (Z: par. 0041; ‘The web pages may include a web application or other user interface to facilitate user login, such as by a user name or user identification ("ID") and password to access the services.’; G: par. 0005; ‘Additionally, existing linking techniques may require that user-identifiable information pass through a personal voice assistant in order to link the personal voice assistant to a service.’).

Regarding claim 9, Zeljkovic further teaches:
“a transceiver configured to communicate with a server” (par. 0099; ‘The communications component 1120, in some embodiments, includes one or more transceivers each configured to communicate over the same or a different wireless technology standard.’);

“at least one processor coupled to the transceiver and the speaker” (par. 0116; ‘In one embodiment, the I/O devices 1208 are operatively connected to an I/O controller (not shown) that enables communication with the processing unit 1202 via the system bus 1212.’), the at least one processor is configured to:
“notify the user to verbally recite a phrase key that is randomly selected to the personal device” (par. 0034; ‘In some embodiments, such as the embodiment illustrated and described with reference to FIG. 9, the MFA service client application 120 prompts a user to speak a phrase for use in authenticating a user in accordance with one aspects of the MFA service.’; par. 0081; ‘At operation 714, the authentication server 106 generates a sample phrase message including a sample phrase for use in authenticating the user via voice recognition utilizing the voice print created during the voice enrollment procedure 600. In some embodiments, the sample phrase includes a random phrase.’); and
“receive, from the server via each of the one or more target services, the set of user credentials granting the VAD access to the user account associated with each of the one or more target services, as a result of the verbal recitation matching the phrase key” (par. 0083; ‘At operation 724, the authentication server 106 generates an authentication result notification indicating whether or not the user is authenticated and sends the authentication result notification message to the provider server 108.’; par. 
However, Zeljkovic does not expressly teach:
“receive, via the transceiver, a request to authenticate a target service on the VAD, wherein the target service is associated with a user account and currently authenticated with a personal device by on a set of user credentials”;
“receive, from the server via a target server associated with the target service, the set of user credentials that authenticate the personal device to the target service after the user verbally recites the phrase key”; and
“use the set of user credentials to access the user account associated with the  target service as a result of the verbal recitation matching the phrase key.”
 Gilles teaches:
“receive, via the transceiver, a request to authenticate a target service on the VAD, wherein the target service is associated with a user account and currently authenticated with a personal device by on a set of user credentials” (par. 0004; ‘In some instances, the user may be able to extend or link the pizza delivery app to a personal voice assistant that executes, for example, on one or more computing devices of an "ecosystem" of coordinated devices associated with the user (e.g., a smart phone, smart watch, tablet, interactive standalone speaker, etc.).’; par. 0015; ‘In some 
“receive, from the server via a target server associated with the target service, the set of user credentials that authenticate the personal device to the target service after the user verbally recites the phrase key” (par. 0019; ‘Also at 124, authentication backend 114 may generate a speakable credential, which as mentioned above may be any word, phrase, sequence of one or alphanumeric characters, symbols, etc., that is capable of being spoken aloud.’; par. 0022; ‘At 122, services library 108 may provide the credential to authentication backend 114, e.g., over one or more computing networks (not depicted).’); and
“use the set of user credentials to access the user account associated with the  target service as a result of the verbal recitation matching the phrase key” (par. 0023; ‘Referring now to Fig. 2B , at 252, the user utters aloud the speakable credential to standalone interactive speaker 102. At 254, the local instance of personal voice assistant 106A relays the spoken utterance to cloud-based personal voice assistant 106, which then attempts to invoke the targeted service with the spoken credential at 256. At this point, personal voice assistant 106 may have (e.g., by way of a natural language processor) converted the spoken utterance from user 100 into textual content.’; par. 0024; ‘At 266, services library 108 may return some indication of success to personal voice assistant 106, which in turn relays the success indication to the instance of personal voice assistant 106A operating on standalone interactive speaker 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to Zeljkovic’s authentication server by incorporating Gilles’ personal assistant authentication methods in order to extend or link apps/services to personal voice assistants. (Gilles: par. 0004)

Regarding claim 10 (dep. on claim 9), the combination of Zeljkovic in view of Gilles further teaches:
“broadcast, via the speaker, the phrase key to the user” (Z: par. 0103; ‘Audio capabilities for the mobile device 1100 may be provided by an audio I/O component 1128 that includes a speaker for the output of audio signals and a microphone to collect audio signals. In particular, the audio I/O component 1128 facilitate speech input via the microphone for voice samples used in the creation of a voice print and for recording spoken phrases for authentication purposes.’).

Regarding claim 12 (dep. on claim 9), the combination of Zeljkovic in view of Gilles further teaches:
“receive via the transceiver the phrase key from the server” (Z: par. 0005; ‘selecting a sample phrase for use to authenticate the user to access the service, sending the sample phrase to the computing device’).

claim 15 (dep. on claim 9), the combination of Zeljkovic in view of Gilles further teaches:
“after receiving the set of user credentials, to configure the user account associated with the target service on the VAD, based on the received set of user credentials, wherein the set of user credentials comprise a user personalization profile to personalize the target service on the VAD” (Z: par. 0063; ‘From operation 306, the pre-registration procedure 300 proceeds to operation 308, wherein the authentication server 106 creates the user profile including the information provided by the user via the user interface. The authentication server 106 then saves the user profile.’; G: par. 0004; ‘By logging into their profile, the user can order pizza, and payment information and other details (e.g., their address) may be already saved within the app, so that the user need not provide it again.’).
Regarding claim 18 (dep. on claim 16), the combination of Zeljkovic in view of Gilles further teaches:
“receiving the phrase key from the VAD” (Z: par. 0005; ‘selecting a sample phrase for use to authenticate the user to access the service, sending the sample phrase to the computing device’).

Claims 2, 3, 11, 17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Zeljkovic in view of Gilles, further in view of Visser et al. (US 20150301796 A1).

claims 2 (dep. on claim 1), 11 (dep. on claim 9), and 17 (dep. on claim 16), the combination of Zeljkovic in view of Gilles further teaches:
“wherein the at least one processor is further configured to: select the phrase key from a plurality of phrase keys” (Z: par. 0081; ‘In some embodiments, the sample phrase message includes instructions for the MFA application 120 to prompt the user to speak the multiple phrases in a particular order.’ See also par. 0005); and
“send, via the communication interface, the phrase key to the VAD” (Z: par. 0005; ‘selecting a sample phrase for use to authenticate the user to access the service, sending the sample phrase to the computing device’).
However, Zeljkovic and Gilles do not expressly teach:
“wherein: the phrase key is in a language based on a language indicator received via the communication interface from the VAD, and the phrase key is operably spoken by the user.”
In a similar field of endeavor (speaker verification), Visser teaches:
“wherein: the phrase key is in a language based on a language indicator received via the communication interface from the VAD, and the phrase key is operably spoken by the user” (par. 0107; ‘The enrollment GUI 184 may include an option to select a particular language. The enrollment GUI 184 may include an enrollment phrase. The enrollment module 108 may receive an enrollment phrase audio signal 130 and a selection of the particular language.’).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Zeljkovic’s (in view of Gilles) sample phrase selection by incorporating Visser’s enrollment GUI so that the phrase 

Regarding claims 3 (dep. on claim 1) and 20 (dep. on claim 16), the combination of Zeljkovic in view of Gilles and Visser further teaches:
“wherein the at least one processor is further configured to: analyze the parsed data to identify whether the parsed data is within a predetermined threshold to the phrase key, which corresponds to the determination that the parsed data matches the phrase key, wherein the predetermined threshold is based on at least one of an error coefficient of the speech recognition processing, an error rate, or a linguistic analysis of the parsed data” (V: par. 0083; ‘The testing module 110 may determine that the test phrase audio signal 134 satisfies the verification criterion 144 in response to determining that the confidence level satisfies the confidence level threshold 160.’; par. 0084; ‘For example, a higher confidence level threshold 160 may increase a false alarm rate and decrease a miss rate.’)

Claims 4, 5, 13, 14, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Zeljkovic and Gilles, further in view of Erhart et al. (US 20170286651 A1.

Regarding claims 4 (dep. on claim 1) and 19 (dep. on claim 16), Zeljkovic and Gilles do not expressly teach:

“transmit, via the communications interface, the set of instructions with the phrase key to the VAD”;
“determine, whether the received file further comprises movement data, from a set of sensors, configured to detect movement and direction of movement”;
“in response to determining that the received file comprises movement data, compare the set of instructions to the movement data”; and
“send via the communication interface, a second notification to the target server that the movement data matches the set of instructions.”
In a similar field of endeavor (authentication), Erhart teaches:
“select a set of instructions, wherein the set of instructions comprise audible directions for the user” (par. 0191; ‘Robot 102 may prompt user 302 to provide their pass -gesture, such as a certain movement of their entire body or a portion thereof previously established as a pass -gesture.’);
“transmit, via the communications interface, the set of instructions with the phrase key to the VAD” (par. 0191; ‘In another embodiment, a spoken word or phrase may provide a spoken password.’);
“determine, whether the received file further comprises movement data, from a set of sensors, configured to detect movement and direction of movement” (par. 0191; ‘Robot 102 may prompt user 302 to provide their pass -gesture, such as a certain movement of their entire body or a portion thereof previously established as a pass -
““in response to determining that the received file comprises movement data, compare the set of instructions to the movement data” (par. 0192; ‘Authentication may be a private expression (e.g., password, phrase, gesture, biometric, etc.) of user 302 or an attribute of user 302.’); and
“send via the communication interface, a second notification to the target server that the movement data matches the set of instructions” (par. 0013; ‘3. In addition to movement interaction sequences, the robot may be a local agent to accept and authenticate other forms of voice, visual, and interaction multi-factor authentication. Notifications can be provided by robots, agents, and customers as authentication and testing take place.’).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Zeljkovic’s (in view of Gilles) user authentication method by incorporating Erhart’s authentication through gesture and passphrase in order to provide security for interactions between contact centers and robots and customers. (Erhart: par. 0005)

Regarding claim 5 (dep. on claim 1), the combination of Zeljkovic in view of Gilles and Erhart further teaches: 
“receive, via the communication interface a set of instructions, wherein the set of instructions comprise audible directions for the user from the VAD” (E: par. 0191; ‘Robot 
“determine, whether the received file further comprises movement data, from a set of sensors, configured to detect movement and direction of movement” (E: par. 0191; ‘Robot 102 may prompt user 302 to provide their pass -gesture, such as a certain movement of their entire body or a portion thereof previously established as a pass -gesture. For example, user 302 may wave their right hand, move their eyes up and down, and clap twice, as a pass-gesture.’);
“in response to determining that the received file comprises movement data, compare the set of instructions to the movement data” (E: par. 0192; ‘Authentication may be a private expression (e.g., password, phrase, gesture, biometric, etc.) of user 302 or an attribute of user 302.’); and
“send, via the communication interface, a second notification to the target server that the movement data matches the set of instructions” (E: par. 0013; ‘3. In addition to movement interaction sequences, the robot may be a local agent to accept and authenticate other forms of voice, visual, and interaction multi-factor authentication. Notifications can be provided by robots, agents, and customers as authentication and testing take place.’).

Regarding claim 13 (dep. on claim 9), the combination of Zeljkovic in view of Gilles and Erhart further teaches:
“select a set of instructions, wherein the set of instructions comprise directions for the user” (E: par. 0191; ‘Robot 102 may prompt user 302 to provide their pass -gesture, 
“transmit, via the transceiver, the set of instructions to the server” (Z: par. 0069; ‘The device enrollment procedure 500 begins and proceeds to operation 502, wherein the MFA service client application 120 prompts the user for a something one knows authentication credential, for example, a PIN or password the user chose during the pre-registration procedure 300.’; Erhart: par. 0191; ‘In another embodiment, a spoken word or phrase may provide a spoken password.’); and
“notify the user to follow instructions on the personal device wherein, the personal device is configured to detect movement and direction of movement” (E: par. 0191; ‘Robot 102 may prompt user 302 to provide their pass -gesture, such as a certain movement of their entire body or a portion thereof previously established as a pass -gesture. For example, user 302 may wave their right hand, move their eyes up and down, and clap twice, as a pass-gesture.’).

Regarding claim 14 (dep. on claim 9), the combination of Zeljkovic in view of Gilles and Erhart further teaches:
“receive, via the transceiver, a set of instructions, from the server associated with the VAD, wherein the set of instructions comprise directions for the user” (E: par. 0191; ‘Robot 102 may prompt user 302 to provide their pass -gesture, such as a certain movement of their entire body or a portion thereof previously established as a pass -gesture.’); and


Conclusion
Other pertinent prior art are listed in the PTO-892 for consideration.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARK VILLENA whose telephone number is (571)270-3191.  The examiner can normally be reached on 10 am - 6pm EST Monday through Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for 


MARK . VILLENA
Examiner
Art Unit 2658



/MARK VILLENA/Examiner, Art Unit 2658