DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments
This action is responsive to communication filed on 02/21/2022. Claims 1- 7 and 9-20 are pending and being considered. Claims 1, 19 and 20 are independent. Claim 8 is cancelled. Claims 1, 19 and 20 are amended. Claims 1- 7 and 9-20 are rejected. 

Response to Arguments/Remarks
Regarding claims 1 and 19, applicant’s arguments/remarks filed on 2/21/2022
have been fully considered but they are not persuasive.
Applicant’s Arguments/Remarks:
Regarding independent claim 1, Applicant argues that the cited prior art(s), Zeljkovic et al. (US 2013/0097682 A1) in view of Chao et al. (US 2019/0318724 A1), fails to teach the limitation “retrieving, by a speech module, from a model storage, a speech model particularly trained for a user associated with the user identifier, the speech model including a machine learning model”, as recited in the claim 1.ATTORNEY DOCKET NO. Examiner acknowledged Applicant’s prospective but respectfully disagrees due to the following reason(s):
In response to the Applicant's arguments/remarks that the cited prior art(s) Zeljkovic fails to teach the recited limitation “retrieving, by a speech module, from a model storage, a speech model particularly trained for a user associated with the user identifier” as recited in the claim 1. The examiner respectfully disagrees because the cited prior art ‘Zeljkovic’ (In Fig. 1 and Para. [0043]), discloses that the speech server 114 is configured to provide speech recognition and/or speaker verification services for the MFA service. In particular, the speech server 114 is configured to … perform the speech recognition service to determine whether a speech sample included in the request matches a voice print previously established for a user that is allegedly associated with the speech sample, and/or as disclosed in FIG. 6 and associated Para. [0074], wherein the voice print is created by using voice sample of the user (in step 612), and saves the created voice print in association with the created user profile (in step 614 and Para. [0006]), and as further disclosed in Para. [0061], a pre-registration procedure to create the user profile using the personal user information that includes, but is not limited to, a user ID, a mobile telephone number, a device ID … and a voice passphrase. Therefore, under BRI, the cited prior art ‘Zeljkovic’ clearly teaches the recited limitation “retrieving, by a speech module, from a model storage, a speech model particularly trained for a user associated with the user identifier”, as described above.
However, the cited prior art ‘Zeljkovic’ fails to explicitly disclose “the speech model including a machine learning model” but the cited prior art ‘Chao’ (In Para. [0067]) discloses that the assistant device 206 can store or access a table 210, which can provide one or more user profiles (e.g., "1," "2," etc.) for selecting a speech recognition model, and as disclosed In Para. [0047], wherein the speech recognition models 136 each include one or more machine learning models (e.g., neural network models) and/or statistical models.
Further, the cited prior arts Zeljkovic and Chao are analogous and are in the same field of endeavor as they both pertain and directed to perform speech authentication/recognition based on the received user audio data (such spoken utterance, phrase, etc.).
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Chao’ into the teachings of ‘Zeljkovic’, with a motivation to execute the retrieved speech model to generate a voice match result based on the processed audio data, the speech model including a machine learning model, as taught by Chao, in order  to recognize various characteristics of the spoken utterance of a user captured in the audio data, such as the sounds produced (e.g., phonemes) by the spoken utterance, the order of the produced sounds, rhythm of speech, intonation, etc. ; Chao, Para. [0014].
Therefore, under BRI, the cited prior art(s) Zeljkovic in view of Chao teaches the claimed limitation(s) as mentioned above for the independent claim 1. Therefore, the examiner maintains the rejection for the independent claim 1, as rejected in the previous non-final rejection. 
Regrading independent claim 19, the claim recites similar limitations as
mentioned above for the independent claim 1. Therefore, the independent claim 19 also remain rejected under 35 U.S.C 103 for the same reason(s) as mentioned above for the independent claim 1. Therefore, the Examiner suggests to further amend the independent claims 1 and 19 to overcome the current rejection(s) under 35 U.S.C. 103.
Regarding dependent claims 2-7 and 9-18 fall together accordingly, since the
cited prior art(s) does disclose the limitation(s) as stated above.
Regrading independent claim 20, the applicant’s arguments/remarks filed on 02/21/2022 have been fully considered and are rendered moot in view of new grounds of rejection(s) outlined below. The argument(s) do not apply to the current art(s) being used.
Further, the applicant has not provided any arguments/remarks for rejection under 35 U.S.C. 112(f). Therefore, the examiner maintains the 35 U.S.C. 112(f) rejection as mentioned below.

Claim Interpretation

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 
The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “receiving, by a communication module…”, “transmitting, by the communication module, to a speech module…” and “authentication, by an authentication module…” in claim 19.
Because these claim limitation(s) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, they are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
In claim 19, examiner finds that the limitations “receiving, by a communication module…”, “transmitting, by the communication module, to a speech module…” and “authentication, by an authentication module…”, has support in specification Paragraph [034] and/or as depicted in Fig. 5 and associated Paragraphs [077 and 082]. Therefore the claim 19 only invokes 35 U.S.C. 112 (f) or sixth paragraph, and is not rejected under 35 U.S.C 112(b). 
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 U.S.C. 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or non-obviousness.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1-4, 7, 9-12, 18 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Zeljkovic et al. (US 2013/0097682 A1), hereinafter (Zeljkovic), in view of Chao; Pu-sen et al. (US 2019/0318724 A1; filed on May 7, 2018), hereinafter (Chao).

Regarding Claim 1, Zeljkovic teaches a system comprising: one or more memory units storing instructions (Zeljkovic, Para. [0109 and 0113], discloses a computer system comprising memory storing instructions); and 
one or more processors that execute the instructions to perform operations comprising: receiving user data comprising a user identifier, audio data having a first data format, and a client device identifier (Zeljkovic, Figure 8 and Para. [0061, 0076, 0087, 0109 and 0113], discloses a processor to execute the instructions to perform operations comprising receiving user profile such as user ID, audio signal having spoken utterances (i.e., first data format) and a name of a service (i.e., a client device identifier, as disclosed in Para. [0025])); 
generating processed audio data based on the received audio data (Zeljkovic, Para. [0029 and 0076], discloses to generate words (i.e., processed audio data) based on the audio signal), the processed audio data having a second data format (Zeljkovic, Para. [0029 and 0076], further discloses the generated words have a format useable by a mobile device); 
Zeljkovic, Fig. 1 and Para. [0043], discloses that the speech server 114 is configured to provide speech recognition and/or speaker verification services for the MFA service. In particular, the speech server 114 is configured to … perform the speech recognition service to determine whether a speech sample included in the request matches a voice print previously established for a user that is allegedly associated with the speech sample, and/or as disclosed in FIG. 6 and associated Para. [0074], wherein the voice print is created by using voice sample of the user (in step 612), and saves the created voice print in association with the created user profile (in step 614 and Para. [0006]), and as disclosed in Para. [0061], the pre-registration procedure to create the user profile using the personal user information that includes, but is not limited to, a user ID, a mobile telephone number, a device ID … and a voice passphrase); 
executing the Zeljkovic, Fig. 1 and Para. [0043], discloses that the speech server 114 is configured to provide speech recognition and/or speaker verification services for the MFA service. In particular, the speech server 114 is configured to … perform the speech recognition service to determine whether a speech sample included in the request matches a voice print previously established for a user that is allegedly associated with the speech sample, and as disclosed in Para. [0004], receiving (at the computing device from the server) a message identifying whether access to the service has been granted or denied based on the voice print matching); 
authenticating, by an authentication module, a user based on the voice match result (Zeljkovic, Fig. 1 and Para. [0029-0031, 0038 and 0081], discloses an authentication module to authenticate a user based on the voice print match); and 
transmitting to a client device associated with the client device identifier, a client notification comprising a result of the authentication (Zeljkovic, Para. [0004], discloses the steps for operating a computing device during an authentication procedure includes establishing a data session with an authentication server for authenticating a user to utilize a service, receiving a sample phrase over the data session, and presenting a voiceprint matching prompt on a display of the computing device. The voiceprint matching prompt includes a request for a user to speak the sample phrase. The process further includes receiving a speech input of the sample phrase, transmitting the speech input to the authentication server via the data session, and receiving (at the computing device) a message identifying whether access to the service has been granted or denied, or see also Para. [0005], discloses to receive (at the computing device) a speech recognition and/or speaker verification response from the speech server computer. The speech recognition and/or speaker verification response includes an indication of whether or not the speech input (of the sample phrase as provided by the user at the computing device) matches a (pre-established) voice print).  
Zelikovic fails to explicitly disclose but Chao teaches retrieving, by a speech module, from a model storage, a speech model associated with the user identifier, the speech model including a machine learning model (Chao, Para. [0067], discloses that the assistant device 206 can store or access a table 210, which can provide one or more user profiles (e.g., "1," "2," etc.) for selecting a speech recognition model, and as disclosed in Para. [0047], wherein the speech recognition models 136 each include one or more machine learning models (e.g., neural network models) and/or statistical models);
executing the retrieved speech model to generate a voice match result based on the processed audio data (Chao, Para. [0067], discloses that the assistant device 206 can store or access a table 210, which can provide one or more user profiles (e.g., "1," "2," etc.) for selecting a speech recognition model to employ for processing spoken utterances from the user 202, and/or see also Para. [0047], wherein the speech recognition models 136 each include one or more machine learning models (e.g., neural network models) and/or statistical models for determining text (or other semantic representation) that corresponds to a spoken utterance embodied in audio data);
Zeljkovic and Chao are analogous arts and are in the same field of endeavor as they both pertain and directed to perform speech authentication/recognition based on the received user audio data (such spoken utterance, phrase, etc.).
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Chao’ into the teachings of ‘Zeljkovic’, with a motivation to executing the retrieved speech model to generate a voice match result based on the processed audio data, the speech model including a machine learning model, as taught by Chao, in order  to recognize various characteristics of the spoken utterance of a user captured in the audio data, such as the sounds produced (e.g., phonemes) by the spoken utterance, the order of the produced sounds, rhythm of speech, intonation, etc. ; Chao, Para. [0014].

Regarding Claim 2, Zeljkovic as modified by Chao teaches the system of claim 1, wherein Zeljkovic further teaches: the user data comprises biometric data, and the authenticating of the user is based on biometric data (Zeljkovic, Para. [0028 and 0030] and Claim 11, discloses that the user profile comprising voice print (i.e., biometric data) and the authentication of the user is based on the voice print).  

Regarding Claim 3, Zeljkovic as modified by Chao teaches the system of claim 1, wherein Zeljkovic further teaches the operations further comprising: transmitting, to a user device associated with the user identifier, a user notification comprising the result of the authentication (Zeljkovic, Para. [0025, 0029, 0068 and 0083], discloses transmitting to a mobile device associated with the user ID, a notification message including a valid token comprising authentication result notification message (i.e., the result of the authentication)).  

Regarding Claim 4, Zeljkovic as modified by Chao teaches the system of claim 1, wherein Zeljkovic further teaches the operations further comprising: 55Attorney Docket No. 11360.0630-00000spinning up, in response to the received user data, a container instance, wherein the container instance performs, using the one or more processors, at least one of (Zeljkovic, Fig. 6 and associated Para. [0074-0074], discloses a voice enrollment procedure 600 that proceeds to operation 612, wherein a voice print (i.e., instance) is created (i.e., spinning up) using the user’s voice sample (i.e., user data). From operation 612, the voice enrollment procedure 600 proceeds to operation 614, wherein the voice print is saved in association with the user's profile, and as disclosed in Para. [0006], wherein a pre-registration procedure is utilized to create a user profile for a user of a multi-factor authentication ("MFA") service, and as disclosed in Para. [0094], a processor 1104 for processing data and/or executing computer-executable instructions stored in memory): 
processing the audio data, transmitting the processed audio data (Zelkovic, Para. [0004], discloses to transmit the speech input (i.e., the processed audio data) to the authentication server via the data session, or see also Para. [0005], discloses to receive a speech input of the sample phrase as provided by the user at the computing device, generating a speech recognition and/or speaker verification request that includes the speech input, and sending the speech recognition and/or speaker verification request to a speech server computer), receiving the voice match result, authenticating the user, and transmitting the client notification (Zelkovic, Para. [0005], discloses to receive a speech recognition and/or speaker verification response from the speech server computer. The speech recognition and/or speaker verification response includes an indication of whether or not the speech input matches a voice print. The method further includes generating an authentication result notification that includes the indication of whether or not the speech input matches the voice print).  

Regarding Claim 7, Zeljkovic as modified by Chao teaches the system of claim 1, wherein Zeljkovic, further teaches generating, by the speech module, the voice match result is further based on a phrase identified in the audio data (Zeljkovic, Para. [0029 and 0034], discloses that the determination is based on a phrase identified in the audio signal, and/or see also Para. [0086] and/or claim 9).  

Regarding Claim 8, (Cancelled).

Regarding Claim 9, Zeljkovic as modified by Chao teaches the system of claim 8, Wherein Zeljkovic further teaches the operations further comprising: 56Attorney Docket No. 11360.0630-00000spinning up a container instance, wherein the container instance performs, using the one or more processors, at least one of (Zeljkovic, Fig. 6 and associated Para. [0074-0074], discloses a voice enrollment procedure 600 that proceeds to operation 612, wherein a voice print (i.e., instance) is created (i.e., spinning up) using the user’s voice sample (i.e., user data). From operation 612, the voice enrollment procedure 600 proceeds to operation 614, wherein the voice print is saved in association with the user's profile, and as disclosed in Para. [0006], wherein a pre-registration procedure is utilized to create a user profile for a user of a multi-factor authentication ("MFA") service, and as disclosed in Para. [0094], a processor 1104 for processing data and/or executing computer-executable instructions stored in memory): Zelkovic, Para. [0005], discloses to receive a speech recognition and/or speaker verification response from the speech server computer. The speech recognition and/or speaker verification response includes an indication of whether or not the speech input matches a voice print. The method further includes generating an authentication result notification that includes the indication of whether or not the speech input matches the voice print).  
However Zelikovic fails to explicitly disclose but Chao further teaches retrieving a speech model (Chao, Para. [0067], discloses that the assistant device 206 can store or access a table 210, which can provide one or more user profiles (e.g., "1," "2," etc.) for selecting a speech recognition model);
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Chao’ into the teachings of ‘Zeljkovic’, with a motivation to retrieve a speech model, as taught by Chao, in order to determine a language for speech recognition of a spoken utterance; Chao, Para. [0003].

Regarding Claim 10, Zeljkovic as modified by Chao teaches the system of claim 1, wherein Zeljkovic further teaches the operations further comprising: updating the speech module based on the voice match result and the processed audio data (Zeljkovic, Para. [0043 and 0045], discloses to associate (i.e., update) a speech sample to the speech recognition engine based on the voice print match and the words).  

Regarding Claim 11, Zeljkovic as modified by Chao teaches the system of claim 8, wherein Zeljkovic fails to explicitly disclose but Chao further teaches the speech model is a machine learning model (Chao, Para. [0047], discloses that the speech recognition models 136 each include one or more machine learning models (e.g., neural network models) and/or statistical models).  
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Chao’ into the teachings of ‘Zeljkovic’, with a motivation wherein the speech model is a machine learning model, as taught by Chao, in order for determining text (or other semantic representation) that corresponds to a spoken utterance embodied in audio data; Chao, Para. [0047].

Regarding Claim 12, Zeljkovic as modified by Chao teaches the system of claim 11, wherein Zeljkovic fails to explicitly disclose but Chao further teaches the machine learning model comprises at least one of a recurrent neural network model, a hidden Markov model, a discriminative learning model, a Bayesian learning model, a structured sequence learning model, or an adaptive learning model (Chao, Para. [0047], discloses that the speech recognition models 136 each include one or more machine learning models (e.g., neural network models) and/or statistical models).  
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Chao’ into the teachings of ‘Zeljkovic’, with a motivation wherein the machine learning model comprises at least one of a recurrent neural network model, as taught by Chao, in order for determining text (or other semantic representation) that corresponds to a spoken utterance embodied in audio data; Chao, Para. [0047].

Regarding Claim 18, Zeljkovic as modified by Chao teaches the system of claim 1, wherein Zeljkovic further teaches the client device is associated with an access point (Zeljkovic, Fig. 1 and Para. [0041], discloses that the provider server is associated with an online service that provides access to sensitive information for online shopping (i.e., access point)).  

Regarding Claim 19, Zeljkovic teaches a computer-implemented method comprising: receiving, by a communication module, user data comprising a user identifier, audio data having a first data format, and a client device identifier (Zeljkovic, Fig. 8 and Para. [0061, 0076, 0087, 0109 and 0113] and Claim 18; discloses a computer implemented method receiving by a processor receives user profile including user ID, audio signal having spoken utterances (i.e., first data format) and a name of a service (i.e., client device identifier, as disclosed in Para. [0025])); 
processing, by the communication module, the audio data (Zeljkovic, Para. [0029 and 0076], discloses to generate words (i.e., processed audio data) based on the audio signal), the resulting processed audio data having a second data format (Zeljkovic, Para. [0029 and 0076], further discloses the generated words have a format useable by a mobile device); 
transmitting, by the communication module, to a speech module, the processed audio data (Zeljkovic, Para. [0029, 0043 and 0045], discloses to transmit the generated words to a speech server 114 which is configured to provide speech recognition engine (i.e., speech module)) and an instruction to determine whether voice data of the processed audio data matches a voice pattern associated with the user identifier (Zljkovic, Para. [0043-0046], discloses the words and a request (i.e., instruction) to determine whether the audio signal of the words matches to a voice print associated with the user ID);  58Attorney Docket No. 11360.0630-00000 
Zeljkovic, Fig. 1 and Para. [0043], discloses that the speech server 114 is configured to provide speech recognition and/or speaker verification services for the MFA service. In particular, the speech server 114 is configured to … perform the speech recognition service to determine whether a speech sample included in the request matches a voice print previously established for a user that is allegedly associated with the speech sample, and/or as disclosed in FIG. 6 and associated Para. [0074], wherein the voice print is created by using voice sample of the user (in step 612), and saves the created voice print in association with the created user profile (in step 614 and Para. [0006]), and as disclosed in Para. [0061], the pre-registration procedure to create the user profile using the personal user information that includes, but is not limited to, a user ID, a mobile telephone number, a device ID … and a voice passphrase); 
executing the Zeljkovic, Fig. 1 and Para. [0043], discloses that the speech server 114 is configured to provide speech recognition and/or speaker verification services for the MFA service. In particular, the speech server 114 is configured to … perform the speech recognition service to determine whether a speech sample included in the request matches a voice print previously established for a user that is allegedly associated with the speech sample, and as disclosed in Para. [0004], receiving (at the computing device from the server) a message identifying whether access to the service has been granted or denied based on the voice print matching); and 
transmitting, by the speech module, the voice match result (Zeljkovic, Fig. 1 and Para. [0073], sending by the speech server 114 which is configured to provide speech recognition engine, the voice print match to the mobile device); 
receiving, at the communication module, from the speech module, the voice match result (Zeljkovic, Fig. 1 and Para. [0029-0030], discloses receiving by the processor from the speech recognition engine a voice print match); 
authenticating, by an authentication module, a user based on the voice match result (Zeljkovic, Fig. 1 and Para. [0029-0031, 0038 and 0081], discloses an authentication module to authenticate a user based on the voice print match); and 
transmitting, by the communication module, to a client device associated with the client device identifier, a client notification comprising a result of the authentication (Zeljkovic, Para. [0004], discloses the steps for operating a computing device during an authentication procedure includes establishing a data session with an authentication server for authenticating a user to utilize a service, receiving a sample phrase over the data session, and presenting a voiceprint matching prompt on a display of the computing device. The voiceprint matching prompt includes a request for a user to speak the sample phrase. The process further includes receiving a speech input of the sample phrase, transmitting the speech input to the authentication server via the data session, and receiving (at the computing device) a message identifying whether access to the service has been granted or denied, or see also Para. [0005], discloses to receive (at the computing device) a speech recognition and/or speaker verification response from the speech server computer. The speech recognition and/or speaker verification response includes an indication of whether or not the speech input (of the sample phrase as provided by the user at the computing device) matches a (pre-established) voice print, or see also Para. [0083]).  
Zelikovic fails to explicitly disclose but Chao teaches retrieving, by the speech module, from a model storage, a speech model associated with the user identifier, the speech model including a machine learning model (Chao, Para. [0067], discloses that the assistant device 206 can store or access a table 210, which can provide one or more user profiles (e.g., "1," "2," etc.) for selecting a speech recognition model, and as disclosed in Para. [0047], wherein the speech recognition models 136 each include one or more machine learning models (e.g., neural network models) and/or statistical models);
executing the retrieved speech model to generate a voice match result based on the processed audio data (Chao, Para. [0067], discloses that the assistant device 206 can store or access a table 210, which can provide one or more user profiles (e.g., "1," "2," etc.) for selecting a speech recognition model to employ for processing spoken utterances from the user 202, and/or see also Para. [0047], wherein the speech recognition models 136 each include one or more machine learning models (e.g., neural network models) and/or statistical models for determining text (or other semantic representation) that corresponds to a spoken utterance embodied in audio data);
Zeljkovic and Chao are analogous arts and are in the same field of endeavor as they both pertain and directed to perform speech authentication/recognition based on the received user audio data (such spoken utterance, phrase, etc.).
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Chao’ into the teachings of ‘Zeljkovic’, with a motivation to executing the retrieved speech model to generate a voice match result based on the processed audio data, the speech model including a machine learning model, as taught by Chao, in order  to recognize various characteristics of the spoken utterance of a user captured in the audio data, such as the sounds produced (e.g., phonemes) by the spoken utterance, the order of the produced sounds, rhythm of speech, intonation, etc. ; Chao, Para. [0014].

Claims 5-6 are rejected under 35 U.S.C. 103 as being unpatentable over Zeljkovic in view of Chao, as applied above, and further in view of Di Mambro; Germano et al. (US 2007/0185718 A1), hereinafter (Di Mambro).

Regarding Claim 5, Zeljkovic as modified by Di Mambro teaches the system of claim 4, wherein Zeljkovic further teaches the operations further comprising: terminating the container instance based on the voice match result (Zeljkovic, Para. [0073], discloses that the MFA service client application 120 sends the spoken answer to the authentication server 106, which utilizes the speech server 114 to determine if the spoken answer matches the answer provided in the user profile created during the pre-registration procedure 300. In any case, if an incorrect answer was received at operation 604, the voice enrollment procedure 600 proceeds to operation 608. The voice enrollment procedure 600 ends (i.e., terminates) at operation 608), 
However Zeljkovic as modified by Chao fails to explicitly disclose but Di Mambro further teaches the terminating comprising destroying the user data (Di Mambro, Para. [0041-0042], discloses that the profile management module 420 can communicate with the authentication servlet 420 for evaluating one or more user profiles stored in the voice print database 140. Wherein the authentication servlet 420 can also be communicatively coupled to a verification module 430 for authorizing a user (see Fig. 4 and Para. [0037]).The profile management module 420 can create, update and delete user profiles from the voice print database 140, which is also communicatively coupled with the profile management module 420 (see Fig. 4)).  
Zeljkovic, Chao and Di Mambro are analogous arts and are in the same field of endeavor as they all pertain and directed to perform speech authentication/recognition based on the received user audio data (such spoken utterance, phrase, etc.).
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Di Mambro’ into the teachings of ‘Zeljkovic’ as modified by ‘Chao’, with a motivation to create, update and/or delete user profiles, in order to grant an authenticated user of a mobile device access to one or more resources available to the mobile device; Di Mambro, Para. [0034 and 0041].

Regarding Claim 6, Zeljkovic as modified by Chao teaches the system of claim 1, Wherein Zeljkovic further teaches the operations further comprising: updating a log based on the user identifier, the client identifier, and the voice match result (Zeljkovic, Para. [0040] and Claim 18, discloses to populate a speech database (i.e., log) based on the user ID, name of the service and the voice print match); and
However Zeljkovic as modified by Chao fails to explicitly disclose but Di Mambro from the same field of technology teaches destroying the user data (Di Mambro, Para. [0041], discloses that the profile management module 420 can create, update and delete user profiles 142, as disclosed in Fig. 2).  
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Di Mambro’ into the teachings of ‘Zeljkovic’ as modified by ‘Chao’, with a motivation to create, update and/or delete user profiles, in order to grant an authenticated user of a mobile device access to one or more resources available to the mobile device; Di Mambro, Para. [0034 and 0041].

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Zeljkovic in view of Chao, as applied above, and further in view of YOON; Seokhyun et al. (US 2016/0253669 A1), hereinafter (Yoon).

Regarding Claim 13, Zeljkovic as modified by Chao teaches the system of claim 1, wherein Zeljkovic as modified by Chao fails to explicitly disclose but Yoon teaches the operations further comprising: determining, based on the user data and the voice match result, a payment method (Yoon, Para. [0077 and 0177], discloses to determine an electronic card (i.e., payment method) based on the card number (i.e., user data) and the corresponding voice instruction); 
sending a payment request to a financial institution (Yoon, Para. [0146-0147, 0150 and 0174], discloses to transmit a payment request to a financial server of a financial institution); and
receiving a payment notification from the financial institution (Yoon, Para. [0174], discloses to receive a payment notification from a financial server of a financial institution), the payment notification comprising a payment request result (Yoon, Para. [0174], discloses that the payment notification comprising a payment failed cause (i.e., a payment request result)), and wherein the client notification comprises the payment request result (Yoon, Para. [0174], discloses that the notification from the POS device (i.e., client device) comprises payment failed cause).  
Zeljkovic, Chao and Yoon are analogous arts and are in the same field of endeavor as they all pertain and directed to perform user authentication/recognition based on the received user authentication information.
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Yoon’ into the teachings of ‘Zeljkovic’ as modified by ‘Chao’, with a motivation to provide determining, based on the user data and the voice match result, a payment method; sending a payment request to a financial institution; and receiving a payment notification from the financial institution, the payment notification comprising a payment request result, and wherein the client notification comprises the payment request result, as taught by the Yoon, in order to gain the advantage of reducing the additional authentication process; Yoon, Para. [0196].

Claims 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Zeljkovic in view of Chao, as applied above, and further in view of Fung, D et al. (US 2011/0099108 A1), hereinafter (Fung).

Regarding Claim 14, Zeljkovic as modified by Chao teaches the system of claim 1, Wherein Zeljkovic further teaches the operations further comprising: receiving, by a storage service, the user data from a user device (Zeljkovic, Para. [0039], discloses to receive by a database 128 (i.e., storage service) the user profile from a mobile device);
receiving, from the authentication service, a request for the user data (Zeljkovic, Fig. 1 and Para. [0080], discloses to receive from the authentication module via an authentication server 106 an authentication request for the user profile); 
transmitting, to the authentication service, the user data based on the request (Zeljkovic, Para. [0078 and 0080], discloses to send to the authentication module via the authentication server 106, the user profile based on the authentication request); 
receiving, from the authentication service, a notification stating that the user data has been received by the authentication service (Zeljkovic, Para. [0091-0092], discloses to receive from the authentication module via the authentication server 106 an authentication complete prompt indicating (i.e., notification stating) that the user authentication attempt was successful (i.e., the user data has been received by the authentication service)); 
Zeljkovic as modified by Chao fails to explicitly disclose but Fung teaches transmitting an alert from the storage service to the authentication service (Fung, Para. [0056] and Claim 1; discloses to provide a merchant identification signal (i.e., alert) from the database (i.e., storage service) to the financial institution (i.e., the authentication service)), the alert comprising the user identifier (Fung, Claim 1; discloses that the merchant identification signal comprises the merchant identification (i.e., user identifier)); 57Attorney Docket No. 11360.0630-00000and destroying the user data (Fung, Para. [0079 and 0184], discloses to initiate a decline confirmation (i.e., destroying) for the user login data (i.e., user data) by deleting the login data).  
Zeljkovic, Chao and Fung are analogous arts and are in the same field of endeavor as they all pertain and directed to perform user authentication/recognition based on the received user information.
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Fung’ into the teachings of ‘Zeljkovic’ as modified by ‘Chao’, with a motivation to provide an enhanced security system by transmitting an alert from the storage service to the authentication module, the alert comprising the user identifier; and by further destroying the user data, in order to gain the advantage of preventing unauthorized user to impersonate a user, and further preventing attackers from discovering the user’s true identity; Fung, Para. [0142 and 0153].
  
Regarding Claim 15, Zeljkovic as modified by Chao in view of Fung teaches the system of claim 14, wherein Zeljkovic as modified by Chao fails to explicitly disclose but Fung further teaches destroying the user data comprises permanently deleting the user data and deleting file pointers associated with the user data (Fung, Para. [0079, 0184, 0187 and 0228], discloses declining confirmation for the user login data comprises deleting completely the user login data deleting files icons (i.e., filed pointers) associated with the user login data).  
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Fung’ into the teachings of ‘Zeljkovic’ as modified by ‘Chao’, with a motivation to provide wherein destroying the user data comprises permanently deleting the user data and deleting file pointers associated with the user data, in order to gain the advantage of preventing attackers from discovering the user’s true identity; Fung, Para. [0153].

Claims 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Zeljkovic in view of Chao, as applied above, and further in view of Soyannwo; Olusanya Temitope et al. (US 9,774,998 B1), hereinafter (Soyannwo).

Regarding Claim 16, Zeljkovic as modified by Chao teaches the system of claim 1, wherein Zeljkovic as modified by Chao fails to explicitly disclose but Soyannwo teaches the system is located at a cloud service (Soyannwo, Col. 7 (lines 4-7), discloses that the server 206 is located at a cloud service).  
Zeljkovic, Chao and Soyannwo are analogous arts and are in the same field of endeavor as they all pertain and directed to perform user authentication/recognition based on the received user information.
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Soyannwo’ into the teachings of ‘Zeljkovic’ as modified by ‘Chao’, with a motivation to provide wherein the system is located at a cloud service, as taught by the Soyannwo, in order to gain the advantage of improved performance of computing systems; Soyannwo, Col. 1 (lines 64-65).

Regarding Claim 17, Zeljkovic as modified by Chao teaches the system of claim 1, wherein Zeljkovic as modified by Chao fails to explicitly disclose but Soyannwo teaches: the authentication module is located at a first cloud service (Soyannwo, Fig. 5 depicts that the authentication module is located at a first cloud service); and the speech module is located at a second cloud service (Soyannwo, Fig. 5 and Col. 13 (lines 1-13) - Col. 14 (lines 28-34), discloses that the speech recognition performed by the computing device (i.e., speech module) is located at a second cloud service).
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Soyannwo’ into the teachings of ‘Zeljkovic’ as modified by ‘Chao’, with a motivation to provide wherein the authentication module is located at a first cloud service; and the speech module is located at a second cloud service, as taught by the Soyannwo, in order to gain the advantage of improved performance of computing systems; Soyannwo, Col. 1 (lines 64-65).

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Di Mambro; Germano et al. (US 2007/0185718 A1), hereinafter (Di Mambro), in view of Zeljkovic et al. (US 2013/0097682 A1), hereinafter (Zeljkovic), and further in view of Chao; Pu-sen et al. (US 2019/0318724 A1; filed on May 7, 2018), hereinafter (Chao).

Regarding Claim 20, Di Mambro teaches a computer system comprising: Di Mambro, Para. [0019], discloses a number of suitable processors, controllers, units, or the like that carry out a pre-programmed or programmed set of instructions and/or as illustrated in Fig. 9, a voice processor 944): 
receiving, at an authentication system, user data comprising a user identifier, audio data having a first data format, and a client device identifier (Di Mambro, Fig. 4 and Para. [0040], discloses that the application 410 can create a user profile 142 (See FIG. 2) which includes the pass phrase (144), the biometric voice print (146), and the IDEI (148). Upon speaking the password phrase (spoken utterance, which represents a first data format), the J2ME application 410 can send the user profile 142 (See Fig. 2) to the verification/authentication server, and/or see also Fig. 5 and Para. [0043], discloses that, at step 502, the user is prompted to record his voice for voice print creation. The user can submit a particular phrase (i.e., password phrase) that the user will recite during voice authorization. At step 503, the user records their voice using the provided application (410). At step 504, the user can enter in a PIN number. Again, the PIN number may be required if the application cannot retrieve an IMEI number from the device. If the application 410 can access the IMEI, then the PIN number may not be required); 
spinning up, by the authentication service, a first container instance (Di Mambro, Fig. 5 and Para. [0029 and 0043], discloses the steps (501-509), performed by the authentication server, of creating (spinning up) the user’s profile (i.e., a first instance) and storing the newly created user profile on a voice print database 140 (i.e., container, which contains the created user’s profiles (i.e., instances))); 
processing, by the first container instance, the audio data, the processed audio data having a second data format (Di Mambro, Para. [0040], discloses that the J2ME application 410 can perform voice processing on the spoken utterance (i.e. first format of the pass phrase) and encode one or more features of the biometric voice prior to create the user profile); 
spinning up, by the authentication service, a second container instance (Di Mambro, Figs. 4 and 5 and Para. [0029 and 0043], discloses that the authentication server creates and stores plurality of user profiles 142 (i.e., second instance) on a voice print database 140 (i.e., container which contains the created instances)); 
transmitting, from the first container instance, to the second container instance, the processed audio data, the user identifier, and the client device identifier (Di Mambro, Para. [0038- 0040], discloses that upon speaking the password phrase, the J2ME application 410 can send the user profile 142 (which includes the pass phrase (144), the biometric voice print (146), and the IDEI/device identifier (148), see Fig. 2) to the verification server. In one arrangement, the J2ME application 410 can perform voice processing on the spoken utterance (i.e. pass phrase) and encode one or more features of the biometric voice prior to create the user profile and sending it to the verification module 430, and as further disclosed in the Para. [0037], wherein the verification module 430 can communicate information to and from the application 410 via JNI 414, which provides an interface to transport data from one format to another while preserving structural aspects of the code and data); 
Di Mambro, Para. [0024, 0042 and 0047], discloses that the profile management module 420 can communicate with the voice print database 140 over a Java Database Connectivity (JDBC) 416 interface. The JDBC 416 can provide data access for retrieving and storing data from the voice print database 140. For example, the voice print database 140 can be a relational database composed of tables which can be indexed be row column formatting as is known in the art. The JDBC 140 provides a structured query language locating data headers and fields within the voice print database 140. The profile management module 420 can parse the user profile for the biometric voice print and compare the biometric voice print with other voice prints in the voice print database 140. In one arrangement, biometric voiceprints can be stored using the mobile handsets' IMEI number for indexing. Notably, the voice print database 140 includes one or more reference voice prints from multiple users having a registered voice print. Upon determining a match with a voice print, the profile management module 420 can grant access to the user to one or more resource); and  59Attorney Docket No. 11360.0630-00000 
transmitting, by the second container instance, to a client device associated with the client device identifier, a client notification comprising a result of the authentication (Di Mambro, Para. [0044] and Fig. 6 (steps 614-620), discloses that after the authentication server uses the verification module to verify the user’s recorded voice against his stored voiceprint, the authentication server responds back to the user (i.e., sends notification) whether the authentication is successful or unsuccessful, wherein the steps performed in Fig. 6 are practiced by utilizing the Fig. 4 components, and as disclosed in Para. [0033], wherein the FIG. 4 refers to a client-server based architecture).
Di Mambro fails to explicitly disclose but Zelkovic teaches one or more memory units storing instructions (Zeljkovic, Para. [0109 and 0113], discloses a computer system comprising memory storing instructions); and
retrieving, by the second container instance, from a model storage, a  associated with the user identifier (Zeljkovic, Fig. 1 and Para. [0043], discloses that the speech server 114 is configured to provide speech recognition and/or speaker verification services for the MFA service. In particular, the speech server 114 is configured to … perform the speech recognition service to determine whether a speech sample included in the request matches a voice print previously established for a user that is allegedly associated with the speech sample, and/or as disclosed in FIG. 6 and associated Para. [0074], wherein the voice print is created by using voice sample of the user (in step 612), and saves the created voice print in association with the created user profile (in step 614 and Para. [0006]), and as disclosed in Para. [0061], the pre-registration procedure to create the user profile using the personal user information that includes, but is not limited to, a user ID, a mobile telephone number, a device ID … and a voice passphrase);
Di-Mambro and Zeljkovic are analogous arts and are in the same field of endeavor as they both pertain and directed to perform speech authentication and/or recognition based on the user profile and the received user audio data (such spoken utterance, phrase, etc.).
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Zeljkovic’ into the teachings of ‘Di-Mambro’, with a motivation to retrieve and execute a speech model particularly trained for a user associated with the user identifier, as taught by Zeljkovic, in order to recognize and verify a speech associated with a speaker (or user); Chao, Para. [0005].
Di Mambro as modified by Zeljkovic fails to explicitly disclose but Chao teaches retrieving, by the second container instance, from a model storage, a machine learning model particularly trained for a user associated with the user identifier (Chao, Para. [0067], discloses that the assistant device 206 can store or access a table 210, which can provide one or more user profiles (e.g., "1," "2," etc.) for selecting a speech recognition model, and as disclosed in Para. [0047], wherein the speech recognition models 136 each include one or more machine learning models (e.g., neural network models) and/or statistical models);
executing the machine learning model to determine a voice match result based on the processed audio data and the user identifier (Chao, Para. [0067], discloses that the assistant device 206 can store or access a table 210, which can provide one or more user profiles (e.g., "1," "2," etc.) for selecting a speech recognition model to employ for processing spoken utterances from the user 202, and/or as disclosed in Para. [0047], wherein the speech recognition models 136 each include one or more machine learning models (e.g., neural network models) and/or statistical models for determining text (or other semantic representation) that corresponds to a spoken utterance embodied in audio data);
Di-Mambro, Zeljkovic and Chao are analogous arts and are in the same field of endeavor as they both pertain and directed to perform speech authentication and/or recognition based on the user profile and the received user audio data (such spoken utterance, phrase, etc.).
Thus it would have been obvious to one ordinary skilled in the art before the effective filling date of the claimed invention to implement the teachings of ‘Chao’ into the teachings of ‘Di-Mambro’ as modified by ‘Zeljkovic’, with a motivation to retrieve and execute the machine learning model to determine a voice match result based on the processed audio data and the user identifier, as taught by Chao, in order  to recognize various characteristics of the spoken utterance of a user captured in the audio data, such as the sounds produced (e.g., phonemes) by the spoken utterance, the order of the produced sounds, rhythm of speech, intonation, etc. ; Chao, Para. [0014].

Conclusion

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALI CHEEMA, whose telephone number is 571-272-1239. The examiner can normally be reached on 8AM-4PM (EST) Monday-Friday. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jorge L. Ortiz-Criado can be reached on 571-272-7624.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ALI CHEEMA/
Examiner, Art Unit 2496

/JORGE L ORTIZ CRIADO/Supervisory Patent Examiner, Art Unit 2496