DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This Office Action is in response to correspondence filed 13 June 2022 in reference to application 17/052,736.  Claims 1-14 and 16 are pending and have been examined.

Response to Amendment
The amendment filed 13 June 2022 has been accepted and considered in this office action.  Claims 1, 4-8, and 11-14 have been amended, claim 15 cancelled, and claims 16 added.

Response to Arguments
Applicant’s arguments with respect to claim(s) 1-14 and 16 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim 1-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bocklet et al. (US PAP 2017/0200451) in view of Khoury et al. (US PAP 2018/0254046) and further in view of Hardt et al. (US Patent 10,735,411).

Consider claim 1, Bocklet teaches a device for authenticating a voice input provided from a user (abstract), the device comprising: 
a microphone configured to receive the voice input (0030-31, receiving utterance using microphone 201); 
a memory configured to store one or more instructions (0018, RAM, ROM); and 
a processor configured to execute the one or more instructions (0018, processor), 
wherein the processor is further configured to execute the one or more instructions to:
 obtain, from the voice input, signal characteristic data representing signal characteristics of the voice input (0032-33, feature extraction of input utterance, i.e. MFCCs);
 apply the obtained signal characteristic data to a first model configured to determine whether an attribute of the voice input corresponds to a voice uttered by a person and a voice output by an apparatus (0034-36, 0020-22 features fed to classifier module, which classifies signal as live or replay); and
 based on the attribute of the voice input corresponding to the voice uttered by the person, apply the voice input to a second learning model to authenticate the voice input (0035, if signal is live, further authentication to identify the user).
Bocklet does not specifically teach applying the obtained signal characteristic data to a first learning model.
In the same field of detecting replay attacks, Khoury teaches applying the obtained signal characteristic data to a first learning model (0045-47, applying features to a Deep Neural Network to determine if voice is a playback spoof or live).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to use a DNN for classification as taught by Khoury in the system of Bocklet in order to increase accuracy and thus create a more secure authentication (Khoury 0002).
Bocklet and Khoury do not specifically teach obtain context information including at least one of device state information, user's device usage history information, and user schedule information; and 
apply the voice input and the context information to a second learning model to authenticate the voice input.
In the same field of user authentication, Hardt teaches obtain context information including at least one of device state information, user's device usage history information, and user schedule information (col 9 line 65 col 10 line 25, meeting schedules and meeting history for users to be authenticated); and 
apply the voice input and the context information to a second learning model (col 9 lines 5-15, neural network for authentication) to authenticate the voice input (col 9 line 65 col 10 line 25, changing confidence scores for authentication based on meeting schedules and meeting history for users to be authenticated).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use usage history to bias authentication as taught by Hardt in the system of Bocklet and Khoury in order to more accurate authenticate a user.

Consider claim 2, Bocklet teaches the device of claim 1, wherein the signal characteristic data comprises information about a per-frequency cumulative power of the voice input (0032-33, using Mel-frequency ceptral coefficients, which is a measure of pre-frequency cumulative power).

Consider claim 3, Bocklet and Khoury teaches the device of claim 2, wherein the first learning model is trained to determine the attribute of the voice input differently according to per-frequency cumulative powers of the voice uttered by the person and the voice output by the apparatus (Bocklet 0039-40, figure 3, training the model based on extracted MFCCs in order to recognition the difference between live and replay, Khoury (0045-47, applying features to a Deep Neural Network trained to determine if voice is a playback spoof or live).

Consider claim 4, Hardt teaches the device of claim 1, wherein the processor is further configured to execute the one or more instructions to: 
obtain a voice input pattern of the user (col 9 line 65 col 10 line 25, meeting schedules and meeting history for users to be authenticated); 
wherein the voice input pattern is determined based on the voice input which is input by the user or a situation in which the voice input is input (col 9 line 65 col 10 line 25, meeting schedules and meeting history for user trying to be authenticated); 
apply the voice input pattern to the second learning model configured to authenticate the voice input (col 9 line 65 col 10 line 25, changing confidence scores for authentication based on meeting schedules and meeting history for users to be authenticated).

Consider claim 5, Hardt teaches The device of claim 4, wherein the voice input pattern comprises a user's usage behavior of inputting voice (col 9 line 65 col 10 line 25, meeting schedules and meeting history for user).

Consider claim 6, Bocklet and Khoury teach the device of claim 1, however does not specifically teach wherein the context information comprises the device state information, and the device state information comprises information of at least one of an operation mode of the device, a location of the device, a communication module activation state of the device, and a network connection state of the device.
In the same field of user authentication, Hardt teaches wherein the context information comprises the device state information, and the device state information comprises information of at least one of an operation mode of the device, a location of the device, a communication module activation state of the device, and a network connection state of the device (col 7 lines 38-55, user device location used and context, col 8 line 60- col 8 line 45 using location to bias authentication).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use location to bias authentication as taught by Hardt in the system of Bocklet and Khoury in order to more accurate authenticate a user.

Consider claim 7, Hardt teaches the device of claim 1, wherein the context information comprises the user's device usage history information, and the user's device usage history information comprises at least one of an application usage history, a user's call history, a user's text history, and a usage frequency of a voice recognition function (col 9 line 65 col 10 line 25, meeting schedules and meeting history for user trying to be authenticated, i.e. call or application history).

Consider claim 8, Bocklet teaches A method of authenticating a voice input provided from a user (abstract), the method comprising: 
receiving the voice input (0030-31, receiving utterance using microphone 201); 
obtaining, from the voice input, signal characteristic data representing signal characteristics of the voice input (0032-33, feature extraction of input utterance, i.e. MFCCs), and 
authenticating the voice input by applying the obtained signal characteristic data to a first model configured to determine whether an attribute of the voice input corresponds to a voice uttered by a person and a voice output by an apparatus (0034-36, 0020-22 features fed to classifier module, which classifies signal as live or replay).
based on the attribute of the voice input corresponding to the voice uttered by the person, apply the voice input to a second learning model (0035, if signal is live, further authentication to identify the user).
Bocklet does not specifically teach applying the obtained signal characteristic data to a first learning model.
In the same field of detecting replay attacks, Khoury teaches applying the obtained signal characteristic data to a first learning model (0045-47, applying features to a Deep Neural Network to determine if voice is a playback spoof or live).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use a DNN for classification as taught by Khoury in the system of Bocklet in order to increase accuracy and thus create a more secure authentication (Khoury 0002).
Bocklet and Khoury do not specifically teach obtain context information including at least one of device state information, user's device usage history information, and user schedule information; and 
apply the voice input and the context information to a second learning model.
In the same field of user authentication, Hardt teaches obtain context information including at least one of device state information, user's device usage history information, and user schedule information (col 9 line 65 col 10 line 25, meeting schedules and meeting history for users to be authenticated); and 
apply the voice input and the context information to a second learning model (col 9 lines 5-15, neural network for authentication col 9 line 65 col 10 line 25, changing confidence scores for authentication based on meeting schedules and meeting history for users to be authenticated).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to use usage history to bias authentication as taught by Hardt in the system of Bocklet and Khoury in order to more accurate authenticate a user.

Claim 9 contains similar limitations as claim 2 and is therefore rejected for the same reasons.

Claim 10 contains similar limitations as claim 3 and is therefore rejected for the same reasons.

Claim 11 contains similar limitations as claim 4 and is therefore rejected for the same reasons.

Claim 12 contains similar limitations as claim 5 and is therefore rejected for the same reasons.

Claim 13 contains similar limitations as claim 6 and is therefore rejected for the same reasons.

Claim 14 contains similar limitations as claim 7 and is therefore rejected for the same reasons.

Consider claim 16, Bocklet teaches a non-transitory computer readable medium configured to store instructions, wherein execution of the instructions by one or more processors of a computer (0018, RAM, ROM and processor) are configured to cause the computer to:
 obtain, from the voice input, signal characteristic data representing signal characteristics of the voice input (0032-33, feature extraction of input utterance, i.e. MFCCs);
 apply the obtained signal characteristic data to a first model configured to determine whether an attribute of the voice input corresponds to a voice uttered by a person and a voice output by an apparatus (0034-36, 0020-22 features fed to classifier module, which classifies signal as live or replay); and
 based on the attribute of the voice input corresponding to the voice uttered by the person, apply the voice input to a second learning model to authenticate the voice input (0035, if signal is live, further authentication to identify the user).
Bocklet does not specifically teach applying the obtained signal characteristic data to a first learning model.
In the same field of detecting replay attacks, Khoury teaches applying the obtained signal characteristic data to a first learning model (0045-47, applying features to a Deep Neural Network to determine if voice is a playback spoof or live).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to use a DNN for classification as taught by Khoury in the system of Bocklet in order to increase accuracy and thus create a more secure authentication (Khoury 0002).
Bocklet and Khoury do not specifically teach obtain context information including at least one of device state information, user's device usage history information, and user schedule information; and 
apply the voice input and the context information to a second learning model to authenticate the voice input.
In the same field of user authentication, Hardt teaches obtain context information including at least one of device state information, user's device usage history information, and user schedule information (col 9 line 65 col 10 line 25, meeting schedules and meeting history for users to be authenticated); and 
apply the voice input and the context information to a second learning model (col 9 lines 5-15, neural network for authentication) to authenticate the voice input (col 9 line 65 col 10 line 25, changing confidence scores for authentication based on meeting schedules and meeting history for users to be authenticated).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use usage history to bias authentication as taught by Hardt in the system of Bocklet and Khoury in order to more accurate authenticate a user.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451. The examiner can normally be reached 6:30am-5pm Monday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on (571)272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

DOUGLAS GODBOLD
Examiner
Art Unit 2655



/DOUGLAS GODBOLD/Primary Examiner, Art Unit 2655