DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
2.	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant’s submission filed on 01/04/2021 has been entered.

Response to Amendments/Arguments
3.	With respect to 103 rejection, the argument have been fully considered but are moot in view of new ground(s) of rejection.

Claim Rejections - 35 USC § 103
4.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

5.	Claims 1, 4, 6-8, 11, 13-15, 18, 20 are rejected under 35 U.S.C.103 as being unpatentable over Qian et al. (US 20180025733 A1) in view of Oku (US 2011/0052139 A1) and Han et al. (US 2018/0276454 A1.)

	With respect to Claim 1, Qian et al. disclose 

 	 A computer-implemented method for control of a smart display device based on characteristics, the method comprising: 
 	receiving an image from a light capture device associated with the smart display device (Qian et al. [0034] a camera that gathers one or more images and provides input related thereto to the processor 122, The camera may be a thermal imaging camera, a digital camera such as a webcam, a three-dimensional (3D) camera, and/or a camera otherwise integrated into the system 100 and controllable by the processor 122 to gather pictures/images and/or video, [0037] Fig. 2 shows a notebook computer and/or convertible computer 202, a desktop computer 204, a wearable device 206 such as a smart watch, a smart television (TV) 208, a smart phone 210, a table computer 212);
 	determining, whether to activate voice recognition of a recording device associated with the smart display device based on a face being in the image (Qian et al. [0041] the sensor 314 may be a camera images from which are analyzed by the processor employing face recognition to determine whether a particular person is recognized and based on the size of the image of the face, whether the person is within a proximity threshold of the device, [0004] automatically activate a voice assistant module (VAM) of the device responsive to the determination based on the proximity signal that the user is proximate to the device, [0040] The device 300 may include a display 304 such as a touch-sensitive display. The device 300 also includes one or more processors 306 configured to execute one or more voice assistant modules (VAM) 308 for purposes of sending data from one or more microphones 310 or the headphone microphone 303 to the VAM 308 to execute voice recognition on the microphone data and to return programmatically defined response over one or more speakers 312); 
 	in response to determining the face being in the image and the portion of the image occupied by the face, activating the voice recognition of the recording device associated with the smart display device (Qian et al. [0041] the sensor 314 may be a camera images from which are analyzed by the processor employing face recognition to determine whether a particular person is recognized and based on the size of the image of the face, whether the person is within a proximity threshold of the device, [0004] automatically activate a voice assistant module (VAM) of the device responsive to the determination based on the proximity signal that the user is proximate to the device, [0040] The device 300 may include a display 304 such as a touch-sensitive display. The device 300 also includes one or more processors 306 configured to execute one or more voice assistant modules (VAM) 308 for purposes of sending data from one or more microphones 310 or the headphone microphone 303 to the VAM 308 to execute voice recognition on the microphone data and to return programmatically defined response over one or more speakers 312.)
	The Examiner notes that Qian et al. disclose a method of activating the voice recognition based on the size of the image of the face. From the size of the image of the face, Qian et al. determines whether the person is within a proximity threshold of the device. Qian et al. does not teach the threshold for the size of the image of the face. Qian et al. fail to explicitly teach
 	by utilizing a machine learning model to analyze the image,  
 	wherein the machine learning model is trained locally by the smart display device,
 	determining whether a portion of the image occupied by the face is greater than a threshold portion; and exceeding the threshold portion. 
 	However, Oku teaches 
 	determining whether a portion of the image occupied by the face is greater than a threshold portion;  and exceeding the threshold portion (Oku [0083] In cases in which a face that has been detected through face detection processing in larger than a specified size (for example, the area that the face occupies in the image is 30% or 50%), it is assumed that the operator will want to collect the voice emanating chiefly from said person and thus, control-method decision part 111 will output to control-switching part 112 a command to implement enhancement processing of the audio band corresponding to that person’s voice.)
 	Qian et al. and Oku are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of activating the voice recognition in response to detecting the size of the image of the face as taught by Qian et al., using teaching of the threshold in the size of image of the face as taught by Oku for the benefit of starting processing the audio corresponding to that person’s voice (Oku [0083] In cases in which a face that has been detected through face detection processing in larger than a specified size (for example, the area that the face occupies in the image is 30% or 50%), it is assumed that the operator will want to collect the voice emanating chiefly from said person and thus, control-method decision part 111 will output to control-switching part 112 a command to implement enhancement processing of the audio band corresponding to that person’s voice.)
	Qian et al in view of Oku fail to explicitly teach
 	by utilizing a machine learning model to analyze the image,  
 	wherein the machine learning model is trained locally by the smart display device,
 	However, Han et al. teach
 	by utilizing a machine learning model to analyze the image (Han et al. [0028] The feature extractor may include a trained neural network and the method may further include training the neural network to extract face image features using the training images and determining the reference image from the training images, [0099] The facial verification apparatus may be configured to perform one or more or all neural network verification trainings, registration operations herein without verification...Any of the facial verification apparatus may train the extractor or verification models or neural networks, or still another facial verification apparatus may perform the training, [0089] A facial verification refers to a verification method used to determine whether a user is a valid user based on face information of the user, and verify a valid user in, for example, user log-in, payment services, and access control. Referring to Fig. 1A, a facial verification apparatus configured to perform such a facial verification in included in, or represented by, a computing apparatus 120. The computing apparatus 120 includes, for example, a smartphone, a wearable device, a tablet computer, a netbook, a laptop computer...and a vehicle start device) and, 
 	wherein the machine learning model is trained locally by the smart display device (Han et al. [0099] The facial verification apparatus may be configured to perform one or more or all neural network verification trainings, registration operations herein without verification...Any of the facial verification apparatus may train the extractor or verification models or neural networks, or still another facial verification apparatus may perform the training, [0089] A facial verification refers to a verification method used to determine whether a user is a valid user based on face information of the user, and verify a valid user in, for example, user log-in, payment services, and access control. Referring to Fig. 1A, a facial verification apparatus configured to perform such a facial verification in included in, or represented by, a computing apparatus 120. The computing apparatus 120 includes, for example, a smartphone, a wearable device, a tablet computer, a netbook, a laptop computer...and a vehicle start device); 
The facial verification apparatus may be configured to perform one or more or all neural network verification trainings, registration operations herein without verification...Any of the facial verification apparatus may train the extractor or verification models or neural networks, or still another facial verification apparatus may perform the training, [0089] A facial verification refers to a verification method used to determine whether a user is a valid user based on face information of the user, and verify a valid user in, for example, user log-in, payment services, and access control. Referring to Fig. 1A, a facial verification apparatus configured to perform such a facial verification in included in, or represented by, a computing apparatus 120. The computing apparatus 120 includes, for example, a smartphone, a wearable device, a tablet computer, a netbook, a laptop computer...and a vehicle start device.) 

 	With respect to Claim 4, Qian et al. in view of Oku and Han et al. teach 
 	wherein determining whether to activate the voice recognition of the recording device associated with the smart display device further comprises:
 	 determining a distance from the face in the image to the smart display device (Oku [0068] In the case in which the face has been detected, determination results output part 913 will output the size and the location and the distance from lens part 3 to the face, estimated from the size of the face, taking the input image of the face that has been detected as the standard.)

 	With respect to Claim 6, Qian et al. in view of Oku and Han et al. teach 
 	further comprising: 
 	identifying a user associated with the face (Han et al. [0092] the computing apparatus 120 detects a face region 140 in the obtained face image, and extracts a feature from the face region 140 using a feature extractor. The computing apparatus 120 determines the user authentication to be a success or a failure based on a result of comparing the extracted feature and the example registered feature, of a valid user, registered in a face registering process, e.g., of the same or different computing apparatus 120.)

 	With respect to Claim 7, Qian et al. in view of Oku and Han et al. teach 
 	wherein identifying the user associated with the face is based on a comparison of the face present in the image with a face of the user present in a local model (Han et al. [0092] the computing apparatus 120 detects a face region 140 in the obtained face image, and extracts a feature from the face region 140 using a feature extractor. The computing apparatus 120 determines the user authentication to be a success or a failure based on a result of comparing the extracted feature and the example registered feature, of a valid user, registered in a face registering process, e.g., of the same or different computing apparatus 120, [0028] The feature extractor may include a trained neural network and the method may further include training the neural network to extract face image features using the training images and determining the reference image from the training images, [0099] The facial verification apparatus may be configured to perform one or more or all neural network verification trainings, registration operations herein without verification...Any of the facial verification apparatus may train the extractor or verification models or neural networks, or still another facial verification apparatus may perform the training, [0089] A facial verification refers to a verification method used to determine whether a user is a valid user based on face information of the user, and verify a valid user in, for example, user log-in, payment services, and access control. Referring to Fig. 1A, a facial verification apparatus configured to perform such a facial verification in included in, or represented by, a computing apparatus 120. The computing apparatus 120 includes, for example, a smartphone, a wearable device, a tablet computer, a netbook, a laptop computer...and a vehicle start device.)

	With respect to Claim 8, Qian et al. disclose
 	 A non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations (Qian et al. [0003] a processor, a microphone accessible to the processor, and storage accessible to the processor), the operations comprising: 
 	receiving an image from a light capture device associated with a smart display device (Qian et al. [0034] a camera that gathers one or more images and provides input related thereto to the processor 122, The camera may be a thermal imaging camera, a digital camera such as a webcam, a three-dimensional (3D) camera, and/or a camera otherwise integrated into the system 100 and controllable by the processor 122 to gather pictures/images and/or video, [0037] Fig. 2 shows a notebook computer and/or convertible computer 202, a desktop computer 204, a wearable device 206 such as a smart watch, a smart television (TV) 208, a smart phone 210, a table computer 212);
 	determining, whether to activate voice recognition of a recording device associated with the smart display device based on a face being in the image (Qian et al. [0041] the sensor 314 may be a camera images from which are analyzed by the processor employing face recognition to determine whether a particular person is recognized and based on the size of the image of the face, whether the person is within a proximity threshold of the device, [0004] automatically activate a voice assistant module (VAM) of the device responsive to the determination based on the proximity signal that the user is proximate to the device, [0040] The device 300 may include a display 304 such as a touch-sensitive display. The device 300 also includes one or more processors 306 configured to execute one or more voice assistant modules (VAM) 308 for purposes of sending data from one or more microphones 310 or the headphone microphone 303 to the VAM 308 to execute voice recognition on the microphone data and to return programmatically defined response over one or more speakers 312), 
 	in response to determining the face being in the image and the portion of the image occupied by the face, activating the recording device associated with the smart display device (Qian et al. [0041] the sensor 314 may be a camera images from which are analyzed by the processor employing face recognition to determine whether a particular person is recognized and based on the size of the image of the face, whether the person is within a proximity threshold of the device, [0004] automatically activate a voice assistant module (VAM) of the device responsive to the determination based on the proximity signal that the user is proximate to the device, [0040] The device 300 may include a display 304 such as a touch-sensitive display. The device 300 also includes one or more processors 306 configured to execute one or more voice assistant modules (VAM) 308 for purposes of sending data from one or more microphones 310 or the headphone microphone 303 to the VAM 308 to execute voice recognition on the microphone data and to return programmatically defined response over one or more speakers 312.)
	The Examiner notes that Qian et al. disclose a method of activating the voice recognition based on the size of the image of the face. From the size of the image of the face, Qian et al. determines whether the person is within a proximity threshold of the device. Qian et al. does not teach the threshold for the size of the image of the face. Qian et al. fail to explicitly teach
	by utilizing a machine learning model to analyze the image
 	wherein the machine learning model is trained locally by the smart display device, 
 	determining whether a portion of the image occupied by the face is greater than a threshold portion; and exceeding the threshold portion
 	However, Oku teaches 
 	determining whether a portion of the image occupied by the face is greater than a threshold portion;  and exceeding the threshold portion (Oku [0083] In cases in which a face that has been detected through face detection processing in larger than a specified size (for example, the area that the face occupies in the image is 30% or 50%), it is assumed that the operator will want to collect the voice emanating chiefly from said person and thus, control-method decision part 111 will output to control-switching part 112 a command to implement enhancement processing of the audio band corresponding to that person’s voice.)
 	Qian et al. and Oku are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of activating the voice recognition in response to detecting the size of the image of the face as taught by Qian et al., using teaching of the threshold in the size of image of the face as taught by Oku for the benefit of starting processing the audio corresponding to that person’s voice (Oku [0083] In cases in which a face that has been detected through face detection processing in larger than a specified size (for example, the area that the face occupies in the image is 30% or 50%), it is assumed that the operator will want to collect the voice emanating chiefly from said person and thus, control-method decision part 111 will output to control-switching part 112 a command to implement enhancement processing of the audio band corresponding to that person’s voice.)
	Qian et al in view of Oku fail to explicitly teach
 	by utilizing a machine learning model to analyze the image,  
 	wherein the machine learning model is trained locally by the smart display device,
 	However, Han et al. teach
 	by utilizing a machine learning model to analyze the image (Han et al. [0028] The feature extractor may include a trained neural network and the method may further include training the neural network to extract face image features using the training images and determining the reference image from the training images, [0099] The facial verification apparatus may be configured to perform one or more or all neural network verification trainings, registration operations herein without verification...Any of the facial verification apparatus may train the extractor or verification models or neural networks, or still another facial verification apparatus may perform the training, [0089] A facial verification refers to a verification method used to determine whether a user is a valid user based on face information of the user, and verify a valid user in, for example, user log-in, payment services, and access control. Referring to Fig. 1A, a facial verification apparatus configured to perform such a facial verification in included in, or represented by, a computing apparatus 120. The computing apparatus 120 includes, for example, a smartphone, a wearable device, a tablet computer, a netbook, a laptop computer...and a vehicle start device) and, 
 	wherein the machine learning model is trained locally by the smart display device (Han et al. [0099] The facial verification apparatus may be configured to perform one or more or all neural network verification trainings, registration operations herein without verification...Any of the facial verification apparatus may train the extractor or verification models or neural networks, or still another facial verification apparatus may perform the training, [0089] A facial verification refers to a verification method used to determine whether a user is a valid user based on face information of the user, and verify a valid user in, for example, user log-in, payment services, and access control. Referring to Fig. 1A, a facial verification apparatus configured to perform such a facial verification in included in, or represented by, a computing apparatus 120. The computing apparatus 120 includes, for example, a smartphone, a wearable device, a tablet computer, a netbook, a laptop computer...and a vehicle start device); 
The facial verification apparatus may be configured to perform one or more or all neural network verification trainings, registration operations herein without verification...Any of the facial verification apparatus may train the extractor or verification models or neural networks, or still another facial verification apparatus may perform the training, [0089] A facial verification refers to a verification method used to determine whether a user is a valid user based on face information of the user, and verify a valid user in, for example, user log-in, payment services, and access control. Referring to Fig. 1A, a facial verification apparatus configured to perform such a facial verification in included in, or represented by, a computing apparatus 120. The computing apparatus 120 includes, for example, a smartphone, a wearable device, a tablet computer, a netbook, a laptop computer...and a vehicle start device.)

 	With respect to Claim 11, Qian et al. in view of Oku and Han et al. teach 
 	wherein determining whether to activate the voice recognition of the recording device associated with the smart display device further comprises:
 	 determining a distance from the face in the image to the smart display device (Oku [0068] In the case in which the face has been detected, determination results output part 913 will output the size and the location and the distance from lens part 3 to the face, estimated from the size of the face, taking the input image of the face that has been detected as the standard.)

 	With respect to Claim 13, Qian et al. in view of Oku and Han et al. teach 
 	further comprising: 
 	identifying a user associated with the face (Han et al. [0092] the computing apparatus 120 detects a face region 140 in the obtained face image, and extracts a feature from the face region 140 using a feature extractor. The computing apparatus 120 determines the user authentication to be a success or a failure based on a result of comparing the extracted feature and the example registered feature, of a valid user, registered in a face registering process, e.g., of the same or different computing apparatus 120.)

	With respect to Claim 14, Qian et al. in view of Oku and Han et al. teach 
 	wherein identifying the user associated with the face is based on a comparison of the face present in the image with a face of the user present in a local model (Han et al. [0092] the computing apparatus 120 detects a face region 140 in the obtained face image, and extracts a feature from the face region 140 using a feature extractor. The computing apparatus 120 determines the user authentication to be a success or a failure based on a result of comparing the extracted feature and the example registered feature, of a valid user, registered in a face registering process, e.g., of the same or different computing apparatus 120, [0028] The feature extractor may include a trained neural network and the method may further include training the neural network to extract face image features using the training images and determining the reference image from the training images, [0099] The facial verification apparatus may be configured to perform one or more or all neural network verification trainings, registration operations herein without verification...Any of the facial verification apparatus may train the extractor or verification models or neural networks, or still another facial verification apparatus may perform the training, [0089] A facial verification refers to a verification method used to determine whether a user is a valid user based on face information of the user, and verify a valid user in, for example, user log-in, payment services, and access control. Referring to Fig. 1A, a facial verification apparatus configured to perform such a facial verification in included in, or represented by, a computing apparatus 120. The computing apparatus 120 includes, for example, a smartphone, a wearable device, a tablet computer, a netbook, a laptop computer...and a vehicle start device.)

	With respect to Claim 15, Qian et al. disclose
 	A system, comprising: 
 	a processor (Qian et al. [0003] a processor, a microphone accessible to the processor, and storage accessible to the processor); and 
a processor, a microphone accessible to the processor, and storage accessible to the processor, [0020] a processor can access information over its input lines from data storage, such as the computer readable storage medium), the operations including: 
 	receiving an image from a light capture device associated with a smart display device (Qian et al. [0034] a camera that gathers one or more images and provides input related thereto to the processor 122, The camera may be a thermal imaging camera, a digital camera such as a webcam, a three-dimensional (3D) camera, and/or a camera otherwise integrated into the system 100 and controllable by the processor 122 to gather pictures/images and/or video, [0037] Fig. 2 shows a notebook computer and/or convertible computer 202, a desktop computer 204, a wearable device 206 such as a smart watch, a smart television (TV) 208, a smart phone 210, a table computer 212);
 	determining, whether to activate voice recognition of a recording device associated with the smart display device based on a face being in the image (Qian et al. [0041] the sensor 314 may be a camera images from which are analyzed by the processor employing face recognition to determine whether a particular person is recognized and based on the size of the image of the face, whether the person is within a proximity threshold of the device, [0004] automatically activate a voice assistant module (VAM) of the device responsive to the determination based on the proximity signal that the user is proximate to the device, [0040] The device 300 may include a display 304 such as a touch-sensitive display. The device 300 also includes one or more processors 306 configured to execute one or more voice assistant modules (VAM) 308 for purposes of sending data from one or more microphones 310 or the headphone microphone 303 to the VAM 308 to execute voice recognition on the microphone data and to return programmatically defined response over one or more speakers 312), 
 	in response to determining the face being in the image and the portion of the image occupied by the face, activating the recording device associated with the smart display device (Qian et al. [0041] the sensor 314 may be a camera images from which are analyzed by the processor employing face recognition to determine whether a particular person is recognized and based on the size of the image of the face, whether the person is within a proximity threshold of the device, [0004] automatically activate a voice assistant module (VAM) of the device responsive to the determination based on the proximity signal that the user is proximate to the device, [0040] The device 300 may include a display 304 such as a touch-sensitive display. The device 300 also includes one or more processors 306 configured to execute one or more voice assistant modules (VAM) 308 for purposes of sending data from one or more microphones 310 or the headphone microphone 303 to the VAM 308 to execute voice recognition on the microphone data and to return programmatically defined response over one or more speakers 312.)
	The Examiner notes that Qian et al. disclose a method of activating the voice recognition based on the size of the image of the face. From the size of the image of the face, Qian et al. determines whether the person is within a proximity threshold of the device. Qian et al. does not teach the threshold for the size of the image of the face. Qian et al. fail to explicitly teach
	by utilizing a machine learning model to analyze the image
 	wherein the machine learning model is trained locally by the smart display device, 
 	determining whether a portion of the image occupied by the face is greater than a threshold portion; and exceeding the threshold portion
 	However, Oku teaches
 	by utilizing a machine learning model to analyze the image
 	wherein the machine learning model is trained locally by the smart display device, 
 	determining whether a portion of the image occupied by the face is greater than a threshold portion; and exceeding the threshold portion
determining whether a portion of the image occupied by the face is greater than a threshold portion;  and exceeding the threshold portion (Oku [0083] In cases in which a face that has been detected through face detection processing in larger than a specified size (for example, the area that the face occupies in the image is 30% or 50%), it is assumed that the operator will want to collect the voice emanating chiefly from said person and thus, control-method decision part 111 will output to control-switching part 112 a command to implement enhancement processing of the audio band corresponding to that person’s voice.)
 	Qian et al. and Oku are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of activating the voice recognition in response to detecting the size of the image of the face In cases in which a face that has been detected through face detection processing in larger than a specified size (for example, the area that the face occupies in the image is 30% or 50%), it is assumed that the operator will want to collect the voice emanating chiefly from said person and thus, control-method decision part 111 will output to control-switching part 112 a command to implement enhancement processing of the audio band corresponding to that person’s voice.)
	Qian et al in view of Oku fail to explicitly teach
 	by utilizing a machine learning model to analyze the image,  
 	wherein the machine learning model is trained locally by the smart display device,
	However, Han et al. teach 
 	by utilizing a machine learning model to analyze the image (Han et al. [0028] The feature extractor may include a trained neural network and the method may further include training the neural network to extract face image features using the training images and determining the reference image from the training images, [0099] The facial verification apparatus may be configured to perform one or more or all neural network verification trainings, registration operations herein without verification...Any of the facial verification apparatus may train the extractor or verification models or neural networks, or still another facial verification apparatus may perform the training, [0089] A facial verification refers to a verification method used to determine whether a user is a valid user based on face information of the user, and verify a valid user in, for example, user log-in, payment services, and access control. Referring to Fig. 1A, a facial verification apparatus configured to perform such a facial verification in included in, or represented by, a computing apparatus 120. The computing apparatus 120 includes, for example, a smartphone, a wearable device, a tablet computer, a netbook, a laptop computer...and a vehicle start device) and, 
 	wherein the machine learning model is trained locally by the smart display device (Han et al. [0099] The facial verification apparatus may be configured to perform one or more or all neural network verification trainings, registration operations herein without verification...Any of the facial verification apparatus may train the extractor or verification models or neural networks, or still another facial verification apparatus may perform the training, [0089] A facial verification refers to a verification method used to determine whether a user is a valid user based on face information of the user, and verify a valid user in, for example, user log-in, payment services, and access control. Referring to Fig. 1A, a facial verification apparatus configured to perform such a facial verification in included in, or represented by, a computing apparatus 120. The computing apparatus 120 includes, for example, a smartphone, a wearable device, a tablet computer, a netbook, a laptop computer...and a vehicle start device); 
 	Qian et al., Oku and Han et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of activating the voice recognition in response to detecting the size of the image of the face as taught by Qian et al., using teaching of the threshold in the size of image of the face as taught by Oku for the benefit of starting processing the audio corresponding to that person’s voice, using teaching of neural network model trained at the user device as taught by Han et al. for the benefit of verifying a valid user in user log-in, payment service and/or access control (Han et al. [0099] The facial verification apparatus may be configured to perform one or more or all neural network verification trainings, registration operations herein without verification...Any of the facial verification apparatus may train the extractor or verification models or neural networks, or still another facial verification apparatus may perform the training, [0089] A facial verification refers to a verification method used to determine whether a user is a valid user based on face information of the user, and verify a valid user in, for example, user log-in, payment services, and access control. Referring to Fig. 1A, a facial verification apparatus configured to perform such a facial verification in included in, or represented by, a computing apparatus 120. The computing apparatus 120 includes, for example, a smartphone, a wearable device, a tablet computer, a netbook, a laptop computer...and a vehicle start device.)

 	With respect to Claim 18, Qian et al. in view of Oku and Han et al. teach 
 	wherein determining whether to activate the voice recognition of the recording device associated with the smart display device further comprises:
 	 determining a distance from the face in the image to the smart display device (Oku [0068] In the case in which the face has been detected, determination results output part 913 will output the size and the location and the distance from lens part 3 to the face, estimated from the size of the face, taking the input image of the face that has been detected as the standard.) 

 	With respect to Claim 20, Qian et al. in view of Oku and Han et al. teach 
 	further comprising: 
 	identifying a user associated with the face (Han et al. [0092] the computing apparatus 120 detects a face region 140 in the obtained face image, and extracts a feature from the face region 140 using a feature extractor. The computing apparatus 120 determines the user authentication to be a success or a failure based on a result of comparing the extracted feature and the example registered feature, of a valid user, registered in a face registering process, e.g., of the same or different computing apparatus 120.)

6.	Claims 3, 10, 17 are rejected under 35 U.S.C.103 as being unpatentable over Qian et al. (US 20180025733 A1) in view of Oku (US 2011/0052139 A1),  Han et al. (US 2018/0276454 A1) and Park et al. (US 2018/0137862 A1).

 	With respect to Claim 3, Qian et al. in view of Oku and Han et al. teach all the limitations of Claim 1 upon which Claim 3 depends. Qian et al. in view of Oku and Han et al. fail to explicitly teach 
 	further comprising: 	
receiving a second image from the light capture device associated with the smart display device; 
determining whether the face is in the second image; and 
 	in response to determining that the face is not in the second image, deactivating the voice recognition of the recording device associated with the smart display device.  
	However, Park et al. teach 
 	further comprising: 
 	receiving a second image from the light capture device associated with the smart display device (Park et al. [0010] A method for controlling a mobile terminal according to an embodiment of the present invention can include detecting a face image through a low-power image sensors of detecting a subject based on black and white image sensor, [0114] The mobile terminal according to an embodiment of the present invention can further include a low-power image sensor for sensing an object using low-power); 
 	determining whether the face is in the second image (Park et al. [0010] A method for controlling a mobile terminal according to an embodiment of the present invention can include detecting a face image through a low-power image sensors...include continuously detecting the face image through the low-power image sensor after the execution of the voice recognition, determining whether or not the face image satisfies a preset condition, and terminating the voice recognition function according to the determination result, [0142] the low-power image sensor 200 according to an embodiment of the present invention can continuously detect the face image during the execution of the voice recognition function (S430). The low-power image sensor 200 can continue to detect the face image even when the voice recognition function is executed. That is, the low-power image sensor 200 can detect whether the user’s face exists in a surrounding area of the mobile terminal during the execution of the voice recognition function to determine an end time point of the voice recognition function, Fig. 5 elements S430, S440.); and 
 	in response to determining that the face is not in the second image, deactivating the voice recognition of the recording device associated with the smart display device (Park et al. [0142] the low-power image sensor 200 according to an embodiment of the present invention can continuously detect the face image during the execution of the voice recognition function (S430). The low-power image sensor 200 can continue to detect the face image even when the voice recognition function is executed. That is, the low-power image sensor 200 can detect whether the user’s face exists in a surrounding area of the mobile terminal during the execution of the voice recognition function to determine an end time point of the voice recognition function, Fig. 5 elements S430, S440.)   
 	Qian et al., Oku, Han et al. and Park et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of activating the voice recognition in response to detecting the size of the image of the face as taught by Qian et al., using teaching of the threshold in the size of image of the face as taught by Oku for the benefit of starting processing the audio corresponding to that person’s voice, using teaching of neural network model trained at the user device as taught by Han et al. for the benefit of verifying a valid user in user log-in, payment the low-power image sensor 200 according to an embodiment of the present invention can continuously detect the face image during the execution of the voice recognition function (S430). The low-power image sensor 200 can continue to detect the face image even when the voice recognition function is executed. That is, the low-power image sensor 200 can detect whether the user’s face exists in a surrounding area of the mobile terminal during the execution of the voice recognition function to determine an end time point of the voice recognition function, Fig. 5 elements S430, S440.)   

 	With respect to Claim 10, Qian et al. in view of Oku and Han et al. teach all the limitations of Claim 8 upon which Claim 10 depends. Qian et al. in view of Oku and Han et al. fail to explicitly teach 
 	further comprising: 	
receiving a second image from the light capture device associated with the smart display device; 
determining whether the face is in the second image; and 
 	in response to determining that the face is not in the second image, deactivating the voice recognition of the recording device associated with the smart display device.  
	However, Park et al. teach 
 	further comprising: 
 	receiving a second image from the light capture device associated with the smart display device (Park et al. [0010] A method for controlling a mobile terminal according to an embodiment of the present invention can include detecting a face image through a low-power image sensors of detecting a subject based on black and white image sensor, [0114] The mobile terminal according to an embodiment of the present invention can further include a low-power image sensor for sensing an object using low-power); 
 	determining whether the face is in the second image (Park et al. [0010] A method for controlling a mobile terminal according to an embodiment of the present invention can include detecting a face image through a low-power image sensors...include continuously detecting the face image through the low-power image sensor after the execution of the voice recognition, determining whether or not the face image satisfies a preset condition, and terminating the voice recognition function according to the determination result, [0142] the low-power image sensor 200 according to an embodiment of the present invention can continuously detect the face image during the execution of the voice recognition function (S430). The low-power image sensor 200 can continue to detect the face image even when the voice recognition function is executed. That is, the low-power image sensor 200 can detect whether the user’s face exists in a surrounding area of the mobile terminal during the execution of the voice recognition function to determine an end time point of the voice recognition function, Fig. 5 elements S430, S440.); and 
 	in response to determining that the face is not in the second image, deactivating the voice recognition of the recording device associated with the smart display device (Park et al. [0142] the low-power image sensor 200 according to an embodiment of the present invention can continuously detect the face image during the execution of the voice recognition function (S430). The low-power image sensor 200 can continue to detect the face image even when the voice recognition function is executed. That is, the low-power image sensor 200 can detect whether the user’s face exists in a surrounding area of the mobile terminal during the execution of the voice recognition function to determine an end time point of the voice recognition function, Fig. 5 elements S430, S440.)   
 	Qian et al., Oku, Han et al. and Park et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of activating the voice recognition in response to detecting the size of the image of the face as taught by Qian et al., using teaching of the threshold in the size of image of the face as taught by Oku for the benefit of starting processing the audio corresponding to that person’s voice, using teaching of neural network model trained at the user device as taught by Han et al. for the benefit of verifying a valid user in user log-in, payment service and/or access control, using teaching of determining whether the face in the image as taught by Park et al. for the benefit of deactivating the device (Park et al. [0142] the low-power image sensor 200 according to an embodiment of the present invention can continuously detect the face image during the execution of the voice recognition function (S430). The low-power image sensor 200 can continue to detect the face image even when the voice recognition function is executed. That is, the low-power image sensor 200 can detect whether the user’s face exists in a surrounding area of the mobile terminal during the execution of the voice recognition function to determine an end time point of the voice recognition function, Fig. 5 elements S430, S440.)   

 	With respect to Claim 17, Qian et al. in view of Oku and Han et al. teach all the limitations of Claim 15 upon which Claim 17 depends. Qian et al. in view of Oku and Han et al. fail to explicitly teach 
 	further comprising: 	
receiving a second image from the light capture device associated with the smart display device; 
determining whether the face is in the second image; and 
 	in response to determining that the face is not in the second image, deactivating the voice recognition of the recording device associated with the smart display device.  
	However, Park et al. teach 
 	further comprising: 
 	receiving a second image from the light capture device associated with the smart display device (Park et al. [0010] A method for controlling a mobile terminal according to an embodiment of the present invention can include detecting a face image through a low-power image sensors of detecting a subject based on black and white image sensor, [0114] The mobile terminal according to an embodiment of the present invention can further include a low-power image sensor for sensing an object using low-power); 
 	determining whether the face is in the second image (Park et al. [0010] A method for controlling a mobile terminal according to an embodiment of the present invention can include detecting a face image through a low-power image sensors...include continuously detecting the face image through the low-power image sensor after the execution of the voice recognition, determining whether or not the face image satisfies a preset condition, and terminating the voice recognition function according to the determination result, [0142] the low-power image sensor 200 according to an embodiment of the present invention can continuously detect the face image during the execution of the voice recognition function (S430). The low-power image sensor 200 can continue to detect the face image even when the voice recognition function is executed. That is, the low-power image sensor 200 can detect whether the user’s face exists in a surrounding area of the mobile terminal during the execution of the voice recognition function to determine an end time point of the voice recognition function, Fig. 5 elements S430, S440.); and 
 	in response to determining that the face is not in the second image, deactivating the voice recognition of the recording device associated with the smart display device (Park et al. [0142] the low-power image sensor 200 according to an embodiment of the present invention can continuously detect the face image during the execution of the voice recognition function (S430). The low-power image sensor 200 can continue to detect the face image even when the voice recognition function is executed. That is, the low-power image sensor 200 can detect whether the user’s face exists in a surrounding area of the mobile terminal during the execution of the voice recognition function to determine an end time point of the voice recognition function, Fig. 5 elements S430, S440.)   
 	Qian et al., Oku, Han et al. and Park et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of activating the voice recognition in response to detecting the size of the image of the face as taught by Qian et al., using teaching of the threshold in the size of image of the face as taught by Oku for the benefit of starting processing the audio corresponding to that person’s voice, using teaching of neural network model trained at the user device as taught by Han et al. for the benefit of verifying a valid user in user log-in, payment service and/or access control, using teaching of determining whether the face in the image as taught by Park et al. for the benefit of deactivating the device (Park et al. [0142] the low-power image sensor 200 according to an embodiment of the present invention can continuously detect the face image during the execution of the voice recognition function (S430). The low-power image sensor 200 can continue to detect the face image even when the voice recognition function is executed. That is, the low-power image sensor 200 can detect whether the user’s face exists in a surrounding area of the mobile terminal during the execution of the voice recognition function to determine an end time point of the voice recognition function, Fig. 5 elements S430, S440.)   

7.	Claims 5, 12, 19 are rejected under 35 U.S.C.103 as being unpatentable over Qian et al. (US 20180025733 A1) in view of Oku (US 2011/0052139 A1),  Han et al. (US 2018/0276454 A1) and VanBlon et al. (US 2015/0154983 A1). 

 	With respect to Claim 5, Qian et al. in view of Oku and Han et al. teach all the limitations of Claim 1 upon which Claim 5 depends. Qian et al. in view of Oku and Han et al. fail to explicitly teach 
 	wherein determining whether to activate the voice recognition of the recording device associated with the smart display device further comprises: 
 	determining a gaze direction of the face in the image relative to the smart display device. 
	However, VanBlon et al.   
 	wherein determining whether to activate the voice recognition of the recording device associated with the smart display device further comprises: 
 	determining a gaze direction of the face in the image relative to the smart display device (VanBlon et al. [0008] a determination that the user’s eyes are not looking at the device or toward the device, [0038] Once an affirmative determination in made at diamond 202, the logic proceeds to decision diamond 204 where the logic determines (e.g., based on signals from a camera in communication with the device) whether the user’s mouth and/or eyes are indicative of the user providing audible input to the device (e.g. using lip reading software, eye tracking software, etc... one or more signals from a camera gathering images of a user and providing them to a processor of the device may be analyzed, examined, etc. by the device for whether the user’s eyes and even more particularly the user’s pupils are directed at, around, or toward the device (which may be determined using eye tracking software)). 
 	Qian et al., Oku, Han et al. and VanBlon et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of activating the voice recognition in response to detecting the size of the image of the face as taught by Qian et al., using teaching of the threshold in the size of image of the face as taught by Oku for the benefit of starting processing the audio corresponding to that person’s voice, using teaching of neural network model trained at the user device as taught by Han et al. for the benefit of verifying a valid user in user log-in, payment service and/or access control, using teaching of the tracking the user’s eye direction as taught by VanBlon et al. for the benefit of beginning or pausing processing audible input (VanBlon et al. Fig. 2 elements 200-222, [0008] a determination that the user’s eyes are not looking at the device or toward the device, [0038] Once an affirmative determination in made at diamond 202, the logic proceeds to decision diamond 204 where the logic determines (e.g., based on signals from a camera in communication with the device) whether the user’s mouth and/or eyes are indicative of the user providing audible input to the device (e.g. using lip reading software, eye tracking software, etc... one or more signals from a camera gathering images of a user and providing them to a processor of the device may be analyzed, examined, etc. by the device for whether the user’s eyes and even more particularly the user’s pupils are directed at, around, or toward the device (which may be determined using eye tracking software)). 

 	With respect to Claim 12, Qian et al. in view of Oku and Han et al. teach all the limitations of Claim 8 upon which Claim 12 depends. Qian et al. in view of Oku and Han et al. fail to explicitly teach 
 	wherein determining whether to activate the voice recognition of the recording device associated with the smart display device further comprises: 
 	determining a gaze direction of the face in the image relative to the smart display device. 
	However, VanBlon et al.   
 	wherein determining whether to activate the voice recognition of the recording device associated with the smart display device further comprises: 
 	determining a gaze direction of the face in the image relative to the smart display device (VanBlon et al. [0008] a determination that the user’s eyes are not looking at the device or toward the device, [0038] Once an affirmative determination in made at diamond 202, the logic proceeds to decision diamond 204 where the logic determines (e.g., based on signals from a camera in communication with the device) whether the user’s mouth and/or eyes are indicative of the user providing audible input to the device (e.g. using lip reading software, eye tracking software, etc... one or more signals from a camera gathering images of a user and providing them to a processor of the device may be analyzed, examined, etc. by the device for whether the user’s eyes and even more particularly the user’s pupils are directed at, around, or toward the device (which may be determined using eye tracking software)). 
 	Qian et al., Oku, Han et al. and VanBlon et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the a determination that the user’s eyes are not looking at the device or toward the device, [0038] Once an affirmative determination in made at diamond 202, the logic proceeds to decision diamond 204 where the logic determines (e.g., based on signals from a camera in communication with the device) whether the user’s mouth and/or eyes are indicative of the user providing audible input to the device (e.g. using lip reading software, eye tracking software, etc... one or more signals from a camera gathering images of a user and providing them to a processor of the device may be analyzed, examined, etc. by the device for whether the user’s eyes and even more particularly the user’s pupils are directed at, around, or toward the device (which may be determined using eye tracking software)). 

 	With respect to Claim 19, Qian et al. in view of Oku and Han et al. teach all the limitations of Claim 15 upon which Claim 19 depends. Qian et al. in view of Oku and Han et al. fail to explicitly teach 
 	wherein determining whether to activate the voice recognition of the recording device associated with the smart display device further comprises: 
 	determining a gaze direction of the face in the image relative to the smart display device. 
	However, VanBlon et al.   
 	wherein determining whether to activate the voice recognition of the recording device associated with the smart display device further comprises: 
 	determining a gaze direction of the face in the image relative to the smart display device (VanBlon et al. [0008] a determination that the user’s eyes are not looking at the device or toward the device, [0038] Once an affirmative determination in made at diamond 202, the logic proceeds to decision diamond 204 where the logic determines (e.g., based on signals from a camera in communication with the device) whether the user’s mouth and/or eyes are indicative of the user providing audible input to the device (e.g. using lip reading software, eye tracking software, etc... one or more signals from a camera gathering images of a user and providing them to a processor of the device may be analyzed, examined, etc. by the device for whether the user’s eyes and even more particularly the user’s pupils are directed at, around, or toward the device (which may be determined using eye tracking software)). 
 	Qian et al., Oku, Han et al. and VanBlon et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of activating the voice recognition in response to detecting the size of the image of the face as taught by Qian et al., using teaching of the threshold in the size of image of the face as taught by Oku for the benefit of starting processing the audio corresponding to that person’s voice, using teaching of neural network model trained at the user device as taught by Han et al. for the benefit of verifying a valid user in user log-in, payment service and/or access control, using teaching of the tracking the user’s eye direction as taught by VanBlon et al. for the benefit of beginning or pausing processing audible input (VanBlon et al. Fig. 2 elements 200-222, [0008] a determination that the user’s eyes are not looking at the device or toward the device, [0038] Once an affirmative determination in made at diamond 202, the logic proceeds to decision diamond 204 where the logic determines (e.g., based on signals from a camera in communication with the device) whether the user’s mouth and/or eyes are indicative of the user providing audible input to the device (e.g. using lip reading software, eye tracking software, etc... one or more signals from a camera gathering images of a user and providing them to a processor of the device may be analyzed, examined, etc. by the device for whether the user’s eyes and even more particularly the user’s pupils are directed at, around, or toward the device (which may be determined using eye tracking software)). 

Conclusion
8. 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to THUYKHANH LE whose telephone number is (571)272-6429.  The examiner can normally be reached on Mon-Fri: 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RICHEMOND DORVIL can be reached on 571-272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/THUYKHANH LE/Primary Examiner, Art Unit 2658