DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4-8, 11-15, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Liu (PG-Pub. US 20210382542) in view of Naqvi (Deep Learning-Based Gaze Detection System for Automobile Drivers Using a NIR Camera Sensor, Publication date :02/03/2018).
Regarding claim 1:
Liu teaches: a method for waking up a device (¶ [0125] “FIG. 3A and FIG. 3B are a schematic flowchart of a screen wakeup method according to an embodiment of this application”), comprising: 
acquiring an environment image of a surrounding environment of a target device in real time (¶ [0129] “Step 302: Obtain a latest image frame in an image storage module as an input image for subsequent gaze identification or owner identification”);
and recognizing a face region of a user in the environment image; (¶ [0131] “Step 303: Determine whether there is a face image in the obtained latest image frame”);
and waking up the target device in a case of determining that the user is looking at the target device according to (¶ [0141] “:Step 309: Execute the gaze identification network by using the face image as an input, to output a gaze probability”. ¶ [0144] “Step 310: Compare the output gaze probability with a preset gaze threshold, and determine whether the face is gazing at the screen. If the gaze probability is greater than the preset gaze threshold, it is determined that the face is gazing at the screen, which indicates that the person behind the face is attempting to wake up the device” ¶ [0151] “Step 315: Wake up the screen, that is, adjust the screen state to the screen-on state”).
Liu does not specifically teaches acquiring a plurality of facial landmarks in the face region; acquiring a left eye image and a right eye image according to the facial landmarks; acquiring a left eye sight classification result and a right eye sight classification result according to the left eye image and the right eye image.
However, in a related field, Naqvi teaches: acquiring a plurality of facial landmarks in the face region (Section 4.1 “Overview of the proposed method”, FIG. 1, “…68 face landmarks are detected by the Dlib facial feature tracker [53] (steps (1) and (2) of Figure 1, and details are explained in Section 4.2)”);
acquiring a left eye image and a right eye image according to the facial landmarks (Section 4.1 “Overview of the proposed method”, FIG. 1, “…Then, the region-of-interest (ROI) images of face, left and right eye are obtained based on the corresponding face landmarks position (step (3) of Figure 1).”);
acquiring a left eye sight classification result and a right eye sight classification result according to the left eye image and the right eye image (Section 4.1 “Overview of the proposed method”, FIG. 1, “…Then, each set of feature values is normalized, and three distances are calculated by three sets of feature values (step (5) of Figure 1). Here, distance is calculated between the input set of feature values and that in each gaze zone. Finally, our system classifies the driver’s gaze zone based on score fusion of three distances (details are explained in Section 4.4.4).” Section 4.4.4. “Classifying Gaze Zones by Score Fusion of Three Distances”, “As explained in Section 4.4.1, after extracting three separate feature sets (three sets of 4096 features) from face, left eye, and right eye images (scheme 1), we normalized them to each other by min-max scaling. With the training data, we already saved the three (normalized) feature sets per each gaze zone of Figure 3a. Then, we can calculate three Euclidean distances between the three feature sets of inputs and the three saved on each gaze zone. After that, these three distances were combined based on score level fusion. Finally, one final score (distance) is obtained, and the gaze zone whose final score (distance) is smallest among 17 zones of Figure 3a is determined as the driver gazing region. Section 4.2 “Facial landmarks are used to localize and represent salient regions of the face, such as eyes, eyebrows, nose, mouth and jawline. It can be successfully applied to various applications of face alignment, face swapping, and blink detection etc.”).

    PNG
    media_image1.png
    579
    865
    media_image1.png
    Greyscale
Therefore, it would have been obvious to a person of ordinary skill in the art prior to the effective filing date of the claimed invention to have modified Liu to incorporate the teachings of Naqvi by including: acquiring a plurality of facial landmarks in the face region; acquiring a left eye image and a right eye image according to the facial landmarks; acquiring a left eye sight classification result and a right eye sight classification result according to the left eye image and the right eye image in order to determine gaze zones at which a user is looking.
	

Regarding claim 4:
Liu in view of Naqvi teaches the method according claim 1.
Naqvi further teaches: wherein the acquiring the left eye sight classification result and the right eye sight classification result according to the left eye image and the right eye image, comprises:
inputting the left eye image and the right eye image into a sight classification model respectively, to obtain the left eye sight classification result and the right eye sight classification result to be output by the sight classification model (Section 4.4.4. “Classifying Gaze Zones by Score Fusion of Three Distances”; “…The 17 outputs of the output layer in Figure 9 represent the 17 gaze regions of Figure 3a. If we use one CNN for gaze estimation, these 17 outputs can be used for the detection of final gaze position.”);
wherein the left eye sight classification result and the right eye sight classification result each comprise: looking up, looking down, looking left, looking right, looking forward, and closing an eye (See FIGS. 3a and 3b, the classification covers a wide variety of results such as looking up, left, right, and closing, see zones 6, 17, 1, and 4).

Regarding claim 5:
Liu in view of Naqvi teaches the method according claim 4.
Liu further teaches: wherein the determining that the user is looking at the target device according to the left eye sight classification result and the right eye sight classification result, comprises:
determining that the user is looking at the target device in a case that the left eye sight classification result and the right eye sight classification result are both looking forward (¶ [0015] “According to the screen wakeup method provided in this embodiment of this application, a probability value that a user corresponding to a first face image is gazing at the screen of the device may be determined by using the preconfigured neural network, to determine whether the user corresponding to the first face image is the user who is gazing at the screen of the device.” It is understood by a person having ordinary skill in the art that gazing at the screen means looking forward at it when facing it.).

Regarding claim 6:
Liu in view of Naqvi teaches the method according claim 1.
Liu further teaches: wherein the acquiring the plurality of facial landmarks in the face region, and acquiring the left eye image and the right eye image according to the facial landmarks, comprises:
performing identity verification on the user according to the plurality of facial landmarks (FIG. 3B, ¶ [0147] “Step 313: Compare the calculated distance with a preset distance threshold, and determine whether the face image belongs to an owner”); and
Liu does not specifically teach: acquiring the left eye image and the right eye image according to the facial landmarks in a case of determining that the user is a pre-registered valid user for wake-up.
However, Liu in view of Naqvi teaches: acquiring the left eye image and the right eye image according to the facial landmarks in a case of determining that the user is a pre-registered valid user for wake-up (Naqvi already teaches acquiring the left and right eye images as applied in claim 1, and therefore by combining Naqvi’s teaching with the method of Liu, the left and right images are acquired from a registered user image).

Regarding claim 7:
Liu in view of Naqvi teaches the method according claim 1.
Liu further teaches: wherein the target device is an intelligent speaker (¶ [0081] “In this application, the obtained target model/rule may be applied to different devices. For example, the obtained target model/rule may be applied to a terminal device, such as a mobile phone terminal, a tablet computer, a notebook computer, AR/VR, or a vehicle-mounted terminal.” A mobile phone can reasonably be regarded as an intelligent speaker);
the acquiring the environment image of the surrounding environment of the target device in real time, comprises:
acquiring the environment image of the surrounding environment of the target device in real time via at least one camera provided on the intelligent speaker (FIG. 6, camera 690, ¶ [0170] For example, in an implementation, the camera 690 or an image processing channel corresponding to the camera 690 may be configured to obtain M image frames.”).

Regarding claim 8: the claim limitations are similar to those of claim 1; therefore, rejected in the same manner as applied above.  
the claim differs from claim 1 by being an apparatus claim with some structural elements such as a processor and a memory (See Liu, FIG. 6, processor 610 and memory 630).
Regarding claims 11-14: the claims limitations are similar to those of claims 4-7; therefore, rejected in the same manner as applied above. 
Regarding claim 15: the claim limitations are similar to those of claim 1; therefore, rejected in the same manner as applied above.  
the claim differs from claim 1 by having a computer-readable medium storing instructions (See Liu, ¶ [0187] “This application further provides a computer-readable storage medium. The computer-readable storage medium stores instructions. When the instructions are run on a computer, the computer is enabled to perform steps in the screen wakeup method shown in FIG. 2 and FIG. 3A and FIG. 3B.”).
Regarding claims 18-20: the claims limitations are similar to those of claims 4-6; therefore, rejected in the same manner as applied above. 
Allowable Subject Matter
Claims 2-3, 9-10, and 16-17 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Regarding claim 2: Liu in view of Naqvi teaches the method according claim 1. Although Liu further teaches: ¶ [0132] “Step 304: Calculate a face box size, and obtain a largest face box and a face direction of the largest face box” and ¶ [0134] “In an example, an end-to-end multi-task network structure may be used to complete three tasks of face box positioning, face classification, and face direction classification” The prior art fails to disclose, teach, or suggest inputting the environment image into a face bounding box detection model, to obtain coordinates of a plurality of face bounding boxes to be output by the face bounding box detection model in the context of the claim as a whole. 
Regarding claim 3: the claim depends from claim 2; therefore, contains an allowable subject matter for the same reasons. 
Regarding claims 9-10, and 16-17: the claim limitations are similar to those of claims 2 and 3, respectively; therefore, objected to for the same reasons. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
YAMASAKI (US 20170279995) teaches: an information processing apparatus includes a controller that controls the information processing apparatus by making a transition of a state of the information processing apparatus in relation to power consumption of the information processing apparatus from a first state to a second state in which the power consumption is higher than that in the first state, and a line-of-sight detector that detects a line of sight toward the information processing apparatus (See FIGS. 3, 4, and 7). 
QIAN (CN 113626778) teaches: obtaining the image related to the environment around the electronic device, determining whether the eye feature in the obtained image satisfies the gaze condition, and if the eye feature satisfies the gaze condition.
Liu (CN 108509037) teaches: a method comprises: if the screen of the mobile terminal is in the off state, collecting the image by the camera; if there is a face image in the image, obtaining the eye image feature in the face image; according to the eye image feature, judging whether the sight of the user is towards the screen; if the sight of the user is towards the screen, then displaying the preset content on the screen. Thus, when the mobile terminal performs screen-off display, the preset content is displayed only when the line of sight of the user is towards the screen.
Sun (CN 106878559) teaches: obtaining the current iris information of the user and the current iris information data matching with the pre-stored iris information; when the current iris information corresponding to the data matching result is the matching is successful, obtaining the screen information corresponding to the current iris information; when the detected gaze condition pre-set in the screen gaze information satisfies the screen state of the current screen is switched from a sleep state to a wakeup state.
Sun (CN 106484113) teaches: a device comprises an obtaining module for obtaining the eye image, the eye image comprises image of human eye eyeball part, and a comparison module for the eyeball image with the preset image, image of the eyeball part of the preset image taken in advance for the user, and a driving module, for when the similarity of the eyeball image and the preset image is greater than or equal to the preset value, wake-up screen when the eyeball image and similarity of the preset image is less than the preset value, without waking the screen. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WASSIM MAHROUKA whose telephone number is (571)272-2945. The examiner can normally be reached Monday-Thursday 7:00-4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Edward Urban can be reached on (571)272-7899. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/WASSIM MAHROUKA/Examiner, Art Unit 2665                                                                                                                                                                                                        
/EDWARD F URBAN/Supervisory Patent Examiner, Art Unit 2665