DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/26/2022 has been entered.
Response to Amendment
Applicant's amendments and remarks submitted 10/26/2022 have been entered and considered, but are not found convincing. Claims 1,7, 9, 12, 19 have been amended. Claims 3-4, 10, 16-17 have been cancelled.  In summary, claims 1-2, 5-9, 11-15, 18-20 are pending in the application. 
Response to Arguments Claim Rejections - 35 U.S.C. 103: 
Applicant's arguments with respect to amended independent claim have been considered but are moot because the rejection have been modified to address the newly added limitations. The examiner now relies on Cao.

 Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
1. Claims 1-2, 5-8, 11-14, 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Liu, Feng, et al. "Joint face alignment and 3d face reconstruction." European Conference on Computer Vision. Springer, Cham, 2016.(“Liu”) in view of RODRIGUEZ et al, U.S Patent Application Publication o.20160086017(“RODRIGUEZ”) further in view of Cao, Chen, et al. "Facewarehouse: A 3d facial expression database for visual computing." IEEE Transactions on Visualization and Computer Graphics 20.3 (2013): 413-425. (“Cao”) further in view of Lee, Youn Joo, et al. "Single view-based 3D face reconstruction robust to self-occlusion." EURASIP Journal on Advances in Signal Processing 2012.1 (2012): 1-20 (“Lee”)  
Regarding independent claim 1, Liu teaches a face pose estimation ( Fig. 1.We view 2D landmarks are generated from a 3D face through 3D expression (fE) and pose (fP ) deformation, and camera projection (fC) (top row). While conventional face alignment and 3D face reconstruction are two separate tasks and the latter requires the former as the input, this paper performs these two tasks jointly, i.e., reconstructing a pose-expression-normalized (PEN) 3D face and estimating visible/invisible landmarks (green/red points) from a 2D face image with arbitrary poses and expressions “).method (Abstract. “We present an approach to simultaneously solve the two problems of face alignment and 3D face reconstruction from an input 2D face image of arbitrary poses and expressions.”)

    PNG
    media_image1.png
    259
    489
    media_image1.png
    Greyscale

Fig.2 of Liu
acquiring a two-dimensional face image (3.1 Overview as show in Fig.2 of Liu “Here, we use a 3D-to-2D mapping matrix M to approximate the composite effect of expression and pose induced deformation and camera projection. Given an input 2D face image I, our goal is to simultaneously locate its”);
constructing a three-dimensional face model corresponding to the two-dimensional face image ( Abstract. We present an approach to simultaneously solve the two problems of face alignment and 3D face reconstruction from an input 2D face image of arbitrary poses and expression. The proposed method iteratively and alternately applies two sets of cascaded regressors, one for updating 2D landmarks and the other for updating reconstructed pose expression-normalized (PEN) 3D face shape.”); wherein the constructing of the three-dimensional face model comprises: 
determining a projection mapping matrix from the three-dimensional average face model to the two-dimensional face image based on internal face feature points of the two-dimensional face image and the three-dimensional average face model (see 3.1 Overview”…Here, we use a 3D-to-2D mapping matrix M to approximate the composite effect of expression and pose induced deformation and camera projection. Given an input 2D face image I, our goal is to simultaneously locate its landmarks U and reconstruct its 3D face shape S. Note that, in some context, we also write the 3D face shape and the landmarks as column vectors: S = (x1, y1, z1, x2, y2, z2, · · · , xn, yn, zn)T, and U = (u1, v1, u2, v2, · · · , ul, vl)T, where ‘T’ is transpose operator. Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.”); wherein the internal face feature points include one or more of eyes, nose tip, mouth corner points, or eyebrows (Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.” Where Figure 2. Landmarks U including eyes, nose, eyebrows).
 constructing a first three-dimensional face model corresponding to the two- dimensional face image based on the projection mapping matrix and feature vectors of a three-dimensional feature face space (see 3.1 Overview, “…..Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.; see 3.3 Learning Landmark Regressors, last paragraph “We use 128-dim SIFT descriptors [24] as the local feature. The feature vector of h is a concatenation of the SIFT descriptors at all the l landmarks, i.e., a 128l-dim vector. If a landmark is invisible, no feature will be extracted, and its corresponding entries in h will be zero. It is worth mentioning that the regressors estimate the semantic positions of all landmarks including invisible landmarks.”; 3.5 Estimating 3D-to-2D Mapping and Landmark Visibility , “In order to refine the landmarks with the updated 3D face shape, we have to project the 3D shape to the 2D image with a 3D-to-2D mapping matrix. In this paper, we dynamically estimate the mapping matrix based on Sk and ˆUk. As discussed earlier in Sect. 3.1, the mapping matrix is a composite effect of expression and pose induced deformation and camera projection. Here, we assume a weak perspective projection for the camera projection as in prior work [18,38], and further assume that the expression and pose induced deformation can be approximated by a linear transform. As a result, the mapping matrix Mk is represented by a 2 × 4 matrix, and can be estimated as a least squares solution to the following fitting problem) and determining a face pose of the two-dimensional face image based on face feature points of the first three-dimensional face model and face feature points of the two- dimensional face image (see 3.1 Overview, “…..Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.; see 3.3 Learning Landmark Regressors, last paragraph “We use 128-dim SIFT descriptors [24] as the local feature. The feature vector of h is a concatenation of the SIFT descriptors at all the l landmarks, i.e., a 128l-dim vector. If a landmark is invisible, no feature will be extracted, and its corresponding entries in h will be zero. It is worth mentioning that the regressors estimate the semantic positions of all landmarks including invisible landmarks.”; 3.5 Estimating 3D-to-2D Mapping and Landmark Visibility , “In order to refine the landmarks with the updated 3D face shape, we have to project the 3D shape to the 2D image with a 3D-to-2D mapping matrix. In this paper, we dynamically estimate the mapping matrix based on Sk and ˆUk. As discussed earlier in Sect. 3.1, the mapping matrix is a composite effect of expression and pose induced deformation and camera projection. Here, we assume a weak perspective projection for the camera projection as in prior work [18,38], and further assume that the expression and pose induced deformation can be approximated by a linear transform. As a result, the mapping matrix Mk is represented by a 2 × 4 matrix, and can be estimated as a least squares solution to the following fitting problem” where project 3D shape to 2D image with a 3D-2D mapping matrix which is a composite effect of expression a pose).  Liu does not specific if project 3D shape to 2D image will determine a pose of 2D image.
However, RODRIGUEZ teaches the 3D image is projected in 2D so as to generate at least one dataset representing a pose-rectified 2D projected image (¶0089 “In step E of FIG. 1, the textured 3D image is projected in 2D so as to generate at least one dataset representing a pose-rectified 2D projected image, in the visible and/or in the NIR range. Various projections could be considered. At first, deformations caused by the camera and/or by perspective are preferably compensated. Then, in one embodiment, the projection generates a frontal facial image, i.e. a 2D image as seen from a viewer in front of the user 100. It is also possible to generate a non-frontal facial image, or a plurality of 2D projections, such as for example one frontal facial projection and one another profile projection. Other projections could be considered, including cartographic projections, or projections that introduce deformations in order to magnify discriminative parts of the face, in particular the eyes, the upper half of the face, and reduce the size of more deformable parts of the face, such as the mouth. It is also possible to morph the head to a generic model before comparison, in order to facilitate comparison. Purely mathematical projections, such as projections onto not easily representable space, could also be considered” where a pose-rectified 2D projected image is considered a face pose of the two-dimensional face image) 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify  joint face alignment and 3D face reconstruction method of Liu with projecting the 3D image data in 2D so as to generate data representing a pose-rectified 2D as seen in RODRIGUEZ  because this modification would generate a non-frontal facial image, or a plurality of 2D projections, such as for example one frontal facial projection and one another profile projection (¶0089 of RODRIGUEZ). Both Liu and RODRIGUEZ are understood to be silent on the remaining limitations of claim 1.
In the same field of endeavor, Cao teaches obtaining a plurality of three-dimensional face model samples ( see III. FaceWarehouse “In this section, we describe our pipeline for constructing FaceWarehouse and the techniques involved. We use Microsoft’s Kinect System to capture the geometry and texture information of various expressions of each subject. We register the frames from different views of the same expression to generate a smooth, low-noise depth map. The depth maps, together with the RGB images, are used to guide the deformation of a template mesh to generate the expression meshes. Once we obtain all the expression meshes of a single subject, we generate her individualspecific expression blendshapes. Fig. 1 shows the entire pipeline of processing one subject. Finally, the expression blendshapes from all subjects constitute our face database. As we have all the face models in a consistent topology, we can build a bilinear face model with two attributes, identity and expression.”; see Neutral expression. “ We first generate the face mesh for the neutral expression by using a two-step approach. Blanz and Vetter’s morphable model is automatically fitted to produce an initial matching mesh. Then a mesh deformation algorithm is employed to refine this mesh for better matching between the depth map and the feature points. Blanz and Vetter’s morphable model performs Principal Component Analysis (PCA) on 200 neutral face models.”; Face component transfer, “This application performs face component copy-and-paste to modify the expression in a facial image. It takes two images of the same person as input: one is the target photo that contains an undesirable expression and the other one is the reference image that contains the desired expression, such as smiling. As modifying expression causes global changes in one’s face, if we directly copy the local component from the reference image and paste/blend it to the target, the transferred component may not be compatible with the face contour or other components in the target image. A better approach proposed by Yang et al. [1] is to use 3D face models to guide the component transfer process. We follow this approach and use our bilinear face model to synthesize 3D face models matching the input images);  applying a dimensionality reduction algorithm to the plurality of three- dimensional face model samples to determine a three-dimensional average face model (B. Expression mesh and individual-specific blendshape generation, “…Blanz and Vetter’s morphable model performs Principal Component Analysis (PCA) on 200 neutral face models. Any face can be approximated as a linear combination of the average face and l leading PCA vectors: V =¯F +åli =1aiFi, where ¯F is the average face, and Fi is the i-th PCA vector. Our goal is to compute the coefficients ai to get the closest mesh in the PCA space. The first term corresponds to internal feature matching. Cj is the 3D position of the j-th feature point, while vi j is its corresponding vertex on the mesh V. The indices for these internal feature points on the mesh are simply marked on the average face in our implementation.”);
 determining a projection mapping matrix from the three-dimensional average face model to the two-dimensional face image based on internal face feature points of the two-dimensional face image and the three-dimensional average face model, wherein the internal face feature points include one or more of eyes, nose tip, mouth corner points, or eyebrows  (B. Expression mesh and individual-specific blendshape generation, “From smooth depth maps and corresponding color images, we generate the associated expression meshes. For each expression data, we first use Active Shape Model (ASM) [18] to locate 74 feature points on the color image, including the face contour, eye corners, brow boundary, mouth boundary, nose contour and tip. The automatically detected locations may not be accurate in all cases, especially for those expressions with relatively large deformation (e.g., mouth open and smile). We thus require a small amount of user interaction to refine the positions of some feature points–the user interaction is as simple as drag-and-dropping the feature points on the image. The 74 feature points are divided into two categories: the mi internal feature points (i.e., features on eyes, brows, nose and mouth, c.f. the green points in Fig. 3) located inside the face region, and the mc contour feature points (the yellow points in Fig. 3). Given the correspondence between the color image and the depth map, we can easily get the corresponding 3D positions from the depth map for internal feature points. We classify all contour feature points in the image as 2D. Neutral expression. We first generate the face mesh for the neutral expression by using a two-step approach. Blanz and Vetter’s morphable model is automatically fitted to produce an initial matching mesh. Then a mesh deformation algorithm is employed to refine this mesh for better matching between the depth map and the feature points. Blanz and Vetter’s morphable model performs Principal Component Analysis (PCA) on 200 neutral face models. Any face can be approximated as a linear combination of the average face and l leading PCA vectors: ..=1aiFi, where ¯F is the average face, and Fi is the i-th PCA vector. Our goal is to compute the coefficients ai to get the closest mesh in the PCA space. The energy to be minimized for feature point matching is defined as …The first term corresponds to internal feature matching. Cj is the 3D position of the j-th feature point, while vi j is its corresponding vertex on the mesh V. The indices for these internal feature points on the mesh are simply marked on the average face in our implementation. The second term is for contour feature matching. sk is a 2D feature point on the color image, vck is its corresponding 3D feature vertex on the mesh V, and Mpro j is the projection matrix of the camera. We use the method described in [1] to determine the indices of the contour feature points on the mesh V: We first project the face region of V to the image to get the 2D face mesh. Then we find its convex hull to get the points along the contour of the mesh. Among these points, we find the nearest one for each contour feature on the image, and assign it as the corresponding feature point on the mesh….”);
constructing a first three-dimensional face model corresponding to the two- dimensional face image based on the projection mapping matrix and feature vectors of a three-dimensional feature face space, wherein the constructing of the first three-dimensional face model comprises: performing contour feature point fitting on the first three-dimensional average face model based on face contour feature points of the two-dimensional face image(B. Expression mesh and individual-specific blendshape generation, “From smooth depth maps and corresponding color images, we generate the associated expression meshes. For each expression data, we first use Active Shape Model (ASM) [18] to locate 74 feature points on the color image, including the face contour, eye corners, brow boundary, mouth boundary, nose contour and tip. The automatically detected locations may not be accurate in all cases, especially for those expressions with relatively large deformation (e.g., mouth open and smile). We thus require a small amount of user interaction to refine the positions of some feature points–the user interaction is as simple as drag-and-dropping the feature points on the image. The 74 feature points are divided into two categories: the mi internal feature points (i.e., features on eyes, brows, nose and mouth, c.f. the green points in Fig. 3) located inside the face region, and the mc contour feature points (the yellow points in Fig. 3). Given the correspondence between the color image and the depth map, we can easily get the corresponding 3D positions from the depth map for internal feature points. We classify all contour feature points in the image as 2D. Neutral expression. We first generate the face mesh for the neutral expression by using a two-step approach. Blanz and Vetter’s morphable model is automatically fitted to produce an initial matching mesh. Then a mesh deformation algorithm is employed to refine this mesh for better matching between the depth map and the feature points. Blanz and Vetter’s morphable model performs Principal Component Analysis (PCA) on 200 neutral face models. Any face can be approximated as a linear combination of the average face and l leading PCA vectors: ..=1aiFi, where ¯F is the average face, and Fi is the i-th PCA vector. Our goal is to compute the coefficients ai to get the closest mesh in the PCA space. The energy to be minimized for feature point matching is defined as …The first term corresponds to internal feature matching. Cj is the 3D position of the j-th feature point, while vi j is its corresponding vertex on the mesh V. The indices for these internal feature points on the mesh are simply marked on the average face in our implementation. The second term is for contour feature matching. sk is a 2D feature point on the color image, vck is its corresponding 3D feature vertex on the mesh V, and Mpro j is the projection matrix of the camera. We use the method described in [1] to determine the indices of the contour feature points on the mesh V: We first project the face region of V to the image to get the 2D face mesh. Then we find its convex hull to get the points along the contour of the mesh. Among these points, we find the nearest one for each contour feature on the image, and assign it as the corresponding feature point on the mesh…..”)
Therefore, in combination of Liu and RODRIGUEZ,  it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify  joint face alignment and 3D face reconstruction method of Liu with feature points are divided into two categories: the internal feature points (i.e., features on eyes, brows, nose and mouth) located inside the face region, and the contour feature points as seen in Cao because this modification would easily get the corresponding 3D positions from the depth map for internal feature points and classify all contour feature points in the image as 2D (B. Expression mesh and individual-specific blendshape generation, second paragraph of Cao). Liu,  RODRIGUEZ and Cao are understood to be silent on the remaining limitations of claim 1. 
In the same field of endeavor, Lee teaches wherein the constructing of the three-dimensional face model comprises:
applying a dimensionality reduction algorithm to the plurality of three- dimensional face model samples to determine a three-dimensional average face model (see Introduction, “ Among the single view-based methods, shape-from shading (SFS) is a traditional method for deriving a 3D facial shape from the brightness variations in a single image. However, SFS-based methods have impractical constraints because the Lambertian reflectance model and a known light source direction need to be assumed to produce accurate results [10-12]. Recently, several new techniques have been proposed to overcome this problem. These methods reconstruct a 3D face by modeling the relationships between the intensities and the depth information of the face using statistical learning techniques, such as principal component analysis (PCA), partial least squares, and canonical correlation analysis [13-16]”; see section  DMM and self-occlusion problem S3DMM, pages 2-3 of Lee “In S3DMM, the geometry of a face is defined as a shape vector S ¼ X1; Y1; Z1; X2; . . . Yn; Zn ð ÞT 2 R3n, which contains the X , Y , and Z-coordinates of n vertices. The original 3DMM generally uses a dense shape with thousands of vertices, whereas S3DMM uses a sparse shape with only dozens of vertices. In order to build a morphable shape model, S3DMM performs PCA on a training set of shape vectors Sj. The mean shape s0 and m shape variations si are then obtained, and a new shape S can be expressed as a linear combination of the mean shape s0 and the shape variations si as follows… where PCA (principal component analysis) is considered as a dimensionality reduction algorithm”);
constructing a first three-dimensional face model corresponding to the two- dimensional face image based on the projection mapping matrix and feature vectors of a three-dimensional feature face space (see section  DMM and self-occlusion problem S3DMM, pages 2-3 of Lee “In S3DMM, the geometry of a face is defined as a shape vector S ¼ X1; Y1; Z1; X2; . . . Yn; Zn ð ÞT 2 R3n, which contains the X , Y , and Z-coordinates of n vertices. The original 3DMM generally uses a dense shape with thousands of vertices, whereas S3DMM uses a sparse shape with only dozens of vertices. In order to build a morphable shape model, S3DMM performs PCA on a training set of shape vectors Sj. The mean shape s0 and m shape variations si are then obtained, and a new shape S can be expressed as a linear combination of the mean shape s0 and the shape variations si as follows…. Given the 2D FFPs of an input face image, such as s2d ¼ x1; y1; x2; . . . yn ð ÞT 2 R2n , the shape parameter β needs to be determined such that it minimizes the shape residual between the projected 3D facial shape generated by the shape parameter and the input 2D facial shape. The optimal shape and pose parameters ðβ; Rθ; TÞ are obtained from (2): …Where ~S is a 3 × n matrix that is reshaped from the 3n × 1 model shape vector S obtained using (1), ~s2d is a 2×n matrix that is reshaped from the 2n × 1 input shape vector s2d , P is a 2 × 3 orthographic projection matrix, ~T is a 3 × n translation matrix consisting of n translation vectors T ¼ tx ty tz T , and Rθ is a 3 × 3 rotation matrix where the yaw angle is θ. Note that in this paper, we consider mainly yaw rotation because the self-occlusion caused by yaw rotation is relatively greater than that caused by pitch rotation, and tz is set to 0 because an orthographic projection is assumed…. shown in Algorithm 1, the procedure for 3D model fitting is as follows. First, the shape parameter β0 and translation parameter T0 are initialized to 0 and the input 2D FFPs s2d are aligned with the 2D mean shape obtained by projecting the 3D mean shape (s0) with a frontal pose onto the x–y plane. As the alignment method, we use the Procrustes analysis, which includes translation, rotation, and scaling [22]. The optimal model parameters are determined by alternately updating the pose parameter (Rθ, T) at the fixed β and updating the shape parameter β at fixed (Rθ,T) until the shape residual error converges. The cost function is solved as a least squares problem and the rotation matrix is calculated by QR decomposition, as in [5]. Finally, a new 3D facial shape S3d is reconstructed by applying the optimal shape parameter β to (1).”), wherein the constructing of the first three-dimensional face model comprises: 
performing contour feature point fitting on the first three-dimensional face model based on face contour feature points of the two-dimensional face image (see Self-occlusion problem, page 4 of Lee “….Figure 1 shows self-occlusion errors that occur after comparing the observed 2D FFPs with the ground-truth 2D FFPs of the facial contours. When detecting the facial contour FFPs in a half-profile view image, the visible FFPs laid on the visible facial region can be detected as those of the ground-truth facial contour, but the observed FFPs on the occluded facial region are located in the outline of the face because the occluded real facial contour cannot be observed, as shown in Figure 1a. Therefore, the 2D FFPs observed in a rotated face image have location errors, which are the differences between the observed FFPs and the occluded real FFPs, as shown in Figure 1c”; page 4, Proposed 3D face reconstruction method, Overall procedure of the proposed method. “Overall procedure of the proposed method The proposed 3D face reconstruction process starts with the localization of the FFPs in a given 2D face image. To detect self-occlusion in an input face, the head pose is estimated using a cylindrical head model-based method [23]. The estimated pose can then be used to determine which FFPs are self-occluded. Next, a sparse 3D facial shape is reconstructed using the model fitting process based on the selected visible FFPs. Subsequently, a dense 3D facial shape is interpolated from the reconstructed sparse 3D facial shape using the Thin Plate Spline (TPS) method [24,25]. Finally, the facial texture directly extracted from the input image is mapped onto the dense 3D facial shape. The overall procedure of the proposed method is shown in Figure 3.”); determining an error between the two-dimensional face image and the three-dimensional average face model;  in response to the error being less than a preset error, adopting the first three-dimensional average face model as the three-dimensional face model (see section  DMM and self-occlusion problem S3DMM, pages 2-3  of Lee “….As shown in Algorithm 1 3D, the procedure for 3D model fitting is as follows. First, the shape parameter β0 and translation parameter T0 are initialized to 0 and the input 2D FFPs s2d are aligned with the 2D mean shape obtained by projecting the 3D mean shape (s0) with a frontal pose onto the x–y plane. As the alignment method, we use the Procrustes analysis, which includes translation, rotation, and scaling [22]. The optimal model parameters are determined by alternately updating the pose parameter (Rθ, T) at the fixed β and updating the shape parameter β at fixed (Rθ,T) until the shape residual error converges. The cost function is solved as a least squares problem and the rotation matrix is calculated by QR decomposition, as in [5]. Finally, a new 3D facial shape S3d is reconstructed by applying the optimal shape parameter β to (1).
Algorithm 1 3D Model Fitting [5]

    PNG
    media_image2.png
    269
    361
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    132
    341
    media_image3.png
    Greyscale
” where step 5 shape residual error is less than preset error €  then, the shape parameter β at fixed (Rθ,T) is the final shape parameters which is considered as adopting the first three-dimensional average face model as the three-dimensional face model) ;
and in response to the error being greater than or equal to the preset error, determining a new projection mapping matrix from the fitted three- dimensional average face model to the two-dimensional face image and constructing the first three-dimensional face model based on the new projection mapping matrix (see section  DMM and self-occlusion problem S3DMM, pages 2-3  of Lee “….As shown in Algorithm 1 3D, the procedure for 3D model fitting is as follows. First, the shape parameter β0 and translation parameter T0 are initialized to 0 and the input 2D FFPs s2d are aligned with the 2D mean shape obtained by projecting the 3D mean shape (s0) with a frontal pose onto the x–y plane. As the alignment method, we use the Procrustes analysis, which includes translation, rotation, and scaling [22]. The optimal model parameters are determined by alternately updating the pose parameter (Rθ, T) at the fixed β and updating the shape parameter β at fixed (Rθ,T) until the shape residual error converges. The cost function is solved as a least squares problem and the rotation matrix is calculated by QR decomposition, as in [5]. Finally, a new 3D facial shape S3d is reconstructed by applying the optimal shape parameter β to (1).”)
Algorithm 1 3D Model Fitting [5]

    PNG
    media_image2.png
    269
    361
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    132
    341
    media_image3.png
    Greyscale
” where step 5 shape residual error is not less than preset error € which means then the error being greater than or equal to the preset error, then go to step 3-4 where  update R and T of the matrix which is considered as new projection mapping matrix and reconstruct 3D facial shape with shape parameter after update R and tT which is considered as determining a new projection mapping matrix from the fitted three- dimensional average face model to the two-dimensional face image and constructing the first three-dimensional face model based on the new projection mapping matrix) 
Therefore, in combination of Liu ,RODRIGUEZ and Cao,  it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify  joint face alignment and 3D face reconstruction method of Liu with using proposed 3D Model Fitting of Lee because this modification would find the optimal shape parameter and pose parameter until the shape residual converges (see section  DMM and self-occlusion problem S3DMM, page 3, right column, first paragraph of Lee). 
Thus, the combination of Liu,  RODRIGUEZ, Cao and Lee teaches a face pose estimation method, comprising: acquiring a two-dimensional face image; constructing a three-dimensional face model corresponding to the two-dimensional face image, wherein the constructing of the three-dimensional face model comprises: obtaining a plurality of three-dimensional face model samples; applying a dimensionality reduction algorithm to the plurality of three- dimensional face model samples to determine a three-dimensional average face model; determining a projection mapping matrix from the three-dimensional average face model to the two-dimensional face image based on internal face feature points of the two-dimensional face image and the three-dimensional average face model, wherein the internal face feature points include one or more of eyes, nose tip, mouth corner points, or eyebrows; constructing a first three-dimensional face model corresponding to the two- dimensional face image based on the projection mapping matrix and feature vectors of a three-dimensional feature face space, wherein the constructing of the first three-dimensional face model comprises: performing contour feature point fitting on the first three-dimensional average face model based on face contour feature points of the two-dimensional face image; determining an error between the two-dimensional face image and the three-dimensional average face model:  in response to the error being less than a preset error, adopting the fitted three-dimensional average face model as the first three- dimensional face model; and in response to the error being greater than or equal to the preset error, determining a new projection mapping matrix from the fitted three- dimensional average face model to the two-dimensional face image and constructing the first three-dimensional face model based on the new projection mapping matrix; and determining a face pose of the two-dimensional face image based on face feature points of the first three-dimensional face model and face feature points of the two- dimensional face image.
Regarding claim 2, Liu,  RODRIGUEZ, Cao and Lee teach the method of claim 1, wherein the constructing the three-dimensional face model corresponding to the two-dimensional face image comprises: constructing the three-dimensional face model corresponding to the two-dimensional face image by using a face shape fitting algorithm (3.1 Overview of Liu, “We denote an n-vertex 3D face shape of neutral expression and frontal pose as,.. and a subset of S with columns corresponding to l landmarks as SL. The projections of these landmarks on the 2D face image I are represented by …Here, we use a 3D-to-2D mapping matrix M to approximate the composite effect of expression and pose induced deformation and camera projection. Given an input 2D face image I, our goal is to simultaneously locate its landmarks U and reconstruct its 3D face shape S. Note that, in some context, we also write the 3D face shape and the landmarks as column vectors: S = (x1, y1, z1, x2, y2, z2, · · · , xn, yn, zn)T, and U = (u1, v1, u2, v2, · · · , ul, vl)T, where ‘T’ is transpose operator. Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks…”; abstract, Figure.2-Figure 4 o Cao; 5 Estimating 3D-to-2D Mapping and Landmark Visibility, “In order to refine the landmarks with the updated 3D face shape, we have to project the 3D shape to the 2D image with a 3D-to-2D mapping matrix. In this paper, we dynamically estimate the mapping matrix based on Sk and ˆUk. As discussed earlier in Sect. 3.1, the mapping matrix is a composite effect of expression and pose induced deformation and camera projection. Here, we assume a weak perspective projection for the camera projection as in prior work [18,38], and further assume that the expression and pose induced deformation can be approximated by a linear transform.”; Reconstruction accuracy across expressions. Figure 4(b) shows the average MAE of our proposed method across expressions. Although the error increases as expressions become intensive, the maximum increment (i.e., SU vs.NE) is below 7%. This proves the robustness of the proposed method in normalizing expressions while maintaining model individualities. Figure 6 shows the reconstruction and face alignment results of a subject under seven expressions.) In addition, the same motivation is used as the rejection for claim 1.
Regarding claim 5, Liu,  RODRIGUEZ, Cao and Lee teach the method of claim 1, wherein the constructing the three-dimensional face model corresponding to the two-dimensional face image comprises: constructing the three-dimensional face model corresponding to the two-dimensional face image by using a facial expression fitting algorithm (3.1 Overview of Liu, “We denote an n-vertex 3D face shape of neutral expression and frontal pose as,.. and a subset of S with columns corresponding to l landmarks as SL. The projections of these landmarks on the 2D face image I are represented by …Here, we use a 3D-to-2D mapping matrix M to approximate the composite effect of expression and pose induced deformation and camera projection. Given an input 2D face image I, our goal is to simultaneously locate its landmarks U and reconstruct its 3D face shape S. Note that, in some context, we also write the 3D face shape and the landmarks as column vectors: S = (x1, y1, z1, x2, y2, z2, · · · , xn, yn, zn)T, and U = (u1, v1, u2, v2, · · · , ul, vl)T, where ‘T’ is transpose operator. Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks…”; B. Expression mesh and individual-specific blendshape generation and C. Blinear face model of Cao; 3.5 Estimating 3D-to-2D Mapping and Landmark Visibility, “In order to refine the landmarks with the updated 3D face shape, we have to project the 3D shape to the 2D image with a 3D-to-2D mapping matrix. In this paper, we dynamically estimate the mapping matrix based on Sk and ˆUk. As discussed earlier in Sect. 3.1, the mapping matrix is a composite effect of expression and pose induced deformation and camera projection. Here, we assume a weak perspective projection for the camera projection as in prior work [18,38], and further assume that the expression and pose induced deformation can be approximated by a linear transform.”; Reconstruction accuracy across expressions. Figure 4(b) shows the average MAE of our proposed method across expressions. Although the error increases as expressions become intensive, the maximum increment (i.e., SU vs.NE) is below 7%. This proves the robustness of the proposed method in normalizing expressions while maintaining model individualities. Figure 6 shows the reconstruction and face alignment results of a subject under seven expressions.) In addition, the same motivation is used as the rejection for claim 1.
Regarding claim 6, Liu,  RODRIGUEZ, Cao and Lee teach the method of claim 1, wherein the constructing the three-dimensional face model corresponding to the two-dimensional face image comprises: constructing the three-dimensional face model corresponding to the two-dimensional face image by using a face shape and expression fitting algorithm ((3.1 Overview of Liu, “We denote an n-vertex 3D face shape of neutral expression and frontal pose as,.. and a subset of S with columns corresponding to l landmarks as SL. The projections of these landmarks on the 2D face image I are represented by …Here, we use a 3D-to-2D mapping matrix M to approximate the composite effect of expression and pose induced deformation and camera projection. Given an input 2D face image I, our goal is to simultaneously locate its landmarks U and reconstruct its 3D face shape S. Note that, in some context, we also write the 3D face shape and the landmarks as column vectors: S = (x1, y1, z1, x2, y2, z2, · · · , xn, yn, zn)T, and U = (u1, v1, u2, v2, · · · , ul, vl)T, where ‘T’ is transpose operator. Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks…”; B. Expression mesh and individual-specific blendshape generation and C. Blinear face model of Cao; 3.5 Estimating 3D-to-2D Mapping and Landmark Visibility, “In order to refine the landmarks with the updated 3D face shape, we have to project the 3D shape to the 2D image with a 3D-to-2D mapping matrix. In this paper, we dynamically estimate the mapping matrix based on Sk and ˆUk. As discussed earlier in Sect. 3.1, the mapping matrix is a composite effect of expression and pose induced deformation and camera projection. Here, we assume a weak perspective projection for the camera projection as in prior work [18,38], and further assume that the expression and pose induced deformation can be approximated by a linear transform.”; Reconstruction accuracy across expressions. Figure 4(b) shows the average MAE of our proposed method across expressions. Although the error increases as expressions become intensive, the maximum increment (i.e., SU vs.NE) is below 7%. This proves the robustness of the proposed method in normalizing expressions while maintaining model individualities. Figure 6 shows the reconstruction and face alignment results of a subject under seven expressions.) In addition, the same motivation is used as the rejection for claim 1.
Regarding claim 7, Liu,  RODRIGUEZ, Cao and Lee teach the method of claim 6, wherein the face shape and expression fitting algorithm comprises: 
performing, based on the internal feature points of the two-dimensional face image and at least one facial expression base corresponding to the three-dimensional average face model, expression fitting on the first three-dimensional face model fitted with the face contour feature points (see Figure .2 where showing landmarks contour on 2D image; see 3.1 Overview of Liu, “…..Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.;, 4.1 Protocols “…Experiment setup. During training and testing, each image is associated with a bounding box, which specifies the face region in the image. To initialize the landmarks in it, the mean of the landmarks in all neutral frontal training images is fitted to the face region via a similarity transform. In this paper, we set the number of iterations K = 5 (discussion of convergence issue is provided in supplemental material). SIFT descriptors are computed on 32 × 32 local patches around the landmarks, and the implementation by [35] is used in our experiments.” where the mean of the landmarks is fitted to the face region which is bounding box is considered as contour feature point ; 3.5 Estimating 3D-to-2D Mapping and Landmark Visibility, “In order to refine the landmarks with the updated 3D face shape, we have to project the 3D shape to the 2D image with a 3D-to-2D mapping matrix. In this paper, we dynamically estimate the mapping matrix based on Sk and ˆUk. As discussed earlier in Sect. 3.1, the mapping matrix is a composite effect of expression and pose induced deformation and camera projection. Here, we assume a weak perspective projection for the camera projection as in prior work [18,38], and further assume that the expression and pose induced deformation can be approximated by a linear transform.”; Reconstruction accuracy across expressions. Figure 4(b) shows the average MAE of our proposed method across expressions. Although the error increases as expressions become intensive, the maximum increment (i.e., SU vs.NE) is below 7%. This proves the robustness of the proposed method in normalizing expressions while maintaining model individualities. Figure 6 shows the reconstruction and face alignment results of a subject under seven expressions.”; B. Expression mesh and individual-specific blendshape generation of Cao, “From smooth depth maps and corresponding color images, we generate the associated expression meshes. For each expression data, we first use Active Shape Model (ASM) [18] to locate 74 feature points on the color image, including the face contour, eye corners, brow boundary, mouth boundary, nose contour and tip. The automatically detected locations may not be accurate in all cases, especially for those expressions with relatively large deformation (e.g., mouth open and smile). We thus require a small amount of user interaction to refine the positions of some feature points–the user interaction is as simple as drag-and-dropping the feature points on the image. The 74 feature points are divided into two categories: the mi internal feature points (i.e., features on eyes, brows, nose and mouth, c.f. the green points in Fig. 3) located inside the face region, and the mc contour feature points (the yellow points in Fig. 3). Given the correspondence between the color image and the depth map, we can easily get the corresponding 3D positions from the depth map for internal feature points. We classify all contour feature points in the image as 2D. Neutral expression. We first generate the face mesh for the neutral expression by using a two-step approach. Blanz and Vetter’s morphable model is automatically fitted to produce an initial matching mesh. Then a mesh deformation algorithm is employed to refine this mesh for better matching between the depth map and the feature points. Blanz and Vetter’s morphable model performs Principal Component Analysis (PCA) on 200 neutral face models. Any face can be approximated as a linear combination of the average face and l leading PCA vectors: ..=1aiFi, where ¯F is the average face, and Fi is the i-th PCA vector. Our goal is to compute the coefficients ai to get the closest mesh in the PCA space. The energy to be minimized for feature point matching is defined as …The first term corresponds to internal feature matching. Cj is the 3D position of the j-th feature point, while vi j is its corresponding vertex on the mesh V. The indices for these internal feature points on the mesh are simply marked on the average face in our implementation. The second term is for contour feature matching. sk is a 2D feature point on the color image, vck is its corresponding 3D feature vertex on the mesh V, and Mpro j is the projection matrix of the camera. We use the method described in [1] to determine the indices of the contour feature points on the mesh V: We first project the face region of V to the image to get the 2D face mesh. Then we find its convex hull to get the points along the contour of the mesh. Among these points, we find the nearest one for each contour feature on the image, and assign it as the corresponding feature point on the mesh”; see 3D Face model fitting, pages 7-8  of Lee “A detailed description of the proposed model fitting method is shown in Algorithm 2. This is a modified version of the earlier S3DMM-based algorithm mentioned in Section “S3DMM”. The proposed model fitting scheme is based on the selected visible FFPs, which eliminates the self-occlusion effect. As a result, the cost function of (2) is modified as follows: arg minβ;Rθ;T Mθ .. where the symbol “°” represents the Hadamard product, which is known as entry-wise multiplication [33], while Mθ is the masking matrix at rotation angle θ. Mθ isobtained from the index table of the visible FFP for the estimated pose, as explained in Sections “Head pose estimation” and “Determination of visible FFPs”. We can calculate the shape residual between the visible FFPs of the shape model and the input 2D facial shape using this masking matrix. The shape parameter β and pose parameter (Rθ,T) can be obtained without any selfocclusion effect by minimizing this shape residual. The proposed 3D model fitting algorithm has the following two advantages compared with the previous method:
1) The pose angle ^θ estimated by the cylindrical model is used for the pose parameter initialization. Therefore, the parameter estimation starts from a relatively exact initial pose parameter, which enhances the 3D face reconstruction performance. During the alignment step, an accurate alignment result is obtained by aligning the input 2D FFPs with the FFPs of the 2D mean shape, which are obtained by rotating the 3D mean shape (s0) from 0° to ^θ and projecting it onto the x–y plane.
2) 3D model fitting is achieved on the basis of the visible FFPs by using the masking matrix. Therefore, the proposed method can reconstruct 3D faces that are less affected by self-occlusion.”)); and 
in response to the error between the two-dimensional face image and the first three- 40Client Ref. No. PCT15416USAttorney Docket No. 60YN-322189dimensional face model fitted the facial expression (see Reconstruction accuracy across expressions “Figure 4(b) shows the average MAE of our proposed method across expressions. Although the error increases as expressions become intensive, the maximum increment (i.e., SU vs. NE) is below 7%. This proves the robustness of the proposed method in normalizing expressions while maintaining model individualities. Figure 6 shows the reconstruction and face alignment results of a subject under seven expressions.”; Fitting 3D face mesh to image of Cao “ To calculate a 3D face mesh using our bilinear model that can match the face image well, we first localize a set of facial feature points in the image in the same way as we suggest in Section III-B. Then we estimate the rigid transformation of the face model as well as the identity and expression weights in the bilinear face model to minimize the matching error between the feature points on the image and the face mesh. Following previous work [5], we assume that the camera projection is weakly perspective. Every mesh vertex vk is projected to the image space as pk = sRvk+t (13)  where the rigid transformation consists of a scaling factor s, a 3D rotation matrix R and a translation vector t. The mesh vertex position vk can be computed from the bilinear face model according to Eq. (10). The matching error of the feature points on the image and the mesh is defined as …where s(k) is the feature point positions on the image. This energy can be easily minimized using the coordinate descent method as described in [5].”)   and 
in response to the error between the two-dimensional face image and the first three- 40Client Ref. No. PCT15416USAttorney Docket No. 60YN-322189dimensional face model fitted the facial expression being less than the preset error, adopting the first three-dimensional face model fitted with the expression as the three-dimensional face model (see Reconstruction accuracy across expressions of Liu “Figure 4(b) shows the average MAE of our proposed method across expressions. Although the error increases as expressions become intensive, the maximum increment (i.e., SU vs. NE) is below 7%. This proves the robustness of the proposed method in normalizing expressions while maintaining model individualities. Figure 6 shows the reconstruction and face alignment results of a subject under seven expressions.”; see section III. FaceWarehouse of Cao;  see section  DMM and self-occlusion problem S3DMM, pages 2-3  of Lee “….As shown in Algorithm 1 3D, the procedure for 3D model fitting is as follows. First, the shape parameter β0 and translation parameter T0 are initialized to 0 and the input 2D FFPs s2d are aligned with the 2D mean shape obtained by projecting the 3D mean shape (s0) with a frontal pose onto the x–y plane. As the alignment method, we use the Procrustes analysis, which includes translation, rotation, and scaling [22]. The optimal model parameters are determined by alternately updating the pose parameter (Rθ, T) at the fixed β and updating the shape parameter β at fixed (Rθ,T) until the shape residual error converges. The cost function is solved as a least squares problem and the rotation matrix is calculated by QR decomposition, as in [5]. Finally, a new 3D facial shape S3d is reconstructed by applying the optimal shape parameter β to (1).; see 3D Face model fitting, pages 7-8 “…….We can calculate the shape residual between the visible FFPs of the shape model and the input 2D facial shape using this masking matrix. The shape parameter β and pose parameter (Rθ,T) can be obtained without any selfocclusion effect by minimizing this shape residual. The proposed 3D model fitting algorithm has the following two advantages compared with the previous method:
1) The pose angle ^θ estimated by the cylindrical model is used for the pose parameter initialization. Therefore, the parameter estimation starts from a relatively exact initial pose parameter, which enhances the 3D face reconstruction performance. During the alignment step, an accurate alignment result is obtained by aligning the input 2D FFPs with the FFPs of the 2D mean shape, which are obtained by rotating the 3D mean shape (s0) from 0° to ^θ and projecting it onto the x–y plane.
2) 3D model fitting is achieved on the basis of the visible FFPs by using the masking matrix. Therefore, the proposed method can reconstruct 3D faces that are less affected by self-occlusion.

    PNG
    media_image4.png
    487
    357
    media_image4.png
    Greyscale

where step 5 shape residual error is less than preset error €  then, the shape parameter β at fixed (Rθ,T) is the final shape parameters which is considered as adopting the first three-dimensional face model fitted with the expression as the three-dimensional face model) In addition, the same motivation is used as the rejection for claim 1.
Regarding claim 8, Liu,  RODRIGUEZ, Cao and Lee teach the method of claim 7, further comprising: in response to the error being greater than or equal to the preset error, constructing the three-dimensional face model corresponding to the two-dimensional face image based on the three-dimensional face model fitted with the facial expression (see section III. FaceWarehouse of Cao; see section  DMM and self-occlusion problem S3DMM, pages 2-3  of Lee “….As shown in Algorithm 1 3D, the procedure for 3D model fitting is as follows. First, the shape parameter β0 and translation parameter T0 are initialized to 0 and the input 2D FFPs s2d are aligned with the 2D mean shape obtained by projecting the 3D mean shape (s0) with a frontal pose onto the x–y plane. As the alignment method, we use the Procrustes analysis, which includes translation, rotation, and scaling [22]. The optimal model parameters are determined by alternately updating the pose parameter (Rθ, T) at the fixed β and updating the shape parameter β at fixed (Rθ,T) until the shape residual error converges. The cost function is solved as a least squares problem and the rotation matrix is calculated by QR decomposition, as in [5]. Finally, a new 3D facial shape S3d is reconstructed by applying the optimal shape parameter β to (1).; see 3D Face model fitting, pages 7-8 “…….We can calculate the shape residual between the visible FFPs of the shape model and the input 2D facial shape using this masking matrix. The shape parameter β and pose parameter (Rθ,T) can be obtained without any selfocclusion effect by minimizing this shape residual. The proposed 3D model fitting algorithm has the following two advantages compared with the previous method:
1) The pose angle ^θ estimated by the cylindrical model is used for the pose parameter initialization. Therefore, the parameter estimation starts from a relatively exact initial pose parameter, which enhances the 3D face reconstruction performance. During the alignment step, an accurate alignment result is obtained by aligning the input 2D FFPs with the FFPs of the 2D mean shape, which are obtained by rotating the 3D mean shape (s0) from 0° to ^θ and projecting it onto the x–y plane.
2) 3D model fitting is achieved on the basis of the visible FFPs by using the masking matrix. Therefore, the proposed method can reconstruct 3D faces that are less affected by self-occlusion.

    PNG
    media_image4.png
    487
    357
    media_image4.png
    Greyscale

where step 5 shape residual error is not less than preset error € which means then the error being greater than or equal to the preset error, then go to step 3-4 where reconstruct 3D facial shape with shape parameter which is considered as constructing the three-dimensional face model corresponding to the two-dimensional face image based on the first three-dimensional face model fitted with the facial expression ) In additional, the same motivation is used as the rejection for claim 1.
Regarding claim 11, Liu, RODRIGUEZ, Cao and Lee teach the method of claim 7, further comprising: determining at least one facial expression base corresponding to the three-dimensional average face model based on the plurality of three-dimensional face model samples (B. Expression mesh and individual-specific blendshape generation, “From smooth depth maps and corresponding color images, we generate the associated expression meshes. For each expression data, we first use Active Shape Model (ASM) [18] to locate 74 feature points on the color image, including the face contour, eye corners, brow boundary, mouth boundary, nose contour and tip. The automatically detected locations may not be accurate in all cases, especially for those expressions with relatively large deformation (e.g., mouth open and smile). We thus require a small amount of user interaction to refine the positions of some feature points–the user interaction is as simple as drag-and-dropping the feature points on the image. The 74 feature points are divided into two categories: the mi internal feature points (i.e., features on eyes, brows, nose and mouth, c.f. the green points in Fig. 3) located inside the face region, and the mc contour feature points (the yellow points in Fig. 3). Given the correspondence between the color image and the depth map, we can easily get the corresponding 3D positions from the depth map for internal feature points. We classify all contour feature points in the image as 2D. Neutral expression. We first generate the face mesh for the neutral expression by using a two-step approach. Blanz and Vetter’s morphable model is automatically fitted to produce an initial matching mesh. Then a mesh deformation algorithm is employed to refine this mesh for better matching between the depth map and the feature points. Blanz and Vetter’s morphable model performs Principal Component Analysis (PCA) on 200 neutral face models. Any face can be approximated as a linear combination of the average face and l leading PCA vectors: ..=1aiFi, where ¯F is the average face, and Fi is the i-th PCA vector. Our goal is to compute the coefficients ai to get the closest mesh in the PCA space. The energy to be minimized for feature point matching is defined as …The first term corresponds to internal feature matching. Cj is the 3D position of the j-th feature point, while vi j is its corresponding vertex on the mesh V. The indices for these internal feature points on the mesh are simply marked on the average face in our implementation. The second term is for contour feature matching. sk is a 2D feature point on the color image, vck is its corresponding 3D feature vertex on the mesh V, and Mpro j is the projection matrix of the camera. We use the method described in [1] to determine the indices of the contour feature points on the mesh V: We first project the face region of V to the image to get the 2D face mesh. Then we find its convex hull to get the points along the contour of the mesh. Among these points, we find the nearest one for each contour feature on the image, and assign it as the corresponding feature point on the mesh….”; see section  DMM and self-occlusion problem S3DMM, pages 2-3 of Lee “In S3DMM, the geometry of a face is defined as a shape vector S ¼ X1; Y1; Z1; X2; . . . Yn; Zn ð ÞT 2 R3n, which contains the X , Y , and Z-coordinates of n vertices. The original 3DMM generally uses a dense shape with thousands of vertices, whereas S3DMM uses a sparse shape with only dozens of vertices. In order to build a morphable shape model, S3DMM performs PCA on a training set of shape vectors Sj. The mean shape s0 and m shape variations si are then obtained, and a new shape S can be expressed as a linear combination of the mean shape s0 and the shape variations si as follows…;) In addition, the same motivation is used as the rejection for claim 1. 
Regarding independent claim 12, Liu teaches a three-dimensional face reconstruction method (Abstract. “We present an approach to simultaneously solve the two problems of face alignment and 3D face reconstruction from an input 2D face image of arbitrary poses and expressions.”) , comprising:
acquiring a two-dimensional face image for processing (3.1 Overview as show in Fig.2 of Liu “Here, we use a 3D-to-2D mapping matrix M to approximate the composite effect of expression and pose induced deformation and camera projection. Given an input 2D face image I, our goal is to simultaneously locate its”);
determining a projection mapping matrix from the three-dimensional average face model to the two-dimensional face image based on internal feature points of the two-dimensional face image and the three- dimensional average face model(see 3.1 Overview”…Here, we use a 3D-to-2D mapping matrix M to approximate the composite effect of expression and pose induced deformation and camera projection. Given an input 2D face image I, our goal is to simultaneously locate its landmarks U and reconstruct its 3D face shape S. Note that, in some context, we also write the 3D face shape and the landmarks as column vectors: S = (x1, y1, z1, x2, y2, z2, · · · , xn, yn, zn)T, and U = (u1, v1, u2, v2, · · · , ul, vl)T, where ‘T’ is transpose operator. Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.”); wherein the internal face feature points include one or more of eyes, nose tip, mouth corner points, or eyebrows (Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.” Where Figure 2. Landmarks U including eyes, nose, eyebrows).
 constructing a first three-dimensional face model corresponding to the two- dimensional face image based on the projection mapping matrix and feature vectors of a three-dimensional feature face space (see 3.1 Overview, “…..Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.; see 3.3 Learning Landmark Regressors, last paragraph “We use 128-dim SIFT descriptors [24] as the local feature. The feature vector of h is a concatenation of the SIFT descriptors at all the l landmarks, i.e., a 128l-dim vector. If a landmark is invisible, no feature will be extracted, and its corresponding entries in h will be zero. It is worth mentioning that the regressors estimate the semantic positions of all landmarks including invisible landmarks.”; 3.5 Estimating 3D-to-2D Mapping and Landmark Visibility , “In order to refine the landmarks with the updated 3D face shape, we have to project the 3D shape to the 2D image with a 3D-to-2D mapping matrix. In this paper, we dynamically estimate the mapping matrix based on Sk and ˆUk. As discussed earlier in Sect. 3.1, the mapping matrix is a composite effect of expression and pose induced deformation and camera projection. Here, we assume a weak perspective projection for the camera projection as in prior work [18,38], and further assume that the expression and pose induced deformation can be approximated by a linear transform. As a result, the mapping matrix Mk is represented by a 2 × 4 matrix, and can be estimated as a least squares solution to the following fitting problem) and determining a face pose of the two-dimensional face image based on face feature points of the first three-dimensional face model and face feature points of the two- dimensional face image (see 3.1 Overview, “…..Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.; see 3.3 Learning Landmark Regressors, last paragraph “We use 128-dim SIFT descriptors [24] as the local feature. The feature vector of h is a concatenation of the SIFT descriptors at all the l landmarks, i.e., a 128l-dim vector. If a landmark is invisible, no feature will be extracted, and its corresponding entries in h will be zero. It is worth mentioning that the regressors estimate the semantic positions of all landmarks including invisible landmarks.”; 3.5 Estimating 3D-to-2D Mapping and Landmark Visibility , “In order to refine the landmarks with the updated 3D face shape, we have to project the 3D shape to the 2D image with a 3D-to-2D mapping matrix. In this paper, we dynamically estimate the mapping matrix based on Sk and ˆUk. As discussed earlier in Sect. 3.1, the mapping matrix is a composite effect of expression and pose induced deformation and camera projection. Here, we assume a weak perspective projection for the camera projection as in prior work [18,38], and further assume that the expression and pose induced deformation can be approximated by a linear transform. As a result, the mapping matrix Mk is represented by a 2 × 4 matrix, and can be estimated as a least squares solution to the following fitting problem” where project 3D shape to 2D image with a 3D-2D mapping matrix which is a composite effect of expression a pose).  Liu does not specific if project 3D shape to 2D image will determine a pose of 2D image.
However, RODRIGUEZ teaches the 3D image is projected in 2D so as to generate at least one dataset representing a pose-rectified 2D projected image (¶0089 “In step E of FIG. 1, the textured 3D image is projected in 2D so as to generate at least one dataset representing a pose-rectified 2D projected image, in the visible and/or in the NIR range. Various projections could be considered. At first, deformations caused by the camera and/or by perspective are preferably compensated. Then, in one embodiment, the projection generates a frontal facial image, i.e. a 2D image as seen from a viewer in front of the user 100. It is also possible to generate a non-frontal facial image, or a plurality of 2D projections, such as for example one frontal facial projection and one another profile projection. Other projections could be considered, including cartographic projections, or projections that introduce deformations in order to magnify discriminative parts of the face, in particular the eyes, the upper half of the face, and reduce the size of more deformable parts of the face, such as the mouth. It is also possible to morph the head to a generic model before comparison, in order to facilitate comparison. Purely mathematical projections, such as projections onto not easily representable space, could also be considered” where a pose-rectified 2D projected image is considered a face pose of the two-dimensional face image) In addition, the same motivation is used as the rejection for claim 1. Both Liu and RODRIGUEZ are understood to be silent on the remaining limitations of claim 1.
In the same field of endeavor, Cao teaches applying a dimensionality reduction algorithm to a plurality of three-dimensional face model samples to determine a three-dimensional average face model (B. Expression mesh and individual-specific blendshape generation, “…Blanz and Vetter’s morphable model performs Principal Component Analysis (PCA) on 200 neutral face models. Any face can be approximated as a linear combination of the average face and l leading PCA vectors: V =¯F +åli =1aiFi, where ¯F is the average face, and Fi is the i-th PCA vector. Our goal is to compute the coefficients ai to get the closest mesh in the PCA space. The first term corresponds to internal feature matching. Cj is the 3D position of the j-th feature point, while vi j is its corresponding vertex on the mesh V. The indices for these internal feature points on the mesh are simply marked on the average face in our implementation.”);
 determining a projection mapping matrix from the three-dimensional average face model to the two-dimensional face image based on internal face feature points of the two-dimensional face image and the three-dimensional average face model, wherein the internal face feature points include one or more of eyes, nose tip, mouth corner points, or eyebrows  (B. Expression mesh and individual-specific blendshape generation, “From smooth depth maps and corresponding color images, we generate the associated expression meshes. For each expression data, we first use Active Shape Model (ASM) [18] to locate 74 feature points on the color image, including the face contour, eye corners, brow boundary, mouth boundary, nose contour and tip. The automatically detected locations may not be accurate in all cases, especially for those expressions with relatively large deformation (e.g., mouth open and smile). We thus require a small amount of user interaction to refine the positions of some feature points–the user interaction is as simple as drag-and-dropping the feature points on the image. The 74 feature points are divided into two categories: the mi internal feature points (i.e., features on eyes, brows, nose and mouth, c.f. the green points in Fig. 3) located inside the face region, and the mc contour feature points (the yellow points in Fig. 3). Given the correspondence between the color image and the depth map, we can easily get the corresponding 3D positions from the depth map for internal feature points. We classify all contour feature points in the image as 2D. Neutral expression. We first generate the face mesh for the neutral expression by using a two-step approach. Blanz and Vetter’s morphable model is automatically fitted to produce an initial matching mesh. Then a mesh deformation algorithm is employed to refine this mesh for better matching between the depth map and the feature points. Blanz and Vetter’s morphable model performs Principal Component Analysis (PCA) on 200 neutral face models. Any face can be approximated as a linear combination of the average face and l leading PCA vectors: ..=1aiFi, where ¯F is the average face, and Fi is the i-th PCA vector. Our goal is to compute the coefficients ai to get the closest mesh in the PCA space. The energy to be minimized for feature point matching is defined as …The first term corresponds to internal feature matching. Cj is the 3D position of the j-th feature point, while vi j is its corresponding vertex on the mesh V. The indices for these internal feature points on the mesh are simply marked on the average face in our implementation. The second term is for contour feature matching. sk is a 2D feature point on the color image, vck is its corresponding 3D feature vertex on the mesh V, and Mpro j is the projection matrix of the camera. We use the method described in [1] to determine the indices of the contour feature points on the mesh V: We first project the face region of V to the image to get the 2D face mesh. Then we find its convex hull to get the points along the contour of the mesh. Among these points, we find the nearest one for each contour feature on the image, and assign it as the corresponding feature point on the mesh….”);
constructing a first three-dimensional face model corresponding to the two- dimensional face image based on the projection mapping matrix and feature vectors of a three-dimensional feature face space, wherein the constructing of the first three-dimensional face model comprises: performing contour feature point fitting on the first three-dimensional average face model based on face contour feature points of the two-dimensional face image(B. Expression mesh and individual-specific blendshape generation, “From smooth depth maps and corresponding color images, we generate the associated expression meshes. For each expression data, we first use Active Shape Model (ASM) [18] to locate 74 feature points on the color image, including the face contour, eye corners, brow boundary, mouth boundary, nose contour and tip. The automatically detected locations may not be accurate in all cases, especially for those expressions with relatively large deformation (e.g., mouth open and smile). We thus require a small amount of user interaction to refine the positions of some feature points–the user interaction is as simple as drag-and-dropping the feature points on the image. The 74 feature points are divided into two categories: the mi internal feature points (i.e., features on eyes, brows, nose and mouth, c.f. the green points in Fig. 3) located inside the face region, and the mc contour feature points (the yellow points in Fig. 3). Given the correspondence between the color image and the depth map, we can easily get the corresponding 3D positions from the depth map for internal feature points. We classify all contour feature points in the image as 2D. Neutral expression. We first generate the face mesh for the neutral expression by using a two-step approach. Blanz and Vetter’s morphable model is automatically fitted to produce an initial matching mesh. Then a mesh deformation algorithm is employed to refine this mesh for better matching between the depth map and the feature points. Blanz and Vetter’s morphable model performs Principal Component Analysis (PCA) on 200 neutral face models. Any face can be approximated as a linear combination of the average face and l leading PCA vectors: ..=1aiFi, where ¯F is the average face, and Fi is the i-th PCA vector. Our goal is to compute the coefficients ai to get the closest mesh in the PCA space. The energy to be minimized for feature point matching is defined as …The first term corresponds to internal feature matching. Cj is the 3D position of the j-th feature point, while vi j is its corresponding vertex on the mesh V. The indices for these internal feature points on the mesh are simply marked on the average face in our implementation. The second term is for contour feature matching. sk is a 2D feature point on the color image, vck is its corresponding 3D feature vertex on the mesh V, and Mpro j is the projection matrix of the camera. We use the method described in [1] to determine the indices of the contour feature points on the mesh V: We first project the face region of V to the image to get the 2D face mesh. Then we find its convex hull to get the points along the contour of the mesh. Among these points, we find the nearest one for each contour feature on the image, and assign it as the corresponding feature point on the mesh…..”) In addition, the same motivation is used as the rejection for claim 1. Liu,  RODRIGUEZ and Cao are understood to be silent on the remaining limitations of claim 1. 
In the same field of endeavor, Lee teaches wherein the constructing of the three-dimensional face model comprises: applying a dimensionality reduction algorithm to a plurality of three- dimensional face model samples to determine a three-dimensional average face model (see Introduction, “ Among the single view-based methods, shape-from shading (SFS) is a traditional method for deriving a 3D facial shape from the brightness variations in a single image. However, SFS-based methods have impractical constraints because the Lambertian reflectance model and a known light source direction need to be assumed to produce accurate results [10-12]. Recently, several new techniques have been proposed to overcome this problem. These methods reconstruct a 3D face by modeling the relationships between the intensities and the depth information of the face using statistical learning techniques, such as principal component analysis (PCA), partial least squares, and canonical correlation analysis [13-16]”; see section  DMM and self-occlusion problem S3DMM, pages 2-3 of Lee “In S3DMM, the geometry of a face is defined as a shape vector S ¼ X1; Y1; Z1; X2; . . . Yn; Zn ð ÞT 2 R3n, which contains the X , Y , and Z-coordinates of n vertices. The original 3DMM generally uses a dense shape with thousands of vertices, whereas S3DMM uses a sparse shape with only dozens of vertices. In order to build a morphable shape model, S3DMM performs PCA on a training set of shape vectors Sj. The mean shape s0 and m shape variations si are then obtained, and a new shape S can be expressed as a linear combination of the mean shape s0 and the shape variations si as follows… where PCA (principal component analysis) is considered as a dimensionality reduction algorithm”);
constructing a first three-dimensional face model corresponding to the two- dimensional face image based on the projection mapping matrix and feature vectors of a three-dimensional feature face space (see section  DMM and self-occlusion problem S3DMM, pages 2-3 of Lee “In S3DMM, the geometry of a face is defined as a shape vector S ¼ X1; Y1; Z1; X2; . . . Yn; Zn ð ÞT 2 R3n, which contains the X , Y , and Z-coordinates of n vertices. The original 3DMM generally uses a dense shape with thousands of vertices, whereas S3DMM uses a sparse shape with only dozens of vertices. In order to build a morphable shape model, S3DMM performs PCA on a training set of shape vectors Sj. The mean shape s0 and m shape variations si are then obtained, and a new shape S can be expressed as a linear combination of the mean shape s0 and the shape variations si as follows…. Given the 2D FFPs of an input face image, such as s2d ¼ x1; y1; x2; . . . yn ð ÞT 2 R2n , the shape parameter β needs to be determined such that it minimizes the shape residual between the projected 3D facial shape generated by the shape parameter and the input 2D facial shape. The optimal shape and pose parameters ðβ; Rθ; TÞ are obtained from (2): …Where ~S is a 3 × n matrix that is reshaped from the 3n × 1 model shape vector S obtained using (1), ~s2d is a 2×n matrix that is reshaped from the 2n × 1 input shape vector s2d , P is a 2 × 3 orthographic projection matrix, ~T is a 3 × n translation matrix consisting of n translation vectors T ¼ tx ty tz T , and Rθ is a 3 × 3 rotation matrix where the yaw angle is θ. Note that in this paper, we consider mainly yaw rotation because the self-occlusion caused by yaw rotation is relatively greater than that caused by pitch rotation, and tz is set to 0 because an orthographic projection is assumed…. shown in Algorithm 1, the procedure for 3D model fitting is as follows. First, the shape parameter β0 and translation parameter T0 are initialized to 0 and the input 2D FFPs s2d are aligned with the 2D mean shape obtained by projecting the 3D mean shape (s0) with a frontal pose onto the x–y plane. As the alignment method, we use the Procrustes analysis, which includes translation, rotation, and scaling [22]. The optimal model parameters are determined by alternately updating the pose parameter (Rθ, T) at the fixed β and updating the shape parameter β at fixed (Rθ,T) until the shape residual error converges. The cost function is solved as a least squares problem and the rotation matrix is calculated by QR decomposition, as in [5]. Finally, a new 3D facial shape S3d is reconstructed by applying the optimal shape parameter β to (1).”), wherein the constructing of the first three-dimensional face model comprises: 
performing contour feature point fitting on the first three-dimensional face model based on face contour feature points of the two-dimensional face image (see Self-occlusion problem, page 4 of Lee “….Figure 1 shows self-occlusion errors that occur after comparing the observed 2D FFPs with the ground-truth 2D FFPs of the facial contours. When detecting the facial contour FFPs in a half-profile view image, the visible FFPs laid on the visible facial region can be detected as those of the ground-truth facial contour, but the observed FFPs on the occluded facial region are located in the outline of the face because the occluded real facial contour cannot be observed, as shown in Figure 1a. Therefore, the 2D FFPs observed in a rotated face image have location errors, which are the differences between the observed FFPs and the occluded real FFPs, as shown in Figure 1c”; page 4, Proposed 3D face reconstruction method, Overall procedure of the proposed method. “Overall procedure of the proposed method The proposed 3D face reconstruction process starts with the localization of the FFPs in a given 2D face image. To detect self-occlusion in an input face, the head pose is estimated using a cylindrical head model-based method [23]. The estimated pose can then be used to determine which FFPs are self-occluded. Next, a sparse 3D facial shape is reconstructed using the model fitting process based on the selected visible FFPs. Subsequently, a dense 3D facial shape is interpolated from the reconstructed sparse 3D facial shape using the Thin Plate Spline (TPS) method [24,25]. Finally, the facial texture directly extracted from the input image is mapped onto the dense 3D facial shape. The overall procedure of the proposed method is shown in Figure 3.”); determining an error between the two-dimensional face image and the three-dimensional average face model;  in response to the error being less than a preset error, adopting the first three-dimensional average face model as the three-dimensional face model (see section  DMM and self-occlusion problem S3DMM, pages 2-3  of Lee “….As shown in Algorithm 1 3D, the procedure for 3D model fitting is as follows. First, the shape parameter β0 and translation parameter T0 are initialized to 0 and the input 2D FFPs s2d are aligned with the 2D mean shape obtained by projecting the 3D mean shape (s0) with a frontal pose onto the x–y plane. As the alignment method, we use the Procrustes analysis, which includes translation, rotation, and scaling [22]. The optimal model parameters are determined by alternately updating the pose parameter (Rθ, T) at the fixed β and updating the shape parameter β at fixed (Rθ,T) until the shape residual error converges. The cost function is solved as a least squares problem and the rotation matrix is calculated by QR decomposition, as in [5]. Finally, a new 3D facial shape S3d is reconstructed by applying the optimal shape parameter β to (1).
Algorithm 1 3D Model Fitting [5]

    PNG
    media_image2.png
    269
    361
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    132
    341
    media_image3.png
    Greyscale
” where step 5 shape residual error is less than preset error €  then, the shape parameter β at fixed (Rθ,T) is the final shape parameters which is considered as adopting the first three-dimensional average face model as the three-dimensional face model) ;
and in response to the error being greater than or equal to the preset error, determining a new projection mapping matrix from the fitted three- dimensional average face model to the two-dimensional face image and constructing the first three-dimensional face model based on the new projection mapping matrix (see section  DMM and self-occlusion problem S3DMM, pages 2-3  of Lee “….As shown in Algorithm 1 3D, the procedure for 3D model fitting is as follows. First, the shape parameter β0 and translation parameter T0 are initialized to 0 and the input 2D FFPs s2d are aligned with the 2D mean shape obtained by projecting the 3D mean shape (s0) with a frontal pose onto the x–y plane. As the alignment method, we use the Procrustes analysis, which includes translation, rotation, and scaling [22]. The optimal model parameters are determined by alternately updating the pose parameter (Rθ, T) at the fixed β and updating the shape parameter β at fixed (Rθ,T) until the shape residual error converges. The cost function is solved as a least squares problem and the rotation matrix is calculated by QR decomposition, as in [5]. Finally, a new 3D facial shape S3d is reconstructed by applying the optimal shape parameter β to (1).”)
Algorithm 1 3D Model Fitting [5]

    PNG
    media_image2.png
    269
    361
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    132
    341
    media_image3.png
    Greyscale
” where step 5 shape residual error is not less than preset error € which means then the error being greater than or equal to the preset error, then go to step 3-4 where  update R and T of the matrix which is considered as new projection mapping matrix and reconstruct 3D facial shape with shape parameter after update R and tT which is considered as determining a new projection mapping matrix from the fitted three- dimensional average face model to the two-dimensional face image and constructing the first three-dimensional face model based on the new projection mapping matrix) In addition, the same motivation is used as the rejection for claim 1. 
Thus, the combination of Liu,  RODRIGUEZ, Cao and Lee teaches  a three-dimensional face reconstruction method, comprising: acquiring a two-dimensional face image for processing; applying a dimensionality reduction algorithm to a plurality of three-dimensional face model samples to determine a three-dimensional average face model; determining a projection mapping matrix from the three-dimensional average face model to the two-dimensional face image based on internal feature points of the two-dimensional face image and the three- dimensional average face model, wherein the internal face feature points include one or more of eyes, nose tip, mouth corner points, or eyebrows; constructing a first three-dimensional face model corresponding to the two-dimensional face image based on the projection mapping matrix and feature vectors of a three- dimensional feature face space, wherein the constructing of the first three- dimensional face model comprises: performing face contour feature points fitting on the three-dimensional average face model based on the two-dimensional face image; determining an error between the two-dimensional face image and the fitted three-dimensional average face model; in response to the error being less than a preset error, adopting the fitted three-dimensional average face model as the first three- dimensional face model; and in response to the error being greater than or equal to the preset error, determining a new projection mapping matrix from the fitted three- dimensional average face model to the two-dimensional face image and constructing the first three-dimensional face model based on the new projection mapping matrix; and determining a face pose of the two-dimensional face image based on face feature points of the first three-dimensional face model and face feature points of the two-dimensional face image.
Regarding claim 13, Liu,  RODRIGUEZ, Cao and Lee  teach the method of claim 12, wherein after the performing the contour feature point fitting on the first three-dimensional face model based on the face contour feature points of the two- dimensional face image, the method further comprises:
 performing expression fitting on the first three-dimensional face model fitted with the face contour feature points based on the internal face feature points of the two- dimensional face image and at least one facial expression base corresponding to the three-dimensional average face model (see Figure .2 where showing landmarks contour on 2D image; see 3.1 Overview, “…..Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.;, 4.1 Protocols “…Experiment setup. During training and testing, each image is associated with a bounding box, which specifies the face region in the image. To initialize the landmarks in it, the mean of the landmarks in all neutral frontal training images is fitted to the face region via a similarity transform. In this paper, we set the number of iterations K = 5 (discussion of convergence issue is provided in supplemental material). SIFT descriptors are computed on 32 × 32 local patches around the landmarks, and the implementation by [35] is used in our experiments.” where the mean of the landmarks is fitted to the face region which is bounding box is considered as contour feature point ; 3.5 Estimating 3D-to-2D Mapping and Landmark Visibility, “In order to refine the landmarks with the updated 3D face shape, we have to project the 3D shape to the 2D image with a 3D-to-2D mapping matrix. In this paper, we dynamically estimate the mapping matrix based on Sk and ˆUk. As discussed earlier in Sect. 3.1, the mapping matrix is a composite effect of expression and pose induced deformation and camera projection. Here, we assume a weak perspective projection for the camera projection as in prior work [18,38], and further assume that the expression and pose induced deformation can be approximated by a linear transform.”; Reconstruction accuracy across expressions. Figure 4(b) shows the average MAE of our proposed method across expressions. Although the error increases as expressions become intensive, the maximum increment (i.e., SU vs.NE) is below 7%. This proves the robustness of the proposed method in normalizing expressions while maintaining model individualities. Figure 6 shows the reconstruction and face alignment results of a subject under seven expressions “.see 3.2 Training Data Preparation of Liu “training data should contain 2D face images of varying expressions and poses. As for the 3D shape S∗i corresponding to the Ii in the training data, it can either have the same expression and pose as Ii, or just have neutral expression and frontal pose no matter what expression and pose Ii has. In the former, the learned regressors will output 3D face shapes that have the same expression and pose as the input images; while in the latter, the learned regressors will generate neutral and frontal 3D shapes for any input images. In either case, the dense registration among all 3D shapes S∗ i is needed for regressor learning. In this paper, we follow the latter for two reasons: (i) dense registration of 3D face shapes with different expressions is difficult, and (ii) the reconstructed PEN 3D shapes are preferred for being used in 3D face recognition. It is, however, difficult to find in the public domain such data sets of 3D face shapes and corresponding annotated 2D images with various expressions/ poses. Thus, we construct two sets of training data by ourselves: one based on BU3DFE [36], and the other based on LFW [16]. BU3DFE database contains 3D face scans of 56 males and 44 females, acquired in neutral plus six basic expressions (happiness, disgust, fear, angry, surprise and sadness). All basic expressions are acquired at four levels of intensity. These 3D face scans have been manually annotated with 84 landmarks (83 landmarks provided by the database and one nose tip marked by ourselves). For each of the 100 subjects, we select one scan of neutral expression as the ground truth 3D shape.”; B. Expression mesh and individual-specific blendshape generation and C. Blinear face model of Cao; see 3D Face model fitting, pages 7-8  of Lee “A detailed description of the proposed model fitting method is shown in Algorithm 2. This is a modified version of the earlier S3DMM-based algorithm mentioned in Section “S3DMM”. The proposed model fitting scheme is based on the selected visible FFPs, which eliminates the self-occlusion effect. As a result, the cost function of (2) is modified as follows: arg minβ;Rθ;T Mθ .. where the symbol “°” represents the Hadamard product, which is known as entry-wise multiplication [33], while Mθ is the masking matrix at rotation angle θ. Mθ isobtained from the index table of the visible FFP for the estimated pose, as explained in Sections “Head pose estimation” and “Determination of visible FFPs”. We can calculate the shape residual between the visible FFPs of the shape model and the input 2D facial shape using this masking matrix. The shape parameter β and pose parameter (Rθ,T) can be obtained without any selfocclusion effect by minimizing this shape residual. The proposed 3D model fitting algorithm has the following two advantages compared with the previous method:
1) The pose angle ^θ estimated by the cylindrical model is used for the pose parameter initialization. Therefore, the parameter estimation starts from a relatively exact initial pose parameter, which enhances the 3D face reconstruction performance. During the alignment step, an accurate alignment result is obtained by aligning the input 2D FFPs with the FFPs of the 2D mean shape, which are obtained by rotating the 3D mean shape (s0) from 0° to ^θ and projecting it onto the x–y plane.
2) 3D model fitting is achieved on the basis of the visible FFPs by using the masking matrix. Therefore, the proposed method can reconstruct 3D faces that are less affected by self-occlusion.”) In addition, the same motivation is used as the rejection for claim 1
Regarding claim 14, Liu,  RODRIGUEZ, Cao and Lee  teach the method of claim 12, wherein after the constructing the first three-dimensional face model corresponding to the two-dimensional face image based on the projection mapping matrix and the feature vectors of the three-dimensional feature face space, the method further comprises: performing expression fitting on the first three-dimensional face model based on at least one facial expression base corresponding to the three-dimensional average face model (see Figure .2 where showing landmarks contour on 2D image; see 3.1 Overview, “…..Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.;, 4.1 Protocols “…Experiment setup. During training and testing, each image is associated with a bounding box, which specifies the face region in the image. To initialize the landmarks in it, the mean of the landmarks in all neutral frontal training images is fitted to the face region via a similarity transform. In this paper, we set the number of iterations K = 5 (discussion of convergence issue is provided in supplemental material). SIFT descriptors are computed on 32 × 32 local patches around the landmarks, and the implementation by [35] is used in our experiments.” where the mean of the landmarks is fitted to the face region which is bounding box is considered as contour feature point ; 3.5 Estimating 3D-to-2D Mapping and Landmark Visibility, “In order to refine the landmarks with the updated 3D face shape, we have to project the 3D shape to the 2D image with a 3D-to-2D mapping matrix. In this paper, we dynamically estimate the mapping matrix based on Sk and ˆUk. As discussed earlier in Sect. 3.1, the mapping matrix is a composite effect of expression and pose induced deformation and camera projection. Here, we assume a weak perspective projection for the camera projection as in prior work [18,38], and further assume that the expression and pose induced deformation can be approximated by a linear transform.”; Reconstruction accuracy across expressions. Figure 4(b) shows the average MAE of our proposed method across expressions. Although the error increases as expressions become intensive, the maximum increment (i.e., SU vs.NE) is below 7%. This proves the robustness of the proposed method in normalizing expressions while maintaining model individualities. Figure 6 shows the reconstruction and face alignment results of a subject under seven expressions .see 3.2 Training Data Preparation of Liu “training data should contain 2D face images of varying expressions and poses. As for the 3D shape S∗i corresponding to the Ii in the training data, it can either have the same expression and pose as Ii, or just have neutral expression and frontal pose no matter what expression and pose Ii has. In the former, the learned regressors will output 3D face shapes that have the same expression and pose as the input images; while in the latter, the learned regressors will generate neutral and frontal 3D shapes for any input images. In either case, the dense registration among all 3D shapes S∗ i is needed for regressor learning. In this paper, we follow the latter for two reasons: (i) dense registration of 3D face shapes with different expressions is difficult, and (ii) the reconstructed PEN 3D shapes are preferred for being used in 3D face recognition. It is, however, difficult to find in the public domain such data sets of 3D face shapes and corresponding annotated 2D images with various expressions/ poses. Thus, we construct two sets of training data by ourselves: one based on BU3DFE [36], and the other based on LFW [16]. BU3DFE database contains 3D face scans of 56 males and 44 females, acquired in neutral plus six basic expressions (happiness, disgust, fear, angry, surprise and sadness). All basic expressions are acquired at four levels of intensity. These 3D face scans have been manually annotated with 84 landmarks (83 landmarks provided by the database and one nose tip marked by ourselves). For each of the 100 subjects, we select one scan of neutral expression as the ground truth 3D shape.”; B. Expression mesh and individual-specific blendshape generation and C. Blinear face model of Cao ; see 3D Face model fitting, pages 7-8  of Lee “A detailed description of the proposed model fitting method is shown in Algorithm 2. This is a modified version of the earlier S3DMM-based algorithm mentioned in Section “S3DMM”. The proposed model fitting scheme is based on the selected visible FFPs, which eliminates the self-occlusion effect. As a result, the cost function of (2) is modified as follows: arg minβ;Rθ;T Mθ .. where the symbol “°” represents the Hadamard product, which is known as entry-wise multiplication [33], while Mθ is the masking matrix at rotation angle θ. Mθ isobtained from the index table of the visible FFP for the estimated pose, as explained in Sections “Head pose estimation” and “Determination of visible FFPs”. We can calculate the shape residual between the visible FFPs of the shape model and the input 2D facial shape using this masking matrix. The shape parameter β and pose parameter (Rθ,T) can be obtained without any selfocclusion effect by minimizing this shape residual. The proposed 3D model fitting algorithm has the following two advantages compared with the previous method:
1) The pose angle ^θ estimated by the cylindrical model is used for the pose parameter initialization. Therefore, the parameter estimation starts from a relatively exact initial pose parameter, which enhances the 3D face reconstruction performance. During the alignment step, an accurate alignment result is obtained by aligning the input 2D FFPs with the FFPs of the 2D mean shape, which are obtained by rotating the 3D mean shape (s0) from 0° to ^θ and projecting it onto the x–y plane.
2) 3D model fitting is achieved on the basis of the visible FFPs by using the masking matrix. Therefore, the proposed method can reconstruct 3D faces that are less affected by self-occlusion.”); 
wherein the performing the contour feature point fitting on the first three-dimensional 42Client Ref. No. PCT15416US Attorney Docket No. 60YN-322189 face model based on the face contour feature points of the two-dimensional face image comprises: performing the contour feature point fitting on the first three-dimensional face model fitted with the facial expression based on the face contour feature points of the two-dimensional face image (see Figure .2 where showing landmarks contour on 2D image; see 3.1 Overview, “…..Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.;, 4.1 Protocols “…Experiment setup. During training and testing, each image is associated with a bounding box, which specifies the face region in the image. To initialize the landmarks in it, the mean of the landmarks in all neutral frontal training images is fitted to the face region via a similarity transform. In this paper, we set the number of iterations K = 5 (discussion of convergence issue is provided in supplemental material). SIFT descriptors are computed on 32 × 32 local patches around the landmarks, and the implementation by [35] is used in our experiments.” where the mean of the landmarks is fitted to the face region which is bounding box is considered as contour feature point ;  3.5 Estimating 3D-to-2D Mapping and Landmark Visibility, “In order to refine the landmarks with the updated 3D face shape, we have to project the 3D shape to the 2D image with a 3D-to-2D mapping matrix. In this paper, we dynamically estimate the mapping matrix based on Sk and ˆUk. As discussed earlier in Sect. 3.1, the mapping matrix is a composite effect of expression and pose induced deformation and camera projection. Here, we assume a weak perspective projection for the camera projection as in prior work [18,38], and further assume that the expression and pose induced deformation can be approximated by a linear transform.”; Reconstruction accuracy across expressions. Figure 4(b) shows the average MAE of our proposed method across expressions. Although the error increases as expressions become intensive, the maximum increment (i.e., SU vs.NE) is below 7%. This proves the robustness of the proposed method in normalizing expressions while maintaining model individualities. Figure 6 shows the reconstruction and face alignment results of a subject under seven expressions .see 3.2 Training Data Preparation of Liu “training data should contain 2D face images of varying expressions and poses. As for the 3D shape S∗i corresponding to the Ii in the training data, it can either have the same expression and pose as Ii, or just have neutral expression and frontal pose no matter what expression and pose Ii has. In the former, the learned regressors will output 3D face shapes that have the same expression and pose as the input images; while in the latter, the learned regressors will generate neutral and frontal 3D shapes for any input images. In either case, the dense registration among all 3D shapes S∗ i is needed for regressor learning. In this paper, we follow the latter for two reasons: (i) dense registration of 3D face shapes with different expressions is difficult, and (ii) the reconstructed PEN 3D shapes are preferred for being used in 3D face recognition. It is, however, difficult to find in the public domain such data sets of 3D face shapes and corresponding annotated 2D images with various expressions/ poses. Thus, we construct two sets of training data by ourselves: one based on BU3DFE [36], and the other based on LFW [16]. BU3DFE database contains 3D face scans of 56 males and 44 females, acquired in neutral plus six basic expressions (happiness, disgust, fear, angry, surprise and sadness). All basic expressions are acquired at four levels of intensity. These 3D face scans have been manually annotated with 84 landmarks (83 landmarks provided by the database and one nose tip marked by ourselves). For each of the 100 subjects, we select one scan of neutral expression as the ground truth 3D shape.”; B. Expression mesh and individual-specific blendshape generation and C. Blinear face model of Cao ; see 3D Face model fitting, pages 7-8  of Lee “A detailed description of the proposed model fitting method is shown in Algorithm 2. This is a modified version of the earlier S3DMM-based algorithm mentioned in Section “S3DMM”. The proposed model fitting scheme is based on the selected visible FFPs, which eliminates the self-occlusion effect. As a result, the cost function of (2) is modified as follows: arg minβ;Rθ;T Mθ .. where the symbol “°” represents the Hadamard product, which is known as entry-wise multiplication [33], while Mθ is the masking matrix at rotation angle θ. Mθ isobtained from the index table of the visible FFP for the estimated pose, as explained in Sections “Head pose estimation” and “Determination of visible FFPs”. We can calculate the shape residual between the visible FFPs of the shape model and the input 2D facial shape using this masking matrix. The shape parameter β and pose parameter (Rθ,T) can be obtained without any selfocclusion effect by minimizing this shape residual. The proposed 3D model fitting algorithm has the following two advantages compared with the previous method:
1) The pose angle ^θ estimated by the cylindrical model is used for the pose parameter initialization. Therefore, the parameter estimation starts from a relatively exact initial pose parameter, which enhances the 3D face reconstruction performance. During the alignment step, an accurate alignment result is obtained by aligning the input 2D FFPs with the FFPs of the 2D mean shape, which are obtained by rotating the 3D mean shape (s0) from 0° to ^θ and projecting it onto the x–y plane.
2) 3D model fitting is achieved on the basis of the visible FFPs by using the masking matrix. Therefore, the proposed method can reconstruct 3D faces that are less affected by self-occlusion.”);In addition, the same motivation is used as the rejection for claim 1.
Regarding claim 18, Liu, RODRIGUEZ, Cao and Lee  teach the method of claim 13, further comprising: determining at least one facial expression base corresponding to the three-dimensional average face model based on the plurality of three-dimensional face model samples (B. Expression mesh and individual-specific blendshape generation, “From smooth depth maps and corresponding color images, we generate the associated expression meshes. For each expression data, we first use Active Shape Model (ASM) [18] to locate 74 feature points on the color image, including the face contour, eye corners, brow boundary, mouth boundary, nose contour and tip. The automatically detected locations may not be accurate in all cases, especially for those expressions with relatively large deformation (e.g., mouth open and smile). We thus require a small amount of user interaction to refine the positions of some feature points–the user interaction is as simple as drag-and-dropping the feature points on the image. The 74 feature points are divided into two categories: the mi internal feature points (i.e., features on eyes, brows, nose and mouth, c.f. the green points in Fig. 3) located inside the face region, and the mc contour feature points (the yellow points in Fig. 3). Given the correspondence between the color image and the depth map, we can easily get the corresponding 3D positions from the depth map for internal feature points. We classify all contour feature points in the image as 2D. Neutral expression. We first generate the face mesh for the neutral expression by using a two-step approach. Blanz and Vetter’s morphable model is automatically fitted to produce an initial matching mesh. Then a mesh deformation algorithm is employed to refine this mesh for better matching between the depth map and the feature points. Blanz and Vetter’s morphable model performs Principal Component Analysis (PCA) on 200 neutral face models. Any face can be approximated as a linear combination of the average face and l leading PCA vectors: ..=1aiFi, where ¯F is the average face, and Fi is the i-th PCA vector. Our goal is to compute the coefficients ai to get the closest mesh in the PCA space. The energy to be minimized for feature point matching is defined as …The first term corresponds to internal feature matching. Cj is the 3D position of the j-th feature point, while vi j is its corresponding vertex on the mesh V. The indices for these internal feature points on the mesh are simply marked on the average face in our implementation. The second term is for contour feature matching. sk is a 2D feature point on the color image, vck is its corresponding 3D feature vertex on the mesh V, and Mpro j is the projection matrix of the camera. We use the method described in [1] to determine the indices of the contour feature points on the mesh V: We first project the face region of V to the image to get the 2D face mesh. Then we find its convex hull to get the points along the contour of the mesh. Among these points, we find the nearest one for each contour feature on the image, and assign it as the corresponding feature point on the mesh….”; see section  DMM and self-occlusion problem S3DMM, pages 2-3 of Lee “In S3DMM, the geometry of a face is defined as a shape vector S ¼ X1; Y1; Z1; X2; . . . Yn; Zn ð ÞT 2 R3n, which contains the X , Y , and Z-coordinates of n vertices. The original 3DMM generally uses a dense shape with thousands of vertices, whereas S3DMM uses a sparse shape with only dozens of vertices. In order to build a morphable shape model, S3DMM performs PCA on a training set of shape vectors Sj. The mean shape s0 and m shape variations si are then obtained, and a new shape S can be expressed as a linear combination of the mean shape s0 and the shape variations si as follows…) In addition, the same motivation is used as the rejection for claim 1.
Regarding independent claim 19, Liu teaches an electronic device, comprising: a processor ; and a memory, configured to store a program for implementing a face pose estimation method (see 4.5 Computational Efficiency, “According to our experiments on a PC with i7-4710 CPU and 8 GB memory, the Matlab implementation of the proposed method runs at ∼ 26 FPS (K = 5 and n = 9, 677); see 4 Experiments, “4.1 Protocols We conduct three sets of experiments to evaluate the proposed method in 3D shape reconstruction, face alignment, and benefits to face recognition. Datasets. The training data are constructed from two public face databases: BU3DFE and LFW, as detailed in Sect. 3.2. Respectively, two different models are trained using each of the two training sets. Our test sets include BU3DFE and AFW (Annotated Faces in-the-Wild) [40]. To evaluate the 3D shape reconstruction accuracy, a 10-fold cross validation is applied to split the BU3DFE data into training and testing subsets, resulting in 11,970 training samples and 1,330 testing samples. To evaluate the face alignment accuracy, the AFW database [40] is tested using the LFW-trained model. AFW is a widely used benchmark in the face alignment literature. It contains 205 images of 468 faces with different poses within ±90◦. In [30], 337 of these faces have been manually annotated with face bounding boxes and 68 landmarks. We use them in our experiments.), wherein the device, after being powered on and running the program of the face pose estimation method through the processor (see 4.5 Computational Efficiency, “According to our experiments on a PC with i7-4710 CPU and 8 GB memory, the Matlab implementation of the proposed method runs at ∼ 26 FPS (K = 5 and n = 9, 677) where PC with CPU and memory implemented method runs which is considered powered on), performs the following steps: Remaining limitations of claim 19 is similar in scope to claim 1 and therefore rejected under the same rationale.
Regarding claim 20, Liu,  RODRIGUEZ, Cao and Lee  teach the electronic device of claim 19, wherein the constructing the three-dimensional face model corresponding to the two-dimensional face image comprises: Remaining limitations of claim 20 is similar in scope to claim 2 and therefore rejected under the same rationale.
2.	Claims 9  and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Liu, Feng, et al. "Joint face alignment and 3d face reconstruction." European Conference on Computer Vision. Springer, Cham, 2016.(“Liu”) in view of RODRIGUEZ et al, U.S Patent Application Publication o.20160086017(“RODRIGUEZ”) further in view of Cao, Chen, et al. "Facewarehouse: A 3d facial expression database for visual computing." IEEE Transactions on Visualization and Computer Graphics 20.3 (2013): 413-425. (“Cao”) further in view of Lee, Youn Joo, et al. "Single view-based 3D face reconstruction robust to self-occlusion." EURASIP Journal on Advances in Signal Processing 2012.1 (2012): 1-20 (“Lee”)  further in view of Kamencay, Patrik, et al. "A novel approach to face recognition using image segmentation based on spca-knn method." Radioengineering 22.1 (2013): 92-99.(“Kamencay”)
Regarding 9, Liu,  RODRIGUEZ, Cao and Lee teach the method of claims 1, wherein the performing the contour feature point fitting on the first three-dimensional face model based on the face contour feature points of the two- dimensional face image comprises:
 selecting, from the first three-dimensional face model, three-dimensional points corresponding to the face contour feature points of the two-dimensional face image as initial three-dimensional contour feature points (B. Expression mesh and individual-specific blendshape generation of Cao, “From smooth depth maps and corresponding color images, we generate the associated expression meshes. For each expression data, we first use Active Shape Model (ASM) [18] to locate 74 feature points on the color image, including the face contour, eye corners, brow boundary, mouth boundary, nose contour and tip. The automatically detected locations may not be accurate in all cases, especially for those expressions with relatively large deformation (e.g., mouth open and smile). We thus require a small amount of user interaction to refine the positions of some feature points–the user interaction is as simple as drag-and-dropping the feature points on the image. The 74 feature points are divided into two categories: the mi internal feature points (i.e., features on eyes, brows, nose and mouth, c.f. the green points in Fig. 3) located inside the face region, and the mc contour feature points (the yellow points in Fig. 3). Given the correspondence between the color image and the depth map, we can easily get the corresponding 3D positions from the depth map for internal feature points. We classify all contour feature points in the image as 2D. Neutral expression. We first generate the face mesh for the neutral expression by using a two-step approach. Blanz and Vetter’s morphable model is automatically fitted to produce an initial matching mesh. Then a mesh deformation algorithm is employed to refine this mesh for better matching between the depth map and the feature points. Blanz and Vetter’s morphable model performs Principal Component Analysis (PCA) on 200 neutral face models. Any face can be approximated as a linear combination of the average face and l leading PCA vectors: ..=1aiFi, where ¯F is the average face, and Fi is the i-th PCA vector. Our goal is to compute the coefficients ai to get the closest mesh in the PCA space. The energy to be minimized for feature point matching is defined as …The first term corresponds to internal feature matching. Cj is the 3D position of the j-th feature point, while vi j is its corresponding vertex on the mesh V. The indices for these internal feature points on the mesh are simply marked on the average face in our implementation. The second term is for contour feature matching. sk is a 2D feature point on the color image, vck is its corresponding 3D feature vertex on the mesh V, and Mpro j is the projection matrix of the camera. We use the method described in [1] to determine the indices of the contour feature points on the mesh V: We first project the face region of V to the image to get the 2D face mesh. Then we find its convex hull to get the points along the contour of the mesh. Among these points, we find the nearest one for each contour feature on the image, and assign it as the corresponding feature point on the mesh….”see section  DMM and self-occlusion problem S3DMM, pages 2-3 of Lee “In S3DMM, the geometry of a face is defined as a shape vector S ¼ X1; Y1; Z1; X2; . . . Yn; Zn ð ÞT 2 R3n, which contains the X , Y , and Z-coordinates of n vertices. The original 3DMM generally uses a dense shape with thousands of vertices, whereas S3DMM uses a sparse shape with only dozens of vertices. In order to build a morphable shape model, S3DMM performs PCA on a training set of shape vectors Sj. The mean shape s0 and m shape variations si are then obtained, and a new shape S can be expressed as a linear combination of the mean shape s0 and the shape variations si as follows…. Given the 2D FFPs of an input face image, such as s2d ¼ x1; y1; x2; . . . yn ð ÞT 2 R2n , the shape parameter β needs to be determined such that it minimizes the shape residual between the projected 3D facial shape generated by the shape parameter and the input 2D facial shape. The optimal shape and pose parameters ðβ; Rθ; TÞ are obtained from (2): …Where ~S is a 3 × n matrix that is reshaped from the 3n × 1 model shape vector S obtained using (1), ~s2d is a 2×n matrix that is reshaped from the 2n × 1 input shape vector s2d , P is a 2 × 3 orthographic projection matrix, ~T is a 3 × n translation matrix consisting of n translation vectors T ¼ tx ty tz T , and Rθ is a 3 × 3 rotation matrix where the yaw angle is θ. Note that in this paper, we consider mainly yaw rotation because the self-occlusion caused by yaw rotation is relatively greater than that caused by pitch rotation, and tz is set to 0 because an orthographic projection is assumed…. shown in Algorithm 1, the procedure for 3D model fitting is as follows. First, the shape parameter β0 and translation parameter T0 are initialized to 0 and the input 2D FFPs s2d are aligned with the 2D mean shape obtained by projecting the 3D mean shape (s0) with a frontal pose onto the x–y plane. As the alignment method, we use the Procrustes analysis, which includes translation, rotation, and scaling [22]. The optimal model parameters are determined by alternately updating the pose parameter (Rθ, T) at the fixed β and updating the shape parameter β at fixed (Rθ,T) until the shape residual error converges. The cost function is solved as a least squares problem and the rotation matrix is calculated by QR decomposition, as in [5]. Finally, a new 3D facial shape S3d is reconstructed by applying the optimal shape parameter β to (1).); 
mapping the initial three-dimensional contour feature points to the two-dimensional face image by using the projection mapping matrix (B. Expression mesh and individual-specific blendshape generation of Cao “From smooth depth maps and corresponding color images, we generate the associated expression meshes. For each expression data, we first use Active Shape Model (ASM) [18] to locate 74 feature points on the color image, including the face contour, eye corners, brow boundary, mouth boundary, nose contour and tip. The automatically detected locations may not be accurate in all cases, especially for those expressions with relatively large deformation (e.g., mouth open and smile). We thus require a small amount of user interaction to refine the positions of some feature points–the user interaction is as simple as drag-and-dropping the feature points on the image. The 74 feature points are divided into two categories: the mi internal feature points (i.e., features on eyes, brows, nose and mouth, c.f. the green points in Fig. 3) located inside the face region, and the mc contour feature points (the yellow points in Fig. 3). Given the correspondence between the color image and the depth map, we can easily get the corresponding 3D positions from the depth map for internal feature points. We classify all contour feature points in the image as 2D. Neutral expression. We first generate the face mesh for the neutral expression by using a two-step approach. Blanz and Vetter’s morphable model is automatically fitted to produce an initial matching mesh. Then a mesh deformation algorithm is employed to refine this mesh for better matching between the depth map and the feature points. Blanz and Vetter’s morphable model performs Principal Component Analysis (PCA) on 200 neutral face models. Any face can be approximated as a linear combination of the average face and l leading PCA vectors: ..=1aiFi, where ¯F is the average face, and Fi is the i-th PCA vector. Our goal is to compute the coefficients ai to get the closest mesh in the PCA space. The energy to be minimized for feature point matching is defined as …The first term corresponds to internal feature matching. Cj is the 3D position of the j-th feature point, while vi j is its corresponding vertex on the mesh V. The indices for these internal feature points on the mesh are simply marked on the average face in our implementation. The second term is for contour feature matching. sk is a 2D feature point on the color image, vck is its corresponding 3D feature vertex on the mesh V, and Mpro j is the projection matrix of the camera. We use the method described in [1] to determine the indices of the contour feature points on the mesh V: We first project the face region of V to the image to get the 2D face mesh. Then we find its convex hull to get the points along the contour of the mesh. Among these points, we find the nearest one for each contour feature on the image, and assign it as the corresponding feature point on the mesh….”see section  DMM and self-occlusion problem S3DMM, pages 2-3  of Lee “….As shown in Algorithm 1 3D, the procedure for 3D model fitting is as follows. First, the shape parameter β0 and translation parameter T0 are initialized to 0 and the input 2D FFPs s2d are aligned with the 2D mean shape obtained by projecting the 3D mean shape (s0) with a frontal pose onto the x–y plane. As the alignment method, we use the Procrustes analysis, which includes translation, rotation, and scaling [22]. The optimal model parameters are determined by alternately updating the pose parameter (Rθ, T) at the fixed β and updating the shape parameter β at fixed (Rθ,T) until the shape residual error converges. The cost function is solved as a least squares problem and the rotation matrix is calculated by QR decomposition, as in [5]. Finally, a new 3D facial shape S3d is reconstructed by applying the optimal shape parameter β to (1).); and 
selecting, mapped three-dimensional points corresponding to two-dimensional contour feature points as face contour feature points of the first three-dimensional face model (see Figure .2 where showing landmarks contour on 2D image; see 3.1 Overview of Liu “…..Figure 2 shows the flowchart of the proposed method. For the input 2D face image I, its 3D face shape S is initialized as the mean 3D shape of training faces. Its landmarks U are initialized by fitting the mean landmarks of training frontal faces into the face region specified by a bounding box in I via similarity transforms. U and S are iteratively updated by applying a series of regressors. Each iteration contains three main steps: (i) updating landmarks, (ii) updating 3D face shape, and (iii) refining landmarks.;, 4.1 Protocols “…Experiment setup. During training and testing, each image is associated with a bounding box, which specifies the face region in the image. To initialize the landmarks in it, the mean of the landmarks in all neutral frontal training images is fitted to the face region via a similarity transform. In this paper, we set the number of iterations K = 5 (discussion of convergence issue is provided in supplemental material). SIFT descriptors are computed on 32 × 32 local patches around the landmarks, and the implementation by [35] is used in our experiments.” where the mean of the landmarks is fitted to the face region which is bounding box is considered as contour feature point;  B. Expression mesh and individual-specific blendshape generation, see section  DMM and self-occlusion problem S3DMM, pages 2-3  of Lee “….As shown in Algorithm 1 3D, the procedure for 3D model fitting is as follows. First, the shape parameter β0 and translation parameter T0 are initialized to 0 and the input 2D FFPs s2d are aligned with the 2D mean shape obtained by projecting the 3D mean shape (s0) with a frontal pose onto the x–y plane. As the alignment method, we use the Procrustes analysis, which includes translation, rotation, and scaling [22]. The optimal model parameters are determined by alternately updating the pose parameter (Rθ, T) at the fixed β and updating the shape parameter β at fixed (Rθ,T) until the shape residual error converges. The cost function is solved as a least squares problem and the rotation matrix is calculated by QR decomposition, as in [5]. Finally, a new 3D facial shape S3d is reconstructed by applying the optimal shape parameter β to (1).) In addition, the same motivation is used as the rejection for claim 1. Liu,  Liu,  RODRIGUEZ, Cao and Lee are understood to be silent on the remaining limitations of claim 9.
In the same field of endeavor, Kamencay teaches using a nearest neighbor matching algorithm (see 3.2 K-Nearest Neighbor (KNN) and 3.3 SPCA-KNN)
Therefore, in combination of Liu,  RODRIGUEZ, Cao and Lee,  it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify  joint face alignment and 3D face reconstruction method of Liu and feature points are divided into two categories: the internal feature points (i.e., features on eyes, brows, nose and mouth) located inside the face region, and the contour feature points of Cao and using proposed 3D Model Fitting of Lee with using K-Nearest Neighbor (KNN) or SPCA-KNN as seen in Kamencay because this modification would identify the closest object from the trained features (see section 5. Conclusion of Kamencay) 
Thus, the combination of Liu,  RODRIGUEZ, Cao, Lee and Kamencay teaches wherein the performing the contour feature point fitting on the first three-dimensional face model based on the face contour feature points of the two- dimensional face image comprises: selecting, from the first three-dimensional face model, three-dimensional points corresponding to the face contour feature points of the two-dimensional face image as initial three-dimensional contour feature points; mapping the initial three-dimensional contour feature points to the two-dimensional face image by using the projection mapping matrix; and selecting, by using a nearest neighbor matching algorithm, mapped three-dimensional points corresponding to two-dimensional contour feature points as face contour feature points of the first three-dimensional face model.
Regarding claim 15, Liu,  RODRIGUEZ, Cao, Lee teach the method of claim 12, Remaining limitations of claim15 is similar in scope to claim 9 and therefore rejected under the same rationale.

Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SARAH LE whose telephone number is (571)270-7842. The examiner can normally be reached Monday: 8AM-4:30PM EST, Tuesday: 8 AM-3:30PM EST, Wednesday: 8AM-2:30PM EST, Thursday and Friday off.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached on (571) 272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SARAH LE/Primary Examiner, Art Unit 2619