DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to claim(s) 1/5/2022 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
 Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 21-24, 26-28, and 31-35 is/are rejected under 35 U.S.C. 103 as being unpatentable over CN 107545302 A (“D1”) in view of Nistico et al. (US 2014/0055747 A1) and Pauli (US 9,253,442 B1).
Consider claim 21, D1 teaches a method for detecting one or more gaze-related parameters of a user, the method comprising: creating a left image of at least a portion of a left eye of the user using a first camera of a head-wearable device worn by the user ([0008] – [0012] (translation p.3, ¶2 – ¶6): (1) Shooting the user's face image, locating the left eye or right eye area, pre-processing the human eye image, realizing the correction of the head position, and obtaining the human eye image with a fixed pixel size.  The step (2) establishes a two-channel model, respectively inputting image information of the left eye and the right eye in the human eye image, and the specific process of separately extracting and outputting the information features of the left eye and the right eye through the two-channel model.  [0042] (translation p. 6, ¶3): At the same time, the line-of-sight direction estimation method has the scalability, and the information features or joint information features of the left and right eyes of the human eye can be used for regression analysis alone, and the predicted line of sight direction of the two eyes is obtained, and the true line of sight direction of the input image and the output predicted line of sight direction are adopted. The average angular deviation between the two is used as an error term to adaptively adjust the prediction model. Similarly, the line-of-sight direction estimation method has additiveness. After obtaining the information features of both eyes and the joint information features of both eyes, some other related information features can be directly added, and regression analysis is performed with all the features as a whole. [0044] – [0046] (translation p. 6, 
(b) in FIG. 1 is a structural schematic diagram TE-II of a single-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. After inputting the binocular image, the image is first processed by a CNN-based network to obtain independent feature information. Then, through a fully connected layer, the independent feature information is merged to obtain the binocular correlation information. Similarly, only the image is obtained. From the correlation information of both eyes, the direction of the line of sight of both eyes can also be predicted;
(c) in FIG. 1 is a structural diagram TE-A of a network model for calculating a line-of-sight direction method for combining left and right eye images of a human eye, by combining two of (a) of FIG. 1 and (b) of FIG. The model structure obtains binocular features and binocular correlation features in one go, and takes these information as the overall characteristics, and regression analysis obtains the line of sight directions of both eyes); creating a right image of at least a portion of a right eye of the user using a second camera of the head-wearable device ([0008] – [0012] (translation p.3, ¶2 – ¶6): (1) Shooting the user's face image, locating the left eye or right eye area, 

(c) in FIG. 1 is a structural diagram TE-A of a network model for calculating a line-of-sight direction method for combining left and right eye images of a human eye, by combining two of (a) of FIG. 1 and (b) of FIG. The model structure obtains binocular features and binocular correlation features in one go, and takes these information as the overall characteristics, and regression analysis obtains the line of sight directions of both eyes); feeding the left and right images together as an input into a convolutional neural network ([0013] (translation p. 3, ¶7) : The corrected fixed-size left-eye and right-eye images Il and Ir are input into the two-channel model, and Il and Ir are respectively processed through one channel.  [0044] – [0046] (translation p. 6, ¶5 – ¶7): (a) in Fig. 1 is a structural schematic diagram TE-I of a two-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. The model inputs the images of both eyes, and the images are outputted by the convolutional neural network (CNN)-based network to output the feature information of the left eye and the right eye respectively. The feature information is called the binocular feature, and the binocular features can also predict the line of sight of the eyes. ;

(c) in FIG. 1 is a structural diagram TE-A of a network model for calculating a line-of-sight direction method for combining left and right eye images of a human eye, by combining two of (a) of FIG. 1 and (b) of FIG. The model structure obtains binocular features and binocular correlation features in one go, and takes these information as the overall characteristics, and regression analysis obtains the line of sight directions of both eyes); and obtaining the one or more gaze-related parameters from the convolutional neural network as a result of the left and right images input ([0015] (translation: p.3, ¶9): The fixed length feature vector generated by each channel is the information feature corresponding to the input image extracted by the deep neural network, and the information features generated by the two channels are connected to obtain the final left eye and right eye information features).
However, D1 does not explicitly teach creating the left image and right image using two cameras.
Nistico teaches creating the left image and right image using two cameras ([0069]: The eye cameras 3l and 3r can either be suited for visible or near infrared light. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known of technique creating the left image and right image using two cameras because replacing a single camera with two cameras involves only routine skill in the art.  Nerwin v. Erlichman, 168 USPQ 177, 179.
However, the combination of D1 and Nistico does not explicitly teach concatenating the left image and the right image to form a concatenated image.
Pauli teaches concatenating the left image and the right image to form a concatenated image (col. 6, lines 12-24; col. 8, lines 21-40).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known of technique of concatenating the left image and the right image to form a concatenated image because such incorporation would generate a composite picture that represent an undistorted high-definition picture of the remote participant.  Col. 8, lines 21-40.
Consider claim 22, D1 teaches the one or more gaze-related parameters are selected from a list consisting of: a gaze direction, a cyclopian gaze direction, a 3D gaze point, a 2D gaze point, a visual axis orientation, an optical axis orientation, a pupil axis orientation, a line of sight orientation, an orientation and/or a position and/or an eyelid closure, a pupil area, a pupil size, a pupil diameter, a sclera characteristic, an iris diameter, a characteristic of a blood vessel, a cornea characteristic of at least one eye, a cornea radius, an eyeball radius, a distance pupil-center to cornea-center, a distance cornea-center to eyeball-center, a distance pupil-center to limbus center, a cornea keratometric index of refraction, a cornea index of refraction, a vitreous humor index of refraction, a distance crystalline lens to eyeball-center, to cornea center and/or to corneal apex, a crystalline lens index of refraction, a degree of astigmatism, an orientation angle of a flat and/or a steep axis, a limbus major and/or minor axes orientation, an eye cyclo-torsion, an eye intra-ocular distance, an eye vergence, a statistics over eye adduction and/or eye abduction, and a statistics over eye elevation and/or eye depression, data about blink events, drowsiness and/or awareness of the user, parameters for user iris verification and/or identification ([0044] – [0046] (translation p. 6, ¶5 – ¶7): (a) in Fig. 1 is a structural schematic diagram TE-I of a two-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. The model inputs the images of both eyes, and the images are outputted by the convolutional neural network (CNN)-based network to output the feature information of the left eye and the right eye respectively. The feature information is called the binocular feature, and the binocular features can also predict the line of sight of the eyes. ;
(b) in FIG. 1 is a structural schematic diagram TE-II of a single-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. After inputting the binocular image, the image is first processed by a CNN-the direction of the line of sight of both eyes can also be predicted;
(c) in FIG. 1 is a structural diagram TE-A of a network model for calculating a line-of-sight direction method for combining left and right eye images of a human eye, by combining two of (a) of FIG. 1 and (b) of FIG. The model structure obtains binocular features and binocular correlation features in one go, and takes these information as the overall characteristics, and regression analysis obtains the line of sight directions of both eyes).
Consider claim 23, Nistico teaches locating the first and second cameras within a range of 32 to 40 degrees with respect to the median plane of the head-wearable device ([0069]: The eye cameras 3l and 3r can either be suited for visible or near infrared light. They are located symmetrically with respect to a vertical center line that divides the user's face into two halves. The eye cameras 3l and 3r may be positioned in front and below the eyes 10l and 10r respectively, for example in or at the lower rim of a pair of eye glass lenses 8l and 8r, pointing at the eyes 10l and 10r in an angle of 30 degree to 50 degree and being mounted in the frame 4 in an angle a of 30 degree to 50 degree).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known of technique creating the left image and right image using two cameras because replacing a single Nerwin v. Erlichman, 168 USPQ 177, 179.
Consider claim 24, Nistico teaches locating the first and second cameras within a range of 114 to 122 degrees with respect to the median plane of the head-wearable device ([0069]: The eye cameras 3l and 3r can either be suited for visible or near infrared light. They are located symmetrically with respect to a vertical center line that divides the user's face into two halves. The eye cameras 3l and 3r may be positioned in front and below the eyes 10l and 10r respectively, for example in or at the lower rim of a pair of eye glass lenses 8l and 8r, pointing at the eyes 10l and 10r in an angle of 30 degree to 50 degree and being mounted in the frame 4 in an angle a of 30 degree to 50 degree).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known of technique of creating the left image and right image using two cameras because replacing a single camera with two cameras involves only routine skill in the art.  Nerwin v. Erlichman, 168 USPQ 177, 179.
Consider claim 26, D1 teaches the first and second images are not preprocessed to obtain spatial and/or temporal patterns or arrangements prior to feeding them to the convolutional neural network ([0044] – [0046] (translation p. 6, ¶5 – ¶7): (a) in Fig. 1 is a structural schematic diagram TE-I of a two-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. The model inputs the images of both eyes, and the images are outputted by the convolutional neural network (CNN)-based network to output the feature information 
(b) in FIG. 1 is a structural schematic diagram TE-II of a single-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. After inputting the binocular image, the image is first processed by a CNN-based network to obtain independent feature information. Then, through a fully connected layer, the independent feature information is merged to obtain the binocular correlation information. Similarly, only the image is obtained. From the correlation information of both eyes, the direction of the line of sight of both eyes can also be predicted;
(c) in FIG. 1 is a structural diagram TE-A of a network model for calculating a line-of-sight direction method for combining left and right eye images of a human eye, by combining two of (a) of FIG. 1 and (b) of FIG. The model structure obtains binocular features and binocular correlation features in one go, and takes these information as the overall characteristics, and regression analysis obtains the line of sight directions of both eyes).
Consider claim 27, D1 teaches the left and right images do not undergo feature extraction prior to feeding them to the convolutional neural network ([0044] – [0046] (translation p. 6, ¶5 – ¶7): (a) in Fig. 1 is a structural schematic diagram TE-I of a two-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. The model inputs the images of both eyes, and the images are outputted by the convolutional neural network (CNN)-based network to 
(b) in FIG. 1 is a structural schematic diagram TE-II of a single-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. After inputting the binocular image, the image is first processed by a CNN-based network to obtain independent feature information. Then, through a fully connected layer, the independent feature information is merged to obtain the binocular correlation information. Similarly, only the image is obtained. From the correlation information of both eyes, the direction of the line of sight of both eyes can also be predicted;
(c) in FIG. 1 is a structural diagram TE-A of a network model for calculating a line-of-sight direction method for combining left and right eye images of a human eye, by combining two of (a) of FIG. 1 and (b) of FIG. The model structure obtains binocular features and binocular correlation features in one go, and takes these information as the overall characteristics, and regression analysis obtains the line of sight directions of both eyes).
 Consider claim 28, D1 teaches the output of the convolutional neural network is not postprocessed for obtaining the one or more gaze-related parameter ([0044] – [0046] (translation p. 6, ¶5 – ¶7): (a) in Fig. 1 is a structural schematic diagram TE-I of a two-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. The model inputs the images of both eyes, and the images are outputted by the convolutional neural network (CNN)-
(b) in FIG. 1 is a structural schematic diagram TE-II of a single-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. After inputting the binocular image, the image is first processed by a CNN-based network to obtain independent feature information. Then, through a fully connected layer, the independent feature information is merged to obtain the binocular correlation information. Similarly, only the image is obtained. From the correlation information of both eyes, the direction of the line of sight of both eyes can also be predicted;
(c) in FIG. 1 is a structural diagram TE-A of a network model for calculating a line-of-sight direction method for combining left and right eye images of a human eye, by combining two of (a) of FIG. 1 and (b) of FIG. The model structure obtains binocular features and binocular correlation features in one go, and takes these information as the overall characteristics, and regression analysis obtains the line of sight directions of both eyes).
Consider claim 31, D1 teaches the convolutional neural network comprises a two-dimensional input layer ([0048] (translation p. 7, ¶2): Referring to Figure 2, a schematic diagram of the basic neural network structure of the present invention. In order to extract excellent feature information from the image, considering the excellent performance of current convolutional neural network in image processing, the method uses CNN network as the basic network for feature extraction. The input of the network 
Consider claim 32, D1 teaches the input layer is a N×2N matrix ([0048] (translation p. 7, ¶2): Referring to Figure 2, a schematic diagram of the basic neural network structure of the present invention. In order to extract excellent feature information from the image, considering the excellent performance of current convolutional neural network in image processing, the method uses CNN network as the basic network for feature extraction. The input of the network is a 36×60 grayscale picture, and the output is an x-dimensional feature. The specific value of x can be set by itself. After the image is input, first through a layer of convolution, the size of the convolution kernel is set to 5 × 5, the number of output channels is set to 20, after the first layer of convolution, the output is 20 32 × The 56-size picture, then, passes through the largest pooling layer, and after 20×2 pooling, it outputs 20 16×28 pictures. Then, the 20 pictures are convolved again, the convolution kernel is still 5×5, the output channel is 
Consider claim 33, D1 teaches N is smaller or equal than 50 ([0048] (translation p. 7, ¶2): Referring to Figure 2, a schematic diagram of the basic neural network structure of the present invention. In order to extract excellent feature information from the image, considering the excellent performance of current convolutional neural network in image processing, the method uses CNN network as the basic network for feature extraction. The input of the network is a 36×60 grayscale picture, and the output is an x-dimensional feature. The specific value of x can be set by itself. After the image is input, first through a layer of convolution, the size of the convolution kernel is set to 5 × 5, the number of output channels is set to 20, after the first layer of convolution, the output is 20 32 × The 56-size picture, then, passes through the largest pooling layer, and after 20×2 pooling, it outputs 20 16×28 pictures. Then, the 20 pictures are convolved again, the convolution kernel is still 5×5, the output channel is 50, a total of 50 12×24 pictures are output, and then the 50 pictures are subjected to the maximum pooling of 2×2. Layer, get 50 6×12 pictures. Finally, the 50 6×12 pictures are spread out to obtain 50×6×12 numbers. Through the fully connected layer, the final desired x-dimensional features are obtained).
Consider claim 34, D1 teaches the convolutional neural network has a filter kernel of size M with M being in the range from 2 to 6 ([0048] (translation p. 7, ¶2): Referring to Figure 2, a schematic diagram of the basic neural network structure of the 
Consider claim 35, D1 teaches the size M with M is in the range from 3 to 5 ([0048] (translation p. 7, ¶2): Referring to Figure 2, a schematic diagram of the basic neural network structure of the present invention. In order to extract excellent feature information from the image, considering the excellent performance of current convolutional neural network in image processing, the method uses CNN network as the basic network for feature extraction. The input of the network is a 36×60 grayscale picture, and the output is an x-dimensional feature. The specific value of x can be set by itself. After the image is input, first through a layer of convolution, the size of the convolution kernel is set to 5 × 5, the number of output channels is set to 20, after the .
Claims 29-30 is/are rejected under 35 U.S.C. 103 as being unpatentable over CN 107545302 A (“D1”) in view of Nistico et al. (US 2014/0055747 A1), Pauli (US 9,253,442 B1), and Aliabadi et al. (US 10,489,680 B2).
Consider claim 29, the combination of D1 and Nistico teaches all the limitations in claim 21 but does not explicitly teach the convolutional neural network comprises at least 6 layers.
Aliabadi teaches the convolutional neural network comprises at least 6 layers (col. 5, lines 10-28: A convolutional layer of a CNN can include a kernel stack of kernels. A kernel of a convolutional layer, when applied to its input, can produce a resulting output activation map showing the response to that particular learned kernel. However, computing convolutions can be computationally expensive or intensive. And a convolutional layer can be computationally expensive. For example, convolutional layers can be the most computationally expensive layers of a CNN because they require more computations than other types of CNN layers (e.g., sub sampling layers). The resulting output activation map can then be processed by another layer of the CNN. Other layers of the CNN can include, for example, a normalization layer (e.g., a 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known of technique of using a convolutional neural network comprising at least 6 layers because such incorporation would help implement efficient eye-tracking applications.  Col. 4, lines 37-67.
Consider claim 30, Aliabadi teaches the convolutional neural network comprises between 12 and 30 layers (col. 5, lines 10-28: A convolutional layer of a CNN can include a kernel stack of kernels. A kernel of a convolutional layer, when applied to its input, can produce a resulting output activation map showing the response to that particular learned kernel. However, computing convolutions can be computationally expensive or intensive. And a convolutional layer can be computationally expensive. For example, convolutional layers can be the most computationally expensive layers of a CNN because they require more computations than other types of CNN layers (e.g., sub sampling layers). The resulting output activation map can then be processed by another layer of the CNN. Other layers of the CNN can include, for example, a normalization layer (e.g., a brightness normalization layer, a batch normalization (BN) layer, a local contrast normalization (LCN) layer, or a local response normalization (LRN) layer), a rectified linear layer, an upsampling layer, 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known of technique of using a convolutional neural network comprising at least 6 layers because such incorporation would help implement efficient eye-tracking applications.  Col. 4, lines 37-67.
Claims 36 and 38-41 is/are rejected under 35 U.S.C. 103 as being unpatentable over CN 107545302 A (“D1”) in view of Nistico et al. (US 2014/0055747 A1), Pauli (US 9,253,442 B1), and Hwang et al. (US 2017/0332901 A1).
Consider claim 36, D1 teaches the steps of: creating a left image of at least a portion of a left eye of the user using a first camera of a head-wearable device worn by the user ([0008] – [0012] (translation p.3, ¶2 – ¶6): (1) Shooting the user's face image, locating the left eye or right eye area, pre-processing the human eye image, realizing the correction of the head position, and obtaining the human eye image with a fixed pixel size.  The step (2) establishes a two-channel model, respectively inputting image information of the left eye and the right eye in the human eye image, and the specific process of separately extracting and outputting the information features of the left eye and the right eye through the two-channel model.  [0042] (translation p. 6, ¶3): At the same time, the line-of-sight direction estimation method has the scalability, and the information features or joint information features of the left and right eyes of the human eye can be used for regression analysis alone, and the predicted line of sight direction of the two eyes is obtained, and the true line of sight direction of the input 
(b) in FIG. 1 is a structural schematic diagram TE-II of a single-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. After inputting the binocular image, the image is first processed by a CNN-based network to obtain independent feature information. Then, through a fully connected layer, the independent feature information is merged to obtain the binocular correlation information. Similarly, only the image is obtained. From the correlation information of both eyes, the direction of the line of sight of both eyes can also be predicted;
(c) in FIG. 1 is a structural diagram TE-A of a network model for calculating a line-of-sight direction method for combining left and right eye images of a human eye, ; creating a right image of at least a portion of a right eye of the user using a second camera of the head-wearable device ([0008] – [0012] (translation p.3, ¶2 – ¶6): (1) Shooting the user's face image, locating the left eye or right eye area, pre-processing the human eye image, realizing the correction of the head position, and obtaining the human eye image with a fixed pixel size.  The step (2) establishes a two-channel model, respectively inputting image information of the left eye and the right eye in the human eye image, and the specific process of separately extracting and outputting the information features of the left eye and the right eye through the two-channel model.  [0042] (translation p. 6, ¶3): At the same time, the line-of-sight direction estimation method has the scalability, and the information features or joint information features of the left and right eyes of the human eye can be used for regression analysis alone, and the predicted line of sight direction of the two eyes is obtained, and the true line of sight direction of the input image and the output predicted line of sight direction are adopted. The average angular deviation between the two is used as an error term to adaptively adjust the prediction model. Similarly, the line-of-sight direction estimation method has additiveness. After obtaining the information features of both eyes and the joint information features of both eyes, some other related information features can be directly added, and regression analysis is performed with all the features as a whole. [0044] – [0046] (translation p. 6, ¶5 – ¶7): (a) in Fig. 1 is a structural schematic diagram TE-I of a two-channel deep neural network model for simultaneously extracting left-eye 
(b) in FIG. 1 is a structural schematic diagram TE-II of a single-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. After inputting the binocular image, the image is first processed by a CNN-based network to obtain independent feature information. Then, through a fully connected layer, the independent feature information is merged to obtain the binocular correlation information. Similarly, only the image is obtained. From the correlation information of both eyes, the direction of the line of sight of both eyes can also be predicted;
(c) in FIG. 1 is a structural diagram TE-A of a network model for calculating a line-of-sight direction method for combining left and right eye images of a human eye, by combining two of (a) of FIG. 1 and (b) of FIG. The model structure obtains binocular features and binocular correlation features in one go, and takes these information as the overall characteristics, and regression analysis obtains the line of sight directions of both eyes); feeding the left and right images together as an input into a convolutional neural network ([0013] (translation p. 3, ¶7) : The corrected fixed-size left-eye and right-eye images Il and Ir are input into the two-channel model, and Il and Ir are respectively processed through one channel.  [0044] – [0046] (translation p. 6, ¶5 – ¶7): (a) in Fig. 1 is a structural schematic diagram TE-I of a two-channel deep neural 
(b) in FIG. 1 is a structural schematic diagram TE-II of a single-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. After inputting the binocular image, the image is first processed by a CNN-based network to obtain independent feature information. Then, through a fully connected layer, the independent feature information is merged to obtain the binocular correlation information. Similarly, only the image is obtained. From the correlation information of both eyes, the direction of the line of sight of both eyes can also be predicted;
(c) in FIG. 1 is a structural diagram TE-A of a network model for calculating a line-of-sight direction method for combining left and right eye images of a human eye, by combining two of (a) of FIG. 1 and (b) of FIG. The model structure obtains binocular features and binocular correlation features in one go, and takes these information as the overall characteristics, and regression analysis obtains the line of sight directions of both eyes); and obtaining the one or more gaze-related parameters from the convolutional neural network as a result of the left and right images input ([0015] (translation: p.3, ¶9): The fixed length feature vector generated by each channel is the information feature corresponding to the input image extracted by the deep neural 
However, D1 does not explicitly teach creating the left image and right image using two cameras.
Nistico teaches creating the left image and right image using two cameras ([0069]: The eye cameras 3l and 3r can either be suited for visible or near infrared light. They are located symmetrically with respect to a vertical center line that divides the user's face into two halves. The eye cameras 3l and 3r may be positioned in front and below the eyes 10l and 10r respectively, for example in or at the lower rim of a pair of eye glass lenses 8l and 8r, pointing at the eyes 10l and 10r in an angle of 30 degree to 50 degree and being mounted in the frame 4 in an angle a of 30 degree to 50 degree).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known of technique creating the left image and right image using two cameras because replacing a single camera with two cameras involves only routine skill in the art.  Nerwin v. Erlichman, 168 USPQ 177, 179.
Pauli teaches concatenating the left image and the right image to form a concatenated image (col. 6, lines 12-24; col. 8, lines 21-40).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known of technique of concatenating the left image and the right image to form a concatenated image because such incorporation would generate a composite picture that represent an undistorted high-definition picture of the remote participant.  Col. 8, lines 21-40.

Hwang teaches a computer-readable medium comprising instructions which, when executed on a computer, cause the computer to carry out the steps ([0025] – [0028]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known technique of implementing the method on a computer-readable medium because such incorporation would allow the method to be efficiently transmitted to and downloaded from a remote location.  [0027] – [0028].
Consider claim 38, Hwang teaches the system is integrated in a wearable device ([0011] – [0012] and [0025] – [0028]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known technique of implementing the method on a computer having instruction because such incorporation would allow the method to be efficiently transmitted to and downloaded from a remote location.  [0027] – [0028].
Consider claim 39, Hwang teaches the wearable device is the head-wearable device ([0011] – [0012] and [0025] – [0028]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known technique of implementing the method on a computer having instruction because such incorporation 
Consider claim 40, Hwang teaches the computer is integrated into a desktop computer, a server, a smart phone, a tablet, or a laptop ([0011] – [0012] and [0025] – [0028]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known technique of implementing the method on a computer having instruction because such incorporation would allow the method to be efficiently transmitted to and downloaded from a remote location.  [0027] – [0028].
Consider claim 41, D1 teaches a system: a head wearable device comprising a left camera for taking a left image of at least a portion of a left eye of a user wearing the head-wearable device, and a right camera for taking a right image of at least a portion of a right eye of the user wearing the head-wearable device ([0008] – [0012] (translation p.3, ¶2 – ¶6): (1) Shooting the user's face image, locating the left eye or right eye area, pre-processing the human eye image, realizing the correction of the head position, and obtaining the human eye image with a fixed pixel size.  The step (2) establishes a two-channel model, respectively inputting image information of the left eye and the right eye in the human eye image, and the specific process of separately extracting and outputting the information features of the left eye and the right eye through the two-channel model.  [0042] (translation p. 6, ¶3): At the same time, the line-of-sight direction estimation method has the scalability, and the information features or joint information features of the left and right eyes of the human 
(b) in FIG. 1 is a structural schematic diagram TE-II of a single-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. After inputting the binocular image, the image is first processed by a CNN-based network to obtain independent feature information. Then, through a fully connected layer, the independent feature information is merged to obtain the binocular correlation information. Similarly, only the image is obtained. From the correlation information of both eyes, the direction of the line of sight of both eyes can also be predicted;
; a processing unit configured to receive the left image, receive the right image, concatenating the left image and the right image to form a concatenated image ([0049] (translation p. 7, ¶3): Referring to FIG. 3, a general structural diagram of a method for determining a line of sight direction of a combination of left and right eye images of a human eye according to the present invention is shown. The invention constructs a neural network by itself, and uses this to predict and analyze the three-dimensional line of sight direction of the user's eyes. The method obtains the 1506-dimensional feature vector by inputting the fixed-scale human eye grayscale image and the head angle vector, and then obtains the 6-dimensional binocular line-of-sight direction through regression analysis. The overall structure of the invention also encompasses the network of steps (2), (3) in the Summary of the Invention. The overall structure of the network of step (2) is as shown in FIG. 3 with the above two human eyes as input, the network inputs the images of the left eye and the right eye, respectively, and then convolves the image using the CNN network in FIG. 2, and The final feature quantity x is set to 1000, and the feature vector with length 1000 is obtained. The feature vector is respectively passed through a fully connected layer (FC) to obtain the 500-dimensional feature vector respectively. Finally, the two 500 are simply the feature vectors of the dimensions are connected as the output features of the feed the concatenated image as an input into a trained convolutional neural network ([0013] (translation p. 3, ¶7) : The corrected fixed-size left-eye and right-eye images Il and Ir are input into the two-channel model, and Il and Ir are respectively processed through one channel.  [0044] – [0046] (translation p. 6, ¶5 – ¶7): (a) in Fig. 1 is a structural schematic diagram TE-I of a two-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. The model inputs the images of both eyes, and the images are outputted by the convolutional neural network (CNN)-based network to output the feature information of the left eye and the right eye respectively. The feature information is called the binocular feature, and the binocular features can also predict the line of sight of the eyes. ;
(b) in FIG. 1 is a structural schematic diagram TE-II of a single-channel deep neural network model for simultaneously extracting left-eye and right-eye information features. After inputting the binocular image, the image is first processed by a CNN-based network to obtain independent feature information. Then, through a fully connected layer, the independent feature information is merged to obtain the binocular correlation information. Similarly, only the image is obtained. From the correlation information of both eyes, the direction of the line of sight of both eyes can also be predicted;
; and obtain the one or more gaze-related parameters from the convolutional neural network as a result of the concatenated image input ([0015] (translation: p.3, ¶9): The fixed length feature vector generated by each channel is the information feature corresponding to the input image extracted by the deep neural network, and the information features generated by the two channels are connected to obtain the final left eye and right eye information features).
However, D1 does not explicitly teach creating the left image and right image using two cameras.
Nistico teaches creating the left image and right image using two cameras ([0069]: The eye cameras 3l and 3r can either be suited for visible or near infrared light. They are located symmetrically with respect to a vertical center line that divides the user's face into two halves. The eye cameras 3l and 3r may be positioned in front and below the eyes 10l and 10r respectively, for example in or at the lower rim of a pair of eye glass lenses 8l and 8r, pointing at the eyes 10l and 10r in an angle of 30 degree to 50 degree and being mounted in the frame 4 in an angle a of 30 degree to 50 degree).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known of technique creating the left image and right image using two cameras because replacing a single Nerwin v. Erlichman, 168 USPQ 177, 179.
Hwang teaches the wearable device is the head-wearable device ([0011] – [0012] and [0025] – [0028]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known technique of implementing the method on a computer having instruction because such incorporation would allow the method to be efficiently transmitted to and downloaded from a remote location.  [0027] – [0028].
Claim 42 is/are rejected under 35 U.S.C. 103 as being unpatentable over CN 107545302 A (“D1”) in view of Nistico et al. (US 2014/0055747 A1), Pauli (US 9,253,442 B1), and Abe et al. (US 2020/0322595 A1).
	Consider claim 42, the combination of D1 and Nistico teaches all the limitations in claim 21 but does not explicitly teach the one or more gaze-related parameters are selected from a list comprising a 3D gaze point and a 2D gaze point.
	Abe teaches the one or more gaze-related parameters are selected from a list comprising a 3D gaze point and a 2D gaze point ([0131], [0137] – [0138], [0153], and [0170]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the known technique of using 3D gaze point and 2D gaze point because such incorporation would enable a comfortably handsfree operation with achievement of an improvement regarding the 
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TAT CHI CHIO whose telephone number is (571)272-9563. The examiner can normally be reached Monday-Thursday 10am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JAMIE J ATALA can be reached on 571-272-7384. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TAT C CHIO/Primary Examiner, Art Unit 2486