PNG
    media_image1.png
    340
    340
    media_image1.png
    Greyscale
United States Patent and Trademark Office    
        
            
                                
            
        
    

Commissioner for Patents
United States Patent and Trademark Office
P.O. Box 1450
Alexandria, VA 22313-1450
www.uspto.gov











BEFORE THE PATENT TRIAL AND APPEAL BOARD


Application Number: 15/759,819
Filing Date: 13 Mar 2018
Appellant(s): EyesMatch Ltd.



__________________
JOSEPH BACH
For Appellant


EXAMINER’S ANSWER





This is in response to the appeal brief filed 05/09/2022.


(1) Grounds of Rejection to be Reviewed on Appeal
Every ground of rejection set forth in the Office action dated 10/07/2021 from which the appeal is taken is being maintained by the examiner except for the grounds of rejection (if any) listed under the subheading “WITHDRAWN REJECTIONS.”  New grounds of rejection (if any) are provided under the subheading “NEW GROUNDS OF REJECTION.”
The following ground(s) of rejection are applicable to the appealed claims.
I. Claim(s) 1-26 is/are rejected under 35 U.S.C. 103 as being unpatentable over Santos et al. (US 20130169827 A1) in view of Rav-Acha et al. (US 20130343729 A1).
II. Claim 27-29 is/are rejected under 35 U.S.C. 103 as being unpatentable over Santos et al. (US 20130169827 A1) in view of Rav-Acha et al. (US 20130343729 A1) as used in the rejection of claims 1-26 above, and further in view of Rolston (US 20090051779 A1).













(2) Response to Argument
Appellant's arguments filed in the Appeal Brief filed on 05/09/2022 have been fully considered but they are not persuasive.
Appellant on Pages 5 and 8 of the Appeal Brief, mischaracterizes the disclosure of Santos et al. by stating “Santos discloses digitally applying makeup to a static image of a user” and “Santos discloses applying virtual make-up to a photograph, not a video”. To this end, Santos et al. in Paragraph 30, clearly teaches “The results of the interaction the user with the make-up simulation process are shown in real time. The use of hands reproduces the physical interaction that occurs in the conventional process, playing a pleasurable experience with mixing colors and intensity, with the advantage of being able to perform make-up in a simple manner even for those who are not familiar with aesthetics, following the partial results along of the process.” Here the concept of applying real time make-up onto a static image would make no sense whatsoever, it is inherently understood that the makeup would be applied to a dynamic video stream of a user from the device camera. In Paragraph 51, Santos et al. makes clear that it is capable of applying digital makeup to both dynamic and static images, “The present invention is a make-up simulation system using image processing, embodied by means of an integrated software and hardware that applies make-up effects and/or other graphics in real time by means of portable devices having a digital camera. The system simulates make-up on static or dynamic images using the identification manual, semiautomatic or automatic regions of the face.” Thus, it is made abundantly clear that the real-time application of digital makeup taught in Santos et al. applies to both static and dynamic user images, and it is clear that the usage of the word “dynamic” in describing images as opposed to static images clearly means applying digital makeup to video.
In response to appellant's argument that the combination of Santos et al. and Rav-Acha et al. “when considered as a whole”, would not arrive at appellant’s invention, however the test for obviousness is not whether the features of a secondary reference may be bodily incorporated into the structure of the primary reference; nor is it that the claimed invention must be expressly suggested in any one or all of the references.  Rather, the test is what the combined teachings of the references would have suggested to those of ordinary skill in the art.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981). In this case Santos et al. teaches applying makeup to portions of a user’s face in a video stream, Rav-Acha et al. is relied upon to teach editing video data automatically based on detected subject matter. 
In response to appellant's argument that the examiner's conclusion of obviousness is based upon improper hindsight reasoning, it must be recognized that any judgment on obviousness is in a sense necessarily a reconstruction based upon hindsight reasoning.  But so long as it takes into account only knowledge which was within the level of ordinary skill at the time the claimed invention was made, and does not include knowledge gleaned only from the applicant's disclosure, such a reconstruction is proper.  See In re McLaughlin, 443 F.2d 1392, 170 USPQ 209 (CCPA 1971). As clearly laid out above, the disclosures of Santos et al. and Rav-Acha et al. would have been obvious to combine these references to a person having ordinary skill in the art at the time of the filing of the invention.
In response to applicant’s argument that there is no teaching, suggestion, or motivation to combine the references, the examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art.  See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007).  In this case, both Santos et al. and Rav-Acha et al. teach editing of video and image data based on detection of facial features. The automatic sub-sessions or clips created from the video would be obvious to include in order to get a pleasing edited video, which only includes important portions as required.

In response to appellant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).
Appellant argues on page 8 of the Appeal Brief with regards to claim 1, that the combination of Santos et al. and Rav-Acha et al. fails to teach “a user interface including pre-programmed buttons indicating various stages of a makeup session”. Examiner respectfully disagrees. In Paragraph 100 of Santos et al., it teaches “FIG. 4 shows the interface used for application of make-up with two toolbars (410) and (411) containing several features, such as product selection, color selection, style selection, actions to redo/undo make-up among others. After choosing the product and color that will be used, the user does the application of make-up. This process is performed from the touch screen of the portable device, by which it is possible to use the user's fingers to apply make-up, mix colors or increase the intensity of a given product. The result of applying make-up is displayed on the display of the portable device (412).” Here, presenting the options for the various products, styles and color choices along with redo and undo functionality is interpreted to clearly represent the pre-programmed buttons indicating various stages of a makeup session.
Appellant further argues on page 8 of the Appeal Brief regarding Claim 1 that the combination of Santos et al. and Rav-Acha et al. fails to teach “breaking the video stream into makeup sub-sessions according to input from the pre-programmed buttons identifying the timing of start of each of the sub-sessions and inserting marks into the video stream with links to the pre-programmed buttons to thereby generate a marked video stream.”. However, Examiner respectfully disagrees. Santos et al. is relied upon to teach in particular “breaking the video stream into makeup sub-sessions according to input from the pre-programmed buttons, and generate a marked video stream with links to the pre-programmed buttons”. While the discussion that follows may use the word “photo” as previously discussed, the disclosure of Santos et al. is not only concerned with static images, but also dynamic images or video. In Paragraph 97, Santos et al. teaches “After capturing the photo, the procedure is performed to locate the face and segment regions of the same. To assist in the targeting of regions of the face, we obtained first the points of interest. The system embodied by the present invention provides an interface to the user to adjust the regions of the face by the points of interest. These adjustments are necessary to refine the segmentation result of the regions to be made up. The interface to make these adjustments is presented in FIG. 3. This interface has two sidebars (310) and (311), where in a bar to access features such as undoing and redoing, and other bar can perform functions for enlarge image in a particular region of the face, such as mouth, or eye brow, increasing in turn the accuracy of the adjustments made by the points of interest found.” In Paragraph 98, Santos et al. teaches “From the points of interest found, polygons can be formed that are interconnected by the interpolation of Bezier obtaining polygons that define the regions of the face the user. In (312), (313), (314) and (315) the polygons representing the regions of the face, eyes, the eyebrows and the mouth respectively are presented.” In Paragraph 99, it teaches “After the appropriate adjustments to the points of interest, then it creates the masks necessary to segment the regions of the face where the make-up is applied.” Further, in Paragraph 100, it clearly teaches “FIG. 4 shows the interface used for application of make-up with two toolbars (410) and (411) containing several features, such as product selection, color selection, style selection, actions to redo/undo make-up among others. After choosing the product and color that will be used, the user does the application of make-up. This process is performed from the touch screen of the portable device, by which it is possible to use the user's fingers to apply make-up, mix colors or increase the intensity of a given product. The result of applying make-up is displayed on the display of the portable device (412).” This clearly lays out the segmentation of the image according to different portions of the face to which makeup is applied, these are considered to be “makeup sub-sessions” as claimed, and further lays out the applicable toolbars and buttons which designate these makeup sub-sessions and application of makeup as needed. 
Rav-Acha is relied upon to teach in particular “breaking the video stream into sub-sessions by identifying the timing of start of each of the sub-sessions and inserting marks into the video stream to thereby generate a marked video stream”. To this end Rav-Acha et al. teaches in Paragraph 36, “The non-transitory computer readable medium may store instructions that may cause the computerized system to detect faces of multiple persons in the media stream and to generate, for at least two of the multiple persons, a modified media stream that is generated by assigning to the person higher importance than an importance of the other persons of the multiple persons.” In Paragraph 37, Rav Acha further elaborates “The non-transitory computer readable medium may store instructions that may cause the computerized system to detect faces of persons in the media stream; to display the faces and information about the modified media stream to the user; receive an instruction to share the modified media stream with a certain person that is identified by a certain face out of the faces; to share the modified media with the certain person if contact information required for the sharing with the certain person is available.” In Paragraph 465 Rav-Acha et al. teaches, “Tagging: Automatic tagging of media entities is achieved by applying the Detection/Recognition building block several times. Some tags are extracted by solving a detection problem. For instance adding a tag " face" whenever the face detector detected a face in a video clip, or a tag "applause" when a sound of clapping hands is detected. Other types of tags are extracted by solving a recognition (or classification) problem. For instance, a specific person-tag is added whenever the face-recognition module classifies a detected face as a specific, previously known face. Another example is classifying a scene to be "living-room scene" out of several possibilities of pre-defined scene location types. The combination of many detection and recognition modules can produce a rich and deep tagging of the media assets, which is valuable for many of the features described below.” In Paragraph 474, it teaches “Given a visual entity d (for example, a video segment), the attributes above can be used to compute intermediate importance scores s.sub.1, . . . s.sub.l (in our implementation, these scores can be negative. Such scores can be obtained by using direct measurements (e.g, SalienSee measure of a clip), or by some binary predicate using the extracted meta-data (e.g., s=1 if clip includes a `large face closeup` tag and s=0 otherwise). The final ImportanSee measure is given as a weighted sum of all attribute scores. I.e., ImportanSee(d)=max (.SIGMA..sub.i .alpha..sub.is.sub.i, 0), where .alpha..sub.i is the relative weights of each attribute.” In Paragraph 475, it teaches “Table of content: Table of (visual) content is a hierarchical segmentation of visual entities (video or set of videos and images). This feature can be implemented as a clustering of the various scenes in a video. For instance, by sampling short video chunks (e.g., 1 second of video every 5 seconds of video) and clustering these media chunks (using the clustering building block) will produce a flat or hierarchical table of contents of the video. In addition to this segmentation, each segment is attached with either a textual or visual short description (for example, a representative frame or a short clip). This representative can be selected randomly, or according to its ImportanSee measure.” In this case tags are clearly and unambiguously used to mark a video stream to identify the content in video within chunks of video data, which are segmented clips of video data which are marked with tags. The video stream is also clearly tagged in response to detection and recognition of faces in the images. Thus, it is clear in light of properly considering the combined teachings of both pieces of prior art as applied to the claim limitations, that these limitations as filed would have been obvious to a person having ordinary skill in the art at the time of the filing of the invention. In this case, both Santos et al. and Rav-Acha et al. teach editing of video and image data based on detection of facial features. The automatic sub-sessions or clips created from the video would be obvious to include in order to get a pleasing edited video, which only includes important portions as required.

Appellant further argues  on Page 9 of the Appeal Brief, regarding Claim 1 that Santos et al. and Rav Acha et al. fail to teach “identify outline of a face within the video stream… translate pixels belonging to the face so that the outline is centered within the digital screen, thereby generating a translated video stream”. However, Examiner respectfully disagrees. In Paragraph 97, Santos et al. teaches "To assist in the targeting of regions of the face, we obtained first the points of interest. The system embodied by the present invention provides an interface to the user to adjust the regions of the face by the points of interest. These adjustments are necessary to refine the segmentation result of the regions to be made up. The interface to make these adjustments is presented in FIG. 3. This interface has two sidebars (310) and (311), where in a bar to access features such as undoing and redoing, and other bar can perform functions for enlarge image in a particular region of the face, such as mouth, or eye brow, increasing in turn the accuracy of the adjustments made by the points of interest found.” This is further depicted in Figure 29 of Santos et al. where it is clearly shown that the face is centered and enlarged to allow for more accuracy. In Paragraph 104, Santos et al. teaches “The steps involved in the step of "Location of the regions of the face" are sequentially displayed in FIG. 6. The first step of the location of regions of the face illustrated in FIG. 7 is cutting the target region (610), since the captured picture (710) has a lot of unnecessary information and which can adversely affect this process. Therefore, we select only the region of the rectangle corresponding to the target (711) and copy the pixels are present in this region to a new image (712). With this, it is possible to reduce the cost of computing location of the face besides increasing the accuracy of the detection process.” In Paragraph 105, it teaches “After selecting only the region of the target, it tries to then locate the region of the face (611) and the eye region (612) and the mouth region (613) of the user. For this, we used an Artificial Intelligence area known as Machine Learning. Machine Learning Techniques for using a collection of data for "teaching machine" to answer questions on the same. The present invention employed these techniques to verify the existence of a face in a digital image.” The creation of a new image file in which the face is centered and unnecessary pixels are excluded is clearly a translation as well. Furthermore, it is clear from the disclosure that this detection of a face location to center in on it is not performed wholly manually as suggested by applicant, it very clearly and unambiguously teaches that machine learning is used to detect and emphasize portions of interest. In Paragraph 108, Santos et al. summarizes this process, “Using the techniques described above, the location of the face (810) takes place, as illustrated in FIG. 8, selecting as region of interest only the region defined by the classifier as a face. Thus, it avoids processing areas that do not belong to the face of the user. Then, a search is effected by the eyes (811) within the region of the face. The detection of the eye is made similarly to the face detection, only changing the training model used in the classification.” This clearly teaches centering and translating pixels of the face. Rav-Acha et al. is relied upon to teach “generating a translated video stream; displaying the translated video stream”. To this end, in Paragraph 354, Rav-Acha et al. teaches “Adding a suggested clip (from an automatically prepared candidate list)” In Paragraph 355, it teaches “Selecting one of more faces to be emphasized or excluded from the edited video. This lists of faces is automatically extracted from the video and can be displayed to the user using a graphical user interface similar to the figure below.” In Paragraph 364, Rav-Acha et al. further teaches “Personalized Production--besides manual post editing, the user can affect the automatic production and editing stages using a search query, which emphasizes the parts in the video, which are important to the user. The query can take the form of a full search query (text+tags+keywords). For instance, a query of the form `Danny jumping in the living room` would put more emphasize in the editing and the production stages on parts which fit the query. Another example is of a query which uses a visual tag describing a pet dog and a location tag with an image of the back yard. Another option for the user to affect the editing stage is by directly marking a sub-clip in the video which must appear in the production. Yet another example is that the user marks several people (resulting from Face Clustering and Recognition) and gets several productions, each production with the selected person highlighted in the resulting clip, suitable for sharing with that respective person.” Thus, Rav-Acha et al. clearly teaches generating and displaying a translated video stream based on the detection, tagging and clipping of video streams based on the detected faces contained in said video streams.
Appellant argues on Page 10 of the Appeal Brief, regarding Claim 2 that the combination of Santos et al. and Rav-Acha et al. fail to teach “the controller performs the breaking according to input data entered by a user identifying the timing of start of each of the sub-sessions”. However, Examiner respectfully disagrees. Santos et al. is relied upon to teach in particular “that the controller performs the breaking according to input data entered by a user”. While the discussion that follows may use the word “photo” as previously discussed, the disclosure of Santos et al. is not only concerned with static images, but also dynamic images or video. In Paragraph 97, Santos et al. teaches “After capturing the photo, the procedure is performed to locate the face and segment regions of the same. To assist in the targeting of regions of the face, we obtained first the points of interest. The system embodied by the present invention provides an interface to the user to adjust the regions of the face by the points of interest. These adjustments are necessary to refine the segmentation result of the regions to be made up. The interface to make these adjustments is presented in FIG. 3. This interface has two sidebars (310) and (311), where in a bar to access features such as undoing and redoing, and other bar can perform functions for enlarge image in a particular region of the face, such as mouth, or eye brow, increasing in turn the accuracy of the adjustments made by the points of interest found.” In Paragraph 98, Santos et al. teaches “From the points of interest found, polygons can be formed that are interconnected by the interpolation of Bezier obtaining polygons that define the regions of the face the user. In (312), (313), (314) and (315) the polygons representing the regions of the face, eyes, the eyebrows and the mouth respectively are presented.” In Paragraph 99, it teaches “After the appropriate adjustments to the points of interest, then it creates the masks necessary to segment the regions of the face where the make-up is applied.” Further, in Paragraph 100, it clearly teaches “FIG. 4 shows the interface used for application of make-up with two toolbars (410) and (411) containing several features, such as product selection, color selection, style selection, actions to redo/undo make-up among others. After choosing the product and color that will be used, the user does the application of make-up. This process is performed from the touch screen of the portable device, by which it is possible to use the user's fingers to apply make-up, mix colors or increase the intensity of a given product. The result of applying make-up is displayed on the display of the portable device (412).” This clearly lays out the segmentation of the image according to different portions of the face to which makeup is applied, these are considered to be “makeup sub-sessions” as claimed, and further lays out the applicable toolbars and buttons which designate these makeup sub-sessions and application of makeup as needed. 
Rav-Acha is relied upon to teach in particular “identifying the timing of start of each of the sub-sessions”. To this end Rav-Acha et al. teaches in Paragraph 36, “The non-transitory computer readable medium may store instructions that may cause the computerized system to detect faces of multiple persons in the media stream and to generate, for at least two of the multiple persons, a modified media stream that is generated by assigning to the person higher importance than an importance of the other persons of the multiple persons.” In Paragraph 37, Rav Acha further elaborates “The non-transitory computer readable medium may store instructions that may cause the computerized system to detect faces of persons in the media stream; to display the faces and information about the modified media stream to the user; receive an instruction to share the modified media stream with a certain person that is identified by a certain face out of the faces; to share the modified media with the certain person if contact information required for the sharing with the certain person is available.” In Paragraph 465 Rav-Acha et al. teaches, “Tagging: Automatic tagging of media entities is achieved by applying the Detection/Recognition building block several times. Some tags are extracted by solving a detection problem. For instance adding a tag " face" whenever the face detector detected a face in a video clip, or a tag "applause" when a sound of clapping hands is detected. Other types of tags are extracted by solving a recognition (or classification) problem. For instance, a specific person-tag is added whenever the face-recognition module classifies a detected face as a specific, previously known face. Another example is classifying a scene to be "living-room scene" out of several possibilities of pre-defined scene location types. The combination of many detection and recognition modules can produce a rich and deep tagging of the media assets, which is valuable for many of the features described below.” In Paragraph 474, it teaches “Given a visual entity d (for example, a video segment), the attributes above can be used to compute intermediate importance scores s.sub.1, . . . s.sub.l (in our implementation, these scores can be negative. Such scores can be obtained by using direct measurements (e.g, SalienSee measure of a clip), or by some binary predicate using the extracted meta-data (e.g., s=1 if clip includes a `large face closeup` tag and s=0 otherwise). The final ImportanSee measure is given as a weighted sum of all attribute scores. I.e., ImportanSee(d)=max (.SIGMA..sub.i .alpha..sub.is.sub.i, 0), where .alpha..sub.i is the relative weights of each attribute.” In Paragraph 475, it teaches “Table of content: Table of (visual) content is a hierarchical segmentation of visual entities (video or set of videos and images). This feature can be implemented as a clustering of the various scenes in a video. For instance, by sampling short video chunks (e.g., 1 second of video every 5 seconds of video) and clustering these media chunks (using the clustering building block) will produce a flat or hierarchical table of contents of the video. In addition to this segmentation, each segment is attached with either a textual or visual short description (for example, a representative frame or a short clip). This representative can be selected randomly, or according to its ImportanSee measure.” In this case tags are clearly and unambiguously used to mark a video stream to identify the content in video within chunks of video data, which are segmented clips of video data which are marked with tags. The video stream is also clearly tagged in response to detection and recognition of faces in the images. Thus, it is clear in light of properly considering the combined teachings of both pieces of prior art as applied to the claim limitations, that these limitations as filed would have been obvious to a person having ordinary skill in the art at the time of the filing of the invention. In this case, both Santos et al. and Rav-Acha et al. teach editing of video and image data based on detection of facial features. The automatic sub-sessions or clips created from the video would be obvious to include in order to get a pleasing edited video, which only includes important portions as required.
Appellant further argues on pages 10 and 11 of the Appeal Brief, regarding Claim 5 that the combination of Santos et al. and Rav-Acha et al. fail to teach “identifying outlines of face features in the video stream” and “tracking location of each identified face feature in each frame”. In Paragraph 51, Santos et al. teaches “The present invention is a make-up simulation system using image processing, embodied by means of an integrated software and hardware that applies make-up effects and/or other graphics in real time by means of portable devices having a digital camera. The system simulates make-up on static or dynamic images using the identification manual, semiautomatic or automatic regions of the face.” The real-time simulation of makeup on dynamic images (or video streams) inherently requires the identification of the face features to which the make-up simulation is applied and the tracking of the face features to which the make-up is digitally applied. More specifically, the facial feature detection is further described in Paragraphs 141-148, where it teaches “Before starting the analysis of the image, it removes the region between the eyebrows and mouth, facilitating, in turn, the mapping of the face. Then apply the Sobel filter in horizontal and vertical direction in order to extract the edges of the face. To perform the search for points of interest was used a sliding window which moves according to the directions shown in FIG. 15. The coordinate of each point of interest corresponds to the position of the first window and the difference between its pixel is greater than a predefined threshold and which are located outside the region between the mouth and eyebrows. All points using basically the same method, changing only the parameters related to orientation and the reference point, which defines the orientation direction of the scanning line and the reference point determining the initial position of the scanning line. The following, the parameters used to estimate each point of interest of the face are described:
The first point of interest of the face (1510) has as a reference point of interest of the eye represented by 1518 and the guidance of its scanning line equals 150.degree.;
The second point of interest of the face (1511) uses the eye point of interest, represented by 1519, as a reference and has a scanning line has the same direction at 60.degree.;
The third point of interest of the face (1512) has as a reference point of interest of the eye represented by (1520) and its line scanning direction is equal to 170;
The fourth point of interest of the face (1513) uses as a reference point, the point of interest of the eye represented by (1521), where its scanning line has guidance of 10.degree.;
The fifth point of interest of the face (1514) has as its reference point the point of interest of the mouth represented by (1522) and the orientation of its scanning line equals 200.degree.; 
The sixth point of interest of the face (1515) uses as a reference point, the point of interest of the mouth represented by (1523) and has a scanning line direction is equal to 340.degree.;
The seventh point of interest of the face (1516) has as its reference point the point of interest of the mouth represented by (1524) and its scanning line has direction with 270.degree.; 
The point of interest of the eighth face (1517) is represent by the reference point (1525) corresponding to the distance between the center points of interest of the eye (1518) and (1519). The orientation of the scanning line from this point is 90.degree..”
In Paragraph 150, Santos et al. teaches “Points of interest are used to form polygons that are connected by Bezier interpolation, and calculate the regions where the make-up will be applied. It is clear that the option of adopting the mechanism of Bezier interpolation concerns the need to tap into the points of interest masks using curved lines, as in real faces. This is done by simulating design applications vector connecting edges and vertices by using the Bezier interpolation to set the bending of these edges. The process of this invention is nonetheless a vectorizing parts of the face.” Thus, Santos et al. clearly teaches “identifying outlines of face features in the video stream” and “tracking location of each identified face feature in each frame” as recited in the claim limitations as filed.
Appellant further argues on Page 11 of the Appeal Brief, regarding claim 8, that the combination of Santos et al. and Rav-Acha et al. fail to teach “the controller further performs the step comprising: identify attributes of a make-up within the video stream; and map the attributes to a database of products, wherein the attributes include at least one of color, transparency and glossiness”. Examiner respectfully disagrees. In Paragraph 94, Santos et al. teaches “Using the camera (111), the capture of images in real-time preview mode is carried out. The information and data, such as color palettes of products used in make-up and the photos with make-up applied are saved on the storage medium (112). The information input devices (113) correspond to the portion of hardware that intercepts keyboard events and touch of the user on the touch screen.” In Paragraph 100, it teaches “FIG. 4 shows the interface used for application of make-up with two toolbars (410) and (411) containing several features, such as product selection, color selection, style selection, actions to redo/undo make-up among others. After choosing the product and color that will be used, the user does the application of make-up. This process is performed from the touch screen of the portable device, by which it is possible to use the user's fingers to apply make-up, mix colors or increase the intensity of a given product. The result of applying make-up is displayed on the display of the portable device (412).”
Appellant further argues on Page 11 of the Appeal Brief, regarding Claim 9, that Santos et al. and Rav Acha et al. fail to teach “identifying a sclera within the frame; identifying an iris within the sclera; and, translating pixels belonging to the iris so that the iris appears central within the sclera”. However, Examiner respectfully disagrees. In Paragraph 105, Santos et al. teaches “After selecting only the region of the target, it tries to then locate the region of the face (611) and the eye region (612) and the mouth region (613) of the user. For this, we used an Artificial Intelligence area known as Machine Learning. Machine Learning Techniques for using a collection of data for "teaching machine" to answer questions on the same. The present invention employed these techniques to verify the existence of a face in a digital image.” In figure 10, Santos et al. shows the determination of points along the eye which correspond to the central points. In Paragraph 122, Santos et al. teaches “After the process described above, is then calculated from the area bounding rectangle corresponding to the eye (1018) from the user. Equation 8 shows the calculation used to obtain the bounding rectangle of the eye, where min and max return respectively the smallest and largest value of a vector, X.sub.e represents the horizontal coordinates and Y.sub.e is the vertical coordinates of the eye of the user. The coordinates of the bounding rectangle are stored in I.sub.e (left), r.sub.e (right), t.sub.e (top) and b.sub.e (base). The result of this operation is illustrated in (1019). The points of interest are obtained from the eye (1020) from the calculated bounding rectangle in the process described in the preceding paragraph. Each eye has four points of interest illustrated in (1021).” In Paragraph 123, it teaches “Equation 9 shows the calculation used to obtain points of interest, where, ponto.sub.x0.sup.olho, ponto.sub.x1.sup.olho, ponto.sub.x2.sup.olho and ponto.sub.x3.sup.olho represent the coordinates of the points in the x-axis, ponto.sub.y.sub.0.sup.olho, ponto.sub.y.sub.1.sup.olho, ponto.sub.y.sub.2.sup.olho and ponto.sub.y.sub.3.sup.olho represent the coordinates of points on the y-axis. The first point is located at the top center of the bounding rectangle and the second point at the bottom center. The third point and fourth point located in the center left and center right, respectively, of the bounding rectangle. From these points then you get the eye area of the user, besides serving as a parameter to obtain the regions around the eyes where you will apply the make-up.” The treatment and isolation of the eye as disclosed is clearly also an identification of the sclera and iris.
Appellant argues on Pages 11 and 12 of the Appeal Brief, regarding Claim 10, that Santos et al. and Rav-Acha et al. fail to teach “digital processing required to make the displayed video appears like a mirror image”. However, Examiner respectfully disagrees. In Paragraph 96, Santos et al. “Many portable devices are equipped with two cameras, one front and one rear, where in general, the former having a higher quality and more features. However, with the front camera is possible to simulate a mirror, since it is located on the same side of the display, which makes it possible to perform self-shoots. Taking advantage of this feature, the present invention provides the user two capture options: 1) by means of the front camera, you can use your device as a mirror, because the user can view the result of his self-portrait before making the capture of the same, and 2) with the rear camera, capturing pictures with higher resolutions.” The display of the front camera’s image to mimic a mirror necessarily requires associated digital processing.
Regarding Appellant’s arguments regarding Claims 25 and 26 on page 12 of the Appeal Brief, in response to appellant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., specific definition of the term “make-up looks”) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). 
For the above reasons, it is believed that the rejections should be sustained.
Respectfully submitted,
/FARHAN MAHMUD/Primary Examiner, Art Unit 2483                                                                                                                                                                                                        
Conferees:
/JOSEPH G USTARIS/Supervisory Patent Examiner, Art Unit 2483   

/Dave Czekaj/Supervisory Patent Examiner, Art Unit 2487                                                                                                                                                                                                                 
                                                                                                                                                                                            
Requirement to pay appeal forwarding fee.  In order to avoid dismissal of the instant appeal in any application or ex parte reexamination proceeding, 37 CFR 41.45 requires payment of an appeal forwarding fee within the time permitted by 37 CFR 41.45(a), unless appellant had timely paid the fee for filing a brief required by 37 CFR 41.20(b) in effect on March 18, 2013.