DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 2, 7, 11, and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Spizhevoy et al (US10448881) in view of Shen et al (US20190295223).
Regarding claim 1, Spizhevoy teaches a method performed by one or more processing resources of a computer system, the method comprising: 
providing a deep neural network (DNN) implementing a facial feature extraction algorithm and a quality prediction algorithm that share a common DNN backbone of the DNN (116 and 120 in fig. 1, col. 4 lines 20-38, For example, a convolutional neural network such as a deep neural network (DNN) can be used to perform both eye image segmentation and image quality estimation. A CNN for performing both eye image segmentation and image quality estimation can have a merged architecture); and 
training the DNN (col. 2 lines 18-32, obtaining a training set of eye images; providing a convolutional neural network with the training set of eye images; and training the convolutional neural network with the training set of eye images) to jointly perform (i) facial feature extraction in accordance with the facial feature extraction algorithm and (ii) a quality score in accordance with the quality prediction algorithm (col. 6 lines 46-49, col. 40 lines 18-24, Accordingly, the shared layers can be advantageously trained simultaneously when training the segmentation tower 104 and the quality estimation tower 108; For example, one or more additional operations can be performed before, after, simultaneously, or between any of the illustrated operations) by: 
training the common DNN backbone and the facial feature extraction algorithm based a first training dataset including a plurality of training images (col. 13 lines 41-59, The process of training a CNN 100 is the process of presenting the CNN 100 with a training set of eye images 124. The training set can include both input data and corresponding reference output data. Thus, in some implementations, a CNN 100 having a merged architecture is trained, using a training set of eye images 124, to learn segmentations and quality estimations of the eye images 124); and 
training the quality prediction algorithm (col. 13 lines 59-62, The quality estimation tower 108 being trained can process an eye image 124 of the training set to generate a quality estimation tower output 132 of the eye image 124) based on different training images (col. 13 lines 54-62, it is implied that the segmentation tower and the quality estimation tower have different eye images as inputs for training) wherein a first category of the plurality of categories includes a first subset of the example images representative of those of the example images for which a facial feature extraction algorithm cannot be performed (col. 6 lines 32-35) and wherein a second category of the plurality of categories includes a second subset of the example images representative of those of the example images that are ideal for the facial feature extraction algorithm (col. 6 lines 29-32).

Spizhevoy fails to teach training the quality prediction algorithm based a second training dataset while holding fixed the common DNN backbone, wherein the second training dataset includes example images each labeled with a score value associated with a particular category of a plurality of categories, wherein a third category of the plurality of categories includes a third subset of the example images representative of those of the example images having a suitability for the facial feature extraction algorithm between that of the first category and the second category.
However Spizhevoy does teach wherein the trained quality prediction algorithm (col. 13 lines 59-62) classifies an image as good quality if the probability of the image exceeds a high quality threshold such as 75%, 85%, or 95% (col. 6 lines 29-32) and classifies an image as bad quality if the probability of the image is less than a low quality threshold such as 25%, 15%, or 5% (col. 6 lines 32-35). The good quality images are interpreted to be images that are ideal for the facial feature extraction algorithm and the bad quality images are interpreted to be images for which a facial feature extraction algorithm cannot be performed. One of ordinary skill in the art would have found it obvious to try or label all images in which the probability falls in between the high quality threshold and low quality threshold, such as 26% to 74%, as a medium quality image to yield predictable results.

	Furthermore, Shen teaches training a quality prediction algorithm (fig. 3) based on a training dataset while holding fixed the common DNN backbone (it is unclear what is meant by “holding fixed the common DNN backbone”. The examiner interprets this limitation to mean using a training dataset only for the quality prediction algorithm. With that said, Shen teaches training images used to train a quality prediction algorithm in para. [0039] and fig. 3), wherein the training dataset includes example images each labeled with a score value associated with a particular category of a plurality of categories (304 and 308 in fig. 3, para. [0036], [0039], [0075], [0105], Utilizing such a range of aesthetic attribute scores can ensure that the input images used to train the aesthetic enhancement neural network are not too poor in quality (e.g., due to camera shake, blur, image darkness) but are also not too high in quality that the aesthetic enhancement neural network fails to learn to enhance images; At block 304, the images can be scored. In an embodiment, scoring can be carried out by evaluating various attributes of the images based on traditional photographic principles).
	Therefore taking the combined teachings of Spizhevoy and Shen as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Shen into the method of Spizhevoy. The motivation to combine Shen and Spizhevoy would be to ensure that the input images used to train the aesthetic enhancement neural network are not too poor in quality (e.g., due to camera shake, blur, image darkness) that the network would have a difficult time learning to enhance the aesthetics but are also not too high in quality that the aesthetic enhancement neural network fails to learn to enhance images (para. [0077] of Shen).


Regarding claim 2, the modified invention of Spizhevoy teaches a method wherein the plurality of categories comprises three categories (col. 6 lines 27-35 of Spizhevoy, an image probability which falls in between the high quality threshold and low quality threshold, such as 26% to 74%, would necessarily have a separate category such as a medium quality image.


Regarding claim 7, the modified invention of Spizhevoy fails to explicitly teach a method wherein the plurality of categories comprises six categories. However Spizhevoy does teach a plurality of categories based on a quality threshold (col. 6 lines 22-35 of Spizhevoy). One of ordinary skill in the art would have been able to categorize six ranges as quality thresholds (for example 0%-10%, 11%-25%, etc) among a finite number of ranges of percentages to yield predictable results. Furthermore, the claim does not specify what each category entails.



Regarding claim 11, the limitations are similar to those claimed in claim 1 and therefore is rejected for the same reasons as stated above.


Regarding claim 13, the limitations are similar to those claimed in claim 1 and therefore is rejected for the same reasons as stated above. Furthermore, Spizhevoy teaches a processor (col. 1 lines 57-58 of Spizhevoy) and a computer-readable medium (col. 38 lines 27-32 of Spizhevoy).


Claim(s) 3, 5, 6, 8-10, 12, 14, and 16-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Spizhevoy et al (US10448881) and Shen et al (US20190295223) in view of Purwar et al (US20180352150).
Regarding claim 3, the modified invention of Spizhevoy fails to teach a method wherein the third subset of the example images comprise images containing an unrecognizable face.
However Purwar teaches a quality prediction module (para. [0006], [0061] selfie quality index module; SQI modules may contain a Blur Detection Module) which is a trained convolutional neural network (para. [0062], The Blur Detection module may be configured as a CNN trained). The quality prediction module includes the Blur Detection Module which outputs determines a blur detection score (para. [0065]). In one example, Purwar teaches a blur detection score of less than 4, such as 3 for example, and indicates that the image is too blurry (para. [0065]). An image that is too blurry is interpreted to be unrecognizable. One of ordinary skill in the art would have found it obvious to train the Blur Detection Module to determine which images yield a detection score of 3, which is interpreted to be a third subset.
Therefore taking the combined teachings of Spizhevoy and Shen with Purwar as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Purwar into the method of Spizhevoy and Shen. The motivation to combine Shen, Purwar and Spizhevoy would be to improve the quality of images taken (para. [0005] of Purwar) which results in higher better image analysis results (para. [0016] of Purwar).


Regarding claim 5, the modified invention of Spizhevoy fails to teach a method wherein the third subset of the example images comprise images each containing a face that is easily recognizable and capable of differentiation by someone unfamiliar with the face despite the images being low resolution or the face being blurred.
However Purwar teaches training a quality prediction module which includes a Blur Detection module (para. [0061], [0062], [0664], selfie quality index module; SQI modules may contain a Blur Detection Module; The Blur Detection module may be configured as a CNN trained; the score and/or other results of the analysis by the Blur Detection module may be provided to the system's training logic to further train the Blur Detection module). The Blur Detection module outputs scores indicating an image contains blur but may still be analyzed (para. [0065], the system may provide indicate to a user that there is blur present in the image but still allow the image to be analyzed). It would be necessary to train the Blur Detection module to determine the allowable amount of blur in an image. It is further interpreted that if the image may be analyzed, then objects in the image, such as a face, are easily recognizable as opposed to when the Blur Detection module determines that the image is too blurry to be analyzed (para. [0065]). Also, the term “capable of differentiation by someone unfamiliar with the face” is not limiting and therefore is not given patentable weight. 
Therefore taking the combined teachings of Spizhevoy and Shen with Purwar as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Purwar into the method of Spizhevoy and Shen. The motivation to combine Shen, Purwar and Spizhevoy would be to improve the quality of images taken (para. [0005] of Purwar) which results in higher better image analysis results (para. [0016] of Purwar).


Regarding claim 6, the modified invention of Spizhevoy fails to teach a method wherein the third subset of the example images comprise images each containing a face that is easy to recognize despite the images having minor blur or an ill-posed nature of the face.
However Purwar teaches training a quality prediction module which includes a Blur Detection module (para. [0061], [0062], [0664], selfie quality index module; SQI modules may contain a Blur Detection Module; The Blur Detection module may be configured as a CNN trained; the score and/or other results of the analysis by the Blur Detection module may be provided to the system's training logic to further train the Blur Detection module). The Blur Detection module outputs scores indicating an image contains blur but may still be analyzed (para. [0065], the system may provide indicate to a user that there is blur present in the image but still allow the image to be analyzed). It would be necessary to train the Blur Detection module to determine the allowable amount of blur in an image. It is further interpreted that if the image may be analyzed, then objects in the image, such as a face, are easily recognizable as opposed to when the Blur Detection module determines that the image is too blurry to be analyzed (para. [0065]). 
Therefore taking the combined teachings of Spizhevoy and Shen with Purwar as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Purwar into the method of Spizhevoy and Shen. The motivation to combine Shen, Purwar and Spizhevoy would be to improve the quality of images taken (para. [0005] of Purwar) which results in higher better image analysis results (para. [0016] of Purwar).


Regarding claim 8, Spizhevoy teaches a method performed by one or more processing resources of a computer system, the method comprising: 
receiving an image (124 in fig. 1); 
predicting a suitability of the image for performing a facial feature extraction algorithm on the image by performing a quality prediction algorithm on the image (108 in fig. 1, col. 6 lines 24-35); 
wherein the quality prediction algorithm and the facial feature extraction algorithm are jointly performed by a deep neural network (col. 6 lines 46-49, col. 40 lines 18-24, Accordingly, the shared layers can be advantageously trained simultaneously when training the segmentation tower 104 and the quality estimation tower 108; For example, one or more additional operations can be performed before, after, simultaneously, or between any of the illustrated operations) that has been trained based on a training dataset including example images (col. 2 lines 18-32, obtaining a training set of eye images; providing a convolutional neural network with the training set of eye images; and training the convolutional neural network with the training set of eye images); 
wherein the facial feature extraction algorithm and the quality prediction algorithm share a common DNN backbone of the DNN (116 and 120 in fig. 1, col. 4 lines 20-38, For example, a convolutional neural network such as a deep neural network (DNN) can be used to perform both eye image segmentation and image quality estimation. A CNN for performing both eye image segmentation and image quality estimation can have a merged architecture); 
wherein a first category of the plurality of categories includes a first subset of the example images representative of those of the example images for which the facial feature extraction algorithm cannot be performed (col. 6 lines 32-35); and
wherein a second category of the plurality of categories includes a second subset of the example images representative of those of the example images that are ideal for the facial feature extraction algorithm (col. 6 lines 29-32). 

Spizhevoy fails to explicitly teach wherein a third category of the plurality of categories includes a third subset of the example images representative of those of the example images having a suitability for the facial feature extraction algorithm between that of the first category and the second category. However Spizhevoy does teach wherein the trained quality prediction algorithm (col. 13 lines 59-62) classifies an image as good quality if the probability of the image exceeds a high quality threshold such as 75%, 85%, or 95% (col. 6 lines 29-32) and classifies an image as bad quality if the probability of the image is less than a low quality threshold such as 25%, 15%, or 5% (col. 6 lines 32-35). The good quality images are interpreted to be images that are ideal for the facial feature extraction algorithm and the bad quality images are interpreted to be images for which a facial feature extraction algorithm cannot be performed. One of ordinary skill in the art would have found it obvious to try or label all images in which the probability falls in between the high quality threshold and low quality threshold, such as 26% to 74%, as a medium quality image to yield predictable results.

Spizhevoy further fails to teach a training dataset including example images each labelled with a score value associated with a particular category of a plurality of categories. However Shen teaches wherein a training dataset includes example images each labeled with a score value associated with a particular category of a plurality of categories (304 and 308 in fig. 3, para. [0036], [0039], [0075], [0105], Utilizing such a range of aesthetic attribute scores can ensure that the input images used to train the aesthetic enhancement neural network are not too poor in quality (e.g., due to camera shake, blur, image darkness) but are also not too high in quality that the aesthetic enhancement neural network fails to learn to enhance images; At block 304, the images can be scored. In an embodiment, scoring can be carried out by evaluating various attributes of the images based on traditional photographic principles).
	Therefore taking the combined teachings of Spizhevoy and Shen as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Shen into the method of Spizhevoy. The motivation to combine Shen and Spizhevoy would be to ensure that the input images used to train the aesthetic enhancement neural network are not too poor in quality (e.g., due to camera shake, blur, image darkness) that the network would have a difficult time learning to enhance the aesthetics but are also not too high in quality that the aesthetic enhancement neural network fails to learn to enhance images (para. [0077] of Shen).

Spizhevoy also fails to teach when the suitability is greater than a predetermined quality threshold, extracting facial features from a face contained within the image by applying the facial feature extraction algorithm. However Purwar teaches when a suitability is greater than a predetermined quality threshold (para. [0046], if the SQI score satisfies the threshold value), extracting facial features from a face contained within an image by applying a facial feature extraction algorithm (para. [0047], The image processing modules may be configured to locate and/or count faces in an image, extract different zones from a face in an image, register the face using one or more facial features as landmarks, and/or normalize the face to a common coordinate system).
Therefore taking the combined teachings of Spizhevoy and Purwar as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Purwar into the method of Spizhevoy. The motivation to combine Purwar and Spizhevoy would be to improve the quality of images taken (para. [0005] of Purwar) which results in higher better image analysis results (para. [0016] of Purwar).


	Regarding claim 9, the modified invention of Spizhevoy teaches a wherein the image is extracted from a plurality of video frames generated by a video camera (col. 26 lines 62-67 and col. 38 lines 19-26 of Spizhevoy).


	Regarding claim 10, the modified invention of Spizhevoy teaches a method wherein the computer system is part of a surveillance system (col. 26 lines 34-46 of Spizhevoy, remote processing of video information is interpreted to be part of a surveillance system).


Regarding claim 12, the limitations are similar to those claimed in claim 8 and therefore is rejected for the same reasons as stated above.


Regarding claim 14, the limitations are similar to those claimed in claim 3 and therefore is rejected for the same reasons as stated above.


Regarding claim 16, the limitations are similar to those claimed in claim 5 and therefore is rejected for the same reasons as stated above.


Regarding claim 17, the limitations are similar to those claimed in claim 6 and therefore is rejected for the same reasons as stated above.


Regarding claim 18, the limitations are similar to those claimed in claim 8 and therefore is rejected for the same reasons as stated above. Furthermore, Spizhevoy teaches a processor (col. 1 lines 57-58 of Spizhevoy) and a computer-readable medium (col. 38 lines 27-32 of Spizhevoy).


Regarding claim 19, the limitations are similar to those claimed in claim 9 and therefore is rejected for the same reasons as stated above.


Regarding claim 20, the limitations are similar to those claimed in claim 10 and therefore is rejected for the same reasons as stated above.


Claim(s) 4 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Spizhevoy et al (US10448881) and Shen et al (US20190295223) in view of Lin et al (US20190213474).
Regarding claim 4, the modified invention of Spizhevoy fails to teach a method wherein the third subset of the example images comprise images each containing a face that is recognizable, but which is difficult to differentiate except by someone familiar with the face.
However Lin teaches example images input to a quality prediction algorithm (para. [0077], the facial quality convolutional neural network is a separate network that has already been trained to assess the qualities of faces present in frames. The training of this facial quality convolutional neural network relies on manually annotated faces with scores set as “0,” “0.5,” and “1.”). The images comprise faces that are recognizable (para. [0077], faces with a score of “0.5” and “1” are interpreted to be recognizable). The limitation “difficult to differentiate except by someone familiar with the face” is unclear because the term difficult is subjective. Any face with a score of “0.5” may be interpreted to be difficult to differentiate except by someone familiar with the face. 
Therefore taking the combined teachings of Spizhevoy and Shen with Lin as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Lin into the method of Spizhevoy and Shen. The motivation to combine Shen, Lin and Spizhevoy would be to performing frame selection in an accurate and computationally efficient manner (para. [0001] of Lin).


Regarding claim 15, the limitations are similar to those claimed in claim 4 and therefore is rejected for the same reasons as stated above.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEON VIET Q NGUYEN whose telephone number is (571)270-1185. The examiner can normally be reached Mon-Fri 11AM-7PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire Wang can be reached on 571-270-1051. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LEON VIET Q NGUYEN/           Primary Examiner, Art Unit 2663