DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Preliminary Remarks
This is a reply to the arguments filed on 01/28/2021, in which, claims 1, 8, 11, 18, 21, and 22 are amended; and claims 4 and 14 are canceled. Claims 1-3, 7-8, 11-13, 17-18, and 20-22 remain pending in the present application with claims 1, 11, 21, and 22 being independent claims.
When making claim amendments, the applicant is encouraged to consider the references in their entireties, including those portions that have not been cited by the examiner and their equivalents as they may most broadly and appropriately apply to any particular anticipated claim amendments.

Response to Arguments
Applicant's arguments with respect to amended claim 1 have been considered but are not persuasive.
On pages 10-11, Applicant argues that, “Goto (US 2017/0185843), Zwol (US 2013/0142418), Chen (US 2016/0014482), and Cordova-Diba (US 2016/0042251) fail to disclose or suggest the learning model including one or more neural networks, and determining the recommended score of a plurality of candidate frames, based on at least one of a size of the selected object, a brightness of the selected object, or a focus in the database ... several images from the database may be a match to the input image” Cordova-Diba at ¶¶ 74-75.”
	In response, Examiner respectfully disagrees. Reference Zwol discloses selecting representative images for video items using a trained machine learning engine wherein a training set is fed to a machine learning engine which includes input parameter values and an externally-generated score; once a machine learning model has been generated based on the training set, input parameters for unscored images are fed to the trained machine learning engine; and the trained machine learning engine 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole 

Claims 1-3, 7-8, 11-13, 17-18, and 20-22 are rejected under 35 U.S.C. 103 as being unpatentable over Goto et al. (US 20170185843 A1, hereinafter referred to as “Goto”) in view of van Zwol et al. (US 20130142418 A1, hereinafter referred to as “van Zwol”), further in view of Chen et al. (US 20160014482 A1, hereinafter referred to as “Chen”), and further in view of Cordova-Diba et al. (US 20160042251 A1, hereinafter referred to as “Cordova-Diba”).
Regarding claim 1, Goto discloses an image display apparatus comprising: 
a display (see Goto, FIG. 1 and paragraph [0028]: “a display”); 
a memory storing instructions (see Goto, paragraph [0154]: “computer executable instructions from the storage medium”); and 
a processor (see Goto, FIG. 1, CPU) configured to execute the instructions stored in the memory, which when executed the processor: 
controls the display to output video content (see Goto, paragraph [0038]: “The image data output from the layout information output unit 215 is displayed on the display”); and
receives a user input for selecting a frame from among a plurality of frames constituting the video content, the frame comprising an object (see Goto, paragraph [0045]: “The display screen 301 also includes a main subject specifying icon 304. The main subject specifying icon 304 is an icon that allows the user to specify a main subject that is to be identified as a subject of interest from among subjects in analysis-target images (e.g., photographs)… The user can manually select a main subject by clicking the main subject specifying icon 304 by using the pointing device 107 to display 
Regarding claim 1, Goto discloses all the claimed limitations with the exception of determines a plurality of candidate frames including the selected object from among the plurality of frames based on a similarity between the selected object and one or more objects included within each of the plurality of frames; determines a recommendation score of each of the plurality of candidate frames by using a learning model including one or more neural networks, the learning model determining the recommendation score of each of the plurality of candidate frames, based on at least one of a size of the selected object, a brightness of the selected object, or a focus of the selected object in the plurality of candidate frames; determines a recommended frame that includes an optimal image of the selected object from the plurality of candidate frames, based on the recommendation score of each of the plurality of candidate frames; and controls the display to output the recommended frame that includes the optimal image of the selected object.
van Zwol from the same or similar fields of endeavor discloses determines a recommendation score of each of the plurality of candidate frames by using a learning model (see van Zwol, paragraph [0055]: “a computationally expensive algorithm may be used to generate relatively accurate scores for candidate images from a small fraction of the total number of videos in a collection. Those scores may be used to train the machine learning engine 308, and the trained machine learning engine 308 may then be used to generate representativeness scores for the remainder of the video collection”),

determines a recommended frame that includes an optimal image of the selected object from the plurality of candidate frames (see van Zwol, paragraph [0035]: “Once the candidate images have been extracted from the video, a trained machine learning engine is used to determine which images are most representative of the video”), based on the recommendation score of each of the plurality of candidate frames (see van Zwol, paragraph [0055]: “a computationally expensive algorithm may be used to generate relatively accurate scores for candidate images from a small fraction of the total number of videos in a collection. Those scores may be used to train the machine learning engine 308, and the trained machine learning engine 308 may then be used to generate representativeness scores for the remainder of the video collection”); and
controls the display to output the recommended frame that includes the optimal image of the object (see van Zwol, paragraph [0120]: “an explore/exploit process may be followed during which, for the same video items, different users are shown different 
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings as in van Zwol with the teachings as in Goto. The motivation for doing so would ensure the system to have the ability to use the method disclosed in van Zwol to select representative images for video items using a trained machine learning engine; to use a training set is fed to a machine learning engine wherein the training set includes, for each image in the training set, input parameter values and an externally-generated score; to generate scores for the images based on the machine learning model, to select a representative image for a particular video item wherein candidate images for that particular video item may be ranked based on their scores; and to display the candidate image with the top score thus determining a plurality of candidate frames including the object from among the plurality of frames; determining a recommended frame that includes an optimal image of the object from the plurality of candidate frames; determining a recommendation score of each of the plurality of candidate frames by learning model based on attribute information of areas of the object in the plurality of candidate frames and displaying the recommended frame that includes the optimal image of the object in order to automatically determine an image in a video having a best picture of an object selected by a user.
Regarding claim 1, the combination teachings of Goto and van Zwol as discussed above disclose all the subject matter of the claimed invention with the exceptions of determines a plurality of candidate frames including the selected object 
Chen from the same or similar fields of endeavor discloses [determines a recommendation score of each of the plurality of candidate frames by using a learning model] including one or more neural networks (see Chen, paragraph [0090]: “a supervised learning approach such as (but not limited to) the use of techniques including (but not limited to) a support vector machine, a neural network classifier, and/or a decision tree classifier are utilized to implement a segment that can identify segmentation boundaries”).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings as in Chen with the teachings as in Goto and van Zwol. The motivation for doing so would ensure the system to have the ability to use the learning approach disclosed in Chen including neural network classifier to identify candidate images thus using one or more neural networks in a learning model so that the model learner can be used for determining the recommended image from the plurality of images.
Regarding claim 1, the combination teachings of Goto, van Zwol, and Chen as discussed above disclose all the subject matter of the claimed invention with the exceptions of determines a plurality of candidate frames including the selected object from among the plurality of frames based on a similarity between the selected object and one or more objects included within each of the plurality of frames.
Cordova-Diba from the same or similar fields of endeavor discloses determines a plurality of candidate frames including the selected object from among the plurality of frames based on a similarity between the selected object and one or more objects included within each of the plurality of frames (see Cordova-Diba, paragraph [0061]: “objects located (i.e., detected) and identified in content of a digital media asset (e.g., in an image or the frame of a video) are made available to a user for direct interaction (e.g., through an interactive client application)” and paragraphs [0142]-[0145]: “Each object in the list of reference objects is associated with confidence score(s), representing a relative degree of visual similarity to the object candidate, and the weight value(s) calculated in step 850… for each reference object in the list of reference objects produced by step 855, a combined matching score is calculated (e.g., by data fusion module 330) based on the confidence scores and weight values associated with that reference object… the next object candidate in a list of object candidates generated by step 830, if any, is selected”).
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings as in Cordova-Diba with the teachings as in Goto, van Zwol, and Chen. The motivation for doing so would ensure the system to have the ability to use the method disclosed in Cordova-Diba to locate and identify objects in a content of digital media asset such as an image or the frame of video; to associate each object in the list of reference objects with confidence score(s) wherein the confidence score(s) represents a relative degree of visual similarity to the candidate object and to calculate the matching score based on the conference scores and weight values associated with that reference object thus determining a  object from among the plurality of frames based on a similarity between the selected object and one or more objects included within each of the plurality of frames in order to determine the recommended frame from a plurality of learning frames.
Regarding claim 2, the combination teachings of Goto, van Zwol, Chen, and Cordova-Diba as discussed above also disclose the image display apparatus of claim 1, wherein the learning model is determined by, in response to inputting of a plurality of learning images to the one or more neural networks, training a method of determining a recommended image from the plurality of learning images (see Chen, paragraph [0090]: “machine learning techniques can be utilized to train a system to identify segmentation boundaries based upon a fused stream of segmentation cues. In a number of embodiments, a supervised learning approach such as (but not limited to) the use of techniques including (but not limited to) a support vector machine, a neural network classifier, and/or a decision tree classifier are utilized to implement a segment that can identify segmentation boundaries based upon a training data set of video streams in which segmentation boundaries are manually identified”).
The motivation for combining the references has been discussed in claim 1 above.
Regarding claim 3, the combination teachings of Goto, van Zwol, Chen, and Cordova-Diba as discussed above also disclose the image display apparatus of claim 1, wherein the learning model is determined by, in response to inputting of a plurality of learning images to the one or more neural networks, determining a recommendation score of each of the plurality of learning images (see Goto, paragraph [0035]: “An image 
The motivation for combining the references has been discussed in claim 1 above.
Regarding claim 7, the combination teachings of Goto, van Zwol, Chen, and Cordova-Diba as discussed above also disclose the image display apparatus of claim 1, wherein the processor when executing the instructions is further configured to: 
track the object in the plurality of frames (see Chen, paragraph [0096]: “a face detector is applied to some or all of the video frames in a video data stream”); and 
based on a tracking result, determine the plurality of candidate frames (see Chen, paragraph [0096]: “a face detector that can detect the presence of a face (without performing identification) is utilized to identify candidate anchor frames and then a facial recognition process is applied to the candidate anchor faces to detect anchor frames”).
The motivation for combining the references has been discussed in claim 1 above.
Regarding claim 8, the combination teachings of Goto, van Zwol, Chen, and Cordova-Diba as discussed above also disclose the image display apparatus of claim 1, wherein the processor when executing the instructions is further configured to: 
recognize a plurality of objects in the frame (see Chen, paragraph [0064]: “recognizing elements within individual frames of video such as (but not limited to) text, faces, images”); and 

The motivation for combining the references has been discussed in claim 1 above.
Claim 11 is rejected for the same reasons as discussed in claim 1 above.
Claim 12 is rejected for the same reasons as discussed in claim 2 above.
Claim 13 is rejected for the same reasons as discussed in claim 3 above.
Claim 17 is rejected for the same reasons as discussed in claim 7 above.
Claim 18 is rejected for the same reasons as discussed in claim 8 above.
Regarding claim 20, the combination teachings of Goto, van Zwol, Chen, and Cordova-Diba as discussed above also disclose a non-transitory computer-readable recording medium having embodied thereon a program for executing the method of operating the image display apparatus of claim 11 (see Goto, FIG. 1, ROM 102, RAM 103, HDD 104).
The motivation for combining the references has been discussed in claim 1 above.
Claim 21 is rejected for the same reasons as discussed in claim 1 above. In addition, the combination teachings of Goto, van Zwol, Chen, and Cordova-Diba as discussed above also disclose determines a plurality of candidate frames including the object that sequentially precede the frame in the video content and sequentially succeed the frame in the video content from among the plurality of frames (see Cordova-Diba, paragraph [0064]: “a user of the network device may interact with objects 
The motivation for combining the references has been discussed in claim 1 above.
Claim 22 is rejected for the same reasons as discussed in claim 21 above.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NIENRU YANG whose telephone number is (571)272-4212.  The examiner can normally be reached on Monday - Friday 10 AM - 6 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, THAI TRAN can be reached on 571-272-7382.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access 


NIENRU YANG
Examiner
Art Unit 2484



/NIENRU YANG/Examiner, Art Unit 2484                                                                                                                                                                                                        

/THAI Q TRAN/Supervisory Patent Examiner, Art Unit 2484