DETAILED ACTION
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-2, 9-15, & 21-26 is/are rejected under 35 U.S.C. 103 as being unpatentable over Davoust (US Patent 10271109 B1) in view of Chen et al (US 20180014041).
Regarding claim 1, Davoust discloses a non-transitory computer-readable medium of a computer having computer-readable program instructions embodied thereon (col 12 lines 32-63), wherein said instructions, when executed, cause said computer to:
receive an audibly spoken question including a noun-phrase, a video stream comprising audio visual work, and metadata about said video stream (col 2 lines 18-50 video content 103 corresponding to a movie is rendered upon a display for viewing by a user; While watching the video content 103, the user presents a verbal query 106 in the form of a question: "Who is the man at the right?; col 7 lines 40-50 The query response service 218 can then determine with respect to the time metadata 237 the items that are currently shown in the video content 103, within a predetermined threshold before or after the verbal query 106);
convert said audibly spoken question to text (col 7 lines 32-38 converting the audio received from the microphone 285 either to text or profile representations);
capture image data of a still frame of said video stream associated with a point in time of said video stream when said audibly spoken question is received (Fig. 3A-3C & col 7 lines 60-67 & col 8 lines 1-40 e.g. video content with grid being superimposed thereon); 
receive said text and extracting therefrom said noun-phrase, said extracted noun-phrase including a query subject (col 2 lines 18-50  While watching the video content 103, the user presents a verbal query 106 in the form of a question: "Who is the man at the right?; col 7 lines 39-50 The query response service 218 performs natural language processing on the verbal query 106 to determine the items that are inquired about and the nature of the inquiry, e.g., who, what, when, where, why, how, etc.);
receive additional information about said identified audiovisual work, said additional information including the identity of said query subject (col 2 lines 30-49 The response 109 in this case specifies the character name ("George") and the name of the cast member who plays the character ("Jim Kingsboro"); In various examples, the system may read out the response 109 using a speech synthesizer, or the system may present the response 109 via the display),
generate a textual description of the identity of said identified query subject (col 2 lines 30-49 The response 109 in this case specifies the character name ("George") and the name of the cast member who plays the character ("Jim Kingsboro"); In various examples, the system may read out the response 109 using a speech synthesizer, or the system may present the response 109 via the display); and
generate a script comprising said noun-phrase and said textual description of said identified query subject (col 2 lines 30-49 In various examples, the system may read out the response 109 using a speech synthesizer, or the system may present the response 109 via the display).
Davoust fails to specifically teach categorize said query subject as said audiovisual work, and based on said categorizing said query subject as said audiovisual work, determine that said query subject is not depicted in said still image; using said received metadata, identify said audiovisual work of which said still image is a component part.
Chen teaches categorize said query subject as said audiovisual work (¶128 a user may search for all of the Brad Pitt scenes in Ocean's Eleven or all movie scenes containing Brad Pitt generally), and based on said categorizing said query subject as said audiovisual work, determine that said query subject is not depicted in said still image (¶128 Image recognition may be performed on the key frames or individual key frames to identify faces, products, or corporate logos; The faces of actors/products/or logos may be identified in the key frames; therefore the key frames not having Bratt Pitt would be the frames not depicted in the image); using said received metadata, identify said audiovisual work of which said still image is a component part (¶128 The search may query the metadata that was gathered via image recognition of the key frames or may perform a search through the key frames of an asset in real time based on the search criteria of a user).
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the invention to have implemented the teaching of categorize said query subject as said audiovisual work, and based on said categorizing said query subject as said audiovisual work, determine that said query subject is not depicted in said still image; using said received metadata, identify said audiovisual work of which said still image is a component part from Chen into the medium as disclosed by Davoust. The motivation for doing this is to improve methods for presentation of key frames.

Regarding claim 2, Davoust discloses the medium of claim 1, wherein said audibly spoken question is converted to text by a speech recognition module (col 7 lines 32-50 query response service 218 performs natural language processing on the verbal query 106 to determine the items that are inquired about and the nature of the inquiry). 

Regarding claim 9, Davoust discloses the medium of claim 1, wherein said medium is included in a display device (Fig. 2 display 206). 

Regarding claim 10, Davoust discloses the medium of claim 9, wherein said display device is a smart television (col 5 lines 58-67 e.g. smart television). 

Regarding claim 11, Davoust discloses the medium of claim 1, wherein said medium is included in a mobile device (col 5 lines 58-67 e.g. cellular telephones). 

Regarding claim 12, Davoust discloses the medium of claim 1, wherein said video stream is received via a telecommunications network (col 2 lines 50-59 The network 209 includes, for example, the Internet, intranets, extranets, wide area networks (WANs), local area networks (LANs), wired networks, wireless networks, or other suitable networks, etc., cable networks, satellite networks, or any combination of two or more such networks). 

Regarding claim 13, Davoust discloses the medium of claim 1, wherein said computer-readable program instructions, when executed, cause said computer further to vocalize a response based at least in part on said script (col 2 lines 30-49 In various examples, the system may read out the response 109 using a speech synthesizer, or the system may present the response 109 via the display). 

Regarding claim 14, Davoust discloses the medium of claim 13, wherein said vocalization is performed using a voice user interface (col 6 lines 56-65 The speech synthesizer 288 may be executed to generate audio corresponding to synthesized speech for textual inputs; The content information application 287 is executed to receive verbal queries 106 from users via the microphone 285 and to present responses 109 via the speech synthesizer 288 and the audio device 286). 

Regarding claim 15, Davoust discloses the medium of claim 14, wherein said voice user interface comprises a digital assistant (col 5 lines 58-67 e.g. personal digital assistants). 

Regarding claim(s) 21-26 (drawn to a method):               
The rejection/proposed combination of Davoust and Chen, explained in the rejection of CRM claim(s) 1 & 9-13, anticipates/renders obvious the steps of the method of claim(s) 21-26 because these steps occur in the operation of the proposed combination as discussed above. Thus, the arguments similar to that presented above for claim(s) 1 & 9-13 is/are equally applicable to claim(s) 21-26.

Claim 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Davoust and Chen as applied to claim 1 above, and further in view of Tang (US 20200380292).
Regarding claim 8, the combination of Davoust and Chen discloses the medium of claim 7, but fail to teach wherein said categorizing said query subject comprises a target categorization module assigning a category based on said extracted noun-phrase.
Tang wherein said categorizing said query subject comprises a target categorization module assigning a category based on said extracted noun-phrase (Tang ¶68 in step S430, a category corresponding to the largest one of the plurality of second feature similarities S2.sub.ref(i) may be determined as the category of the object).
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the invention to have implemented the teaching of wherein said categorizing said query subject comprises a target categorization module assigning a category based on said extracted noun-phrase from Tang into the medium as disclosed by the combination of Davoust and Chen. The motivation for doing this is to improve methods and device for identifying an object.

Response to Arguments
Applicant's arguments filed 05/11/2022 have been fully considered but they are not persuasive. 
The applicant argues that the prior art of record does not teach “categorize said query subject as said audiovisual work”, “based on said categorizing said query subject as said audiovisual work, determine that said query subject is not depicted in said still image” and “receive additional information about said identified audiovisual work, said additional information including the identity of said query subject”.
Regarding the above argument, the examiner respectfully disagrees. Regarding “categorize said query subject as said audiovisual work”, Chen teaches this in ¶128 a “user may search for all of the Brad Pitt scenes in Ocean's Eleven or all movie scenes containing Brad Pitt generally”. ¶128 continues with “the search may query the metadata that was gathered via image recognition of the key frames or may perform a search through the key frames of an asset in real time based on the search criteria of a user. The same image recognition may occur for corporate logos in movies, for example finding all of the scenes where the Coca-Cola logo is displayed.” That is, the service finds all frames containing Brad Pitt, or any query subject. Under the broadest reasonable interpretation, this reads on “categorize said query subject as said audiovisual work” because the query subject is recognized or categorized in the frames.
Next, Chen teaches “based on said categorizing said query subject as said audiovisual work, determine that said query subject is not depicted in said still image” in ¶128 Image recognition may be performed on the key frames or individual key frames to identify faces, products, or corporate logos; The faces of actors/products/or logos may be identified in the key frames; therefore the key frames not having Bratt Pitt would be the frames not depicted in the image. That is, after the service knows what query subject to search for (e.g. Brad Pitt), it identifies frames that include the query subject. The frames that do not include the query subject reads on “determine that said query subject is not depicted in said still image”. 
Lastly, regarding the limitation “receive additional information about said identified audiovisual work, said additional information including the identity of said query subject”, Davoust teaches this in col 2 lines 30-49 The response 109 in this case specifies the character name ("George") and the name of the cast member who plays the character ("Jim Kingsboro"); In various examples, the system may read out the response 109 using a speech synthesizer, or the system may present the response 109 via the display. That is, more information is received about the query subject including the character name and the cast member who plays the character. 
In response to applicant’s argument that there is no teaching, suggestion, or motivation to combine the references, the examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art.  See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007).  In this case, Chen teaches categorize said query subject as said audiovisual work in ¶128 a user may search for all of the Brad Pitt scenes in Ocean's Eleven or all movie scenes containing Brad Pitt generally. Chen then teaches based on said categorizing said query subject as said audiovisual work, determine that said query subject is not depicted in said still image in ¶128 Image recognition may be performed on the key frames or individual key frames to identify faces, products, or corporate logos; The faces of actors/products/or logos may be identified in the key frames; therefore the key frames not having Bratt Pitt would be the frames not depicted in the image. And lastly, Davoust teaches receive additional information about said identified audiovisual work, said additional information including the identity of said query subject in col 2 lines 30-49 The response 109 in this case specifies the character name ("George") and the name of the cast member who plays the character ("Jim Kingsboro"); In various examples, the system may read out the response 109 using a speech synthesizer, or the system may present the response 109 via the display.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEVIN KY whose telephone number is (571)272-7648. The examiner can normally be reached Monday-Friday 9-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on 571-272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KEVIN KY/Primary Examiner, Art Unit 2669