DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/31/2022 has been entered.
 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-2, 9-15, & 21-26 is/are rejected under 35 U.S.C. 103 as being unpatentable over Davoust (US Patent 10271109 B1) in view of Chen et al (US 20180014041).
Regarding claim 1, Davoust discloses a non-transitory computer-readable medium of a computer having computer-readable program instructions embodied thereon (col 12 lines 32-63), wherein said instructions, when executed, cause said computer to:
receive an audibly spoken question including a noun-phrase, a video stream comprising audio visual work, and metadata about said video stream (col 2 lines 18-50 video content 103 corresponding to a movie is rendered upon a display for viewing by a user; While watching the video content 103, the user presents a verbal query 106 in the form of a question: "Who is the man at the right?; col 7 lines 40-50 The query response service 218 can then determine with respect to the time metadata 237 the items that are currently shown in the video content 103, within a predetermined threshold before or after the verbal query 106);
convert said audibly spoken question to text (col 7 lines 32-38 converting the audio received from the microphone 285 either to text or profile representations);
capture image data of a still frame of said video stream associated with a point in time of said video stream when said audibly spoken question is received (Fig. 3A-3C & col 7 lines 60-67 & col 8 lines 1-40 e.g. video content with grid being superimposed thereon); 
receive said text and extracting therefrom said noun-phrase, said extracted noun-phrase including a query subject (col 2 lines 18-50  While watching the video content 103, the user presents a verbal query 106 in the form of a question: "Who is the man at the right?; col 7 lines 39-50 The query response service 218 performs natural language processing on the verbal query 106 to determine the items that are inquired about and the nature of the inquiry, e.g., who, what, when, where, why, how, etc.);
receive additional information about said identified audiovisual work, said additional information including the identity of said query subject (col 2 lines 30-49 The response 109 in this case specifies the character name ("George") and the name of the cast member who plays the character ("Jim Kingsboro"); In various examples, the system may read out the response 109 using a speech synthesizer, or the system may present the response 109 via the display),
generate a textual description of the identity of said identified query subject (col 2 lines 30-49 The response 109 in this case specifies the character name ("George") and the name of the cast member who plays the character ("Jim Kingsboro"); In various examples, the system may read out the response 109 using a speech synthesizer, or the system may present the response 109 via the display); and
(col 2 lines 30-49 In various examples, the system may read out the response 109 using a speech synthesizer, or the system may present the response 109 via the display).
Davoust fails to specifically teach categorize said query subject as said audiovisual work, and based on said categorizing said query subject as said audiovisual work, determine that said query subject is not depicted in said still image; using said received metadata, identify said audiovisual work of which said still image is a component part.
Chen teaches categorize said query subject as said audiovisual work (¶128 a user may search for all of the Brad Pitt scenes in Ocean's Eleven or all movie scenes containing Brad Pitt generally), and based on said categorizing said query subject as said audiovisual work, determine that said query subject is not depicted in said still image (¶128 Image recognition may be performed on the key frames or individual key frames to identify faces, products, or corporate logos; The faces of actors/products/or logos may be identified in the key frames; therefore the key frames not having Bratt Pitt would be the frames not depicted in the image); using said received metadata, identify said audiovisual work of which said still image is a component part (¶128 The search may query the metadata that was gathered via image recognition of the key frames or may perform a search through the key frames of an asset in real time based on the search criteria of a user).
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the invention to have implemented the teaching of categorize said query subject as said audiovisual work, and based on said categorizing said query subject as said audiovisual work, determine that said query subject is not depicted in said still image; using said received metadata, identify said audiovisual work of which said still image is a component part from Chen into the medium as disclosed by Davoust. The motivation for doing this is to improve methods for presentation of key frames.

(col 7 lines 32-50 query response service 218 performs natural language processing on the verbal query 106 to determine the items that are inquired about and the nature of the inquiry). 

Regarding claim 9, Davoust discloses the medium of claim 1, wherein said medium is included in a display device (Fig. 2 display 206). 

Regarding claim 10, Davoust discloses the medium of claim 9, wherein said display device is a smart television (col 5 lines 58-67 e.g. smart television). 

Regarding claim 11, Davoust discloses the medium of claim 1, wherein said medium is included in a mobile device (col 5 lines 58-67 e.g. cellular telephones). 

Regarding claim 12, Davoust discloses the medium of claim 1, wherein said video stream is received via a telecommunications network (col 2 lines 50-59 The network 209 includes, for example, the Internet, intranets, extranets, wide area networks (WANs), local area networks (LANs), wired networks, wireless networks, or other suitable networks, etc., cable networks, satellite networks, or any combination of two or more such networks). 

Regarding claim 13, Davoust discloses the medium of claim 1, wherein said computer-readable program instructions, when executed, cause said computer further to vocalize a response based at least in part on said script (col 2 lines 30-49 In various examples, the system may read out the response 109 using a speech synthesizer, or the system may present the response 109 via the display). 

Regarding claim 14, Davoust discloses the medium of claim 13, wherein said vocalization is performed using a voice user interface (col 6 lines 56-65 The speech synthesizer 288 may be executed to generate audio corresponding to synthesized speech for textual inputs; The content information application 287 is executed to receive verbal queries 106 from users via the microphone 285 and to present responses 109 via the speech synthesizer 288 and the audio device 286). 

Regarding claim 15, Davoust discloses the medium of claim 14, wherein said voice user interface comprises a digital assistant (col 5 lines 58-67 e.g. personal digital assistants). 

Regarding claim(s) 21-26 (drawn to a method):               
The rejection/proposed combination of Davoust and Chen, explained in the rejection of CRM claim(s) 1 & 9-13, anticipates/renders obvious the steps of the method of claim(s) 21-26 because these steps occur in the operation of the proposed combination as discussed above. Thus, the arguments similar to that presented above for claim(s) 1 & 9-13 is/are equally applicable to claim(s) 21-26.

Claim 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Davoust and Chen as applied to claim 1 above, and further in view of Tang (US 20200380292).
Regarding claim 8, the combination of Davoust and Chen discloses the medium of claim 7, but fail to teach wherein said categorizing said query subject comprises a target categorization module assigning a category based on said extracted noun-phrase.
Tang wherein said categorizing said query subject comprises a target categorization module assigning a category based on said extracted noun-phrase (Tang ¶68 in step S430, a category corresponding to the largest one of the plurality of second feature similarities S2.sub.ref(i) may be determined as the category of the object).
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the invention to have implemented the teaching of wherein said categorizing said query subject comprises a target categorization module assigning a category based on said extracted noun-phrase from Tang into the medium as disclosed by the combination of Davoust and Chen. The motivation for doing this is to improve methods and device for identifying an object.

Response to Arguments
Applicant’s arguments with respect to claim(s) 1-2, 8-15, 21-26 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. For example, applicant argues that the prior art of record does not teach “categorize said query subject as said audiovisual work, and based on said categorizing said query subject as said audiovisual work, determine that said query subject is not depicted in said still image; using said received metadata, identify said audiovisual work of which said still image is a component part”. This does not rely on any reference applied in the prior rejection of record.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEVIN KY whose telephone number is (571)272-7648. The examiner can normally be reached Monday-Friday 9-5PM.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on 571-272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KEVIN KY/               Primary Examiner, Art Unit 2669