Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .	
Continued Examination
2.    A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 06/23/2022 has been entered.
Information Disclosure Statement
3.	The information disclosure statement (IDS) submitted on 06/23/2022 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Detailed Action
4. 	This action is in response to the filing with the office dated 03/04/2022. 
	Claims amended are 1, 7, 9, 15, and 17. Claims 2, 6, 8, 10, 14, 16 and 18 are cancelled. Claims 1, 3-5, 7, 9, 11- 13, 15, 17, and 19-24 are now pending in this office action.
Allowable Subject matter
5.	Claims 1, 9 and 17 are allowed as being independent claims.
6.    	Dependent claims 3-5, 7, 21-22 are allowed as being dependent on independent claim 1. Dependent claims 11-13, 15, 23-24 are allowed as being dependent on independent claim 9. Dependent claims 19-20 are allowed as being dependent on independent claim 17.
Reasons for Allowance
7. 	The following is an examiner's statement of reasons for allowance: Applicants amendments for claims 1, 9 and 17 were fully considered and found to be persuasive and overcome the prior art cited in the Final rejection. While display apparatus are known to depict second scene query based on first scene query by determining user intent (LISTER; Patrick M.et al (US 20150382079 A1) Paragraph [0082]) there does not appear to be a specific teaching of “perform voice recognition of the second voice scene query, depict a second phrase corresponding to the second scene query that follows the first phrase as a further query of the first scene query, and interpret, usinq a machine learninq technique, the second scene query as hierarchical supplemental information to the first scene query to refine potential scene command related to the first scene query and to select and depict scene images corresponding to a sub- group of the sequence of video segments within the video content” as claimed in view of the rest of the limitations of claim 1. Similarly for claims 9 and 17. While each element of the limitation may be known in some parts the combination as claimed would not be obvious absent impermissible hindsight. The cited prior art does not teach or suggest, in combination with the rest of the limitations in the dependent claims.
	Claim 1: A display apparatus comprising: user input circuitry for receiving user commands; a display for displaying video content and a user interface; a processor in communication with the user input circuitry and the display; and non-transitory computer readable media in communication with the processor that stores instruction code, which when executed by the processor, causes the processor to: receive, from the user input circuitry, a first scene query from the user input circuitry while displaying the video content, wherein the first scene query comprises a voice command from a user; recognize the first scene query by voice recognition and depict a first phrase that corresponds to the first scene query in real-time as the voice command is being received from the user; determine a sequence of video segments in time within the video content that are related to a type of scene associated with the first scene query; update the user interface to depict scene images corresponding to the sequence of video segments, where the scene images are associated with unique identifiers for facilitating voice control; and in response to receiving a second scene query in voice following the first scene query, perform voice recognition of the second voice scene query, depict a second phrase corresponding to the second scene query that follows the first phrase as a further query of the first scene query, and interpret, usinq a machine learninq technique, the second scene query as hierarchical supplemental information to the first scene query to refine potential scene command related to the first scene query and to select and depict scene images corresponding to a sub- group of the sequence of video segments within the video content.
	Claim 9: A method for controlling a display apparatus comprising: receiving, via user input circuitry, user commands; displaying video content, the video content and a user interface; receiving, from the user input circuitry, a first scene query from the user input circuitry while displaying the video content, wherein the first scene query comprises a voice command from a user; 3Application No. 15/985,251Docket No. 515218.5000100 recognizing the first scene query by voice recognition and depicting a first phrase that corresponds to the first scene query in real-time as the voice command is being received from the user; determining a sequence of video segments in time within the video content that are related to a type of scene associated with the first scene query; updating the user interface to depict scene images corresponding to the sequence of video segments, where the scene images are associated with unique identifiers for facilitating voice control; and in response to receiving a second scene query in voice following the first scene query, performinq voice recognition of the second voice scene query, depict a second phrase corresponding to the second scene query that follows the first phrase as a further query of the first scene query, and interpreting, using a machine learning technique, the second scene query as hierarchical supplemental information to the first scene query to refine potential scene command related to the first scene query and to select and depict scene images corresponding to a sub-group of the sequence of video segments within the video content.
	Claim 17: A non-transitory computer readable media that stores instruction code for controlling a display apparatus, the instruction code being executable by a machine for causing the machine to: display a video content and a user interface; receive, from a user input circuitry of the machine, a first scene query from the user input circuitry while displaying the video content, wherein the first scene query comprises a voice command from a user; recognize the first scene query by voice recognition and depicting a first phrase that corresponds to the first scene query in real-time as the voice command is being received from the user; determine a sequence of video segments in time within the video content that are related to a type of scene associated with the first scene query; update the user interface to depict scene images corresponding to the sequence of video segments, where the scene images are associated with unique identifiers for facilitating voice control; and in response to receiving a second scene query in voice following the first scene query, perform voice recognition of the second voice scene query, depict a second phrase corresponding to the second scene query that follows the first phrase as a further query of the first scene query, and interpret, usinq a machine learninq technique, the second scene query as hierarchical supplemental information to the first scene query to refine potential scene command related to the first scene query and to select and depict scene 5Application No. 15/985,251Docket No. 515218.5000100 images corresponding to a sub-group of the sequence of video segments within the video content.
	The cited prior art on record VAN OS; Marcel (US 20150382047 A1) teaches, Systems and processes are disclosed for controlling television user interactions using a virtual assistant. A virtual assistant can interact with a television set-top box to control content shown on a television. Speech input for the virtual assistant can be received from a device with a microphone. User intent can be determined from the speech input, and the virtual assistant can execute tasks according to the user's intent, including causing playback of media on the television. Virtual assistant interactions can be shown on the television in interfaces that expand or contract to occupy a minimal amount of space while conveying desired information. Multiple devices associated with multiple displays can be used to determine user intent from speech input as well as to convey information to users. In some examples, virtual assistant query suggestions can be provided to the user based on media content shown on a display. The cited prior art on record Gunatilake; Priyan (US 20130006625 A1) teaches, A system, method, and computer program product for automatically analyzing multimedia data audio content are disclosed. Embodiments receive multimedia data, detect portions having specified audio features, and output a corresponding subset of the multimedia data and generated metadata. Audio content features including voices, non-voice sounds, and closed captioning, from downloaded or streaming movies or video clips are identified as a human probably would do, but in essentially real time. Particular speakers and the most meaningful content sounds and words and corresponding time-stamps are recognized via database comparison, and may be presented in order of match probability. Embodiments responsively pre-fetch related data, recognize locations, and provide related advertisements. The content features may be also sent to search engines so that further related content may be identified. User feedback and verification may improve the embodiments over time. The cited prior art on record Soni; Sachin (US 20180089203 A1) teaches, the present disclosure is directed towards methods and systems for providing relevant video scenes in response to a video search query. The systems and methods identify a plurality of key frames of a media object and detect one or more content features represented in the plurality of key frames. Based on the one or more detect content features, the systems and methods associate tags indicating the detected content features with the plurality of key frames of the media object. The systems and methods, in response to receiving a search query including search terms, compare the search terms with the tags of the selected key frames, identify a selected key frame that depicts at least one content feature related to the search terms, and provide a preview image of the media item depicting the at least one content feature. 
	Claims 1, 9 and 17: The cited prior art on record VAN OS; Marcel (US 20150382047 A1), Gunatilake; Priyan (US 20130006625 A1) and Soni; Sachin (US 20180089203 A1) do not teach or suggest in combination with the rest of the limitations in the dependent claims “perform voice recognition of the second voice scene query, depict a second phrase corresponding to the second scene query that follows the first phrase as a further query of the first scene query, and interpret, usinq a machine learninq technique, the second scene query as hierarchical supplemental information to the first scene query to refine potential scene command related to the first scene query and to select and depict scene images corresponding to a sub- group of the sequence of video segments within the video content”.	
	In addition, none of the references cited, reference uncovered that would have provided a basis of evidence for asserting a motivation, nor one of ordinary skilled in the art at the time the invention was made, knowing the teaching of the prior arts of record would have combined them to arrive at the present invention as recited in the context of independent claims 1, 9 and 17 as a whole.
Conclusion
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SUMAN RAJAPUTRA whose telephone number is (571) 272-4669. The examiner can normally be reached on Monday-Friday, 8:30-5:00. If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, ASHISH THOMAS can be reached on 571-272-0631. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only.
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/S. R./ 
Examiner, Art Unit 2164

/ASHISH THOMAS/Supervisory Patent Examiner, Art Unit 2164