DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 04//29/2021 has been entered.
 Response to Arguments
Applicant’s arguments with respect to claims 1-3, 5-11, 13-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5-6,  8-9, 13-14, 16-17, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Kim et al US 2014/0211969(hereinafter Kim) in view of  Onuma US 2019/0228230, further in view of  Lee et al US 2015/0222780(hereinafter Lee).

Regarding claim1, Kim teaches an artificial intelligence device comprising: one or more speakers configured to output audio (see fig. 1  mobile terminal comprising  audio output module 152, [0073],  audio output module 152  implemented using one or more speakers); a display configured to display a video( see fig. 1  the mobile terminal further comprising  display 151); and one or more processors configured to: acquire object identification information of each of  a plurality of objects contained in the video, acquire one or more objects capable of audio output from among the detected plurality of objects( fig. 1 controller 180, [0112], [0115],  fig. 4 separate the audio signal of the video into sound source generated by characters (objects), see also fig. 6a; [0186] controller 180 includes reference audio/image to identify  sound source ), cause, on a display, a display of one or more volume adjustment items that correspond to adjusting an audio volume of a particular object from the acquired one or more objects([0125-0126], see fig. 6b-6d, when the  user input  selects a character  as shown in fig.6b and 6d,  the controller displays a volume adjust bar 610,), and adjust the audio volume of the output audio for at least one specific object from the acquired one or more objects according to an operation command of a respective volume adjustment item from among the one or more volume adjustment items corresponding to the at least one specific object([0126], the controller  180 control a volume adjust bar 610  to adjust the volume of selected character, fig. 4, 12-13) but does not teach  cause the display to display a plurality of indicators 
Onuma teaches cause the display to display a plurality of indicators for distinguishing each of the identified plurality of objects, wherein each of the plurality of indicators is displayed adjacent to a corresponding one of the identified plurality of objects ([0052], describes that operation unit displays information such as an object name (object identifying information) for each object as shown in fig. 9).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to display object identifying information as in Onuma   such that the user is able to recognize individual object easily in order to manipulate
/edit the content.
Kim in view of Onuma do not teach  and Lee teaches  when an identified object among the identified plurality of objects is selected, enlarging a size of the selected identified object while adjusting a volume of an audio spoken by the selected identified object([0212], [0217-0218],  describes enlarging a region and amplifying the audio of predetermined subject).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to  enlarge region a predetermined subject as in Lee  in order to  allow the user be aware that the currently amplified audio is the voice of the subject in the enlarged region([0212]).
(see fig. 4, 12-13 volume adjust bar).

Regarding claim6,Kim teaches  the artificial intelligence device of claim 5, wherein the one or more processors are further configured to mute the audio output from the at least one specific object according to a command selecting a corresponding volume icon from the one or more volume icons (see fig. 4, 12-13 volume adjust bar can be used to mute(reduce to zero) audio volume).

	Regarding claim8, Kim discloses the artificial intelligence device of claim 1, wherein the one or more objects are acquired by: acquiring a plurality of the one or more objects capable of outputting audio( fig. 1 controller 180, [0112], [0115],  fig. 4 separate the audio signal of the video into sound source generated by characters(objects), see also fig. 6a ), clustering the acquired plurality of the one or more objects into a plurality of clusters([0162],   the device sorts the sound source separated for an audio signal of a video   by categories(clustering), and controlling audio output contained in a selected cluster from the plurality of clusters([0162], the  device able to control a volume  to be adjusted for each group/categories (cluster)).

Claims9, 17 are rejected for similar reason as described in claim1 above.

(see fig. 4, 12-13 volume adjust bar).

Regarding claim14, Kim teaches the method of claim 13, further comprising muting the audio output from the at least one specific object according to a command selecting a corresponding volume icon from the one or more volume icons (see fig. 4, 12-13 volume adjust bar can be used to mute(reduce to zero) audio volume).
Claims 16, 20 are rejected for similar reason as described in claim8 above.

Claims2-3, 10-11, 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Kim, Onuma in view of Lee  as applied to claims 1,5-6,  8-9, 13-14,  16-17, 20  above, and further in view of Polavarapu et al US 2020/0265238(hereinafter  Polavarapu).

Regarding claim2, Kim, Onuma in view of Lee teaches all the limitations of claim1 above but do not teach and Polavarapu teaches the plurality of objects are detected based at least in part on using an object detection model, and the one or more processors are further configured to acquire the identification information of the plurality of objects by using an object identification model ([0026-0027], describes detecting and identifying object in a video frame).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to detecting and identifying objects as in 

Regarding claim3, Kim, Onuma in view of Lee further in View of Polavarapu teaches  the artificial intelligence device of claim 2, wherein the object detection model and the object identification model are trained by deep learning algorithm (Polavarapu :[0026-0027], describes detecting and identifying object in a video frame using deep learning machine learning model), the object detection model is configured to extract a bounding box that represents a shape of the particular object based on image data corresponding to a frame from the video, and the object identification model is configured to acquire the identification information by identifying the particular object contained in the extracted bounding box (Polavarapu :[0054-0059], [0062-0064],  region of interest(bounding box)  404  used to  identify the objects in a video frame as shown in fig. 4).
The motivation for combining the prior arts discussed in claim2 above.
Claims 10, 18 are rejected for similar reason as described in claim2 above.
Claims11, 19 are rejected for similar reason as described in claim3 above.

Claims 7, 15 are rejected under 35 U.S.C. 103 as being unpatentable over Kim, Onuma in view of Lee  as applied to claims 1, 5-6, 8-9,13-14,  16-17, 20  above, and further in view of Honma et al US 2019/0222798 (hereinafter Honma).

Regarding claim7, Kim, Onuma in view of Lee teaches all the limitations of claim1 above but do not teach and Honma teaches control audio outputs of the one or more speakers to ([0193], describes that audio is outputs from different speaker based on the position of the object fir localizing a sound in a position, on space, of the audio object).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention outputting audio to different speakers as in Honma in order to localize the sound output relative to the sound source (speaking) character.
Claim15 is rejected for similar reason as described in claim7 above.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GIRUMSEW WENDMAGEGN whose telephone number is (571)270-1118.  The examiner can normally be reached on 9:00-7:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Thai Tran can be reached on (571) 272-7382.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR 


GIRUMSEW WENDMAGEGN
Primary Examiner
Art Unit 2484



/GIRUMSEW WENDMAGEGN/             Primary Examiner, Art Unit 2484