DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This office action is in responsive to RCE filed on 09/20/2022.  Claims 1, 5-7, 9-12, 15-17, and 20-23 remain pending in the application. Claims 1, 12, and 17 are independent.

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Claim Objections
Applicant's amendment to claims corrects previous objections; therefore, the previous objections are withdrawn.

Claim Rejections - 35 USC § 112
Applicant's amendment to claims corrects previous rejections; therefore, previous rejections are withdrawn.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 5, 10-12, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Brouwer et al. (US 2019/0246165 A1, filed on 10/18/2017), hereinafter Brouwer in view of Totoki (US 2013/0163958 A1, published on 06/27/2013), hereinafter Totoki and MARUYAMA et al. (US 2014/0344853 A1, published on 11/20/2014), hereinafter MARUYAMA'853.

Independent Claims 1, 12, and 17
Brouwer’165 discloses an additional object display method, performed by a terminal (Brouwer’165, ABSTRACT; ¶ [0002]: methods and systems for facilitating commenting and messaging associated with particular portions of a video that may reflect the presence of an object in various frames of the video by contextually augmenting and annotating moving pictures or images with tags using region tracking on computing devices with screen displays, including mobile devices and virtual reality headsets), the method comprising: 
displaying a trigger control in a video playback interface (Brouwer’165, FIG. 2; ¶ [0068]: a mobile computing device 14 playing a video that is streamed from a server to the device, wherein the computing device 14 features standard controls known to video streaming applications 13; there may be different types of controls with different functions that can be activated; additional controls and/or functions can be made available; 1009 in FIG. 27; ¶ [0167]: a video player, which has standard known movie controls 1007, 1007b and Messaging or Chat 1009 functionality, or similar functionality, for establishing different types of communications such as sending messages or comments; the Chat or Messaging Function (1009) may be placed elsewhere in the interface; i.e., 1009 in FIG. 27 can be a trigger control to activate functionality for adding annotations, messages or comments to a video/movie), the video playback interface being used for playing a video comprising a plurality of picture frames (Brouwer’165, 1040 in FIG. 33; ¶ [0172]: FIG . 33 shows a video 1040 with a series of frames); 
pausing playback of the video in response to an activation operation on the trigger control (Brouwer’165, FIGS. 3-4; ¶¶ [0070]-[0071], [0079], and [0119]: a user clicks on an object 11 (e.g., an airplane) at position 12 in a movie, where the object is the one to which the user wants to place or "attach" an annotation to; at the same time, the movie is paused or stopped) (Brouwer’165, 1009 in FIG. 27; ¶¶ [0167]-[0168]: user can create a Message or Comment tag by creating a Tag Marker 1012 in a video by first clicking the pause button of the video player or clicking a key, e.g., 1009 in FIG. 27 or spacebar, to pause the movie1), and displaying a reference picture frame of the plurality of picture frames in the video playback interface, the reference picture frame being a picture frame corresponding to a pause time point in the video (Brouwer’165, FIG. 5; ¶ [0071]: a user clicks on the side of the cowling of an airplane 20 to mark a location to place an annotation or Screen Tag which serves as an anchor, placed by a user at a specific position 12 in a movie or image; this anchor or Screen Tag 20 is associated with the ( x , y ) coordinate of the position 12 and the frame number corresponding to a pause time point in the video; FIGS. 33-34; ¶¶ [0128] and [0172]-[0177]: compares a screen selection of the video frame (e.g., 1041b) at the location that the user clicked on with the images of specific key frames (e.g., 1041b to 1041c), wherein a user clicks in the video, or pauses the video, at frame position 1041b, which shows the cube at time 0; analyzing each of the frames from 1041b through 1041c; determine if the tracking of the object is possible within this period of time; i.e., 1041b is used as reference frame for tracking);
obtaining a target object in response to a  (Brouwer’165, FIGS. 4-5; ¶ [0071]: a user clicks on an object 11 (e.g., an airplane) at position 12 in a movie, where the object is the one to which the user wants to place or "attach" an annotation to; a user clicks on the side of the cowling of an airplane 20 to mark a location to place an annotation or Screen Tag which serves as an anchor, placed by a user at a specific position 12 in a movie or image; this anchor or Screen Tag 20 is associated with the ( x , y ) coordinate of the position 12 and the frame number) (Brouwer’165, FIGS. 27 and 33-34; ¶¶ [0164], [0168] and [0173]: after pausing the video, user then clicks, e.g., on the left top corner 1012b of the cube 1011b, where he/she intends to post a Message or Comment tag, which will then be associated with the actual location of the top left corner of the cube 1011b and 1011c for the duration of playback time from t+n to t+n2); 
tracking the target object in a first set of picture frames of the plurality of pictures frames that are displayed after the reference picture frame  first set of picture frames, frame-by-frame, in a reverse order starting from a last frame of the first set of picture frames (Brouwer’165, FIG. 34; ¶¶ [0174]-[0175]: analyze a predefined area around the clicked location 1012b of the image by means of object tracking to determine whether the object will still be visible for at least a predetermined time so that a Tag Marker 1012b, to which the actual Message or Comment Tag is associated, can be placed at the desired location; i.e., the display position of the anchored object in video must be tracked; if the tracking of the object is possible for this predetermined time, a Tag Marker (1012b) , which is then rendered for each of the frames at the chosen locations of the object (1012b, c) for the predetermined time interval and superimposed on the video frame (1041b) and subsequent frames of a video (1040) during playback; FIG. 116; ¶ [0116]: if a part (112) with an element (113) cannot successfully track for a predetermined duration starting from Frame (120), the region tracking software would determine the last frame (125) where the Screen Tag was still visible (113b) and calculate backwards the predetermined time interval of the frames from that point to determine that the frame (100) is the start frame required where the Screen Tag should be created to fulfill the four second requirement; ¶¶ [0093] and [0175]: the system can check the frames (each of frames or each of key frames) in sequence starting with the frame that the user clicked on; the calculation can also start at the last (100th) frame and calculate backward when tracking; the system will start from either end using frame 1 and 98/100, e.g., followed by 19/2 and 80/99, and so forth; i.e., simultaneously tracking in a chronological order and in a reverse order of playback time); 
in response to the tracking, obtaining first display information indicating at least one of a display position, a display size, or a display posture of the target object in each picture frame of the first set  (Brouwer’165, FIGS. 13-14 and 18-19; ¶¶ [0111] and [0114]: tracking positions of an airplane (along with Screen Tags 20, 21) traveled from top right to the bottom left of the screen (84) in each picture frame; e.g., three frames (81 , 82 and 83) in a video (80) from a video sequence)
wherein the first display information comprises pixel coordinates of a target point in the target object in the each picture frame, the target point being a position point corresponding to the end position of the  (Brouwer’165, FIGS. 4-5; ¶¶ [0070]-[0071]: a user clicks on the side of the cowling of an airplane 11 to mark a location to place or "attach" an annotation or Screen Tag 20, which serves as an anchor, placed by a user at a specific position 12 of the airplane 11 in a movie or image; this anchor or Screen Tag 20 is associated with the ( x , y ) coordinate of the position 12 on the airplane 11 and the frame number); 
obtaining relative position information between the additional object and the target point according to a display position of  (Brouwer’165, FIGS. 19 and 25-26; ¶¶ [0076]-[0077] and [0114]: based on the video's frame count, time (optional), frame rate and screen resolution, or any combination of these, a separate interactive layer (85) is created that matches the exact same frame of the video, and this interactive Dynamic Content layer (85) contains all the Tag Containers, Links and Screen Tags and their positions relative to the video (80); FIGS. 4-5; ¶¶ [0154] and [0070]-[0071]: when a message or comment Tag is shared with other users, the system will maintain the relation of the Tag Marker to that identified object in the video and accordingly open the video and show the frame and the object (1011) that the Message or Comment Tag (1020) is associated with; i.e., relative position of the Screen Tag 20 (i.e., additional object) and the click position 12 (i.e., target point) on the airplane 11 is calculated at the reference frame and maintained in other frames of video); 
generating second display information according to the first display information and the relative position information, the second display information indicating at least one of a display position, a display size, or a display posture of an additional object corresponding to the target object in the each picture frame; and displaying the additional object according to the second display information in the each picture frame during the playback of the video (Brouwer’165, FIGS. 13-14 and 18-19; ¶¶ [0111], [0114], an [0154]: Screen Tags 20, 21 will follow their positively identified elements (e.g., corresponding positions on airplane) for at least a predetermined time, e.g., an airplane along with Screen Tags 20, 21 travels from top right to the bottom left of the screen by maintaining the relation of the Tag Marker to that identified object in the video; i.e., generating and displaying Screen Tags 20, 21 according display position of airplane and relative position of Tag Marker and corresponding positions on airplane) (Brouwer’165, FIGS. 27 and 34; ¶¶ [0171] and [0173]-[0176]: during playback the Message or Comment Tag would appear at that location of the Tag Marker 1012, which can then be interacted with during playback or when paused; a Tag Marker 1012b, which is then rendered for each of the frames at the chosen locations of the object for the interval from 1012b to 1012c and superimposed on the video frame 1041b and subsequent frames of a video 1040 during playback) (Brouwer’165, FIGS. 6 and 27-28; ¶¶ [0071]-[0073]: Screen Tag 20 is an icon on screen depicting the type of content associated with identified/attached/anchored object; Screen Tags 20 may be of different geometric shapes, colors or sizes, a logo, or text image or any other graphical elements; ¶ [0151]: a Tag Marker 1012 features a visible symbol, or graphical element, or marker of a particular size, which may differ in shape, size and color depending on the type of content the Message or Comment Tag is associated with).
Brouwer’165 further discloses an apparatus comprising: a memory storing at least one instruction, at least one program, a code set, or an instruction set; and a processor configured to execute the at least one instruction, the at least one program, the code set, or the instruction set, and upon execution, to perform the method described above (Brouwer’165; FIG. 1; ¶ [0067]: memory and processor are inherited in a server platform including server(s) 1 and client devices 4-9 as shown in FIG. 1).
Brouwer’165 further discloses a non-transitory computer readable storage medium storing at least one instruction, at least one program, a code set, or an instruction set, and the at least one instruction, the at least one program, the code set, or the instruction set configured to be loaded and executed by a processor, and upon being executed, cause the processor to perform the method described above (Brouwer’165; FIG. 1; ¶ [0067]: non-transitory computer readable storage medium and processor are inherit in a server platform including server(s) 1 and client devices 4-9 as shown in FIG. 1).
Brouwer’165 fail to explicitly disclose (1) obtaining a target object in response to the drag operation on the trigger control, the target object being a display object corresponding to an end position of the drag operation in the reference picture frame; (2) tracking the target object in a first set of picture frames of the plurality of pictures frames that are displayed after the reference picture frame and in a second set of picture frames of the plurality of picture frames that are displayed before the reference picture frame, wherein the tracking comprises: tracking the target object in the first set of picture frames, frame-by-frame, in a chronological order starting from the reference picture frame corresponding to the pause time point  (i.e., the time point when user request to insert an annotation/comment); and tracking the target object in the second set of picture frames, frame-by-frame, in a reverse order starting from the reference picture frame corresponding to the pause time point (i.e., the time point when user request to insert an annotation/comment); (3) in response to the tracking, obtaining first display information indicating at least one of a display position, a display size, or a display posture of the target object in each picture frame of the first set and the second set; (4) the target point being a position point corresponding to the end position of the drag operation on the trigger control in the reference picture frame; and (5) obtaining relative position information between the additional object and the target point according to a display position of a preview picture of the additional object in the reference picture frame and the corresponding end position of the drag operation in the reference picture frame.
Totoki teaches a system and a method for drawing an anchor superposed on a layer for reproducing a video (Totoki, ¶ [0082]), wherein (1) obtaining a target object in response to a drag operation on the trigger control, the target object being a display object corresponding to an end position of the drag operation in the reference picture frame (Totoki, FIGS. 5B and 5D; ¶¶ [0077] and [0080]: the user sets an anchor an12 (i.e., trigger control) at the desired position (i.e., end position of a drag operation) of the battery pack (i.e., target object) by moving the position of this anchor an12 displayed on the screen d12 using a drag operation);
(4) the target point being a position point corresponding to the end position of the drag operation on the trigger control in the reference picture frame (Totoki, FIG. 5B; step S102 in FIG. 9; ¶¶ [0096] and [0077]: identify a start frame (i.e., reference picture frame) in which position coordinates of an object and an anchor are set, wherein the position coordinates of an object (i.e., target object) are the center coordinates of the anchor (i.e., trigger control) as indicated in the screen d12 of FIG. 5B (e.g., drag the anchor an12 to the desired position of the battery pack); i.e., the target point being a position point corresponding to the desired position coordinates of the battery pack at the end of drag operation on the anchor an12 (e.g., touch position or the center coordinates of the anchor an12) in the start frame);
 and (5) obtaining relative position information between the additional object and the target point according to a display position of a preview picture of the additional object in the reference picture frame and the corresponding end position of the drag operation in the reference picture frame (Totoki, FIGS. 5A-D; ¶¶ [0075]-[0081]: tracks the object in which the anchor is set, to automatically generate an anchor position where the movement amount of the object is large; previewing the anchor position information that has been automatically generated, and for confirming the anchor position; the preview anchored position (e.g., an14) can be moved/adjusted to a desired position by a drag operation; FIGS. 9-10, ¶¶ [0036]-[0041] and [0096]-[0100]: the movement amount may be the difference in object coordinates between frames; the movement amount may be the movement amount of the center coordinates of the anchor; the position coordinates (i.e., target point) of an object (i.e., target object) are the center coordinates of the anchor (i.e., preview of additional object) mapped on the object; extracts the object area by edge extraction, and uses the center coordinates of the object area as object coordinates; based on the movement amount of the object coordinates between the present frame and another past frame other than the present frame, generate/store a frame identifier and anchor position information of the determination target in a frame identified as a saving target; i.e., anchor position information in each identified frame is determined by the center coordinates of the target object area in each identified frame, the relative position information between the target point position (i.e., the end position of drag operation) in the start frame and the center coordinates of the target object area in the start frame (i.e., converting target point position from video frame coordinates to object coordinates), and the relative position information between the center coordinates of the anchor (i.e., preview of additional object) in the start frame and the target point position (i.e., the end position of drag operation) in the start frame).
Brouwer’165 and Totoki are analogous art because they are from the same field of endeavor, a system and a method for drawing an anchor superposed on a layer for reproducing a video.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Totoki to Brouwer’165.  Motivation for doing so would provide users with more operation control for anchoring/attaching an object/tag/annotation/comment/message/image at a desired position of an item in video, and enhance user experience.
MARUYAMA'853 teaches a system and a method for generating an object to be superimposed and displayed on a video (MARUYAMA'853, ¶ [0002]), wherein (2)  tracking the target object in a first set of picture frames of the plurality of pictures frames that are displayed after the reference picture frame and in a second set of picture frames of the plurality of picture frames that are displayed before the reference picture frame, wherein the tracking comprises: tracking the target object in the first set of picture frames, frame-by-frame, in a chronological order starting from the reference picture frame corresponding to the pause time point  (i.e., the time point when user request to insert an annotation/comment); and tracking the target object in the second set of picture frames, frame-by-frame, in a reverse order starting from the reference picture frame corresponding to the pause time point (i.e., the time point when user request to insert an annotation/comment); (3) in response to the tracking, obtaining first display information indicating at least one of a display position, a display size, or a display posture of the target object in each picture frame of the first set and the second set (MARUYAMA'853, ¶¶ [0108] and [0020]-[0023]: receive an input of the input information that contains frames and coordinates that are positional information that the user inputs with intention of displaying a comment to track an object in video; FIGS. 11A-B and 12A; ¶¶ [0109] and [0135]: calculates coordinate values (initial trajectory) along a series of time axis that is a motion of an object as a target followed by a user, on the basis of the input information and the video; ¶¶ [0111], [0168]-[0170] and [0036]: a previous trajectory is calculated back to the early direction in the time axis of the video from the coordinates of the starting point of the initial trajectory; the calculation of the previous trajectory is similar to the calculation of the initial trajectory except that pictures have to be input in the reverse order of the time passage direction that goes back in time from a point (i.e., the time point when user request to insert/post a comment) indicated with a frame, a picture, a time or a coordinate in a frame, to which a user assigns positional information as well as a point in the vicinity of them).
Brouwer’165 in view of Totoki, and MARUYAMA'853 are analogous art because they are from the same field of endeavor, a system and a method for generating an object to be superimposed and displayed on a video.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of MARUYAMA'853 to Brouwer’165 in view of Totoki.  Motivation for doing so would maximize displaying time and improve viewability and readability (MARUYAMA'853, ¶¶ [0016] and [0017]).

Claim 5
Brouwer’165 in view of Totoki and MARUYAMA'853 discloses all the elements as stated in Claim 1 and further discloses wherein the drag operation comprises a first drag operation, the method further comprising: moving a position of the preview picture of the additional object in the video playback interface in response to a second drag operation on the preview picture of the additional object (Brouwer’165, ¶ [0191]: user has option to reposition the selection in case the user is not happy with selected object) (Totoki, FIGS. 5A-D; ¶¶ [0075]-[0081]: previewing the anchor position information that has been automatically generated, and for confirming the anchor position; the preview anchored position (e.g., an14) can be moved/adjusted to a desired position by a drag operation).  

Claim 10
Brouwer’165 in view of Totoki and MARUYAMA'853 discloses all the elements as stated in Claim 1 and further discloses displaying, corresponding to the trigger control, a switch control in the video playback interface; displaying an additional object selection interface in response to an activation operation on the switch control, the additional object selection interface comprising at least two candidate objects; and obtaining, in response to a selection operation in the additional object selection interface, a candidate object corresponding to the selection operation as a new additional object corresponding to the trigger control (Brouwer’165, FIGS. 27-28; ¶¶ [0097], [0100], and [0138]: user may select a type of icon/symbol that depicts the type of category that the annotation belongs from a drop down or popup menu; i.e., displaying a switch control allowing users to change different types of icons/symbols, which depict different types of categories that the annotation belongs, from drop down or popup menu).  

Claim 11
Brouwer’165 in view of Totoki and MARUYAMA'853 discloses all the elements as stated in Claim 1 and further discloses wherein the additional object is a static display object or a dynamic display object (Brouwer’165, FIGS. 13-14; ¶ [0111]: as the video plays back, the Tag Containers 40, 42 will remain static while the Screen Tags 20, 21 will follow the positively identified parts and the Links 50, 51, if used, will remain connected with the Tag Containers 40, 42 for at least a predetermined time; i.e., Tag Containers 40, 42 are static display objects and the Screen Tags 20, 21 are dynamic display objects) (Brouwer’165; FIG. 27; ¶¶ [0158]-[0159], [0161], and [0164]: two different types of Message and Comment Tags 1020; one is General Tags which are associated to topics and not associated with objects and the other is Object Tags which are associated with objects; General Tags 1013 appear in a particular section of the video player, such as the top right corner, i.e., a static display object; Object Tags 1012, unlike General Tags, they are associated with an object identified in a video or an image and follow the associated object for at least predetermined time; i.e., a dynamic display object).

Claims 6-7, 15-16, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Brouwer’165 in view of Totoki and MARUYAMA'853 as applied to Claims 1, 12, and 17 respectively  above, and further in view of Franklin et al. (US 2016/0196052 A1, published on 07/07/2016), hereinafter Franklin.

Claims 6, 15, and 20
Brouwer’165 in view of Totoki and MARUYAMA'853 discloses all the elements as stated in Claims 1, 12, and 17 respectively and further discloses the anchor information further includes anchor area (size) in additional to anchor position information (px, py) indicating coordinates of an anchor on a screen (Totoki, ¶ [0063]).
Brouwer’165 in view of Totoki and MARUYAMA'853 fails to explicitly disclose wherein the first display information comprises the display size of the target object in the each picture frame; and calculating a zoom ratio of the additional object in the each picture frame according to the display size of the target object in the each picture frame and an original size of the target object, the original size of the target object being a display size of the target object in the reference picture frame; obtaining the display size of the additional object in the each picture frame according to an original size of the additional object and the zoom ratio; and generating the second display information comprising the display size of the additional object in the each picture frame.
Franklin teaches a system and a method for adding one or more graphical features or overlaying user interface elements in pictures and/or video (Franklin, ¶ [0015]), wherein the first display information comprises the display size of the target object in the each picture frame; and calculating a zoom ratio of the additional object in the each picture frame according to the display size of the target object in the each picture frame and an original size of the target object, the original size of the target object being a display size of the target object in the reference picture frame; obtaining the display size of the additional object in the each picture frame according to an original size of the additional object and the zoom ratio; and generating the second display information comprising the display size of the additional object in the each picture frame (Franklin, ¶¶ [0016] and [0084]-[0088]: to ensure that at last one overlay UI element in a composite overlay UI element is properly positioned, oriented, and/or proportioned with respect to the appropriate features (e.g., faces, eyes, mouth, etc.) within images and/or video, generate modified overlay element information by modifying one or more attributes of the overlay UI element, e.g., modifying the overlay position information, overlay size information, overlay orientation information of overlay element information associated with the UI element, etc., to align, orient, and/or proportion to a particular feature or features as indicated in the overlay feature alignment information, e.g., align adjacent to a top edge of a detected feature such as a head, on top of an detected feature such as eyes, below a detected feature such as a face, overlapping a detected feature such as nose, etc., to features detected in the image recognition information, e.g., eyes, face, mouth, head, etc., so that the rendered and visually presented overlay UI element in a composite overlay UI element may be substantially positioned and proportioned at least near one or more features detected in the background images and/or background videos represented by image and/or video information; i.e. the size of added overlay UI element will be modified according to the zoom ratio of the target feature size detected in each video frame with respect to target feature size detected in the initial video frame, and the original size of added overlay UI element in the initial video frame).
Brouwer’165 in view of Totoki and MARUYAMA'853, and Franklin are analogous art because they are from the same field of endeavor, a system and a method for adding one or more graphical features or overlaying user interface elements in pictures and/or video.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Franklin to Brouwer’165 in view of Totoki and MARUYAMA'853.  Motivation for doing so would improve/enhance visual presentation of composite overlay video by ensuring each composite overlay UI element is properly presented when the target feature in video moves closer or farther from the camera2 and enhance user experience (Franklin, ¶¶ [0015]-[0016] and [0021]).

Claims 7 and 16
Brouwer’165 in view of Totoki and MARUYAMA'853 discloses all the elements as stated in Claims 1 and 12 respectively and further discloses wherein the first display information comprises the display position  (Brouwer’165, FIGS. 13-14 and 18-19; ¶¶ [0111] and [0114]: tracking positions of an airplane (along with Screen Tags 20, 21) traveled from top right to the bottom left of the screen (84) in each picture frame; e.g., three frames (81 , 82 and 83) in a video (80) from a video sequence); and wherein generating the second display information according to the first display information and the relative position information comprises: obtaining the display position  (Brouwer’165, FIGS. 4-5; ¶¶ [0070]-[0071]: a user clicks on the side of the cowling of an airplane 11 to mark a location to place or "attach" an annotation or Screen Tag 20, which serves as an anchor, placed by a user at a specific position 12 of the airplane 11 in a movie or image; this anchor or Screen Tag 20 is associated with the ( x , y ) coordinate of the position 12 on the airplane 11 and the frame number; FIGS. 13-14, 18-19, and 25-26; ¶¶ [0076]-[0077], [0114], [0111], and [0154]: based on the video's frame count, time (optional), frame rate and screen resolution, or any combination of these, a separate interactive layer (85) is created that matches the exact same frame of the video, and this interactive Dynamic Content layer (85) contains all the Tag Containers, Links and Screen Tags and their positions relative to the video (80); Screen Tags 20, 21 will follow their positively identified elements (e.g., corresponding positions on airplane) for at least a predetermined time, e.g., an airplane along with Screen Tags 20, 21 travels from top right to the bottom left of the screen by maintaining the relation of the Tag Marker to that identified object in the video; i.e., generating and displaying Screen Tags 20, 21 according display position of airplane and relative position of Tag Marker and corresponding positions on airplane) (Brouwer’165, FIGS. 27 and 34; ¶¶ [0171] and [0173]-[0175]: during playback the Message or Comment Tag would appear at that location of the Tag Marker 1012; a Tag Marker 1012b, which is then rendered for each of the frames at the chosen locations of the object for the interval from 1012b to 1012c and superimposed on the video frame 1041b and subsequent frames of a video 1040 during playback).
Brouwer’165 in view of Totoki and MARUYAMA'853 fails to explicitly disclose wherein the first display information comprises the display posture of the target object in each picture frame; and obtaining the display posture of the additional object in each picture frame according to the display posture of the target object in each picture frame; and generating the second display information comprising the display posture of the additional object in each picture frame.
Franklin teaches a system and a method for adding one or more graphical features or overlaying user interface elements in pictures and/or video (Franklin, ¶ [0015]), wherein the first display information comprises the display posture of the target object in each picture frame; and obtaining the display posture of the additional object in each picture frame according to the display posture of the target object in each picture frame; and generating the second display information comprising the display posture of the additional object in each picture frame (Franklin, ¶¶ [0084]-[0088]: to ensure that at last one overlay UI element in a composite overlay UI element is properly positioned, oriented, and/or proportioned with respect to features within images and/or video, generate modified overlay element information by modifying one or more attributes of the overlay UI element, e.g., modifying the overlay position information, overlay size information, overlay orientation information of overlay element information associated with the UI element, etc., to align, orient, and/or proportion to a particular feature or features as indicated in the overlay feature alignment information, e.g., align adjacent to a top edge of a detected feature such as a head, on top of an detected feature such as eyes, below a detected feature such as a face, overlapping a detected feature such as nose, etc., to features detected in the image recognition information, e.g., eyes, face, mouth, head, etc., so that the rendered and visually presented overlay UI element in a composite overlay UI element may be substantially positioned and proportioned at least near one or more features detected in the background images and/or background videos represented by image and/or video information).
Brouwer’165 in view of Totoki and MARUYAMA'853, and Franklin are analogous art because they are from the same field of endeavor, a system and a method for adding one or more graphical features or overlaying user interface elements in pictures and/or video.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Franklin to Brouwer’165 in view of Totoki and MARUYAMA'853.  Motivation for doing so would improve visual presentation of composite overlay image or video by ensuring each composite overlay UI element is properly presented and enhance user experience (Franklin, ¶¶ [0015]-[0016] and [0021]).

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Brouwer’165 in view of Totoki and MARUYAMA'853 as applied to Claim 1 above, and further in view of Nebehay et al. ("Clustering of Static-Adaptive Correspondences for Deformable Object Tracking", 2015 IEEE Conference on Computer Vision and Pattern Recognition, published on 06/01/2015, pp. 2784-2791), hereinafter Nebehay.

Claim 9
Brouwer’165 in view of Totoki and MARUYAMA'853 discloses all the elements as stated in Claim 1 and further discloses wherein tracking the target object in the each picture frame of the video, and obtaining the first display information comprises: tracking the target object in the each picture frame of the video by using a  (Brouwer’165; FIG. 22; ¶¶ [0064]-[0065], [0080[-[0081], and [0126]: augmenting and annotating moving pictures or images with tags using pixel region tracking; use one of several known methods for tracking the collection of pixels around the location where the user wants to insert the tag, which include a method by which video frames are compared to detect and track an object in motion, color comparison method where the system searches for color/shade differences, marker-less tracking, where the frame is converted to black and white to increase contrast, and slam-tracking method where the system can use for tracking pixels corresponding to a selected portion of an object and uses reference points in high contrast images, which may be converted to black and white images, in order to detect and track an object).  
	Brouwer’165 in view of Totoki and MARUYAMA'853 fails to explicitly disclose tracking the target object in each picture frame of the video by using a clustering of static-adaptive correspondences for deformable object tracking (CMT) algorithm.
	Nebehay teaches a method for object tracking in a video sequence (Nebehay, FIGS. 1 and 10; ABTRACT; Section 3.0 in Pages 2785-2786), wherein tracking the target object in each picture frame of the video by using a clustering of static-adaptive correspondences for deformable object tracking (CMT) algorithm (Nebehay, Title; Abstract; Section 3.1-3.3 in Pages 2786-2787:  employ both static correspondences from the initial appearance of the object as well as adaptive correspondences from the previous frame to address the stability-plasticity dilemma; employ a pairwise dissimilarity measure between correspondences based on their geometric compatibility, directly reflecting the deformation of the object of interest, allowing to separate inlier correspondences from outliers).
Brouwer’165 in view of Totoki and MARUYAMA'853, and Nebehay are analogous art because they are from the same field of endeavor, a method for object tracking in a video sequence.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Nebehay to Brouwer’165 in view of Totoki and MARUYAMA'853.  Motivation for doing so would improves on state-of-the-art tracking results lies in the flexible nature of the hierarchical clustering algorithm, allowing for the propagation of inliers, even when correspondences are located on deformed parts of the object (Nebehay, Section 5 in Page 2789).

Claims 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over Brouwer’165 in view of Totoki and MARUYAMA'853 as applied to Claims 1, 12, and 17 respectively above, and further in view of Microsoft ("Screen Captures for Dragging Anchoring Control of a Picture in Microsoft Word 2016", released on 09/22/2015), hereinafter Microsoft.

Claims 21-23
	Brouwer’165 in view of Totoki and MARUYAMA'853 discloses all the elements as stated in Claims 1, 12, and 17 respectively and further disclose performing the drag operation by dragging the trigger control/preview picture away from an initial position of a preview picture of the additional object  (Brouwer’165, ¶ [0097] and [00100]: user may select a type of icon that depicts the type of category from a drop down or popup menu; ¶ [0191]: user has option to reposition the selection in case the user is not happy with selected object; i.e., it must have preview function so that user can select different types of icons or reposition the selected icon when user is not happy with the selection) (Totoki, FIGS. 5B and 5D; ¶¶ [0077] and [0080]: the user sets an anchor an12/an14 (i.e., trigger control/preview) at the desired position (i.e., end position of a drag operation) of the battery pack (i.e., target object) by moving the position of this anchor an12/an14 displayed on the screen d12/d14 using a drag operation).
	Brouwer’165 in view of Totoki and MARUYAMA'853 fails to explicitly disclose performing the drag operation by dragging the trigger control away from an initial position of a preview picture of the additional object without moving the preview picture.
Microsoft teaches a system and a method for anchoring an object, wherein performing the drag operation by dragging the trigger control away from an initial position of a preview picture of the additional object without moving the preview picture (Microsoft, Pages 1-2: an anchor control initially is located at the same paragraph of the picture) (Microsoft, Pages 3-7: the anchor control is dragging away from its initial position without moving the picture) (Microsoft, Page 8: when the anchor control is dropped at the third paragraph below the picture, the third paragraph below the picture becomes a new anchored paragraph; Page 9-12: when the anchored paragraph are moved further down by inserting additional paragraphs between the picture and the anchored paragraph, the relative position between the picture and the anchored paragraph is still maintained; Page 13: undo paragraphs inserted between the picture and the anchored paragraph, and the relative position between the picture and the anchored paragraph is still maintained; Page 14-15: indicating the whole processes are performed using Microsoft Word 2016).
Brouwer’165 in view of Totoki and MARUYAMA'853, and Microsoft are analogous art because they are from the same field of endeavor, a system and a method for anchoring an object.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Microsoft to Brouwer’165 in view of Totoki and MARUYAMA'853.  Motivation for doing so would provide a user interface that is simple and easy to operate for defining a spaced relationship between graphic objects and facilitating naturally binding data to graphic objects3.

Response to Arguments
Applicant’s arguments filed on 09/20/2022 with respect to Claims 1, 12, and 17 have been fully considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Ramachandran et al. (US 2011/0261258 A1, published on 10/27/2011) discloses a system and method associates relevant additional information with a video stream by creates a spot within the video that is linked to the additional information (Ramachandran, ABSTRACT), wherein if the content item remains stationary or relatively stationary, then the HotSpot icon created for the content item can also remain stationary; however, if the content item sufficiently changes position within the video viewing area then the HotSpot icon should also change position (Ramachandran, ¶ [0093]).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to HWEI-MIN LU whose telephone number is (313)446-4913. The examiner can normally be reached Mon - Fri: 9:00 AM - 6:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, WILLIAM L BASHORE can be reached on (571)272-4088. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/HWEI-MIN LU/Examiner, Art Unit 2175                                                                                                                                                                                                        


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 See, for example US 2009/0327856 A1 to Mouilleseaux et al., published on 12/31/2009, 115 in FIG. 1; ¶¶ [0042]-[0043]: the video clip has paused at particular frame 127 in response to a user activating control 115 to add an annotation to the clip.
        2 See, for example US 2008/0046956A1 to Kulas, published on 02/21/2008, FIG. 4; ¶¶ [0028]-[0031].
        3 See, for example US 2019/0114057 A1 to KERR, filed on 10/13/2017, FIGS. 11A-B; ¶ [0100]-[0103], [0021], and [0031].