DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
This office action is in response to the amendment filed on 01/27/2022.  Claims 1, 3-7, 9-12, 14-17, and 19-23 remain pending in the application. Claims 1, 12, and 17 are independent.

Priority
Acknowledgment is made of applicant's claim for foreign priority based on an application filed in PEOPLE'S REPUBLIC OF CHINA on 01/18/2018. It is noted, however, that applicant has not filed a certified copy of the CN201810050497.6 application as required by 37 CFR 1.55. 

Claim Objections
Applicant's amendment to claims is silent on making any corrections regarding to previous objections; therefore, previous objections are converted to 112 rejections.  Applicant's amendment to claims also raises the following new issue.
Claim 1 is objected to because of the following informalities:  
in Claim 1, lines 6-8, "displaying a reference picture frame of the plurality of picture frames in the video playback interface, the reference picture frame corresponding to a pause time point in the video" appears to be "displaying a reference picture frame of the plurality of picture frames in the video playback interface, the reference picture frame being a picture frame corresponding to a pause time point in the video" according to Claims 12 and 17.  
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1, 3-7, 9-12, 14-17, and 19-23 rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Claim 1, lines 12-15, Claim 12, lines 16-19, and Claim 17, lines 15-18 recite "simultaneously tracking/track the target object in a first set of picture frames of the plurality of pictures frames that are temporally displayed after the reference picture frame and in a second set of picture frames of the plurality of picture frames that are temporally displayed before the reference picture frame".  According to ¶ [0118] in the specification of the present invention: "Start tracking after a target object corresponding to a sticker is obtained and an operation of starting tracking is detected (for example, a user clicks an area other than the sticker in a video playback interface), obtain a video frame image A when a video is still, track an area B where the sticker is located, initialize tracking objects (including two tracking objects, for simultaneous forward and backward tracking) according to the image A and the area B, and obtain a timestamp C and a video duration D of the current video", however, there is no description found in the specification to support the afore-mentioned limitation "… tracking/track the target object … that are temporally displayed after the reference picture and … that are temporally displayed before the reference picture frame".  If the Examiner has overlooked the portion of the original Specification that describes these features of the present invention, the Applicant should point it out (by paragraph number or page number with line number) in response to the Office Action.  For examination purposes, "simultaneously tracking/track the target object in a first set of picture frames of the plurality of pictures frames that are displayed after the reference picture frame and in a second set of picture frames of the plurality of picture frames that are displayed before the reference picture frame" is considered.
Claims 3-7, 9-11,14-16, and 19-23 are rejected for fully incorporating the deficiency of their respective base claims.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 4-5 and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 4 recites the limitation "obtaining a display position of the preview picture in the reference picture frame; and obtaining … according to the display position of the preview picture in the reference picture frame …" in lines 2-4.  There is insufficient antecedent basis for this limitation "the preview picture" in the claim.  For examination purpose, "obtaining a display position of a preview picture of the additional object in the reference picture frame; and obtaining … according to the display position of the preview picture of the additional object in the reference picture frame …" is considered.
Claim 5 is rejected for fully incorporating the deficiency of their respective base claims.
Claim 20 recites the limitation "The apparatus according to claim 17" in line 1.  There is insufficient antecedent basis for this limitation "the apparatus" in the claim.  For The non-transitory computer readable storage medium according to claim 17 …" is considered.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 3-6, 10-12, 14-15, 17, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Brouwer et al. (US 2019/0246165 A1, filed on 10/18/2017), hereinafter Brouwer in view of LaForge et al. (US 2015/0277686 A1, published on 10/01/2015), hereinafter LaForge, MARUYAMA et al. (US 2014/0344853 A1, published on 11/20/2014), hereinafter MARUYAMA'853, and Toklu et al. ("Semi-automatic video object segmentation in the presence of occlusion", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 4, Pages 624-629, JUNE 2000), hereinafter Toklu.

Independent Claims 1, 12, and 17
Brouwer’165 discloses an additional object display method, performed by a terminal (Brouwer’165, ABSTRACT; ¶ [0002]: methods and systems for facilitating commenting and messaging associated with particular portions of a video that may reflect the presence of an object in various frames of the video by contextually augmenting and annotating moving pictures or images with tags using region tracking on computing devices with screen displays, including mobile devices and virtual reality headsets), the method comprising: 
displaying a trigger control in a video playback interface (Brouwer’165, FIG. 2; ¶ [0068]: a mobile computing device 14 playing a video that is streamed from a server to the device, wherein the computing device 14 features standard controls known to video streaming applications 13; there may be different types of controls with different functions that can be activated; additional controls and/or functions can be made available; 1009 in FIG. 27; ¶ [0167]: a video player, which has standard known movie controls 1007, 1007b and Messaging or Chat 1009 functionality, or similar functionality, for establishing different types of communications such as sending messages or comments; the Chat or Messaging Function (1009) may be placed elsewhere in the interface; i.e., 1009 in FIG. 27 can be a trigger control to activate functionality for adding annotations, messages or comments to a video/movie), the video playback interface being used for playing a video comprising a plurality of picture frames (Brouwer’165, 1040 in FIG. 33; ¶ [0172]: FIG . 33 shows a video 1040 with a series of frames); 
pausing playback of the video in response to an activation operation on the trigger control (Brouwer’165, FIGS. 3-4; ¶¶ [0070]-[0071], [0079], and [0119]: a user 1), and displaying a reference picture frame of the plurality of picture frames in the video playback interface, the reference picture frame being a picture frame corresponding to a pause time point in the video (Brouwer’165, FIG. 5; ¶ [0071]: a user clicks on the side of the cowling of an airplane 20 to mark a location to place an annotation or Screen Tag which serves as an anchor, placed by a user at a specific position 12 in a movie or image; this anchor or Screen Tag 20 is associated with the ( x , y ) coordinate of the position 12 and the frame number corresponding to a pause time point in the video; FIGS. 33-34; ¶¶ [0128] and [0172]-[0177]: compares a screen selection of the video frame (e.g., 1041b) at the location that the user clicked on with the images of specific key frames (e.g., 1041b to 1041c), wherein a user clicks in the video, or pauses the video, at frame position 1041b, which shows the cube at time 0; analyzing each of the frames from 1041b through 1041c; determine if the tracking of the object is possible within this period of time; i.e., 1041b is used as reference frame for tracking);
obtaining a target object in response to a operation in the reference picture frame (Brouwer’165, FIGS. 4-5; ¶ [0071]: a user clicks on an object 11 (e.g., an airplane) at position 12 in a movie, where the object is the one to which the user wants to place or "attach" an annotation to; a user clicks on the side of the cowling of an airplane 20 to mark a location to place an annotation or Screen Tag which serves as an anchor, placed by a user at a specific position 12 in a movie or image; this anchor or Screen Tag 20 is associated with the ( x , y ) coordinate of the position 12 and the frame number) (Brouwer’165, FIGS. 27 and 33-34; ¶¶ [0164], [0168] and [0173]: after pausing the video, user then clicks, e.g., on the left top corner 1012b of the cube 1011b, where he/she intends to post a Message or Comment tag, which will then be associated with the actual location of the top left corner of the cube 1011b and 1011c for the duration of playback time from t+n to t+n2); 
simultaneously tracking the target object in a first set of picture frames of the plurality of pictures frames that are displayed after the reference picture frame  first set of picture frames, frame-by-frame, in a reverse order starting from a last frame of the first set of picture frames (Brouwer’165, FIG. 34; ¶¶ [0174]-[0175]: analyze a predefined area around the clicked location 1012b of the image by means of object tracking to determine whether the object will still be visible for at least a predetermined time so that a Tag Marker 1012b, th) frame and calculate backward when tracking; the system will start from either end using frame 1 and 98/100, e.g., followed by 19/2 and 80/99, and so forth; i.e., simultaneously tracking in a chronological order and in a reverse order of playback time); 
in response to the simultaneously tracking, obtaining first display information indicating at least one of a display position, a display size, or a display posture of the target object in each picture frame of the first set at least one of a display position, a display size, or a display posture of an additional object corresponding to the target object in the each picture frame; and displaying the additional object according to the second display information in the each picture frame during the playback of the video (Brouwer’165, FIGS. 13-14 and 19; ¶¶ [0111] and [0114]: Screen Tags 20 will follow their positively identified elements for at least a predetermined time, e.g., an airplane along with Screen Tags travels from top right to the bottom left of the screen; i.e., generating and displaying Screen Tags 20 according display position of positively identified elements) (Brouwer’165, FIGS. 27 and 34; ¶¶ [0171] and [0173]-[0176]: during playback the Message or Comment Tag would appear at that location of the Tag Marker 1012, which can then be interacted with during playback or when paused; a Tag Marker 1012b, which is then rendered for each of the frames at the chosen locations of the object for the interval from 1012b to 1012c and superimposed on the video frame 1041b and subsequent frames of a video 1040 during playback) (Brouwer’165, FIGS. 6 and 27-28; ¶¶ [0071]-[0073]: Screen Tag 20 is an icon on screen depicting the type of content associated with identified/attached/anchored object; Screen Tags 20 may be of different geometric shapes, colors or sizes, a logo, or text image or any other graphical elements; ¶ [0151]: a Tag Marker 1012 features a visible symbol, or graphical element, or marker of a particular size, which may differ in shape, size and color depending on the type of content the Message or Comment Tag is associated with).
Brouwer’165 further discloses an apparatus comprising: a memory storing at least one instruction, at least one program, a code set, or an instruction set; and a processor configured to execute the at least one instruction, the at least one program, the code set, or the instruction set, and upon execution, to perform the method 
Brouwer’165 further discloses a non-transitory computer readable storage medium storing at least one instruction, at least one program, a code set, or an instruction set, and the at least one instruction, the at least one program, the code set, or the instruction set configured to be loaded and executed by a processor, and upon being executed, cause the processor to perform the method described above (Brouwer’165; FIG. 1; ¶ [0067]: non-transitory computer readable storage medium and processor are inherit in a server platform including server(s) 1 and client devices 4-9 as shown in FIG. 1).
Brouwer’165 fail to explicitly disclose (1) obtaining a target object in response to the drag operation on the trigger control, the target object being a display object corresponding to an end position of the drag operation in the reference picture frame; (2) simultaneously tracking the target object in a first set of picture frames of the plurality of pictures frames that are displayed after the reference picture frame and in a second set of picture frames of the plurality of picture frames that are displayed before the reference picture frame, wherein the simultaneously tracking comprises simultaneously: tracking the target object in the first set of picture frames, frame-by-frame, in a chronological order starting from the reference picture frame corresponding to the pause time point  (i.e., the time point when user request to insert an annotation/comment); and tracking the target object in the second set of picture frames, frame-by-frame, in a reverse order starting from the reference picture frame corresponding to the pause time point (i.e., the time point when user request to in response to the simultaneously tracking, obtaining first display information indicating at least one of a display position, a display size, or a display posture of the target object in each picture frame of the first set and the second set.
LaForge teaches a system and a method for superimposing an object/image over an identified specific section in a video (LaForge, ¶ [0040]), wherein obtaining a target object in response to a drag operation on the trigger control, the target object being a display object corresponding to an end position of the drag operation in the reference picture frame (LaForge, ¶ [0165]: user can drag the desired BOM image and drop it over the selected Reference Video Frame at the desired location with the help of a computer key or mouse or by using the touchscreen of a touchscreen enabled device; FIG. 4; ¶¶ [0177]-[0178]: user can drag the BOM image 302 and place it properly over the exact position required. e.g., place a hat image 302 over the hair section 402 of the TARGET image; the application can detect the entire section of the TARGET image which is to be covered by the BOM image).
Brouwer’165 and LaForge are analogous art because they are from the same field of endeavor, a system and a method for superimposing an object/image over an identified specific section in a video.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of LaForge to Brouwer’165.  Motivation for doing so would simplify user operations for adding an object/tag/annotation/comment/message/image attached/anchored to an item in video and enhance user experience.
simultaneously tracking the target object in a first set of picture frames of the plurality of pictures frames that are displayed after the reference picture frame and in a second set of picture frames of the plurality of picture frames that are displayed before the reference picture frame, wherein the simultaneously tracking comprises simultaneously: tracking the target object in the first set of picture frames, frame-by-frame, in a chronological order starting from the reference picture frame corresponding to the pause time point  (i.e., the time point when user request to insert an annotation/comment); and tracking the target object in the second set of picture frames, frame-by-frame, in a reverse order starting from the reference picture frame corresponding to the pause time point (i.e., the time point when user request to insert an annotation/comment); (2) in response to the simultaneously tracking, obtaining first display information indicating at least one of a display position, a display size, or a display posture of the target object in each picture frame of the first set and the second set.
MARUYAMA'853 teaches a system and a method for generating an object to be superimposed and displayed on a video (MARUYAMA'853, ¶ [0002]), wherein (1) and in a second set of picture frames of the plurality of picture frames that are displayed before the reference picture frame, wherein the   (i.e., the time point when user request to insert an and tracking the target object in the second set of picture frames, frame-by-frame, in a reverse order starting from the reference picture frame corresponding to the pause time point (i.e., the time point when user request to insert an annotation/comment); (2) in response to the and the second set (MARUYAMA'853, ¶¶ [0108] and [0020]-[0023]: receive an input of the input information that contains frames and coordinates that are positional information that the user inputs with intention of displaying a comment to track an object in video; FIGS. 11A-B and 12A; ¶¶ [0109] and [0135]: calculates coordinate values (initial trajectory) along a series of time axis that is a motion of an object as a target followed by a user, on the basis of the input information and the video; ¶¶ [0111], [0168]-[0170] and [0036]: a previous trajectory is calculated back to the early direction in the time axis of the video from the coordinates of the starting point of the initial trajectory; the calculation of the previous trajectory is similar to the calculation of the initial trajectory except that pictures have to be input in the reverse order of the time passage direction that goes back in time from a point (i.e., the time point when user request to insert/post a comment) indicated with a frame, a picture, a time or a coordinate in a frame, to which a user assigns positional information as well as a point in the vicinity of them).
Brouwer’165 in view of LaForge, and MARUYAMA'853 are analogous art because they are from the same field of endeavor, a system and a method for generating an object to be superimposed and displayed on a video.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the 
Brouwer’165 in view of LaForge, and MARUYAMA'853 fails to explicitly disclose (1) simultaneously tracking the target object in a first set of picture frames of the plurality of pictures frames that are displayed after the reference picture frame and in a second set of picture frames of the plurality of picture frames that are displayed before the reference picture frame, wherein the simultaneously tracking comprises simultaneously: tracking the target object in the first set of picture frames, frame-by-frame, in a chronological order starting from the reference picture frame; and tracking the target object in the second set of picture frames, frame-by-frame, in a reverse order starting from the reference picture frame; (2) in response to the simultaneously tracking, obtaining first display information indicating at least one of a display position, a display size, or a display posture of the target object in each picture frame of the first set and the second set.
Toklu teaches a system and a method for object tracking in a video (Toklu, ABASTRACT), wherein (1) simultaneously tracking the target object in a first set of picture frames of the plurality of pictures frames that are displayed after the reference picture frame and in a second set of picture frames of the plurality of picture frames that are displayed before the reference picture frame, wherein the simultaneously tracking comprises simultaneously: tracking the target object in the first set of picture frames, frame-by-frame, in a chronological order starting from the reference picture frame; and tracking the target object in the second set of picture frames, frame-by-frame, in a reverse order starting from the reference picture frame; (2) in response to the simultaneously tracking, obtaining first display information indicating at least one of a display position, a display size, or a display posture of the target object in each picture frame of the first set and the second set (Toklu, FIG. 1; Section II and Section III A in Pages 624-625: after the user selects a keyframe among the frames that form the life span of the video object to be tracked, the object is simultaneously tracked in both forward and backward time directions starting from the keyframe (i.e., reference frame) until the end of the life span of the object in both directions; find the boundary (i.e., position, size, and posture) of the moved region (i.e., region occupied by the object) in the next frame given the alpha plane of the object in the current frame).
Brouwer’165 in view of LaForge and MARUYAMA'853, and Toklu are analogous art because they are from the same field of endeavor, a system and a method for object tracking in a video .  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Toklu to Brouwer’165 in view of LaForge and MARUYAMA'853.  Motivation for doing so would enhance.

Claims 3, 14, and 19
Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu discloses all the elements as stated in Claims 1, 12, and 17 respectively and further discloses wherein the first display information comprises pixel coordinates of a target point in the target object in the each picture frame, the target point being a position point corresponding to the end position of the drag operation in the target object; and wherein generating the second display information according to the first display information comprises: obtaining pixel coordinates of the additional object in the each picture frame according to the pixel coordinates of the target point in the target object in the each picture frame and relative position information between the additional object and the target point; and generating the second display information comprising the pixel coordinates of the additional object in the each picture frame (Brouwer’165, FIG. 5; ¶ [0071]: a user clicks on the side of the cowling of an airplane 20 to mark a location to place an annotation or Screen Tag, which serves as an anchor, placed by a user at a specific position 12 in a movie or image; this anchor or Screen Tag 20 is associated with the ( x , y ) coordinate of the position 12 and the frame number; FIGS. 13-14 and 19; ¶¶ [0111] and [0114]: Screen Tags 20 will follow their positively identified elements for at least a predetermined time, e.g., an airplane along with Screen Tags travels from top right to the bottom left of the screen; i.e., generating and displaying Screen Tags 20 according display position of positively identified elements; in order for Screen Tags to follow identified elements in video, pixel coordinates of the Screen Tags must be obtained according to pixel coordinates of the target point and relative position between the Screen Tag and target point) (Brouwer’165, FIGS. 27 and 34; ¶¶ [0171] and [0173]-[0175]: during playback the Message or Comment Tag would appear at that location of the Tag Marker 1012; a Tag Marker 1012b, which is then rendered for each of the frames at the chosen locations of the object for the interval from 1012b to 1012c and superimposed on the video frame 1041b and subsequent frames of a video 1040 during playback) (LaForge, ¶ [0165]: user can drag the desired BOM image and drop it over location of the new image on the target image).

Claim 4
Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu discloses all the elements as stated in Claim 3 and further discloses obtaining a display position of the preview picture in the reference picture frame (LaForge, 1422 in FIG.14B; ¶ [0229]: user can drag/drop the sticker 1422 to place it at the desired coordinates over the image 1411 in the main display area 1421; i.e., a display position of the preview picture in the image 1411 must be obtained before and after drag and drop operation); and obtaining the relative position information between the additional object and the target point according to the display position of the preview picture in the reference picture frame and the corresponding end position of the drag operation in the reference picture frame (LaForge, ¶ [0165]: user can drag the desired BOM image and drop it over the selected Reference Video Frame at the desired location; FIG. 4; ¶¶ [0177]-[0178]: user can drag the BOM image 302 and place it properly over the exact position required. e.g., place a hat image 302 over the hair section 402 of the TARGET image; i.e., the target point being a position point corresponding to the end position of the drag 

Claim 5
Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu discloses all the elements as stated in Claim 4 and further discloses wherein the drag operation comprises a first drag operation, the method further comprising: moving a position of the preview picture of the additional object in the video playback interface in response to a second drag operation on the preview picture of the additional object (Brouwer’165, ¶ [0191]: user has option to reposition the selection in case the user is not happy with selected object) (LaForge, 1422 in FIG.14B; ¶ [0229]: user can drag/drop the sticker 1422 to place it at the desired coordinates over the image 1411 in the main display area 1421; i.e., user can apply drag/drop operation to reposition the preview picture of the icon/sticker so that it is anchored to different object/location).  

Claims 6, 15, and 20
Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu discloses all the elements as stated in Claims 1, 12, and 17 respectively and further discloses wherein the first display information comprises the display size of the target object in the each picture frame; and wherein generating the second display information according to the first display information comprises: calculating a zoom ratio of the additional object in the each picture frame according to the display size of the target object in the each picture frame and an original size of the target object, the original size of the target object being a display size of the target object in the reference picture frame; obtaining the display size of the additional object in the each picture frame according to an original size of the additional object and the zoom ratio; and generating the second display information comprising the display size of the additional object in the each picture frame (LaForge, ¶¶ [0016] and [0156]-[0157]: modifications (e.g., changing size) done by the user in a single image frame are automatically applied to all image frames in which similar modifications would be applicable according to properties of the target image, such as size/width, and properties of the superimposing new image, such as size/width; i.e. the size/width of superimposing new image will be adjusted according to the size/width of the target image in all image frames according zoom ratio of target image in all image frames with respect to target image in reference image frame).
Motivation for doing so would enhance visual presentation of video with tags in sequence by shrinking or enlarging tag according to the relative size of the item associated with the tag when the item moves closer or farther from the camera in video2. 

Claim 10
Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu discloses all the elements as stated in Claim 1 and further discloses displaying, corresponding to the trigger control, a switch control in the video playback interface; displaying an additional object selection interface in response to an activation operation on the switch control, the additional object selection interface comprising at least two candidate objects; and obtaining, in response to a selection operation in the additional object selection interface, a candidate object corresponding to the selection operation as a new additional object corresponding to the trigger control (Brouwer’165, FIGS. 27-28; ¶¶ [0097], [00100], and [0138]: user may select a type of icon/symbol that depicts the type of category from a drop down or popup menu; i.e., displaying a switch control allowing users to change different types of icons/symbols that depict different types of categories from drop down or popup menu) (LaForge, FIGS. 3-4 and 14B; ¶¶ [0177] and [0228]-[0229]: in response to selection of option 1420, various categories of stickers or BOMS available are displayed in the image display section 1404; user can select any of the stickers or images in the display area 1404 to insert/superimpose it over the image 1411 in the main display area 1421).  

Claim 11
Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu discloses all the elements as stated in Claim 1 and further discloses wherein the additional object is a static display object or a dynamic display object (Brouwer’165, FIGS. 13-14; ¶ [0111]: as the video plays back, the Tag Containers 40, 42 will remain static while the Screen .

Claims 7 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu as applied to Claims 1 and 12 respectively above, and further in view of Franklin et al. (US 2016/0196052 A1, published on 07/07/2016), hereinafter Franklin.

Claims 7 and 16
Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu discloses all the elements as stated in Claims 1 and 12 respectively and further discloses wherein the first display information comprises the display position and the display posture of the target object in the each picture frame (Toklu, FIG. 1; Section III A in Page 625: find the boundary (i.e., position, size, and posture) of the moved region (i.e., region occupied by and wherein generating the second display information according to the first display information comprises: obtaining the display position  (Brouwer’165, FIG. 5; ¶ [0071]: a user clicks on the side of the cowling of an airplane 20 to mark a location to place an annotation or Screen Tag, which serves as an anchor, placed by a user at a specific position 12 in a movie or image; this anchor or Screen Tag 20 is associated with the ( x , y ) coordinate of the position 12 and the frame number; FIGS. 13-14 and 19; ¶¶ [0111] and [0114]: Screen Tags 20 will follow their positively identified elements for at least a predetermined time, e.g., an airplane along with Screen Tags travels from top right to the bottom left of the screen; i.e., generating and displaying Screen Tags 20 according display position of positively identified elements; in order for Screen Tags to follow identified elements in video, display position of the Screen Tags must be obtained according to display position of the target point and relative position between the Screen Tag and target point) (Brouwer’165, FIGS. 27 and 34; ¶¶ [0171] and [0173]-[0175]: during playback the Message or Comment Tag would appear at that location of the Tag Marker 1012; a Tag Marker 1012b, which is then rendered for each of the frames at the chosen locations of the object for the interval from 1012b to 1012c and superimposed on the video frame 1041b and subsequent frames of a video 1040 during playback) 
Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu fails to explicitly disclose wherein the first display information comprises the display posture of the target object in each picture frame; and wherein generating the second display information according to the first display information comprises: obtaining the display posture of the additional object in each picture frame according to the display posture of the target object in each picture frame; and generating the second display information comprising the display posture of the additional object in each picture frame.
Franklin teaches a system and a method for adding one or more graphical features or overlaying user interface elements in pictures and/or video (Franklin, ¶ [0015]), wherein the first display information comprises the display posture of the target object in each picture frame; and wherein generating the second display information according to the first display information comprises: obtaining the display posture of the additional object in each picture frame according to the display posture of the target object in each picture frame; and generating the second display information comprising the display posture of the additional object in each picture frame (Franklin, orientation information of overlay element information associated with the UI element, etc., to align, orient, and/or proportion to a particular feature or features as indicated in the overlay feature alignment information, e.g., align adjacent to a top edge of a detected feature such as a head, on top of an detected feature such as eyes, below a detected feature such as a face, overlapping a detected feature such as nose, etc., to features detected in the image recognition information, e.g., eyes, face, mouth, head, etc., so that the rendered and visually presented overlay UI element in a composite overlay UI element may be substantially positioned and proportioned at least near one or more features detected in the background images and/or background videos represented by image and/or video information).
Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu, and Franklin are analogous art because they are from the same field of endeavor, a system and a method for adding one or more graphical features or overlaying user interface elements in pictures and/or video.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Franklin to Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu.  Motivation for doing so would improve visual presentation of composite overlay image or video by .

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu as applied to Claim 1 above, and further in view of Nebehay et al. ("Clustering of Static-Adaptive Correspondences for Deformable Object Tracking", 2015 IEEE Conference on Computer Vision and Pattern Recognition, published on 06/01/2015, pp. 2784-2791), hereinafter Nebehay.

Claim 9
Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu discloses all the elements as stated in Claim 1 and further discloses wherein tracking the target object in the each picture frame of the video, and obtaining the first display information comprises: tracking the target object in the each picture frame of the video by using a  (Brouwer’165; FIG. 22; ¶¶ [0064]-[0065], [0080[-[0081], and [0126]: augmenting and annotating moving pictures or images with tags using pixel region tracking; use one of several known methods for tracking the collection of pixels around the location where the user wants to insert the tag, which include a method by which video frames are compared to detect and track an object in motion, color comparison method where the system searches for color/shade differences, marker-less tracking, where the frame is converted to black and white to 
	Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu fails to explicitly disclose tracking the target object in each picture frame of the video by using a clustering of static-adaptive correspondences for deformable object tracking (CMT) algorithm.
	Nebehay teaches a method for object tracking in a video sequence (Nebehay, FIGS. 1 and 10; ABTRACT; Section 3.0 in Pages 2785-2786), wherein tracking the target object in each picture frame of the video by using a clustering of static-adaptive correspondences for deformable object tracking (CMT) algorithm (Nebehay, Title; Abstract; Section 3.1-3.3 in Pages 2786-2787:  employ both static correspondences from the initial appearance of the object as well as adaptive correspondences from the previous frame to address the stability-plasticity dilemma; employ a pairwise dissimilarity measure between correspondences based on their geometric compatibility, directly reflecting the deformation of the object of interest, allowing to separate inlier correspondences from outliers).
Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu, and Nebehay are analogous art because they are from the same field of endeavor, a method for object tracking in a video sequence.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Nebehay to Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu.  .

Claims 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu as applied to Claims 1, 12, and 17 respectively above, and further in view of Microsoft ("Screen Captures for Dragging Anchoring Control of a Picture in Microsoft Word 2016", released on 09/22/20153), hereinafter Microsoft.

Claims 21-23
	Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu discloses all the elements as stated in Claims 1, 12, and 17 respectively and further disclose performing the drag operation by dragging the trigger control/preview picture away from an initial position of a preview picture of the additional object  (Brouwer’165, ¶ [0097] and [00100]: user may select a type of icon that depicts the type of category from a drop down or popup menu; ¶ [0191]: user has option to reposition the selection in case the user is not happy with selected object; i.e., it must have preview function so that user can select different types of icons or reposition the selected icon when user is not happy with the selection) (LaForge, ¶ [0165]: user can drag the desired BOM image and drop it over the selected Reference Video Frame at the desired drag the BOM image 302 and place it properly over the exact position required. e.g., place a hat image 302 over the hair section 402 of the TARGET image; the application can detect the entire section of the TARGET image which is to be covered by the BOM image; FIG. 11B; ¶ [0201]: BOM preview 1102 is placed over the image 1101 selected by the user from BOMS displayed in scrolling area 1103; 1422 in FIG.14B; ¶ [0229]: the sticker 1422 highlighted or selected in the display area 1404 is shown/previewed superimposed over the image 1411 in the main display area 1421; i.e., the trigger control initially is at the same location as the preview picture in scrolling area 1103 or display area 1404, when a BOM image is selected and dragged, the preview picture can be placed over the image 1101 or 1411 at a desired location).
	Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu fails to explicitly disclose performing the drag operation by dragging the trigger control away from an initial position of a preview picture of the additional object without moving the preview picture.
Microsoft teaches a system and a method for anchoring an object, wherein performing the drag operation by dragging the trigger control away from an initial position of a preview picture of the additional object without moving the preview picture (Microsoft, Pages 1-2: an anchor control initially is located at the same paragraph of the picture) (Microsoft, Pages 3-7: the anchor control is dragging away from its initial position without moving the picture) (Microsoft, Page 8: when the anchor control is dropped at the third paragraph below the picture, the third paragraph below 
Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu, and Microsoft are analogous art because they are from the same field of endeavor, a system and a method for anchoring an object.  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Microsoft to Brouwer’165 in view of LaForge, MARUYAMA'853, and Toklu.  Motivation for doing so would provide a user interface that is simple and easy to operate for defining a spaced relationship between graphic objects and facilitating naturally binding data to graphic objects4.

Response to Arguments
Applicant’s arguments filed on 01/27/2022 with respect to Claims 1, 12, and 17 have been fully considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Aaron Martinez ("Snapchat Update v9.28.0.0 - How to Use 3D Stickers on Snapchat (Moving Emojis)", posted on Apr 21, 2016 at www.youtube.com/watch?v=pXrTy8NOnhQ) discloses a Snapchat system for inserting and attaching various emojis to moving objects in the recorded video (e.g., attaching eyes, nose, mouth, and ear emojis to corresponding features in a moving face) and the position and orientation of these emojis are consistent with attached moving objects (Martinez, Pages 2-28 of screen captures at time stamp 4:52 - 6:14).

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HWEI-MIN LU whose telephone number is (313)446-4913. The examiner can normally be reached Mon - Fri: 9:00 AM - 6:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.


If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, WILLIAM L BASHORE can be reached on (571)272-4088. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/HWEI-MIN LU/Examiner, Art Unit 2175    

/REZA NABI/Primary Examiner, Art Unit 2175                                                                                                                                                                                                                                                                                                                                                                                                            


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 See, for example US 2009/0327856 A1 to Mouilleseaux et al., published on 12/31/2009, 115 in Figure 1; paragraphs [0042]-[0043]: the video clip has paused at particular frame 127 in response to a user activating control 115 to add an annotation to the clip.
        2 See, for example US 2008/0046956A1 to Kulas, published on 02/21/2008, Figure 4; paragraphs [0028]-[0031].
        3 See https://en.wikipedia.org/wiki/Microsoft_Office_2016.
        4 See, for example US 2019/0114057 A1 to KERR, filed on 10/13/2017, FIGS. 11A-B; ¶ [0100]-[0103], [0021], and [0031].