DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The amendments, filed 7/21/2022, have been entered and made of record. Claims 1, 6, and 21 have been amended. Claim 26 has been added. Claims 11-16 have been withdrawn. Claims 1-26 are pending.
Response to Arguments
Applicant' s arguments in the Remarks filed on 7/21/2022 have been considered but are moot in view of the new ground(s) of rejection.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Sathish in view of Wait
Claims 1-10 and 21 and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Sathish(USPubN 2013/0259446) in view of Wait et al.(USPubN 2019/0013047; hereinafter Wait).
As per claim 1, Sathish teaches a method of presenting video summarization, comprising: 
presenting, via a display device of a client device, a summary view of event(“FIG. 4A illustrates a user interface (e.g., interface 401) for presenting a customized media item (e.g., personalized video) of an event (e.g., a concert) on a single two-dimensional screen” in Para.[0086], The user interface can be interpreted as a display device of a client device.);
receiving, via a user input device of the client device, a first request to view at least one of a plurality of video streams, the first request including an indication of a first time associated with the at least one of the plurality of video streams(“the system 100 comprises one or more user equipment (UEs) 101a-101c (also collectively referred to as the UEs 101) containing a user interface client 109a-109c (also collectively referred to as user interface client 109) having connectivity to a media platform 103 via a communication network 105” in Para.[0042], “media platform determines one or more viewpoints of a live event selected by a user” in Abs, “The system 100 may render a user interface for determining a selection of at least one source-target pair by a remote user as a viewpoint for streaming one or more media segments of interest. By way of example, the system 100 may render a user interface in the form of a map on a user device. The map may cover an area that is selected by a user, such as a specific stage, specific coordinates, a boundary around a specific location, or the like. Thus, based on the user interface of the map, the user may query for media segments that is associated with the viewpoint marked in the map” in Para.[0029], “the user enters a plurality of source-target pairs in sequence with various time periods in-between for media segments associated with a camera movement flow in the concert hall. The system 100 matches/selects media segments accordingly to compile a customized cut for the user” in Para.[0031], “Alternatively, or in addition to the foregoing, the user may select an object of interest, such as the violinist, as a basis for performing a media segment match/selection process. Further, the user may enter characteristics associated with an object, such as any performer moving on the stage, and may further select one or more characteristics associated with the media segments/items, such as sudden changes of sound/lighting volumes (e.g., climax of the music, audience clapping, etc.), time of day, season, orientation, depth of field, white balance, author(s), etc. In another embodiment, the system 100 suggests an object of interest, characteristics associated with an object, characteristics associated with the media segments/items, or a combination thereof for the user to select” in Para.[0032], Sathish teaches the system may render a user interface for determining a selection of at least one source-target pair by a remote user as a viewpoint for streaming one or more media segments of interest. The user enters a plurality of source-target pairs in sequence with various time periods in-between for media segments associated with a camera movement flow in the concert hall. The system matches/selects media segments accordingly to compile a customized cut for the user. In addition to the foregoing, the user may select an object of interest, such as the violinist, as a basis for performing a media segment match/selection process. Further, the user may enter characteristics associated with an object, such as any performer moving on the stage, and may further select one or more characteristics associated with the media segments/items, such as sudden changes of sound/lighting volumes (e.g., climax of the music, audience clapping, etc.), time of day, season, orientation, depth of field, white balance, author(s), etc. The user may query for media segments with different characteristics associated with the media segments such as time.); 
transmitting, by the processing circuit via a communications interface of the client device, a second request to retrieve a plurality of image frames based on the indication of the first time to a first database maintaining the plurality of image frames(“he media platform then determines respective media segments that depict the respective one or more viewpoints” in Abs, “transmit the plurality of media items taken by different user devices and related information (e.g., context data and/or metadata) to the media platform 103 for further processing and/or storage in the media items database 113 and the context data database 115” in Para.[0042], “the media platform 103 may receive the plurality of media items (e.g., videos) and context data associated with the media items from the UEs 101 and then buffer the information in the media items database 113 and the context data database 115, respectively. Alternatively, the context data can be buffered as a part of the respective media items. The media items database 113 can be utilized for collecting and buffering the plurality of media items. More specifically, the media items database 113 may include a plurality of media items (e.g., videos), one or more media segments (e.g., video referring to the violinist, and/or the singer), one or more customized media items (e.g., personalized video), or a combination thereof. Further, the context data database 115 may be utilized to store current and historical data about one or more events, and which media items belong to which event, media channels and/or customized media items. Moreover, the media platform 103 may have access to additional historical data (e.g., historical sensor data or additional historical information about a region that may or may not be associated with events) to determine if an event is occurring or has occurred at a particular time. This feature can be useful in determining if newly uploaded media items can be associated with one or more events’’ in Para.[0044], “the communication network 105 of system 100 includes one or more networks” in Para.[0045]); 
receiving, from the first database, the plurality of image frames; and providing, by the processing circuit to the display device of the client device, a representation of a plurality of video stream objects corresponding to the plurality of image frames received from the first database (“FIGS. 4A-4C are diagrams of a user interface utilized in the process of FIG. 3, according to various example embodiments. As shown, the example user interface of FIG. 4A includes one or more user interface elements, such as the viewpoints, and/or functionalities created and/or modified based, at least in part, on information, data, and/or signals resulting from the process 300 described with respect to FIG. 3. More specifically, FIG. 4A illustrates a user interface (e.g., interface 401) for presenting a customized media item (e.g., personalized video) of an event (e.g., a concert) on a single two-dimensional screen. As previously discussed, the interface 401 is generated by the media platform 103 based on the viewpoints selected by a remote user and the context information associated with the one or more media segments determined from a plurality of media items captured during the event. As shown in FIG. 4A, a user is able to touch or select a source position 403 and one or more target positions on the stage (e.g., a violinist 405, a singer 407, etc.) to determine which respective viewpoints V and S is presented as arrows on one or more display screens. FIG. 4B shows on the top the violinist video includes viewpoints 421-431 of the violinist, a flutist, a cellist, a pianist, the singer, and the guitarist. FIG. 4B also shows the singer video from different viewpoints 433-443. In addition, a user has the option to present and/or playback the media items, media segments, personalized videos, or a combination thereof by touching an automatic mixing element 409 and or a change view element 411 in different manners. A user is able to touch or select the automatic mixing element 409 to concurrently present the violinist video and the singer video, and the change view element 411 to set the videos in a picture-in-picture mode. An interface 461 shown in FIG. 4C has the singer video shown in a main screen 463 and the violinist video shown in a secondary screen 465” in Para.[0086]).
Sathish is silent about presenting a summary stream including at least one or more sampled images of a plurality synchronized of video streams simultaneously, wherein the plurality of synchronized video streams is stored in each of at least two video recorders that captured the plurality of synchronized video streams and receiving a first request to view at least a portion of the summary stream.
Wait teaches presenting a summary stream including at least one or more sampled images of a plurality synchronized of video streams simultaneously, wherein the plurality of synchronized video streams is stored in each of at least two video recorders that captured the plurality of synchronized video streams and receiving a first request to view at least a portion of the summary stream(“The camera architecture 100 includes cameras 110A through 110H positioned around and/or within the event location 105. The cameras 110A through 110H may be devices that are capable of capturing and/or generating (e.g., taking) images (e.g., pictures) and/or videos (e.g., a sequence of images) of the event location 105 … the images (e.g., arrays of images or image arrays) and/or videos captured by one or more of the cameras 110A through 110H may be stored in a data store such as memory (e.g., random access memory), a disk drive (e.g., a hard disk drive or a flash disk drive)“ in Para.[0021], “the operation of the cameras 110A through 110H may be synchronized with each other and the cameras 110A through 110H may capture images and/or videos of the event location 105 in a synchronized and/or coordinated manner (e.g., the videos captured by the cameras 110A through 110H may be synchronized in time)” in Para.[0024], “a server computing device may analyze and/or process the interesting portions of the videos 210, 220, 230, and/or 240 to generate video 250 (e.g., a content item) based on the interesting portions. The server computing device may identify a subset of the interesting portions of the videos 210, 220, 230, and/or 240 and may generate the video 250 based on the subset of the interesting portions of the videos 210, 220, 230, and/or 240. For example, the server computing device may identify a subset of the interesting portions 210A, 210D, 210F, 220A, 230B, 230C, 240G, and 240X and may generate the video 250 based on the subset” in Para.[0049], “The summary video may present a subset of the interesting portions to provide the viewer of the summary video with a recap or summary focusing on specific people, objects, and/or occurrences that are depicted in the videos of the event” in Para.[0034], Fig. 2).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings Sathish with the above teachings of Wait in order to enhance user’s experience for the video representation with a synopsis of the main events easily and effectively.
As per claim 2, Sathish and Wait teaches all of limitation of claim 1.
Sathish teaches comprising: receiving, via the user input device, a third request including an indication of a second time associated with the at least one of the plurality of video streams; and updating, by the processing circuit, the representation of the plurality of video stream objects based on the third request(Para.[0031], [0032]).
As per claim 3, Sathish and Wait teaches all of limitation of claim 1.
Sathish teaches comprising: receiving, via the user input device, a third request indicating instructions to view a single video stream object of the plurality of video stream objects; transmitting, by the processing circuit via the communications interface to the first database, a fourth request to retrieve a high definition version of images frames corresponding to the single video stream object; receiving, from the first database, the high definition version of the image frames; and updating, by the processing circuit, the representation of the plurality of video stream objects to present the single video stream object including the high definition version of the image frames(“the media platform 103 determines to generate a compilation of at least a portion of the media segments based, at least in part, on the metadata (such as time/date, location, name of event, etc.) and the synchronization. The compilation is dynamically generated during the live event, playback of one or more of the media items, or a combination thereof. The media platform 103 causes, at least in part, a generation of a video for respective one or more viewpoints, wherein the video compiles one or more media segments that depict the respective one or more viewpoints. In one embodiment, the media platform 103 generates a video for each object of interest, a violinist, a singer, etc. In another embodiment, the media platform 103 compresses and/or compiles the multiple personalized videos into a single media stream” in Para.[0077], “the media platform 103 makes high resolution cuts through post-creation, by fetching of higher quality video and audio” in Para.[0084]).
As per claim 4, Sathish and Wait teaches all of limitation of claim 1.
Sathish teaches comprising: identifying, by the processing circuit, a feature of interest assigned to at least one image frame of at least one video stream object; and updating, by the processing circuit, the representation of the plurality of video stream objects to present a display object corresponding to the feature of interest(Para.[0086]).
As per claim 5, Sathish and Wait teaches all of limitation of claim 4.
Sathish teaches wherein the feature of interest includes at least one of an indication of motion detected, a person detected, an object deposited or removed, or a tripwire crossed in the at least one image frame(Para.[0032], “when the system 100 determines to present the customized media items/segments and/or user interface (UI) on a three-dimensional display, the system 100 causes a rendering of a user interface that can include one or more objects with facets associated with the respective one or more user interface elements, one or more videos, or a combination thereof. By way example, the system 100 can determine to render a user interface consisting of a cube for a customized media item consisting of six viewpoints, or an object determined by a user based on the same concept of associating a facet of the object with a viewpoint and/or personalized video. In this example, a user can use a gesture on the facet of the cube interface to cause the system 100 to rotate the UI and/or select one or more corresponding viewpoints or personalized videos to present and/or playback. In another example, a user can use a split gesture to cause the system 100 to divide two or more personalized videos of the UI (e.g., a cube) to create two more presentations on the same screen. Further, a select and combinational gesture by a user can cause the system 100 to combine two or more personalized videos in different manners” in Para.[0041], “when the plurality of media items is captured by the UEs 101, related context data (e.g., metadata) is also simultaneously generated for example from the sensor modules 107 within the UEs 101 and the context data can then be determined and associated with the plurality of media items by the media platform 103 or by the UEs 101 themselves. By way of example, the context data associated with the plurality of media items can include time information, a position of the UEs 101, an altitude of the UEs 101, a tilt of the UEs 101, an orientation/angle of the UEs 101, a zoom level of the camera lens of the UEs 101, a focal length of the camera lens of the UEs 101, a field of view of the camera lens of the UEs 101, a radius of interest of the UEs 101 while capturing the media content, a range of interest of the UEs 101 while capturing the media content, or a combination thereof. The position of the UEs 101 can be also be detected from one or more sensors of the UE 101 (e.g., via GPS). The user's location can be determined by Cell of Origin, wireless local area network triangulation, or other location extrapolation technologies. Further, the altitude can be detected from one or more sensors such as an altimeter and/or GPS. The tilt of the UEs 101 can be based on a reference point (e.g., a camera sensor location) with respect to the ground based on accelerometer information. Moreover, the orientation can be based on compass (e.g., magnetometer) information and may be based on a reference to north. One or more zoom levels, a focal length, and a field of view can be determined according to a camera sensor. Further, the radius of interest and/or focus can be determined based on one or more of the other parameters contained in parameter database 117 or another sensor (e.g., a range detection sensor)” in Para.[0043]).
As per claim 6, Sathish teaches a video summarization device, comprising: a communications interface; a display device; a user input device, a processing circuit(“Communication Interface, Display, Input Device, Processor” in Fig. 5, 6, 7) and the other limitations in the claim 6 has been discussed in the rejection claim 1 and rejected under the same rationale.
As per claim 7, the limitations in the claim 7 has been discussed in the rejection claim 2 and rejected under the same rationale. 	
As per claim 8, the limitations in the claim 8 has been discussed in the rejection claim 3 and rejected under the same rationale.
As per claim 9, the limitations in the claim 9 has been discussed in the rejection claim 4 and rejected under the same rationale.
As per claim 10, the limitations in the claim 10 has been discussed in the rejection claim 5 and rejected under the same rationale.
As per claim 21, Sathish teaches a non-transitory computer readable medium including instructions that, when executed by a processor(“The term "computer-readable medium" as used herein refers to any medium that participates in providing information to processor 502, including instructions for execution. Such a medium may take many forms, including, but not limited to computer-readable storage medium (e.g., non-volatile media, volatile media), and transmission media. Non-transitory media, such as non-volatile media, include, for example, optical or magnetic disks, such as storage device 508” in Para.[0097]) and the other limitations in the claim 21 has been discussed in the rejection claim 1 and rejected under the same rationale.
As per claim 22, the limitations in the claim 22 has been discussed in the rejection claim 2 and rejected under the same rationale. 	
As per claim 23, the limitations in the claim 23 has been discussed in the rejection claim 3 and rejected under the same rationale.
As per claim 24, the limitations in the claim 24 has been discussed in the rejection claim 4 and rejected under the same rationale.
As per claim 25, the limitations in the claim 25 has been discussed in the rejection claim 5 and rejected under the same rationale.

Sathish in view of Wait and Kodama
Claims 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sathish(USPubN 2013/0259446) in view of Wait et al.(USPubN 2019/0013047; hereinafter Wait) further in view of Kodama et al.(USPubN 2004/0156548; hereinafter Kodama).
As per claim 17, Sathish and Wait teaches all of limitation of claim 1.
Sathish and Wait are silent about wherein the at least one or more sampled images include a first video quality and the plurality of image frames include a second video quality higher than the first video quality.
Kodama teaches wherein the at least one or more sampled images include a first video quality and the plurality of image frames include a second video quality higher than the first video quality(“according to the JPEG 2000 scheme (ISO/IEC FCD 15444-1), while a first original image is stored in a high-definition state, a second image can be created by extracting a part of the first original image, which part is a part of the first original image having a specific resolution, a specific image quality or so. By applying this method, it becomes possible to output (display, print or transmit) a thumbnail image or so as mentioned above” in Para.[0425]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings Sathish and Wait with the above teachings of Kodama in order to facilitate the indexing and viewing video program efficiently.
As per claim 18, Sathish and Wait teaches all of limitation of claim 17.
Sathish and Wait are silent about wherein the at least one or more sampled images are thumbnail images and the plurality of image frames are high-definition images.
Kodama teaches wherein the at least one or more sampled images are thumbnail images and the plurality of image frames are high-definition images(“according to the JPEG 2000 scheme (ISO/IEC FCD 15444-1), while a first original image is stored in a high-definition state, a second image can be created by extracting a part of the first original image, which part is a part of the first original image having a specific resolution, a specific image quality or so. By applying this method, it becomes possible to output (display, print or transmit) a thumbnail image or so as mentioned above” in Para.[0425]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings Sathish and Wait with the above teachings of Kodama in order to facilitate the indexing and viewing video program efficiently with two different quality of images.
As per claim 19, the limitations in the claim 19 has been discussed in the rejection claim 17 and rejected under the same rationale.
As per claim 20, the limitations in the claim 20 has been discussed in the rejection claim 18 and rejected under the same rationale.

Sathish in view of Wait and Iwamoto
Claim 26 is rejected under 35 U.S.C. 103 as being unpatentable over Sathish(USPubN 2013/0259446) in view of Wait et al.(USPubN 2019/0013047; hereinafter Wait) further in view of Iwamoto(USPubN 2018/0376058).
As per claim 26, Sathish and Wait teaches all of limitation of claim 1.
Sathish and Wait are silent about further comprising presenting a table of times associated with the at least one or more sampled images for user selection.
Iwamoto teaches further comprising presenting a table of times associated with the at least one or more sampled images for user selection(“where a timeline of images (recorded video images) of a plurality of continuous frames is displayed as an example of an object for selecting an image. However, it is not always necessary to display images of a plurality of continuous frames or display the timeline. For example, it is also possible to display a thumbnail image of each of a plurality of images as an example of an object for selecting any one of a plurality of the images. A plurality of images may be images of frames of a moving image or may be still images. In this case, the display control unit 125 displays the above-described display patterns on a plurality of the thumbnail images in a superimposed way according to the imaging directions of the imaging apparatus 110 used when images corresponding to the thumbnail images have been captured. Then, the display control unit 125 displays in an enlarged way an image selected from a plurality of the thumbnail images by the user. In this case, the user may be allowed to select only one of a plurality of the thumbnail images, or select at least two thereof at the same time.” in Para.[0098]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings Sathish and Wait with the above teachings of Iwamoto in order to facilitate the indexing and viewing video program efficiently.
Conclusion 
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SUNGHYOUN PARK whose telephone number is (571)270-1333. The examiner can normally be reached M - Thur 6:00 am - 4 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, THAI Q TRAN can be reached on (571)272-7382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SUNGHYOUN PARK/Examiner, Art Unit 2484