DETAILED ACTION
1. 	This communication is responsive to the amendment, filed December 04, 2020.
2. 	Claims 1-20 are pending in this application.  Claims 1, 9, 15, 18 are independent claims. This action is made Final.
3. 	The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Examined under the first inventor to file provisions of the AIA 
4. The present application was filed on August 30, 2016 which is on or after March 16, 2013, and thus is being examined under the first inventor to file provisions of the AIA .

5. In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

Response to Arguments
6.  Applicant’s arguments, with respect to the amendment filed on December 04, 2020 with respect to claims 1-20 have been fully considered and are found persuasive.

The applicant argues:
1. The newly amended claim language of “displaying at least one of the first summary frames 
and at least one corresponding piece of first summary information of the at least one of the first 
summary frames, among the plurality of pieces of first summary information, together with at 
least one frame of the video, wherein the at least one corresponding piece of first summary 
information comprises matching information used to search for the at least one of the first 
summary frames and further comprises at least one of text information indicating a reproduction 
location in time of the at least one of the first summary frames in the video or text information 
indicating a reproduction location in time of a next key frame in the video” overcomes the  
previously cited arts.

The examiner responds: 
1. The examiner agrees.

The previously cited arts do not explicitly teach 1) searching based on matching summary text 
information and 2) summary information comprises text information indicating a reproduction 
location in time of a key frame.

Willams [Fig 7] shows matching the search word *dogs* item 704 to dog 
video items 706.

Williams [0099] “the user enters a search query "dog" into a search portion 704 of the 
interface. The data processing environment can respond by presenting a series of search results 
in a results portion 706 of the interface. Each entry in the search result corresponds to a key 
frame of a corresponding video item relating to the theme of dogs”.



The examiner cites the new art of Shichman et. al. (“Shichman”, US 2018/0132011).

Shichman [0103] “For example, user interface unit 320 may enable performing textual search in metadata”.

Shichman [0043] “Metadata for (or of) a data object as referred to herein may be, or may include, one or more data elements that describe, or provide other information for, other data, beyond the primary data (e.g., the video clip itself) for the data object. For example, metadata of a video clip may be its length, the time it was received by a server, specific times of the clip, or who is seen in the video clip (e.g., a name of a player shown in a video may be included in metadata), or what happens in the clip (such as `3-point basket`)”.

Shichman [0046] “For example, metadata may include parameters and descriptions of events, e.g., scores, fouls and time information of a sports event that may be shown in a related or associated video clip”

Thus, Shichman teaches a text searches based on *who is seen in the video clip* such as a *dog*, *description of the event* such as “dog”, and *specific times of the clip*.

start time that is segment 423 of video 420 and an end time that is segment 425 of video 420”.

Thus, Shichman teaches the *specific times of the clip* are indexed to source video times.

Thus, Shichman teaches the event objects (key frames) are reproduction objects of locations in time.

The examiner cites the new art of Brown et. al. (“Brown”, US 2010/0106707).

Brown [abstract] “search component is configured to enable a search of the index of the extensible indexing and search tool according to at least one of the set of attributes of the person”

Brown [0031] “additional metadata that captures a more detailed description of the extracted attribute and/or person. For example, each attribute may be annotated with information such as an identification (ID) of the sensor(s) used to capture the attribute, the location of the sensor(s) that captured the attribute, or a timestamp indicating the time and date that the attribute was captured”.

Thus, Brown teaches a search capability for a location of where a capture device captured data, and date/time timestamp of when the capture happened.

The examiner is always available for interviews.


Claim Rejections - 35 USC § 103
8.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

9.	The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.


10.	Claims 1-3, 5-11 are rejected under 35 U.S.C. 103 as being unpatentable over Williams et. al. (“Williams”, US 2009/0007202) in view of Shichman et. al. (“Shichman”, US 2018/0132011) in further view of Brown et. al. (“Brown”, US 2010/0106707) in further view of Zhang (“Zhang”, 5, 635, 982)

Claim 1:
Williams teaches a method of providing a summary of a video in an electronic device ([abstract] “Functionality is described for forming a summary representation of a video item”)  the method comprising: selecting key frames from a plurality of frames of the video, ([abstract] “The functionality operates by: (a) receiving a video item; (b) dividing the video item into a plurality of segments; (c) extracting at least one key frame from each of the plurality of segments to form a plurality of key frames”, as shown in Fig 2, [0071] “In phase 208 (corresponding to block 106), the data processing environment selects a key frame from each segment”)

Williams teaches that the key frames being located at preset time intervals of the video; , ([abstract] “The functionality operates by: (a) receiving a video item; (b) dividing the video item into a plurality of segments”, thus, the key frames are located at preset time positions (intervals) of the entire video item)

 (Fig 2 item 212 shows selecting three summary frame from level item 208 key frames which has five key frames, as shown in Fig 2, [0073] “In phases 212 and 214 (corresponding to block 110), the data processing environment selects final key frames associated with each scene”, thus, a synthesis process is done to the key frames to find an ultimate final representative frame item 220, Williams ([0056] “In this case, the data processing environment can select a key frame for the group that serves as the best representative for the group (where the same factors discussed above for block 106 can be used to determine the best representative)”)

Williams teaches generating a plurality of pieces of first summary information corresponding to the first summary frames (Fig 7 shows how final  first summary frame  item 708 has associated key frames item 710 and first summary information text title “Safety for your pets”,  and text description of “planning to take your dog with you …”, [0103] “The user interface presentation 702 can also include textual information associated with each search result item to help the user make a decision”, [0060] “The summarization procedure can also provide textual information which accompanies the key frames and video vignettes”)

Williams teaches displaying at least one of the first summary frames and at least one 
corresponding piece of first summary information of the at least one of the first summary frames, (Williams Fig 7 shows how final  first summary frame item 708 has associated key frames item 710 and corresponding pieces of first summary information text title of “Safety for your pets”, and text description information “planning to take your dog with you …”,  Williams [0103] “The user interface presentation 702 can also include textual information associated with each search result item to help the user make a decision”, Williams [0060] “The summarization procedure can also provide textual information which accompanies the key frames and video vignettes”) among the plurality of pieces of first summary information, together with at  least one frame of the video, (as discussed above in Williams Fig 7 and [0103] Williams Fig 7 shows text pieces of summary information of a text title, and text description, key frames 710 of video 708)

Williams does not explicitly teach wherein the at least one corresponding piece of first summary 
information comprises matching information used to search for the at least one of the first 
summary frames further comprises at least one of  text information indicating a reproduction 
location in time of the at least one of the first summary frames in the video or text information 
indicating a reproduction location in time of a next key frame in the video. However, Shichman
is analogous art of generating key frames on a user interface [Shichman 0014]. Shichman [0103] 
“For example, user interface unit 320 may enable performing textual search in metadata”. 
Shichman [0043] “[0043] Metadata for (or of) a data object as referred to herein may be, or may 
Include, one or more data elements that describe, or provide other information for, other data, 
beyond the primary data (e.g., the video clip itself) for the data object. For example, metadata of 
a video clip may be its length, the time it was received by a server, specific times of the clip, or 
who is seen in the video clip (e.g., a name of a player shown in a video may be included in 
metadata), or what happens in the clip (such as `3-point basket`)”. Shichman [0046] “For 
example, metadata may include parameters and descriptions of events, e.g., scores, fouls and 
time information of a sports event that may be shown in a related or associated video clip” Thus, 
Shichman teaches a text searches based on *who is seen in the video clip* such as a *dog*, 
*description of the event* such as “dog”, and *specific times of the clip*. Shichman [0062] “An 

associated with an event object. For example, an object representing an event may include a start 
and an end time of the event. For example and as shown, an event described by event object 451 
may have a start time that is segment 423 of video 420 and an end time that is segment 425 of 
video 420”. Thus, Shichman teaches the *specific times of the clip* are indexed to source video 
times. Thus, Shichman teaches the event objects (key frames) are reproduction objects of 
locations in time. It would have been obvious to one of ordinary skill in the art before the filing date of the invention to combine the video key frame selection user interface of Williams with the video key frame selection user interface of Shichman, so that users have access to editing [Shichman 0102] and publishing [Shichman 0099] tools to create their own events objects and distribute them to others users to appreciate.

The modified Williams + Shichman teaches wherein the matching information comprises key point information about a key point of the at least one of the first summary frames, (as discussed above,  Shichman [0103] “For example, user interface unit 320 may enable performing textual search in metadata”, as discussed above, Shichman [0043] “Metadata for (or of) a data object as referred to herein may be, or may include, one or more data elements that describe, or provide other information for, other data, beyond the primary data (e.g., the video clip itself) for the data object. For example, metadata of a video clip may be its length, the time it was received by a server, specific times of the clip, or who is seen in the video clip (e.g., a name of a player shown in a video may be included in metadata), or what happens in the clip (such as `3-point basket`)”, thus, a *key point* can be who is in it) 


which the video has been captured. However, Browm is analogous art of generating key frames 
on a user interface [Brown 0036]. Brown [abstract] “search component is configured to enable a search of the index of the extensible indexing and search tool according to at least one of the set of attributes of the person”. Brown [0031] “additional metadata that captures a more detailed description of the extracted attribute and/or person. For example, each attribute may be annotated with information such as an identification (ID) of the sensor(s) used to capture the attribute, the location of the sensor(s) that captured the attribute, or a timestamp indicating the time and date that the attribute was captured”. Thus, Brown teaches search capability for a location attribute. It would have been obvious to one of ordinary skill in the art before the filing date of the invention to combine the video key frame selection user interface of the modified Williams + Shichman with the video key frame selection user interface of Brown, so “smart surveillance” can be implemented to which applies automated signal analysis and pattern recognition to video cameras and sensors with the goal of automatically extracting "usable information" from video and sensor streams [Brown 0009].


The modified Williams + Shichman + Brown teaches a date and time information about a date and time at which the video has been captured.(Brown [abstract] “search component is configured to enable a search of the index of the extensible indexing and search tool according to at least one of the set of attributes of the person”. Brown [0031] “additional metadata that captures a more detailed description of the extracted attribute and/or person. For example, each attribute may be annotated with information such as an identification (ID) of the sensor(s) used to capture the attribute, the location of the sensor(s) that captured the attribute, or a timestamp indicating the time and date that the attribute was captured”)

The modified Williams + Shichman + Brown does not explicitly teach wherein the determining the first summary frames comprises determining a given key frame as a first summary frame based on (i) a variation in the given key frame compared with other key frames being equal to or greater than a preset threshold degree. However Zhang is analogous art of creating representative frames from video segments [abstract]. Zhang [abstract] “selecting a key frame once the difference of content between the current frame and a preceding selected key frame exceeds a set of preselected thresholds”. Thus, Zhang compares a number of current frames that match with a current key frame, based on using a threshold comparison and once the comparison exceeds a threshold, then Zhang has determined a number of frames that match the key frame. It would have been obvious to one of ordinary skill in the art before the filing date of the invention to combine the video key frame selection user interface of the modified Williams + Shichman + Brown teaches with the video key frame selection user interface of Zhang, so that a user can apply a multiple key frame determination thresholds such as pair-wise pixel comparison to compare pixels [Zhang 22-24], and likelihood ratio comparisons to compare regions [Zhang 49-51] when selecting key frames to determine gradual transitions [Zhang Col 1 39-41].

The modified Williams + Shichman + Brown + Zhang teaches “(ii) comparing the given key frame with the plurality of frames of the video by using a preset matching criterion and determining, that a number of frames among the plurality of frames of the video, that match with the given key based on a result of the comparing by using the preset matching criterion exceeds a (Zhang [abstract] “selecting a keyframe once the difference of content between the current frame and a preceding selected key frame exceeds a set of preselected thresholds Thus, Zhang compares a number of current frames that match with a current key frame, based on using a threshold comparison and once the comparison exceeds a threshold, then Zhang has determined a number of frames that match the keyframe. Zhang [Col 37-51] teaches “If this accumulated difference, .SIGMA..sub.k, exceeds T.sub.k, thresholdfor potential key frame, then, it sets the Flag to 1, a potential keyframe has been detected, at block 509 and proceeds to block 510, where further verification is made by calculating D.sub.a, the difference between current frame i and last key frame recorded, F.sub.k, basedon a selected difference metric. If, at block 511, D.sub.a is greater than T.sub.d, threshold for key frame, then, at block 512, the current frame, F.sub.i, is recorded as a current key frame and reinitialization of F.sub.k as current frame, .SIGMA..sub.k to zero, andfk, image feature of previous keyframe, as current image feature is carried out before repeating the process again from block 504 if the end of the frame of the video sequence has not been reached. Otherwise, if, at block 511, D.sub.a is not greater than T.sub.d, then, it proceeds to analyze the nextframe As described above, Zhang [Col 5 16-48 ] and shown in Fig 3A1 item 310, when the number of matching frames do exceed the minimum, then “a cutis declared atpointPl and a shot starting at frame F.sub.s and ending at frame F.sub.e is recorded at block 310”. As described above, Zhang [Col 5 16-48] and shown in Fig 3A1 item 310, when the number of matching frames do exceed the minimum, then “a cutis declared at point PI and a shot starting at frame F.sub.s and ending at frame F.sub.e is recorded at block 310 ”)


Claim 2:
The modified Williams + Shichman + Brown + Zhang teaches at least one of a variation of pixel values in the given key frame, appearance of a new object in the given key frame, and a change of an action of an object in the given key frame ([Zhang Col 4 41-45] “This method is only suitable for use in detecting sharp transition between camera shots in a video or film such as that depicted in FIG. 1B. The content between shot 110 and shot 112 is completely different from one another”, thus, a sharp change in pixel values)


Claim 3:
The modified Williams + Shichman + Brown + Zhang teaches that the first summary frames are displayed together with the at least one frame of the video, (as discussed above, Williams [abstract] the first summary frames are derived from the video as they are used to represent the video, so simply,  displaying the video displays all the frames, thus, displays summary frames with non-selected frames, [Col 12 9-13] “invoke a video vignette associated with the selected key frame. The vignette may correspond to only one of the vignettes associated with the video item or may correspond to several vignettes pieced together to form a compilation-type summary”)

The modified Williams + Zhang teaches wherein the method further comprises: receiving a user input to select a first summary frame from among the first summary frames; and reproducing the video corresponding to a location of the selected first summary frame. (Williams Fig 7 shows a user selecting representative frame 708. Williams [Col 12 18-21] “The user may select any frame in the main interface presentation 706 or the panel 710 to invoke a video vignette associated with the selected key frame”)


Claim 5:
The modified Williams + Shichman + Brown + Zhang teaches that the first summary frames are displayed together with the at least one frame of the video, (as discussed above, Williams [abstract] the first summary frames are derived from the video as they are used to represent the video, so simply,  displaying the video displays all the frames, thus, displays summary frames with non-selected,  Williams [0101] “invoke a video vignette associated with the selected key frame. The vignette may correspond to only one of the vignettes associated with the video item or may correspond to several vignettes pieced together to form a compilation-type summary”)

The modified Williams + Shichman + Brown + Zhang teaches wherein the method further comprises: receiving a user input to select a partial area of at least one of the first summary frames; (the key frames, and final representation came from the first frame selections, so selecting a final representation frame is selecting a first summary frame, Williams [0101] “invoke a video vignette associated with the selected key frame”)

The modified Williams + Shichman + Brown + Zhang teaches obtaining at least one first summary information corresponding to the selected partial area. (Williams Fig 7 shows how final first summary frame item 708 has associated key frames item 710 and text description “Safety for your pets”, Williams [0101] “panel 710 that provides additional key frames selected from the video item”)

The modified Williams + Shichman + Brown + Zhang teaches obtaining a plurality of pieces of second summary information from a plurality of videos stored in the electronic device; (as discussed above, Williams [0101] “The user may select any frame in the main interface presentation 706 or the panel 710 to invoke a video vignette associated with the selected key frame”)

The modified Williams + Shichman + Brown + Zhang teaches searching for at least one second summary information that matches with the at least one first summary information, from the plurality of pieces of second summary information; (Williams Fig 7 item 704 shows search term “dog” and showing matching dog representative frames, Williams [0099] “In this illustrative scenario, assume that the user enters a search query "dog" into a search portion 704 of the interface” and displaying at least one summary frame, of the plurality of videos, corresponding to the searched at least one second summary information. (Williams Fig 7 item 704 shows search term “dog” and showing matching dog representative frames, Williams [0099] “In this illustrative scenario, assume that the user enters a search query "dog" into a search portion 704 of the interface”)

Claim 6:
The modified Williams + Shichman + Brown + Zhang teaches extracting summary videos of the 
video by using the first summary frames; (Williams [abstract] “The functionality operates by:
(a) receiving a video item; (b) dividing the video item into a plurality of segments; (c) extracting at least one key frame from each of the plurality of segments to form a plurality of key frames”, 
Williams  [0071] “In phase 208 (corresponding to block 106), the data processing 
environment selects a key frame from each segment”)

The modified Williams + Shichman + Brown + Zhang teaches generating a master summary 
based on the summary videos. (Williams Fig 7 item 706 shows a list of search results, Williams 
[0099] “In this illustrative scenario, assume that the user enters a search query "dog" into a search 
portion 704 of the interface”)


Claim 7:
The modified Williams + Shichman + Brown + Zhang teaches displaying a plurality of second summary frames that are stored in the electronic device; (as discussed above, Williams [0101] “The user may select any frame in the main interface presentation 706 or the panel 710 to invoke a video vignette associated with the selected key frame”) receiving a user input to select a second summary frame; and reproducing the video corresponding to a location of the selected second summary frame (Williams [Col 12 18-21] “The user may select any frame in the main interface presentation 706 or the panel 710 to invoke a video vignette associated with the selected key frame”)

Claims 9-10 are similar in scope to claims 1-2 and are rejected under similar rationale

Claim 11 is similar in scope to claim 5 and is rejected under similar rationale


11.	Claims 4, 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over Williams et. al. (“Williams”, US 2009/0007202) in view of Shichman et. al. (“Shichman”, US 2018/0132011) in further view of Brown et. al. (“Brown”, US 2010/0106707) in further view of Zhang (“Zhang”, 5, 635, 982) in further view of Anderson et. al. (“Anderson”, US 7, 248, 778).

Claim 4:
The modified Williams + Shichman + Brown + Zhang teaches does not explicitly disclose receiving a user input to select a first location and a second location of the video; extracting a portion of the first summary frames included between the first location and the second location from the first summary frames. However, Anderson is analogous art of a user interface of making representative videos [Col 7 42-43]. Anderson Fig 4 item 430 shows how a user can set a start time and end time for making a clip. Anderson [Col 5 15-16] “enables the user to graphically select the length of the clip, indicated as 10 seconds for the first clip, and to modify the starting and ending points for the clip”Fig item 490 shows a “create video” button. [Col 5 51-55] “Create video button 490 initiates the creation of the condensed version of the footage, comprising the selected clips, which are assembled in accordance with the chosen parameters”. Thus, based on the user input of processing the start and end locations, then the video segment between the selected points is extracted. It would have been obvious to one of ordinary skill in the art before the filing date of the invention to combine the video key frame selection user interface of the modified Williams + Shichman + Brown + Zhang teaches with the video key frame selection user interface of Anderson so that users can set various key frame parameters 

The modified Williams + Shichman + Brown + Zhang + Anderson teaches extracting a portion of the first summary frames included between the first location and the second location from the first summary frames (Anderson [Col 5 15-16] “enables the user to graphically select the length of the clip, indicated as 10 seconds for the first clip, and to modify the starting and ending points for the clip” Fig 4 item 490 shows a “create video” button. [Col 5 51-55] “Create video button 490 initiates the creation of the condensed version of the footage, comprising the selected clips, which are assembled in accordance with the chosen parameters”)

The modified Williams + Shichman + Brown + Zhang + Anderson teaches extracting at least one first summary information corresponding to the extracted portion of the first summary frames (Williams [0059] “series of key frames extracted from the video item”)

The modified Williams + Shichman + Brown + Zhang + Anderson teaches obtaining a plurality of pieces of second summary information from a plurality of videos stored in the electronic device; (Williams Fig 7 shows how frames have second summary information of textual “dog” information, Williams [0103] “The user interface presentation 702 can also include textual information associated with each search result item to help the user make a decision”) 

The modified Williams + Shichman + Brown + Zhang + Anderson teaches searching for at least one second summary information that matches with the at least one first summary information, (Williams Fig 7 item 704 shows search term “dog” and showing matching dog representative frames, Williams [0103] “The user interface presentation 702 can also include textual information associated with each search result item to help the user make a decision”) from the plurality of pieces of second summary information; and displaying at least one summary frame, of the plurality of videos, corresponding to the searched at least one second summary information. (Williams Fig 7 shows how frames and text associated with “dog” are displayed, Williams [0103] “The user interface presentation 702 can also include textual information associated with each search result item to help the user make a decision”)

Claim 12 is similar in scope to claim 4 and is rejected under similar rationale

Claim 13
The modified Williams + Shichman + Brown + Zhang + Anderson teaches obtain locations of 
the first summary frames, obtain summary videos corresponding to reproduction locations of the 
first summary frames,(Williams [Col 12 18-21] “The user may select any frame in the main 
interface presentation 706 or the panel 710 to invoke a video vignette associated with the 
selected key frame”, thus, location selection is detected) and generate a master summary based 
on the summary videos (the applicant’s fig 15 teaches that a “master summary” is simply a list 
of videos, Williams fig 7 shows how the list of videos is displayed)


12.	Claims 8, 14 are rejected under 35 U.S.C. 103 as being unpatentable over Williams et. al. (“Williams”, US 2009/0007202) in view of Shichman et. al. (“Shichman”, US 2018/0132011) in .

Claim 8:
The modified Williams + Shichman + Brown + Zhang does not explicitly disclose in response to determining that a storage space of the electronic device is less than or equal to a preset threshold value. However, Mathur is analogous art of a user interface of making segments of a video [abstract]. Mathur [abstract] “At a first memory usage threshold, the operating system requests at least one of the application programs to limit its use of memory”. Mathur teaches [Col 5 10-18] “if the result of comparison 100 is negative, the operating system performs a second test 106, comparing current memory usage or availability against a usage or availability threshold that is referred to as an “intermediate” threshold. This threshold, referred to as a low memory threshold, is less critical than the critical memory threshold described above, and is reached while there is still enough memory available so that any particular application program can safely be allowed to shut itself down”. [Col 2 31-36] “At the second, more critical threshold, the operating system closes one or more of the application programs. The application is closed using a standard operating system mechanism that allows the application to shut down in an orderly fashion, while saving files and performing any other necessary housekeeping”. Thus, Mathur teaches a second threshold level that is less than a higher critical level and in response the user is required to shutdown and save files, which would save a segmented file. It would have been obvious to one of ordinary skill in the art before the filing date of the invention to combine the video key frame selection user interface of the modified Williams + Shichman + Brown + Zhang with the video key frame selection user interface of Mathur so system memory can be reduced 

The modified Williams + Shichman + Brown + Zhang + Anderson + Mathur teaches storing only the first summary frames and the plurality of pieces of first summary information, from among data included in the video, in the electronic device (the first summary of frames can be the entire plurality of frames, thus, as discussed above in Mathur [Col 2 31-36] the file can be saved)

Claim 14 is similar in scope to claim 8 and is rejected under similar rationale

13.	Claims 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Anderson et. al. (“Anderson”, US 7, 248, 778) in view of Williams et. al. (“Williams”, US 2009/0007202) in further view of Shichman et. al. (“Shichman”, US 2018/0132011) in further view of Brown et. al. (“Brown”, US 2010/0106707).



Claim 15
Anderson teaches a memory (Fig 5 main memory item 516) a processor (Fig 5 processor item 
502) an input unit configured to receive a user input to select a first location and a second  
location of a video (Anderson Fig 4 item 430 shows how a user can set a start time and end time 
for making a clip. Anderson [Col 5 15-16] “enables the user to graphically select the length of 
the clip, indicated as 10 seconds for the first clip, and to modify the starting and ending points 
for the clip”. Fig item 490 shows a “create video” button. [Col 5 51-55] “Create video button 
490 initiates the creation of the condensed version of the footage, comprising the selected clips, 
which are assembled in accordance with the chosen parameters”. Thus, based on the user input 
of processing the start and end locations, then the video segment between the selected points is 
extracted”) 

Anderson teaches that the first location and the second location defining a first section of the 
video: (Anderson Fig 4 shows how the start and end points correspond to item 425 video clip,
Anderson [Col 5 1 5-16] “enables the user to graphically select the length of the clip, indicated 
as 10 seconds for the first clip, and to modify the starting and ending points for the clip”)

Anderson teaches a display (fig 1 item 125 display, “A display device is also coupled to processor 110” [Col 3 6-7])

Anderson teaches in response to receiving via the input unit the user input to select the first location and the second location, (Anderson Fig 4 item 430 shows how a user can set a start time and end time for making a clip. Anderson [Col 5 15-16] “enables the user to graphically select the length of the clip, indicated as 10 seconds for  the first clip, and to modify the starting and ending points  for the clip”, Fig item 490 shows a “create video” button. [Col 5 51-55] “Create video button 490 initiates the creation of the condensed version of the footage, comprising the selected clips, which are assembled in accordance with the chosen parameters”. Thus, based on the user input of processing the start and end locations, then the video segment between the selected points is extracted and first summary frames are obtained)
 
Anderson locates the second summary information corresponding to at least one summary frame included in a second section of the video that exceeds the first section. (as discussed above in Anderson Fig 4 item 490.and Anderson  [Col 5 51-55] a new clip with summary frames is created in response to selecting the first and second locations based on selecting the “Create Video” button)

Anderson teaches control the display to display a partial video, of the second section of the video, corresponding to the located second summary information, (as discussed above in Anderson Fig 4 item 490.and Anderson  [Col 5 51-55] a new clip with summary frames is created in response to selecting the first and second locations based on selecting the “Create Video” button)

Anderson does not explicitly teach wherein the processor is configured to select key frames from a plurality of frames of the video, the key frames being located at preset time intervals of the video. However, Williams is analogous art of a user interface of making representative videos [Col 7 42-43].  Anderson Fig 4 item 430 shows how a user can set a start time and end time for making a clip. Anderson [Col 5 15-16] “enables the user to graphically select the length of the clip, indicated as 10 seconds for the first clip, and to modify the starting and ending points for the clip”. Williams [abstract] teaches “The functionality operates by: (a) receiving a video item; (b) dividing the video item into a plurality of segments; (c) extracting at least one key frame from each of the plurality of segments to form a plurality of key frames”, as shown in Fig 2, [0071] “In phase 208 (corresponding to block 106), the data processing environment selects a key frame from each segment”. It would have been obvious to one of ordinary skill in the art before the filing date of the invention to combine Anderson with Williams so a user can jump to specific portions of a video without having to traverse the entire video [Williams 0101].

The modified Anderson + Williams teaches determine first summary frames from among the key frames of the video (Williams [0101] “In another case, the user's selection of the key frame prompts the data processing module to present a panel 710 that provides additional key frames selected from the video item. These additional key frames may correspond to respective scenes within the video item. The user may select any frame in the main interface presentation 706 or the panel 710 to invoke a video vignette associated with the selected key frame”)

The modified Anderson + Williams teaches matching with first summary information corresponding to at least one summary frame included between the first location and second location , (Williams [abstract] “The functionality operates by: (a) receiving a video item; (b) dividing the video item into a plurality of segments; (c) extracting at least one key frame from each of the plurality of segments to form a plurality of key frames”, as shown in Fig 2, [0071] “In phase 208 (corresponding to block 106), the data processing environment selects a key frame from each segment”)



at least one of text information indicating a reproduction location in time of the corresponding summary frame in the video or text information indicating a reproduction location n time of a next key frame in the video. However, Shichman is analogous art of generating key frames on a user interface [Shichman 0014]. Shichman [0103] “For example, user interface unit 320 may enable performing textual search in metadata”. Shichman [0043] “[0043] Metadata for (or of) a data object as referred to herein may be, or may Include, one or more data elements that describe, or provide other information for, other data, beyond the primary data (e.g., the video clip itself) for the data object. For example, metadata of a video clip may be its length, the time it was received by a 
server, specific times of the clip, or who is seen in the video clip (e.g., a name of a player shown 
in a video may be included in metadata), or what happens in the clip (such as `3-point basket`)”. 
Shichman [0046] “For example, metadata may include parameters and descriptions of events, 
e.g., scores, fouls and time information of a sports event that may be shown in a related or 
associated video clip” Thus, Shichman teaches a text searches based on *who is seen in the video 
clip* such as a *dog*, *description of the event* such as “dog”, and *specific times of the clip*. 
Shichman [0062] “An event object may be associated with a start and an end time. A start and end time may be associated with an event object. For example, an object representing an event may include a start and an end time of the event. For example and as shown, an event described by event object 451 may have a start time that is segment 423 of video 420 and an end time that is segment 425 of video 420”. Thus, Shichman teaches the *specific times of the clip* are indexed to source video times. Thus, Shichman teaches the event objects (key frames) are 

The modified Anderson + Williams teaches wherein the matching information comprises key point information about a key point of the at least one of the first summary frames, (as discussed above,  Shichman [0103] “For example, user interface unit 320 may enable performing textual search in metadata”, as discussed above, Shichman [0043] “Metadata for (or of) a data object as referred to herein may be, or may include, one or more data elements that describe, or provide other information for, other data, beyond the primary data (e.g., the video clip itself) for the data object. For example, metadata of a video clip may be its length, the time it was received by a server, specific times of the clip, or who is seen in the video clip (e.g., a name of a player shown in a video may be included in metadata), or what happens in the clip (such as `3-point basket`)”, thus, a *key point* can be who is in it)

The modified Williams + Shichman does not explicitly teach place information about a place in which the video has been captured, and date and time information about a date and time at which the video has been captured. However, Browm is analogous art of generating key frames 
on a user interface [Brown 0036]. Brown [abstract] “search component is configured to enable a search of the index of the extensible indexing and search tool according to at least one of the set of attributes of the person”. Brown [0031] “additional metadata that captures a more detailed the location of the sensor(s) that captured the attribute, or a timestamp indicating the time and date that the attribute was captured”. Thus, Brown teaches search capability for a location attribute. It would have been obvious to one of ordinary skill in the art before the filing date of the invention to combine the video key frame selection user interface of the modified Williams + Shichman with the video key frame selection user interface of Brown, so “smart surveillance” can be implemented to which applies automated signal analysis and pattern recognition to video cameras and sensors with the goal of automatically extracting "usable information" from video and sensor streams [Brown 0009].

Claim 16 
The modified Anderson + Williams teaches determine a summary frame of the video based on the searched second summary information, and wherein the display is further configured to display the summary frame (Williams Fig 7 shows final key 708 and the corresponding “training your dog” text are displayed based on a search for “dogs”, Williams [0103] “The user interface presentation 702 can also include textual information associated with each search result item to help the user make a decision”)

Claim 17
The modified Anderson + Williams teaches is further configured to, when at least two partial videos correspond to the searched second summary information, control the display to display the at least two partial videos (Williams [0101] “The user may select any frame in the main interface presentation 706 or the panel 710 to invoke a video vignette associated with the selected key frame. The vignette may correspond to only one of the vignettes associated with the video item or may correspond to several vignettes pieced together to form a compilation-type summary”)


Claims 18-20 are similar in scope to claims 15-17 and are rejected under similar rationale



Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 



Inquiry
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Erik Stitt whose telephone number is (571)270-5064.  The examiner can normally be reached on M-F 11am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Savla, Arpan, can be reached on (571) 272-1077.The fax phone number for the organization where this application or proceeding is assigned is 571-270-6064.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



EVS

/Arpan P. Savla/Supervisory Patent Examiner, Art Unit 2145