DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
Response to Arguments
3.	Applicant's arguments with respect to the rejections of claims 21-30, 32-38 and 41 have been considered but are moot in view of the new grounds of rejection.  

Response to Amendment
4.	In response to the amendment, the rejection of claims 34, 36, and 37 under 35 U.S.C. 112(b) is withdrawn.        
         
Claim Rejections - 35 USC § 103
5.	The text of those sections of Title 35, U.S. Code not included in this section can be found in a prior Office action.

6.	Claims 21-26 are rejected under AIA  35 U.S.C. 103 as being unpatentable over Liu et al. (US Publication 2016/0092561, hereinafter Liu) in view of Kardashov et al. (US Publication 2017/0133053, hereinafter Kardashov).       
Regarding claim 21, Liu discloses a computer-implemented method, comprising:
receiving image data corresponding to a first image, a second image, and a third image (Liu, fig. 6, para’s 0077-0081, receiving a plurality of portions/images of video content which correspond to a first portion/image, a second portion/image, and a third portion/image);
determining first annotation data corresponding to the first image, the first annotation data indicating, at least in part, a first time corresponding to when the first image was captured (Liu, fig’s 5 and 6, para’s 0008-0010, 0059, 0070-0081, 0091-0097, determine first metadata including a first time when the first portion was captured); 
determining second annotation data corresponding to the second image, the second annotation data indicating, at least in part, a second time corresponding to when the second image was captured (Liu, fig’s 5 and 6, para’s 0008-0010, 0059, 0070-0081, 0091-0097, determine second metadata including a second time when the second portion was captured);
determining third annotation data corresponding to the third image, the third annotation data, indicating, at least in part, a third time corresponding to when the third image was captured (Liu, fig’s 5 and 6, para’s 0008-0010, 0059, 0070-0081, 0091-0097, determine third metadata including a third time when the third portion was captured); 
	receiving a request to generate a video summarization (Liu, para’s 0007-0010, metadata includes time data for when the image data for a plurality of frames was captured, and may be used to remove undesirable portions of video, generate video editing hints or suggestions for a video editing interface or may be used to automatically generate video summary, e.g., a highlight video that highlights the important parts of a video or videos; further, para. 0059, claim 11, also disclose location/time analyzer may analyze the location and time of the captured video.  Such information may be used to help segment the video sequence into different clips, and generate a single video summary based on multiple video sequences.  In some instances, timestamps of different video sequences can help group videos taken at the same time together; video summary can be generated based on those grouped video sequences; therefore the disclosure above implies and/or makes obvious that a request can be received to generate a video highlight that contains video frames that were captured at specific time data and/or generate a video summary from among different video sequences that were taken at the same time); and 
generating the video summarization including the first image and the third image but not including the second image (Liu, fig. 2, para’s 0052-0053 and fig. 9, para’s 0097 and 0098, summarize video including the sequences that were taken at the same time and remove some portion of the video, i.e., to exclude the second image).
Liu does not explicitly disclose:
the request indicating selection of a first time period corresponding to the video summarization;
based at least in part on the request indicating the selection of the first time period, determining that the first image was captured in the first time period; based at least in part on the request indicating the selection of the first time period, determining that the third 
based at least in part on determining that the first image and the third image were captured in the first time period and the second image was captured outside the first time period, generating the video summarization including the first image and the third image but not including the second image.
Kardashov discloses:
the request indicating selection of a first time period corresponding to the video summarization;
based at least in part on the request indicating the selection of the first time period, determining that the first image was captured in the first time period; based at least in part on the request indicating the selection of the first time period, determining that the third image was captured in the first time period;
based at least in part on determining that the first image and the third image were captured in the first time period and the second image was captured outside the first time period, generating the video summarization including the first image and the third image but not including the second image (Kardashov, para. 0067, a summary window 808 may be provided that includes information indicating the time period for which selector 804 is positioned with respect to time listing 802. An “ORDER” icon 810 may be provided to the user in summary window 808 (for example) that, when clicked (or otherwise selected) sends a request to, for example, cloud computing system 110 of FIG. 1 to generate a synopsis video for the selected time period. Cloud computing system 110 may then retrieve stored background images and VMD files for video captured during the selected time period, generate one or more chapters of synopsis video by including images of detected objects from VMD files generated during the time of that chapter, and provide a notification (e.g., a text message, an email, a telephone call, an application programming interface (API) message, etc.) to the user when the synopsis video is ready to be accessed by the user; video image that are not captured during the selected time period obviously would not be retrieved).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Kardashov’s teachings into Liu’s invention for enhancing user’s playback experience by providing a video summary of content captured during a specific time period.

Regarding claim 22, Liu-Kardashov discloses the computer-implemented method of claim 21, further comprising:
determining, using the first annotation data and the third annotation data, a priority metric (Liu, fig. 5, para’s 0070-0075, generating prioritization data of segments based on metadata); and
determining that the priority metric satisfies a condition, wherein generating the video summarization is further based at least in part on determining that the priority metric satisfies the condition (Liu, fig. 5, para’s 0070-0075, fig. 9, para. 0097-0107, generating video summary based on prioritization data condition).

Regarding claim 23, Liu-Kardashov discloses the computer-implemented method of claim 21, further comprising:
Liu, para’s 0007-0010, metadata includes location information from the device capturing the image data and landmark detection data, and may be used to remove undesirable portions of video, generate video editing hints or suggestions for a video editing interface or may be used to automatically generate video summary, e.g., a highlight video that highlights the important parts of a video or videos; para. 0059, claim 11, also disclose location/time analyzer may analyze the location and time of the captured video.  Such information may be used to help segment the video sequence into different clips, and generate a single video summary based on multiple video sequences; therefore the disclosure above implies and/or makes obvious that a geographic location can be determined and used to generate video summary from among different video sequences); 
generating fourth annotation data corresponding to a fourth image, wherein the image data corresponds to the fourth image (Liu, para’s 0091-0096, fig. 8, generating metadata for a fourth portion/image);
determining, based on the first annotation data, that the first image corresponds to the first geographic location; determining, based on the third annotation data, that the third image corresponds to the first geographic location (Liu, para’s 0007-0010, para. 0059, claim 11, determine that a segment, i.e., the first image and the third image correspond to a geographic location, i.e., the first geographic location); and
determining, based on the fourth annotation data, that the fourth image corresponds to a second geographic location different from the first geographic location, wherein the video summarization does not include the fourth portion image (Liu, para’s 0007-0010, para. 0059, claim 11,  0097-0107, 0109, identifying that video frames,  i.e., the fourth portion, containing objects or landmarks, associating with a second different geographic location as well known in the art; associating a first geographic location with the first portion and the third video portion, but the second different geographic location with the fourth portion, and generating video summarization to exclude the fourth portion).

Regarding claim 24, Liu-Kardashov discloses the computer-implemented method of claim 21, wherein the first image is a video frame (Liu, para. 0112, still image or video frame).

Regarding claim 25, Liu-Kardashov discloses the computer-implemented method of claim 21, further comprising:
determining a first object associated with the request (Liu, para’s 0007-0010, metadata includes landmark detection data, object detection data, face detection data, considered as objects as also disclosed in para’s 0060, 0109-0111; metadata may be used to remove undesirable portions of video, generate video editing hints or suggestions for a video editing interface or may be used to automatically generate video summary, e.g., a highlight video that highlights the important parts of a video or videos; therefore the disclosure above implies and/or makes obvious that one or more objects can be determined in video sequences, and can be used to generate video summary from among different video sequences);
generating fourth annotation data corresponding to a fourth image, wherein the image data corresponds to the fourth image (Liu, para’s 0091-0107, fig. 8, generating metadata for a fourth portion/image);
determining, based on the first annotation data, that the first image corresponds to the first object; determining, based on the third annotation data, that the third image corresponds to the first object (Liu, para’s 0007-0010, para. 0060, claim 5, determine that a segment, i.e., the first image and the third image correspond to an object or a face, i.e., the first object); and
determining, based on the fourth annotation data, that the fourth image does not correspond to the first object, wherein the video summarization does not include the fourth image based at least in part on determining that the fourth image does not correspond to the first object (Liu, para’s 0007-0010, para. 0060, claim 5, para’s 0097-0107, identifying that video frames, i.e., the fourth portion, containing a second object or not containing the first object, as also well known in the art; associating the first object with the first portion and the third video portion, but not the fourth portion, and generating video summarization to exclude the fourth portion).

Regarding claim 26, Liu-Kardashov discloses the computer-implemented method of claim 25, wherein the first image is captured by an image capture device (Liu, para. 0082, image sensor).

7.	Claim 27 is rejected under AIA  35 U.S.C. 103 as being unpatentable over Liu-Kardashov, as applied to claim 21 above, in view of Yamaji (US Publication 2016/0086342). 
Regarding claim 27, Liu-Kardashov discloses the computer-implemented method 
processing a portion of video data corresponding to the image data and a fourth image to identify an object represented in the fourth image (Liu, para’s 0007-0010, para. 0060, claim 5, processing and determining that a segment, i.e., a portion of video data corresponding to the image data and a fourth image, contain an object or a face, i.e., identity of an object).
determining that the motion of the object in the portion (Liu, para’s 0011 and 0067, tracking object motion); and 
based at least in part on the motion of the object in the portion, not including the portion in the video summarization (Liu, para’s 0097-0107, generating video summarization to exclude the portion).
Liu-Kardashov does not explicitly disclose determining that the motion of the object in the portion comprises determining that the object does not move in the portion.
Yamaji discloses determining that the motion of the object in the portion comprises determining that the object does not move in the portion (Yamaji, para’s 0011 and 0024, fig’s 4a-4c, detect motion of a face and still face).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Yamaji’s teachings into Liu-Kardashov’s invention for enhancing user’s playback experience by providing a video summary of objects in motion.


8.	Claims 28, 29, 32, and 35-37 are rejected under AIA  35 U.S.C. 103 as being unpatentable over Sampathkumaran et al. (US Publication 2015/0286719, hereinafter Sampathkumaran) in view of Kardashov et al. (US Publication 2017/0133053, hereinafter Kardashov). 
Regarding claim 28, Sampathkumaran discloses a computer-implemented method, comprising:
receiving image data corresponding to a first image, a second image, and a third image (Sampathkumaran, para’s 0004, 0040, receiving video stream comprising a plurality of video segments, i.e., a first image, a second image and a third image);
 generating first annotation data corresponding to the first image; generating second annotation data corresponding to the second image; generating third annotation data corresponding to the third image (Sampathkumaran, para’s 0040-0043, fig. 2, generating face index files for frames of each of the video segments, i.e., first annotation data,  second annotation data, and third annotation data); 
receiving a request to generate a video summarization, the request corresponding to a first object and a second object (Sampathkumaran, fig’s 3 and 4; para’s 0004-0010, 0044-0050, receiving a request to summarize the video stitching segments that show one or more desired faces, i.e., a first object/face and a second object/face); 
based at least in part on the first annotation data, determining that the first image of the image data includes a first representation of the first object and a second representation of the second object (Sampathkumaran, fig.4; 0044-0050, reading the first face index file to determine that frame(s) of the first image/segment include the first object/face and a second object/face);    
Sampathkumaran, fig.4; 0044-0050, reading the second face index file to determine that frame(s) of the second image/segment does not include the second object/face);
 determining that the third image of the image data includes the first representation of the first object and the second representation of the second object based on the third annotation data (Sampathkumaran, fig.4; 0044-0050, reading the first face index file to determine that frame(s) of the third image/segment include the first object/face and a second object/face); and 
based at least in part on determining that the first image includes the first representation and the second representation, and that the third image includes the first representation and the second representation, generating the video summarization including the first image and the third image but not including the second image (Sampathkumaran, fig.4, para’s 0044-0050, generating a redacted version of the full video showing the first image/segment and the third image/segment but not the second image/segment based on the determination that the first image and third image includes the first representation and the second representation).
Sampathkumaran does not explicitly disclose:
the request indicating selection of a first time period corresponding to the video summarization;
based at least in part on the request indicating the selection of the first time period, 
determining that the first image was captured in the first time period;
based at least in part on determining that the first image and the third image were captured in the first time period, generating the video summarization including the first image and the third image but not including the second image.
Kardashov discloses:
the request indicating selection of a first time period corresponding to the video summarization;
based at least in part on the request indicating the selection of the first time period, 
determining that the first image was captured in the first time period;
based at least in part on determining that the first image and the third image were captured in the first time period, generating the video summarization including the first image and the third image but not including the second image (Kardashov, para. 0067, a summary window 808 may be provided that includes information indicating the time period for which selector 804 is positioned with respect to time listing 802. An “ORDER” icon 810 may be provided to the user in summary window 808 (for example) that, when clicked (or otherwise selected) sends a request to, for example, cloud computing system 110 of FIG. 1 to generate a synopsis video for the selected time period. Cloud computing system 110 may then retrieve stored background images and VMD files for video captured during the selected time period, generate one or more chapters of synopsis video by including images of detected objects from VMD files generated during the time of that chapter, and provide a notification (e.g., a text message, an email, a telephone call, an application programming interface (API) message, etc.) to the user when the synopsis video is ready to be accessed by the user; video image that are not captured during the selected time period obviously would not be retrieved).
It would have been obvious to one of ordinary skill in the art before the effective 
 
Regarding claim 29, Sampathkumaran-Kardashov discloses the computer-implemented method of claim 28, wherein the first image is a video and the image data is video data (Sampathkumaran, para. 0041, video data). 

Regarding claim 32, Sampathkumaran discloses a system comprising: 
at least one processor; and at least one memory including instructions that, when executed by the at least one processor (see Sampathkumaran, para’s 0011-0023, system components), cause the system to:
receive image data corresponding to a first image, a second image, and a third image (Sampathkumaran, para’s 0004, 0040, receiving video stream comprising a plurality of video segments, i.e., a first image, a second image and a third image);
receive a request to generate a video summarization, the request corresponding to an identity (Sampathkumaran, fig’s 3 and 4; para’s 0004-0010, 0044-0050, receiving a request to summarize the video stitching segments that show a name of a face in the video);
based at least in part on first annotation data corresponding to the first image, determine that the first image corresponds to the identity (Sampathkumaran, para’s 0040-0043, fig. 2, generating face index files for frames of each of the video segments, i.e., first annotation data, second annotation data, and third annotation data; reading the first face index file to determine frame(s) of the first image/segment that correspond to the name);
based at least in part on third annotation data corresponding to the third image, determine that the third image corresponds to the identity based on third annotation data corresponding to the third image (Sampathkumaran, para’s 0040-0043, fig. 2, reading the third face index file to determine frame(s) of the third image/segment that correspond to the name); and
based at least in part on determining that the first image corresponds to the identity. that the third image corresponds to the identity, and that the first image and the third image were captured in the first time period, generate the video summarization including the first image and the third image but not including the second image (Sampathkumaran, fig.4; 0044-0050, based on the name and segment(s) corresponding determination, generating a redacted version of the full video showing the first image/segment and the third image/segment but not the second image/segment).
Sampathkumaran does not explicitly disclose:
the request indicating selection of a first time period corresponding to the video summarization;
based at least in part on the request indicating the selection of the first time period, determine that the first image and the third image was captured in the first time period.
Kardashov disclose:
the request indicating selection of a first time period corresponding to the video summarization;
based at least in part on the request indicating the selection of the first time period, determine that the first image and the third image was captured in the first time period Kardashov, para. 0067, a summary window 808 may be provided that includes information indicating the time period for which selector 804 is positioned with respect to time listing 802. An “ORDER” icon 810 may be provided to the user in summary window 808 (for example) that, when clicked (or otherwise selected) sends a request to, for example, cloud computing system 110 of FIG. 1 to generate a synopsis video for the selected time period. Cloud computing system 110 may then retrieve stored background images and VMD files for video captured during the selected time period, generate one or more chapters of synopsis video by including images of detected objects from VMD files generated during the time of that chapter, and provide a notification (e.g., a text message, an email, a telephone call, an application programming interface (API) message, etc.) to the user when the synopsis video is ready to be accessed by the user; video image that are not captured during the selected time period obviously would not be retrieved).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Kardashov’s teachings into Sampathkumaran’s invention for enhancing user’s playback experience by providing a video summary of content captured during a specific time period.

Regarding claim 35, Sampathkumaran-Kardashov discloses the system of claim 32, wherein the first image is a video frame (Sampathkumaran, para. 0041, video frames).

Regarding claim 36, Sampathkumaran-Kardashov discloses the system of claim 32, wherein the at least one memory further comprises instructions that, when executed 
determine a first object associated with the request (Sampathkumaran, fig’s 3 and 4; para’s 0004-0010, 0044-0050, an object like a face can be entered for a the request to summarize the video);
generate fourth annotation data corresponding to a fourth image, wherein the image data corresponds to the fourth image (Sampathkumaran, para’s 0040-0043, fig. 2, generating face index files for frames of a fourth image/segment, i.e., fourth annotation data);
determine, based on the first annotation data, that the first image corresponds to the first object; determine, based on the third annotation data, that the third image corresponds to the first object (Sampathkumaran, fig.4; 0044-0050, reading the first face index file to determine that frame(s) of the first image/segment include the first object/face; and reading the third face index file to determine that frame(s) of the third image/segment include the first object/face); and
determine, based on the fourth annotation data, that the fourth image does not correspond to the first object (Sampathkumaran, fig. 4, para’s 0044-0050, reading the fourth face index file to determine that frame(s) of the fourth image/segment does not include the first object/face), wherein the video summarization does not include the fourth image based at least in part on determining that the fourth image does not correspond to the first object (Sampathkumaran, fig. 4, para’s 0044-0050, generating a redacted version of the full video that does not showing the fourth image/segment based on the determination that the fourth image does not correspond to the first object/face).

Regarding claim 37, Sampathkumaran-Kardashov discloses the system of claim 36, wherein the first image is captured by an image capture device (Sampathkumaran, para. 0030, camera).

8.	Claim 41 is rejected under AIA  35 U.S.C. 103 as being unpatentable over Sampathkumaran-Kardashov, as applied to claim 28 above,  in view of Santillan et al. (US Publication 2014/0125702, hereinafter Santillan). 
Regarding claim 41, Sampathkumaran-Kardashov discloses the computer-implemented method of claim 28.
Sampathkumaran-Kardashov does not explicitly disclose but Santillan discloses wherein at least one of the first, the second, and the third annotation data is generated by a server in response to the server receiving the request from a client device (Santillan, para. 0047, client device transmits an image data request, as shown at step 608, and in response will receive image and video files and related metadata “annotation”, as shown at step 602, and then execute a process for rendering a three-dimensional immersive environment reflecting the data and information in the image files, video files and metadata received from the servers).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Santillan’s teachings into Liu-Kardashov’s invention for conserving storage at thin client device.


9.	Claim 41 is rejected under AIA  35 U.S.C. 103 as being unpatentable over Sampathkumaran-Kardashov, as applied to claim 28 above,  in view of Santillan et al. (US Publication 2014/0125702, hereinafter Santillan). 
Regarding claim 41, Sampathkumaran-Kardashov discloses the computer-implemented method of claim 28.
Sampathkumaran-Kardashov does not explicitly disclose but Santillan discloses wherein at least one of the first, the second, and the third annotation data is generated by a server in response to the server receiving the request from a client device (Santillan, para. 0047, client device transmits an image data request, as shown at step 608, and in response will receive image and video files and related metadata “annotation”, as shown at step 602, and then execute a process for rendering a three-dimensional immersive environment reflecting the data and information in the image files, video files and metadata received from the servers).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Santillan’s teachings into Liu-Kardashov’s invention for conserving storage at thin client device.

10.	Claim 30 is rejected under AIA  35 U.S.C. 103 as being unpatentable over Sampathkumaran-Kardashov, as applied to claim 29 above, in view of Qureshi (US Patent 9,223,458).
Regarding claim 30, Sampathkumaran-Kardashov discloses the computer-implemented method of claim 29.
Sampathkumaran-Kardashov does not explicitly disclose receiving audio data associated with the video data; identifying a song represented in the audio data; and 
Qureshi discloses receiving audio data associated with the video data; identifying a song represented in the audio data; and associating the song with the video data (Qureshi, col. 5, lines 1-17, receiving audio data that matches with video data; associating the song title corresponding to the audio data with the video data).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Qureshi’s features into Sampathkumaran-Kardashov’s invention for enhancing user’s video playback experience by combining music into video summarization.

11.	Claims 33 and 34 are rejected under AIA  35 U.S.C. 103 as being unpatentable over Sampathkumaran-Kardashov, as applied to claim 28 above, in view of Liu et al. (US Publication 2016/0092561, hereinafter Liu). 
Regarding claim 33, Sampathkumaran-Kardashov discloses the system of claim 32, but does not explicitly disclose:
determine, using the first annotation data and the third annotation data, a priority metric; and
determine that the priority metric satisfies a condition,
wherein generating the video summarization is further based at least in part on determining that the priority metric satisfies the condition.
Liu discloses:
determining, using the first annotation data and the third annotation data, a priority metric (Liu, fig. 5, para’s 0070-0075, generating prioritization data of segments based on metadata); and
determining that the priority metric satisfies a condition, wherein generating the video summarization is further based at least in part on determining that the priority metric satisfies the condition (Liu, fig. 5, para’s 0070-0075, fig. 9, para. 0097-0107, generating video summary based on prioritization data condition).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Liu’s features into Sampathkumaran-Kardashov’s invention for effectively generating video summary by selecting video segments based on priority metric condition.

Regarding claim 34, Sampathkumaran-Kardashov discloses the system of claim 32, but does not explicitly disclose:
determine a first geographic location associated with the request;
generating fourth annotation data corresponding to a fourth image, wherein the image data corresponds to the fourth image;
determine, based on the first annotation data, that the first image corresponds to the first geographic location;
determine, based on the third annotation data, that the third image corresponds to the first geographic location; and
determine, based on the fourth annotation data, that the fourth image corresponds to a second geographic location different from the first geographic location,
wherein the video summarization does not include the fourth image.
Liu discloses:
Liu, para’s 0007-0010, metadata includes location information from the device capturing the image data and landmark detection data, and may be used to remove undesirable portions of video, generate video editing hints or suggestions for a video editing interface or may be used to automatically generate video summary, e.g., a highlight video that highlights the important parts of a video or videos; para. 0059, claim 11, also disclose location/time analyzer may analyze the location and time of the captured video.  Such information may be used to help segment the video sequence into different clips, and generate a single video summary based on multiple video sequences; therefore the disclosure above implies and/or makes obvious that a geographic location can be determined and used to generate video summary from among different video sequences);
generating fourth annotation data corresponding to a fourth image, wherein the image data corresponds to the fourth image (Liu, para’s 0091-0096, fig. 8, generating metadata for a fourth portion/image);
determine, based on the first annotation data, that the first image corresponds to the first geographic location; determine, based on the second annotation data, that the third image corresponds to the first geographic location (Liu, para’s 0007-0010, para. 0059, claim 11, determine that a segment, i.e., the first image and the third image correspond to a geographic location, i.e., the first geographic location); and
determining, based on the fourth annotation data, that the fourth image corresponds to a second geographic location different from the first geographic location, wherein the video summarization does not include the fourth portion image (Liu, para’s 0007-0010, para. 0059, claim 11,  0097-0107, identifying that video frames,  i.e., the fourth portion, containing objects, associating with a second different geographic location as known in the art; associating a first geographic location with the first portion and the third video portion, but the second different geographic location with the fourth portion, and generating video summarization to exclude the fourth portion).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Liu’s features into Sampathkumaran-Kardashov’s invention for enhancing user’s playback experience by generating video summary of preferred geographic locations only. 

12.	Claim 38 is rejected under AIA  35 U.S.C. 103 as being unpatentable over Sampathkumaran-Kardashov, as applied to claim 32 above, in view of Liu et al. (US Publication 2016/0092561, hereinafter Liu), and further in view of Yamaji (US Publication 2016/0086342).
Regarding claim 38, Sampathkumaran-Kardashov discloses the system of claim 32, but does not explicitly disclose:
process a portion of the video data corresponding to the image data and a fourth image to identify an object represented in the fourth image;
determine that the object does not move in the portion; and
based at least in part on determining that the object does not move in the portion, not including the portion in the video summarization. 
Liu discloses:
process a portion of the video data corresponding to the image data and a fourth image to identify an object represented in the fourth image (Liu, para’s 0007-0010, para. 0060, claim 5, processing and determining that a segment, i.e., a portion of video data corresponding to the image data and a fourth image, contain an object or a face, i.e., identity of an object);
determining the motion of the object in the portion (Liu, para’s 0011 and 0067, tracking object motion); and 
based at least in part on the motion of the object in the portion, not including the portion in the video summarization (Liu, para’s 0097-0107, generating video summarization to exclude the portion).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Liu’s teachings into Sampathkumaran-Kardashov’s invention for enhancing user’s playback experience by providing a video summary that excludes objects in motion.
Sampathkumaran-Kardashov-Liu does not explicitly disclose determining the motion of the object in the portion comprises determining that the object does not move in the portion.
Yamaji discloses determining the motion of the object in the portion comprises determining that the object does not move in the portion (Yamaji, para’s 0011 and 0024, fig’s 4a-4c, detect motion of a face and still face).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Yamaji’s teachings into Sampathkumaran-Kardashov-Liu’s invention for enhancing user’s playback experience by providing a video summary of objects in motion.

Allowable Subject Matter
13. 	Claims 39-40 are rejected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all the limitations of the base claim and any intervening claims.

Conclusion
14.	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

15.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to LOI H TRAN whose telephone number is (571)270-5645.  The examiner can normally be reached on 8:00AM-5:00PM PST FIRST FRIDAY OF BIWEEK OFF.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, THAI TRAN can be reached on 571-272-7382.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/LOI H TRAN/Primary Examiner, Art Unit 2484