DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 02/23/2021 has been entered.
 
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have 

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-5, 8-12 and 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over Suri et al. (US 20160034786, hereinafter “Suri”), view of Ni et al. (US 20180020243, hereinafter “Ni”), and further in view of Sinha et al. (US 20190180109, hereinafter “Sinha”).
Regarding claim 1, Suri discloses,
A computer-implemented method comprising: 
“receiving, by a processor, a video (At 302, the receiving module 202 receives a corpus of video data)”; 
“segmenting, by the processor, the video into a set of video segments (video data may include video segments…..Video segments represent a set of consecutive video frames. In at least one example, a video segment may be defined as a fixed number of video frames (e.g., 20 video frames, 40 video frames, etc.), Para. [0023])”;
(the classifying module 124 may apply the learned classifier to the video segments. As a result, a set of feature values indicating probabilities that a video segment belongs to at least one of the categories in the predefined set of semantic categories may be included in the feature values used to train the scoring model, Para. [0061])”; 
“calculating, by the processor, importance scores for each video segment of a class within the set of video segments (The scoring module 126 receives the feature values, including the set of values resulting from feature extraction and the set of feature values resulting from the classifying module 124. The scoring module 126 applies the scoring model to the feature values. In at least one example, the set of values resulting from the classifying module 124 may be used by the scoring module 126 to determine the desirability score for a video frame and/or video segment, Paras. [0063]-[0064])”; 
“determining, by the processor, a winning video segment of the class within the set of video segments based on the importance scores for each video segment within the class (At 810, the segmenting module 206 identifies video segments with desirability scores above a predetermined threshold. In at least one example, video segments with the desirability scores above a predetermined threshold represent video segments including groups of video frames having the highest desirability scores after the classifier and scoring model are applied to the video frames, Para. [0110])”; and 
“storing, by the processor, the winning video segment from the set of video segments (the techniques described herein may be useful for identifying a set of thumbnails that are representative of a video segment having a high level of desirability, or a desirability score above a predetermined threshold. The set of thumbnails may represent a set of video frames having a desirability score above a predetermined threshold, or a set of video frames representing video segments having desirability scores above predetermined thresholds. When a user desires to playback video data, he or she may click on a representative thumbnail, start playback from the video frame represented as a thumbnail, and continue playback until an end of an interesting video segment, Para. [0027]).”
However, Suri does not explicitly disclose, “removing the winning video segment from the set of video segments and set aside the winning video segment into main memory or secondary memory.”
In a similar field of endeavor, Ni discloses, “removing (i.e., extracting) the winning video segment from the set of video segments (the identified frames corresponding to the determined "highlight" scene can be extracted, identified or otherwise utilized for creation of a short-form video clip or segment. In some embodiments, as discussed in more detail below, such creation of a highlight video segment can involve, but not limited to, generating (or creating or extracting) a highlight video segment from the frames of the stream 600, Para. [0113]) and set aside the winning video segment into main memory or secondary memory (the storage module 308 for storing a detected highlight video segment (or clip) and the information (e.g., data and metadata) associated with the highlight video segment).”

Further, the combination of the combination of Suri and Ni does not discloses, “calculating, by the processor, importance scores of each remaining video segments of the class within the set of video segments.”
In a similar field of endeavor, Sinha discloses, “calculating, by the processor, importance scores of each remaining video segments of the class within the set of video segments (The suppression scores obtained from the similarity distance calculation may be multiplied by the suppression curve. The frames may be ranked again and the frame with the highest overall score may be determined to be the final top frame. The identified top frame may then be removed and the process may be iterated until the next top frame is identified. For example, in a subsequent iteration, the curve may be centered about the next top image frame, thereby causing neighboring frames to have a decreased score).”
Therefore, it would have been obvious to one of ordinary skill in art before the effective filing date of the claimed invention to modify the combination of the combination of Suri and Ni by specifically providing calculating, by the processor, importance scores of each remaining video segments of the class within the set of video segments, as taught by Sinha for the purpose of providing improved techniques for 
Regarding claim 2, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 1),  in addition Ni discloses, “after removing the winning video segment from the set of video segments (i.e., extracting segments labeled with the "game" label from the training videos), determining, by the processor, a second winning video segment of the class within the remaining video segments based on the importance scores for each video segment within the class (In Step 422, a highlight score each extracted game segment is determined. In some embodiments, the highlight scores can be determined by an annotator, Paras. [0128]-[0130]), storing, by the processor, the second winning video segment from the remaining video segments (Step 420 involves extracting segments labeled with the "game" label from the training videos. Such extraction can be performed by any known or to be known extraction algorithm that enables the extraction of a portion of a video file to be extracted based on an applied label, Paras. [0128]-[0130]), and removing the second winning video segment from the set of video segments (the storage module 308 for storing a detected highlight video segment (or clip) and the information (e.g., data and metadata) associated with the highlight video segment).”
Regarding claim 3, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 1), further Suri discloses, “wherein classifying each video segment into a class comprises utilizing machine learning to classify each video segment into a class (The learning module 204 may be configured to train a classifier and a scoring model based on the low level, high level, and derivative feature values. The classifier includes a plurality of classifiers that may be used to generate a plurality of high level semantic feature values. The classifier may be used to estimate a probability that a video frame, video segment, video file, and/or video collection belongs to at least one of the categories in the predefined set of semantic categories (e.g., indoor, outdoor, mountain, lake, city, country, home, party, sporting event, zoo, concert, etc.), Para. [0059]).”
Regarding claim 4, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 1), further Suri discloses, “receiving, by the processor, the class from a user (Feature extraction may be performed on a video frame, video segment, video file, and/or video collection level. In at least some examples, the level of feature extraction depends on the level of ranking desired by the user. For example, if the user wants to rank video files in a video collection, feature extraction may be performed at the video file level, Para. [0101]).”
Regarding claim 5, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 1),  further Suri discloses, “receiving, by the processor, the class from a database (The classifier may be used to estimate a probability that a video frame, video segment, video file, and/or video collection belongs to at least one of the categories in the predefined set of semantic categories (e.g., indoor, outdoor, mountain, lake, city, country, home, party, sporting event, zoo, concert, etc.), Para. [0059])”.
Regarding claim 8, Suri discloses,

“a processor, a memory communicatively coupled to the processor (the service provider 102 may include one or more server(s) 110, which may include one or more processing unit(s) 112 and computer-readable media 114 such as memory)”, the memory having stored therein instructions that when executed caused the processor to: 
“receive a video (At 302, the receiving module 202 receives a corpus of video data)”; 
“segment the video into a set of video segments (video data may include video segments…..Video segments represent a set of consecutive video frames. In at least one example, a video segment may be defined as a fixed number of video frames (e.g., 20 video frames, 40 video frames, etc.), Para. [0023])”;
“classify each video segment into a class (the classifying module 124 may apply the learned classifier to the video segments. As a result, a set of feature values indicating probabilities that a video segment belongs to at least one of the categories in the predefined set of semantic categories may be included in the feature values used to train the scoring model, Para. [0061])”; 
“calculate importance scores for each video segment of a class within the set of video segments (The scoring module 126 receives the feature values, including the set of values resulting from feature extraction and the set of feature values resulting from the classifying module 124. The scoring module 126 applies the scoring model to the feature values. In at least one example, the set of values resulting from the classifying module 124 may be used by the scoring module 126 to determine the desirability score for a video frame and/or video segment, Paras. [0063]-[0064])”; 
“determine a winning video segment of the class within the set of video segments based on the importance scores for each video segment within the class (At 810, the segmenting module 206 identifies video segments with desirability scores above a predetermined threshold. In at least one example, video segments with the desirability scores above a predetermined threshold represent video segments including groups of video frames having the highest desirability scores after the classifier and scoring model are applied to the video frames, Para. [0110])”; and 
“store the winning video segment from the set of video segments (the techniques described herein may be useful for identifying a set of thumbnails that are representative of a video segment having a high level of desirability, or a desirability score above a predetermined threshold. The set of thumbnails may represent a set of video frames having a desirability score above a predetermined threshold, or a set of video frames representing video segments having desirability scores above predetermined thresholds. When a user desires to playback video data, he or she may click on a representative thumbnail, start playback from the video frame represented as a thumbnail, and continue playback until an end of an interesting video segment, Para. [0027]).”
However, Suri does not explicitly disclose, “removing the winning video segment from the set of video segments and set aside the winning video segment into main memory or secondary memory.”
(i.e., extracting) the winning video segment from the set of video segments (the identified frames corresponding to the determined "highlight" scene can be extracted, identified or otherwise utilized for creation of a short-form video clip or segment. In some embodiments, as discussed in more detail below, such creation of a highlight video segment can involve, but not limited to, generating (or creating or extracting) a highlight video segment from the frames of the stream 600, Para. [0113]) and set aside the winning video segment into main memory or secondary memory (the storage module 308 for storing a detected highlight video segment (or clip) and the information (e.g., data and metadata) associated with the highlight video segment).”
Therefore, it would have been obvious to one of ordinary skill in art before the effective filing date of the claimed invention to modify Suri by specifically providing removing the winning video segment from the set of video segments, as taught by NI for the purpose of provides novel systems and methods for automatically, in real-time, identifying and compiling video clips during live streams of video.
Further, the combination of the combination of Suri and Ni does not discloses, “calculate importance scores of each remaining video segments of the class within the set of video segments.”
In a similar field of endeavor, Sinha discloses, “calculate importance scores of each remaining video segments of the class within the set of video segments (The suppression scores obtained from the similarity distance calculation may be multiplied by the suppression curve. The frames may be ranked again and the frame with the highest overall score may be determined to be the final top frame. The identified top frame may then be removed and the process may be iterated until the next top frame is identified. For example, in a subsequent iteration, the curve may be centered about the next top image frame, thereby causing neighboring frames to have a decreased score).”
Therefore, it would have been obvious to one of ordinary skill in art before the effective filing date of the claimed invention to modify the combination of the combination of Suri and Ni by specifically providing calculate importance scores of each remaining video segments of the class within the set of video segments, as taught by Sinha for the purpose of providing improved techniques for automatically selecting image frames from a video and providing the selected image frames to a device for display.
Regarding Claim 9, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 8),  in addition Ni discloses, “after removing the winning video segment from the set of video segments (i.e., extracting segments labeled with the "game" label from the training videos), determining, by the processor, a second winning video segment of the class within the remaining video segments based on the importance scores for each video segment within the class (In Step 422, a highlight score each extracted game segment is determined. In some embodiments, the highlight scores can be determined by an annotator, Paras. [0128]-[0130]), storing, by the processor, the second winning video segment from the remaining video segments (Step 420 involves extracting segments labeled with the "game" label from the training videos. Such extraction can be performed by any known or to be known extraction algorithm that enables the extraction of a portion of a video file to be extracted based on an applied label, Paras. [0128]-[0130]), and removing the second winning video segment from the set of video segments (the storage module 308 for storing a detected highlight video segment (or clip) and the information (e.g., data and metadata) associated with the highlight video segment).”
Regarding Claim 10, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 8), further Suri discloses, “wherein classifying each video segment into a class comprises utilizing machine learning to classify each video segment into a class (The learning module 204 may be configured to train a classifier and a scoring model based on the low level, high level, and derivative feature values. The classifier includes a plurality of classifiers that may be used to generate a plurality of high level semantic feature values. The classifier may be used to estimate a probability that a video frame, video segment, video file, and/or video collection belongs to at least one of the categories in the predefined set of semantic categories (e.g., indoor, outdoor, mountain, lake, city, country, home, party, sporting event, zoo, concert, etc.), Para. [0059]).”
Regarding Claim 11, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 8), further Suri discloses, “further instructions that when executed cause the processor to receive the class from a user (Feature extraction may be performed on a video frame, video segment, video file, and/or video collection level. In at least some examples, the level of feature extraction depends on the level of ranking desired by the user. For example, if the user wants to rank video files in a video collection, feature extraction may be performed at the video file level, Para. [0101]).”. 
Regarding Claim 12, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 8), further Suri discloses, “further instructions that when executed cause the processor to receive the class from a database (The classifier may be used to estimate a probability that a video frame, video segment, video file, and/or video collection belongs to at least one of the categories in the predefined set of semantic categories (e.g., indoor, outdoor, mountain, lake, city, country, home, party, sporting event, zoo, concert, etc.), Para. [0059]). 
Regarding claim 15, Suri discloses,
A computer program product for action localization, the computer program product comprising a computer readable storage medium (computer readable storage medium is interpreted as non-transitory subject matter, as specification discloses, A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire, Para. [0062]) having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to:
 “receive a video (At 302, the receiving module 202 receives a corpus of video data)”; 
(video data may include video segments…..Video segments represent a set of consecutive video frames. In at least one example, a video segment may be defined as a fixed number of video frames (e.g., 20 video frames, 40 video frames, etc.), Para. [0023])”;
“classify each video segment into a class (the classifying module 124 may apply the learned classifier to the video segments. As a result, a set of feature values indicating probabilities that a video segment belongs to at least one of the categories in the predefined set of semantic categories may be included in the feature values used to train the scoring model, Para. [0061])”; 
“calculate importance scores for each video segment of a class within the set of video segments (The scoring module 126 receives the feature values, including the set of values resulting from feature extraction and the set of feature values resulting from the classifying module 124. The scoring module 126 applies the scoring model to the feature values. In at least one example, the set of values resulting from the classifying module 124 may be used by the scoring module 126 to determine the desirability score for a video frame and/or video segment, Paras. [0063]-[0064])”; 
“determine a winning video segment of the class within the set of video segments based on the importance scores for each video segment within the class (At 810, the segmenting module 206 identifies video segments with desirability scores above a predetermined threshold. In at least one example, video segments with the desirability scores above a predetermined threshold represent video segments including groups of video frames having the highest desirability scores after the classifier and scoring model are applied to the video frames, Para. [0110])”; and 
“store the winning video segment from the set of video segments (the techniques described herein may be useful for identifying a set of thumbnails that are representative of a video segment having a high level of desirability, or a desirability score above a predetermined threshold. The set of thumbnails may represent a set of video frames having a desirability score above a predetermined threshold, or a set of video frames representing video segments having desirability scores above predetermined thresholds. When a user desires to playback video data, he or she may click on a representative thumbnail, start playback from the video frame represented as a thumbnail, and continue playback until an end of an interesting video segment, Para. [0027]).”
However, Suri does not explicitly disclose, “removing the winning video segment from the set of video segments and set aside the winning video segment into main memory or secondary memory.”
In a similar field of endeavor, NI discloses, “removing (i.e., extracting) the winning video segment from the set of video segments (the identified frames corresponding to the determined "highlight" scene can be extracted, identified or otherwise utilized for creation of a short-form video clip or segment. In some embodiments, as discussed in more detail below, such creation of a highlight video segment can involve, but not limited to, generating (or creating or extracting) a highlight video segment from the frames of the stream 600, Para. [0113]) and set aside the winning video segment into main memory or secondary (the storage module 308 for storing a detected highlight video segment (or clip) and the information (e.g., data and metadata) associated with the highlight video segment).”
Therefore, it would have been obvious to one of ordinary skill in art before the effective filing date of the claimed invention to modify Suri by specifically providing removing the winning video segment from the set of video segments, as taught by NI for the purpose of provides novel systems and methods for automatically, in real-time, identifying and compiling video clips during live streams of video.
Further, the combination of the combination of Suri and Ni does not discloses, “calculate importance scores of each remaining video segments of the class within the set of video segments.”
In a similar field of endeavor, Sinha discloses, “calculate importance scores of each remaining video segments of the class within the set of video segments (The suppression scores obtained from the similarity distance calculation may be multiplied by the suppression curve. The frames may be ranked again and the frame with the highest overall score may be determined to be the final top frame. The identified top frame may then be removed and the process may be iterated until the next top frame is identified. For example, in a subsequent iteration, the curve may be centered about the next top image frame, thereby causing neighboring frames to have a decreased score).”
Therefore, it would have been obvious to one of ordinary skill in art before the effective filing date of the claimed invention to modify the combination of the combination of Suri and Ni by specifically providing calculate importance scores of each 
Regarding Claim 16, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 15), in addition Ni discloses, “after removing the winning video segment from the set of video segments (i.e., extracting segments labeled with the "game" label from the training videos), determining, by the processor, a second winning video segment of the class within the remaining video segments based on the importance scores for each video segment within the class (In Step 422, a highlight score each extracted game segment is determined. In some embodiments, the highlight scores can be determined by an annotator, Paras. [0128]-[0130]), storing, by the processor, the second winning video segment from the remaining video segments (Step 420 involves extracting segments labeled with the "game" label from the training videos. Such extraction can be performed by any known or to be known extraction algorithm that enables the extraction of a portion of a video file to be extracted based on an applied label, Paras. [0128]-[0130]), and removing the second winning video segment from the set of video segments (the storage module 308 for storing a detected highlight video segment (or clip) and the information (e.g., data and metadata) associated with the highlight video segment).”
Regarding Claim 17, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 15), further Suri discloses, “wherein classifying (The learning module 204 may be configured to train a classifier and a scoring model based on the low level, high level, and derivative feature values. The classifier includes a plurality of classifiers that may be used to generate a plurality of high level semantic feature values. The classifier may be used to estimate a probability that a video frame, video segment, video file, and/or video collection belongs to at least one of the categories in the predefined set of semantic categories (e.g., indoor, outdoor, mountain, lake, city, country, home, party, sporting event, zoo, concert, etc.), Para. [0059]).”
Regarding Claim 18, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 15), further Suri discloses, “further instructions that when executed cause the processor to receive the class from a user (Feature extraction may be performed on a video frame, video segment, video file, and/or video collection level. In at least some examples, the level of feature extraction depends on the level of ranking desired by the user. For example, if the user wants to rank video files in a video collection, feature extraction may be performed at the video file level, Para. [0101]).” 

Claim(s) 6, 7, 13, 14, 19 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Suri, in view of NI, in view of Sinha and further in view of Tandon et al. (US 20190228231, hereinafter “Tandon”).
Regarding claim 6, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 3), however the combination of Suri, Ni and Sinha 
In a similar field of endeavor, Tandon discloses, “using an attention model within the neural network to classify each video segment into a class (an aesthetics score for each of the video frames 104a-n is determined using based on a predictive model. Predictive models can include machine learning models such as neural networks. In particular, deep convolutional neural networks may be used. Based on a query from segmentation application 110, predictive model 120 can find a similar image and base its score on the score of that image, Para. [0071]-[0073]).”
Therefore, it would have  been obvious to one of ordinary skill in art before the effective filing date of the claimed invention to modify the combination of Suri, Ni and Sinha by specifically providing using an attention model within the neural network to classify each video segment into a class, as taught by Tandon for the purpose of using predictive models trained to provide aesthetic scores (e.g. a representation of subjective quality) to segment videos by using changes in quality to segment video.
Regarding claim 7,  the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 1), however the combination of Suri, Ni and Sinha does not explicitly disclose, “wherein calculating, by the processor, importance scores for each video segment of a class within the set of video segments comprises calculating importance scores using a neural network.”
In a similar field of endeavor, Tandon discloses, “wherein calculating, by the processor, importance scores for each video segment of a class within the set of video segments comprises calculating importance scores using a neural network (an aesthetics score for each of the video frames 104a-n is determined using based on a predictive model. Predictive models can include machine learning models such as neural networks. In particular, deep convolutional neural networks may be used. Based on a query from segmentation application 110, predictive model 120 can find a similar image and base its score on the score of that image, Para. [0071]-[0073]).”
Therefore, it would have been obvious to one of ordinary skill in art before the effective filing date of the claimed invention to modify the combination of Suri, Ni and Sinha by specifically providing wherein calculating, by the processor, importance scores for each video segment of a class within the set of video segments comprises calculating importance scores using a neural network, as taught by Tandon for the purpose of using predictive models trained to provide aesthetic scores (e.g. a representation of subjective quality) to segment videos by using changes in quality to segment video.
Regarding claim 13, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 10), however the combination of Suri, Ni and Sinha does not explicitly disclose, “using an attention model within the neural network to classify each video segment into a class.”
In a similar field of endeavor, Tandon discloses, “using an attention model within the neural network to classify each video segment into a class (an aesthetics score for each of the video frames 104a-n is determined using based on a predictive model. Predictive models can include machine learning models such as neural networks. In particular, deep convolutional neural networks may be used. Based on a query from segmentation application 110, predictive model 120 can find a similar image and base its score on the score of that image, Para. [0071]-[0073]).”
Therefore, it would have been obvious to one of ordinary skill in art before the effective filing date of the claimed invention to modify the combination of Suri, Ni and Sinha by specifically providing using an attention model within the neural network to classify each video segment into a class, as taught by Tandon for the purpose of using predictive models trained to provide aesthetic scores (e.g. a representation of subjective quality) to segment videos by using changes in quality to segment video.
Regarding claim 14,  the combination of Suri and NI discloses everything claimed as applied above (see claim 8), however the combination of Suri and NI does not explicitly disclose, “wherein calculating importance scores for each video segment of a class within the set of video segments comprises calculating importance scores using a neural network.”
In a similar field of endeavor, Tandon discloses, “wherein calculating importance scores for each video segment of a class within the set of video segments comprises calculating importance scores using a neural network (an aesthetics score for each of the video frames 104a-n is determined using based on a predictive model. Predictive models can include machine learning models such as neural networks. In particular, deep convolutional neural networks may be used. Based on a query from segmentation application 110, predictive model 120 can find a similar image and base its score on the score of that image, Para. [0071]-[0073]).”
Therefore, it would have been obvious to one of ordinary skill in art before the effective filing date of the claimed invention to modify the combination of Suri and NI by 
Regarding claim 19, the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 17), however the combination of Suri, Ni and Sinha does not explicitly disclose, “using an attention model within the neural network to classify each video segment into a class.”
In a similar field of endeavor, Tandon discloses, “using an attention model within the neural network to classify each video segment into a class (an aesthetics score for each of the video frames 104a-n is determined using based on a predictive model. Predictive models can include machine learning models such as neural networks. In particular, deep convolutional neural networks may be used. Based on a query from segmentation application 110, predictive model 120 can find a similar image and base its score on the score of that image, Para. [0071]-[0073]).”
Therefore, it would have been obvious to one of ordinary skill in art before the effective filing date of the claimed invention to modify the combination of Suri, Ni and Sinha by specifically providing using an attention model within the neural network to classify each video segment into a class, as taught by Tandon for the purpose of using predictive models trained to provide aesthetic scores (e.g. a representation of subjective quality) to segment videos by using changes in quality to segment video.
Regarding claim 20,  the combination of Suri, Ni and Sinha discloses everything claimed as applied above (see claim 15), however the combination of Suri, Ni and Sinha does not explicitly disclose, “wherein calculating importance scores for each video segment of a class within the set of video segments comprises calculating importance scores using a neural network.”
In a similar field of endeavor, Tandon discloses, “wherein calculating importance scores for each video segment of a class within the set of video segments comprises calculating importance scores using a neural network (an aesthetics score for each of the video frames 104a-n is determined using based on a predictive model. Predictive models can include machine learning models such as neural networks. In particular, deep convolutional neural networks may be used. Based on a query from segmentation application 110, predictive model 120 can find a similar image and base its score on the score of that image, Para. [0071]-[0073]).”
Therefore, it would be obvious to one of ordinary skill in art before the effective filing date of the claimed invention to modify the combination of Suri, Ni and Sinha by specifically providing wherein calculating importance scores for each video segment of a class within the set of video segments comprises calculating importance scores using a neural network, as taught by Tandon for the purpose of using predictive models trained to provide aesthetic scores (e.g. a representation of subjective quality) to segment videos by using changes in quality to segment video.

Relevant reference(s):
US 9445136: The invention is directed to Video data is retrieved from a server. During retrieval of the video data, a client device receives information indicating bit rates of representations of multimedia content. In addition, the client device receives information indicating priority values for segments of the representations. The segments correspond to particular temporal sections of the representations. The client device requests selected ones of the segments based on the priority values for the segments and an estimated throughput.
US 20170316256: The present disclosure relates to a computer-implemented method includes identifying interesting moments from a video. The video is received and includes image frames. Continual motion of one or more objects in the video is identified based on identifying foreground motion in the image frames. Video segments from the video that include the continual motion are generated. A segment score for each of the video segments is generated based on animation criteria.

Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GOLAM SOROWAR whose telephone number is (571)270-3761.  The examiner can normally be reached on Mon-Fri: 8:30AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/GOLAM SOROWAR/           Primary Examiner, Art Unit 2641