DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on June 18, 2021 has been entered.
 Response to Amendment
Claims 1-11 and 20-24 are pending. Claims 1-11 and 20-22 are amended directly or based on dependency on an amended claim. Claims 23 and 24 are new.
Response to Arguments
Applicant's arguments filed June 18, 2021 have been fully considered but they are not persuasive. 
Regarding the argument on pages 10-11 that the content type intended by the applicant in the independent claims is different from the way the examiner is interpreting content type, again In response to applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., advertising content/promotional content) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). Claim 22, which does identify the content type more specifically, is rejected under an additional reference.
With respect to the new limitation “storing the sentinel frames in a sentinel frame database”, this is still taught by the prior cited references: Deng (“The scene change detector 225 stores the key frame(s) and the key image signature(s) in the image store 220”, [0030], 
Claims 23 and 24 will be allowable when combined into the independent claims. 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having 

Claims 1-8, 10, 11, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Petersohn (US 20080316307 A1) in view of Gao et al. (US 20110064318 A1) in view of Kasutani (US 20090066838 A1) in view of Deng (US 20130259323 A1).

Regarding claims 1 and 20, Petersohn discloses a method and apparatus one or more computer processors; and a non-transitory computer-readable storage medium comprising instructions ([0149], [0150]) that, when executed, control the one or more computer processors to be configured for: extracting, by a computing device, frame features for a plurality of frames from a video (the color or edge similarity may then be used as a similarity measure, [0069]); identifying, by the computing device, a pattern from a sequence of frames of the plurality of frames, the pattern being identified based on a pattern analysis using the frame features for frames in the sequence of frames (performing a similarity analysis between two frame sequences A and B, [0069], video frames directly before and after the transition, that is, outside of the transition, may be utilized for the similarity analysis, [0072], visually similar frame sequences are designated with the same character, [0101]); clustering, by the computing device, a set of candidate frames into one or more groups based on the frame features for the set of candidate frames (a high value holds the frame sequences together (no scene boundary), and a low value separates the frame sequences into individual scenes, [0077], similarity-based scene segmentation, [0080], shot similarity values are checked to decide whether the current shot can be merged into an existing group or scene, if the shot similarity values are too small and merging is not possible a new scene is declared [0136] [scenes considered groups, frames that can be considered part of that group/scene or a new group/scene considered candidate frames]); selecting, by the computing device, sentinel features for each of the one or more 

Petersohn does not give details on the limitation selecting, by the computing device, sentinel features from the frame features for frames in the sequence of frames based on the pattern, generating a sentinel frame for each of the one or more groups, using the sentinel features for the frames in each of the one or more groups, or outputting, by the computing device, a set of sentinel frames for each of the one or more groups using the sentinel features, or storing the sentinel frames in a sentinel frame database.

Gao et al. teach selecting, by the computing device, sentinel features from the frame features for frames in the sequence of frames based on the pattern (“Thus, a candidate key frame whose dominant feature or features have higher weight values in the visual theme model will 



To the extent that Petersohn and Gao et al. do not make totally explicit outputting, by the computing device, a set of sentinel frames for each of the one or more groups using the sentinel features, an additional reference is provided to teach this feature, as well as storing the sentinel frames in a sentinel frame database.

Kasutani teaches clustering, by the computing device, a set of candidate frames into one or more groups based on the frame features for the set of candidate frames (S101, S103, Fig. 3, [0100]) and generating a sentinel frame for each of the one or more groups, using the sentinel features for the frames in each of the one or more groups, and outputting, by the computing device, a set of sentinel frames for each of the one or more groups using the sentinel features (display representative images, Fig 3, a representative image selection program and a representative image group selection program, [0019], plurality of key frames is extracted from one video as a combination of representative images, [0052], Fig. 4, identifies a group of key frames of the videos corresponding to the video identifiers output from the video selector 21, th  video is cp (1≤ p ≤ n), the representative image group combination extractor 82 extracts d key frames from each video, i.e., n x d key frames in all from one representative image group combination, [0150], if the evaluation values are calculated using color features of the respective key frames, the representative image group selector 84 selects the representative image group combination in which colors of key frames selected from within the same video are similar and, at the same time, a color difference from the other videos is most emphasized, [0176] [colors/color distance interpreted as sentinel features], see also Fig. 4, key frame group of video picture a - key frame group of video picture d for a set of sentinel frames [key frame group interpreted as set of sentinel frames; videos a-d interpreted as “for each of the one or more groups”]). Kasutani also teaches storing the sentinel frames in a sentinel frame database (“The key frame storage unit 31 stores c (where c is a positive integer) images representing each video (hereinafter, " key frame", and a set of key 

Petersohn and Gao et al. and Kasutani are in the same art of analyzing key frames (Petersohn, abstract; Gao et al., abstract; Kasutani, Fig. 4, [0052]). The combination of Kasutani with Petersohn and Gao et al. will enable the use of a set of key frames. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the set described by Kasutani with the invention of Petersohn and Gao et al. as this was known at the time of filing, the combination would have predictable results, and as Kasutani indicates using these techniques a user can easily grasp contents of videos displayed in a list ([0019]).

To the extent the transition in the video from a first content type to a second content type is not made entirely clear by the previous references, another reference is provided herein.

Deng teaches clustering, by the computing device, a set of candidate frames into one or more groups based on the frame features for the set of candidate frames (image grouping for scene detection, [0036], cluster scenes into scene clusters, [0042]) generating a sentinel frame for each of the one or more groups, using the sentinel features for the frames in each of the one or more groups (representative key image formed or selected from the key frames of the scenes 

Petersohn and Gao et al. and Kasutani and Deng are in the same art of key frames (Petersohn, abstract; Gao et al., abstract; Kasutani, Fig. 4, [0052]; Deng, [0042]). The combination of Deng with Petersohn and Gao et al. and Kasutani will enable the use of a transition detection. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the 

Regarding claim 2, Petersohn and Gao et al. and Kasutani and Deng disclose the method of claim 1. Petersohn further indicates identifying the pattern comprises: selecting a transitional frame in the sequence of frames, wherein the transitional frame is different from the sentinel frame; and using at least one frame in the sequence of frames that is before or after the transitional frame to select the sentinel features (video frames directly before and after the transition, that is, outside of the transition, may be utilized for the similarity analysis, or may be postulated that the video frames to be chosen should lie a number x, e.g. x=5, of video frames from the frame sequence boundary [0072]).

Regarding claim 3, Petersohn and Gao et al. and Kasutani and Deng disclose the method of claim 2. Petersohn further indicates selecting the sentinel features comprises using a frame in the sequence of frames on both sides of the transitional frame in the sequence of frames to select the sentinel features (video frames directly before and after the transition, that is, outside of the transition, may be utilized for the similarity analysis, or may be postulated that the video frames to be chosen should lie a number x, e.g. x=5, of video frames from the frame sequence boundary [0072]).

Regarding claim 4, Petersohn and Gao et al. and Kasutani and Deng disclose the method of claim 2. Petersohn and Gao et al. further indicate identifying the pattern comprises: comparing a first frame on a first side of the transitional frame to a second frame on a second side of the transitional frame to determine if the first frame and the second frame are similar within a threshold; and when the first frame and the second frame are similar within the threshold, identifying the sequence of frames as including the pattern (Petersohn, if there is great similarity (in relation to a predetermined threshold value), this implies that both the shots belong to a common scene, [0011], scene boundaries are inserted for all coherence values below a fixed threshold value, [0012], only the coherence value for the DISSOLVE type transition between the both frame sequences C and D is still below the threshold value and leads to the setting of a scene boundary, which corresponds to the correct relations, [0101], uniting may involve the use of a threshold value in order to identify frame sequence transitions as scene boundaries, [0144]; Gao et al., iteratively merge color clusters in Ds with color distances smaller than a threshold T1, until all the remaining color clusters in Ds are mutually distant from each other according to the threshold, [0037]).

Regarding claim 5, Petersohn and Gao et al. and Kasutani and Deng disclose the method of claim 2. Petersohn and Gao et al. further indicate grouping sequential transitional frames in the sequence of frames into a group, wherein the at least one frame in the sequence of frames is near the group within a threshold (Petersohn, if there is great similarity (in relation to a predetermined threshold value), this implies that both the shots belong to a common scene, [0011], scene boundaries are inserted for all coherence values below a fixed threshold value, [0012], only the coherence value for the DISSOLVE type transition between the both frame sequences C and D is still below the threshold value and leads to the setting of a scene boundary, which corresponds to the correct relations, [0101], uniting may involve the use of a threshold value in order to identify frame sequence transitions as scene boundaries, [0144]; Gao et al., iteratively merge color clusters in Ds with color distances smaller than a threshold T1, until all the remaining color clusters in Ds are mutually distant from each other according to the threshold, [0037]).

Regarding claim 6, Petersohn and Gao et al. and Kasutani and Deng disclose the method of claim 1. Petersohn further indicates the transitional frame is a solid color frame (transition frame is a black frame, [0005]).

Regarding claim 7, Petersohn and Gao et al. and Kasutani and Deng disclose the method of claim 1. Petersohn further indicates the pattern comprises a first sequence of frames and a transitional frame (frame sequences with transitions, abstract, [0005], [0016], [0023]-[0025]).

Regarding claim 8, Petersohn and Gao et al. and Kasutani and Deng disclose the method of claim 7. Petersohn further indicates locating a second sequence of frames in addition to the first 

Regarding claim 10, Petersohn and Gao et al. and Kasutani and Deng disclose the method of claim 1. Petersohn and Gao et al. further indicate receiving an input identifying the pattern (Petersohn, computation of brightness or color histograms for the key-frames, [0134], Gao et al., examples of feature types include color, texture, shapes, and faces among many others, [0021], extracts the common principle color components Dpc for a given a set of images, [0035], in addition to a color feature type, the visual theme model may encompass a texture feature type that includes features such as smooth, rough, [0041], once the visual theme model 

Regarding claim 11, Petersohn and Gao et al. and Kasutani and Deng disclose the method of claim 1. Gao et al. further indicate the set of sentinel frames are not previously known before selecting the sentinel features (“Thus, a candidate key frame whose dominant feature or features have higher weight values in the visual theme model will be ranked higher than another candidate key frame whose dominant features have lower weight values.  Continuing with the Halloween example from above, a candidate key frame that is predominantly black would be ranked higher than a candidate key frame that is predominantly blue.  A candidate key frame that is predominantly black, orange, and perhaps brown would be ranked higher than a candidate key frame that is predominantly black with little,or no orange and brown or one that is predominantly orange with little or no black”, [0023], particular candidate key frame selected may be selected manually or automatically, if automatic, the candidate key frame from the set having the highest rank may be selected, [0024]).

Claim 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Petersohn (US 20080316307 A1) and Geo et al. (US 20110064318 A1) and Kasutani (US 20090066838 A1) and Deng (US 20130259323 A1) as applied to claim 1 above, further in view of Dimitrova et al. (US 6100941 A).

Regarding claim 9, Petersohn and Gao et al. and Kasutani and Deng disclose the method of claim 1. Petersohn and Gao et al. and Kasutani and Deng do not explicitly disclose the sentinel features comprise features determined from decoding the frames of the video.

Dimitrova et al. teach sentinel features comprise features determined from decoding the frames of the video (Compressed signal Xn is decompressed by a decompression circuit 70, and then decoded by an entropy decoder, col. 5, lines 1-10, filters keyframes for similarity or unicolor, col. 5, lines 45-55, col. 5, line 65 – col. 6, line 10, if input starts off as a compressed signal, it must be decompressed, col. 8, lines 20-40, filter thread 84 uses the frame list buffer and composes a frame key list which lists only the frames which have "key" or important characteristics, col. 14, lines 25-40, signatures of key frames of known commercials are extracted and stored in a database, col. 17, lines 50-60, decompression, col. 20, lines 10-25).

Petersohn and Gao et al. and Dimitrova et al. are in the same art of analyzing key frames (Petersohn, abstract; Gao et al., abstract; Dimitrova et al., col. 14, lines 25-40). The combination of Dimitrova et al. with Petersohn and Gao et al. and Kasutani and Deng will enable the frame decoding. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the decoding described by Dimitrova et al. with the invention of Petersohn and Gao et al. and Kasutani and Deng as this was known at the time of filing, the combination would have predictable results, and as Dimitrova et al. indicate if input starts off as a compressed signal, it must be decompressed (col. 8, lines 20-40), and as many inputs arrive encoded, decoding would be obvious to try.

Claims 21-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Petersohn (US 20080316307 A1) and Gao et al. (US 20110064318 A1) and Kasutani (US 20090066838 A1) and Deng (US 20130259323 A1) as applied to claim 1 above, further in view of Prasad et al. (US 20160337691 A1).

Regarding claim 21, Petersohn and Gao et al. and Kasutani and Deng disclose the method of claim 1. Petersohn and Gao et al. further indicate the frame features extracted include one or more of: a color layout descriptor, an edge histogram, on-screen textual markings, on-screen logos, and ticker information (Petersohn, computation of brightness or color histograms for the key-frames, [0134], Gao et al., examples of feature types include color, texture, shapes, and faces among many others, [0021], extracts the common principle color components Dpc for a given a set of images, [0035], each key frame would be ranked according to similarities shared between that key frame and the weighted feature values of the color feature type, [0043]), however, to make this more explicit, another reference is provided herein.

Prasad et al. teach frame features extracted include one or more of: a color layout descriptor, an edge histogram, on-screen textual markings, on-screen logos, and ticker information (“In one embodiment, the video feature detecting module identifies advertisement breaks that occur in a broadcast content in a near real-time by analyzing video frames of the broadcast content for a presence of a sequence of black frames, a scene cut, fades in scenes, advertisement start and end animation frames, a presence or an absence of a channel icon, a shift in a position or a change in a size of the channel icon, a presence of black bands on a top and a bottom, and/or a left and a right of a video frame, size of the black bands, a presence or an absence of text in 

Petersohn and Gao et al. and Prasad et al. are in the same art of analyzing frames (Petersohn, abstract; Gao et al., abstract; Prasad et al., [0030]). The combination of Prasad et al. with Petersohn and Gao et al. and Kasutani and Deng will enable the recognition of ad content. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the ad content recognition of Prasad et al. with the invention of Petersohn and Gao et al. and Kasutani and Deng as this was known at the time of filing, the combination would have predictable results, and as Prasad et al. indicate “However, such prior approaches mandate a high computation power which is required to match the program content with the database of advertisements. Hence, such approaches are not suitable for near real-time applications on devices such as set top boxes and digital video recorders. Further, the database of advertisements has to be refreshed frequently with new advertisements. Updating the database may not be possible for all applications. Accordingly, there remains a need for a system and a method for detecting occurrence of advertisements in a broadcast content with less complexity and more accuracy.” ([0006]) indicating the practical advantages in combination with Petersohn and Gao et al. and Kasutani and Deng.

Regarding claim 22, Petersohn and Gao et al. and Kasutani and Deng disclose the method of claim 1. Petersohn and Gao et al. and Kasutani and Deng do not explicitly disclose the first content type is program content and the second content type is advertising content.



Petersohn and Gao et al. and Prasad et al. are in the same art of analyzing frames (Petersohn, abstract; Gao et al., abstract; Prasad et al., [0030]). The combination of Prasad et al. with Petersohn and Gao et al. and Kasutani and Deng will enable the recognition of ad content. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the ad content recognition of Prasad et al. with the invention of Petersohn and Gao et al. and 

Allowable Subject Matter
Claims 23 and 24 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The closest prior art of US 6100941 A is similar, but the indication “A commercial found counter for each commercial in the database is incremented every time that commercial is detected. If a commercial is not seen within a predetermined period of time (e.g. a month) then the signatures corresponding to that commercial are removed from the database. If two known commercials sandwich a set of frames within a specified period of time (e.g. a minute) then those sandwiched frames are placed in a potential commercial database. These sandwiched frames could represent one or a plurality of commercials. If a subset of these frames matches known commercial frames at least two times, then these potential commercial frames are added to the database of known commercials. In 
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M ENTEZARI HAUSMANN whose telephone number is (571)270-5084.  The examiner can normally be reached on 10-7 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VINCENT M RUDOLPH can be reached on (571)272-8243.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MICHELLE M ENTEZARI/Primary Examiner, Art Unit 2661