Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on December 10, 2020 and January 03, 2022 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Response to Arguments and Amendments
The amendment filed on June 29, 2022 has been entered. Claims 1-7, 9-17, and 19-22 are pending in the application. Claims 1, 11, and 20 are amended, claims 8 and 18 are cancelled, and claims 21 and 22 are new.
The applicant claims that Christensen does not disclose “detecting a feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows” and “generating a second set of labels identifying the at least one time-window as potential repetitive content within the podcast content responsive to detecting the feature match”. As mentioned in the interview, the examiner agrees with this assertion. The applicant claims that Christensen does not disclose “detecting a text match between at least one of the query text sentences and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content” and “generating a third set of labels identifying at least one audio segment of the podcast content that corresponds to the at least one of the query text sentences as potential repetitive content within the podcast content responsive to detecting the text match”. As mentioned in the interview, the examiner agrees with this assertion.
Applicant’s arguments with respect to the 35 U.S.C. 102 rejections for claims 1-20 have been considered but are moot because the arguments are directed towards amended claim language, addressed on new grounds of rejection below.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
	
Claims 1-4, 6-7, 9-14, 16-17, and 19-21 are rejected under 35 U.S.C. 103 as being unpatentable over Christensen (U.S. Publication No. 20090205000) in view of Wold (U.S. Publication No. 20210357451).
Regarding claim 1, Christensen discloses a method comprising ([0003] – particular methods):
receiving podcast content ([0006] - receive a plurality of broadcast streams [0027] – The terms “broadcast stream”…broadly refer to…podcasts”);
identifying potential repetitive content within the podcast content by performing two or more of a first set of acts, a second set of acts, or a third set of acts, wherein ([0061] - The stream analysis module 320 in FIG. 3 can identify media content transmitted in the broadcast stream 121 in many other ways):
the first set of acts comprises ([0062] - stream analysis module 320):
detecting a fingerprint match between query fingerprint data representing at least one audio segment within the podcast content and reference fingerprint data representing known repetitive content within other podcast content, the other podcast content being different from the podcast content, wherein the known repetitive content comprises content segments that repeatedly appear in the other podcast content ([0062] - the stream analysis module 320 can further analyze the broadcast stream for a digital fingerprint, and/or compare the digital fingerprint of the scanned media content and/or broadcast stream with the digital fingerprints already stored in the media storage database 340),
and responsive to detecting the fingerprint match, generating a first set of labels identifying the at least one audio segment as potential repetitive content within the podcast content ([0062] - If the fingerprint for the media content and/or the broadcast stream match information that is stored in media storage database 340, then the stream analysis module can assign the broadcast stream and/or media segment a unique event identifier, and/or store the unique event identifier with the matching information in the media storage database 340),
the second set of acts comprises ([0065] - media storage database 340):
for each of a plurality of time-windows within the podcast content, determining a set of audio features, wherein each of the sets of audio features include the same audio feature types ([0065] - In FIG. 3, the broadcast scanning system 160 comprises the media storage database 340, which can be configured to store the results from the analysis performed by the stream analysis module 320…In one embodiment, the media storage database 340 stores waveforms associated with the audio of the broadcast stream 121 Such as a song, advertisement, radio and/or television program, and/or an introduction to media content),45(Attorney Docket No. 20-1660-US)
and the third set of acts comprises ([0055] - broadcast scanning module):
generating a transcript of at least a portion of the podcast content ([0055] - the broadcast scanning module scans a transcript of a show transmitted by the alternate broadcast source 130),
dividing the generated transcript into query text sentences ([0061] - Identifying methodologies can comprise: textual string acquisition from broadcast stream 121 with normalization of acquired text),
responsive to identifying potential repetitive content within the podcast content by performing two or more of the first set of acts, the second set of acts, or the third set of acts, selecting, from two or more of the generated first set of labels, the generated second set of labels, or the generated third set of labels, a consolidated set of labels identifying segments of repetitive content within the podcast content ([0062] - broadcast Scanning system 160 is also configured to send to broadcast receiving devices 140 the unique event identifier that corresponds and/or that is associated with the broadcast stream that is being received by the broadcast receiving devices 140);
and responsive to selecting the consolidated set of labels, performing an action ([0062] - broadcast Scanning system 160 is also configured to send to broadcast receiving devices 140 the unique event identifier that corresponds and/or that is associated with the broadcast stream that is being received by the broadcast receiving devices 140).
However, Christensen does not disclose detecting a feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows;
and responsive to detecting the feature match, generating a second set of labels identifying the at least one time-window as potential repetitive content within the podcast content;
detecting a text match between at least one of the query text sentences and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content;
and responsive to detecting the text match, generating a third set of labels identifying at least one audio segment of the podcast content that corresponds to the at least one of the query text sentences as potential repetitive content within the podcast content.
Wold does teach detecting a feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows ([0133] - the feature identification logic 215 performs harmonic pitch class profile analysis using short-time Fourier transform over a set of overlapping time windows);
and responsive to detecting the feature match, generating a second set of labels identifying the at least one time-window as potential repetitive content within the podcast content ([0139] - When comparing the normalized feature vectors to normalized feature vectors of known media content items, similarities may be found even when the unidentified media content item and a known media content item have different tempos);
detecting a text match between at least one of the query text sentences and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content ([0085] - Sounds received may be represented as paths between states of the HMM and multiple paths may represent multiple possible text matches for the same sound. Ultimately, the speech recognition engine outputs text in the form of a sequence of words, text in the form of a sequence of phonemes, or text in the form of a combination of words and phonemes);
and responsive to detecting the text match, generating a third set of labels identifying at least one audio segment of the podcast content that corresponds to the at least one of the query text sentences as potential repetitive content within the podcast content ([0160] - After identifying a match count, the metadata matching logic 230 may deter mine a match score as a percentage of the normalized text matched. For example, the metadata matching logic 230 may divide the match count by the greater of the number of words in either the unidentified media content item or from the known media content item to generate a match percentage score).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed application to have modified Christensen to incorporate the teachings of Wold in order to implement detecting a feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows; and responsive to detecting the feature match, generating a second set of labels identifying the at least one time-window as potential repetitive content within the podcast content; detecting a text match between at least one of the query text sentences and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content; and responsive to detecting the text match, generating a third set of labels identifying at least one audio segment of the podcast content that corresponds to the at least one of the query text sentences as potential repetitive content within the podcast content. Doing so allows unidentified media content to be compared to known media content items for similarities (Wold [0028]).


Regarding claim 2, Christensen in view of Wold teaches all limitations of claim 1, above. Christensen discloses the method, wherein the audio feature types comprise one or more of:
a power of an audio signal, a pitch, an intensity, a rate of speaking, or a timbre ([0065] - The media storage database 340 can further store additional information that provides more detailed information about broadcast stream 121, and Such information comprises without limitation; stream type, signal strength, broadcast clarity, time and/ or date of each broadcast stream transmission, errors in the broadcast stream 121, frequency and/or channel identification, signal anomalies and/or signal fingerprints that are distinctly characteristic of a particular broadcast source 120, station identification, combinations of the same, or the like).
Regarding claim 3, Christensen in view of Wold teaches all limitations of claim 1, above. Christensen discloses the method, wherein the second set of acts further comprises:
for each of the plurality of time-windows, comparing the time-window to each other time- window of the plurality of time-windows ([0054] - The broadcast scanning system 160 can be configured to capture not only the media content from the airwaves and/or internet broadcast streams but also the… duration [0065] - In FIG. 3, the broadcast scanning system 160 comprises the media storage database 340, which can be configured to store the results from the analysis performed by the stream analysis module 320…In one embodiment, the media storage database 340 stores waveforms associated with the audio of the broadcast stream 121 Such as a song, advertisement, radio and/or television program, and/or an introduction to media content),
wherein detecting the feature match is based on the comparing ([0065] - broadcast scanning system 160 comprises the media storage database 340, which can be configured to store the results from the analysis performed by the stream analysis module 320).
Regarding claim 4, Christensen in view of Wold teaches all limitations of claim 1, above. Christensen discloses the method, wherein the third set of acts further comprises:
detecting a speaker match between at least one speaker associated with the at least one of the query text sentences and speaker identifiers associated with at least one of the reference text sentences ([0063] - stream analysis module 320 assigns a unique event identifier to each specific broadcast stream and/ or media segment, and/or stores the unique event identifier as well as the artist/advertiser and/or title of the song/advertisement, the time, date, channel, content identifier, ISCI Code and/or station and/or other relevant data),
wherein generating the third set of labels is further responsive to detecting the speaker match ([0063] - stream analysis module 320 assigns a unique event identifier to each specific broadcast stream and/ or media segment, and/or stores the unique event identifier as well as the artist/advertiser and/or title of the song/advertisement, the time, date, channel, content identifier, ISCI Code and/or station and/or other relevant data).
Regarding claim 6, Christensen in view of Wold teaches all limitations of claim 1, above. Christensen discloses the method, wherein selecting the consolidated set of labels comprises:
determining that two or more of the first set of labels, the second set of labels, or the third set of labels include one or more labels that identify one or more segments of the podcast content as potential repetitive content ([0062] - broadcast Scanning system 160 is also configured to send to broadcast receiving devices 140 the unique event identifier that corresponds and/or that is associated with the broadcast stream that is being received by the broadcast receiving devices 140);
and 47(Attorney Docket No. 20-1660-US)responsive to determining that two or more of the first set of labels, the second set of labels, or the third set of labels include the one or more labels that identify the one or more segments of the podcast content as potential repetitive content, selecting the one or more labels to include in the consolidated set of labels ([0051] - The broadcast receiving system 140 can also be configured to receive a content identifier, a broadcaster media event identifier, a unique event identifier and/or any combination of identifiers from the broadcast scanning system 160).
Regarding claim 7, Christensen in view of Wold teaches all limitations of claim 1, above. Christensen discloses the method, wherein performing the action comprises transmitting the selected consolidated set of labels to a client device ([0007] - device monitoring module configured to transmit the unique identifier to at least one of the plurality of broadcast receiving devices).
Regarding claim 9, Christensen in view of Wold teaches all limitations of claim 1, above. Christensen discloses the method, wherein identifying potential repetitive content within the podcast content by performing two or more of the first set of acts, the second set of acts, or the third set of acts comprises identifying potential repetitive content within the podcast content by performing the first set of acts, the second set of acts, and the third set of acts ([0007] - management module con figured to process the database entry and to assign the media content a unique event identifier, wherein the unique event identifier is unique to a specific instance of the media content in the specific broadcast stream; a media storage database module configured to store in a media storage database at least the unique event identifier and the database entry, wherein the unique event identifier is database linked to the database entry; and a device monitoring module configured to transmit the unique identifier to at least one of the plurality of broadcast receiving devices).
Regarding claim 10, Christensen in view of Wold teaches all limitations of claim 1, above. Christensen discloses the method, wherein the podcast content comprises an advertisement in which a speaker is speaking over background music ([0034] - media content can comprise a song, a video, a program, a show, an advertisement, combi nations of the same, or the like),
wherein identifying potential repetitive content within the podcast content by performing two or more of the first set of acts, the second set of acts, or the third set of acts comprises identifying potential repetitive content within the podcast content by performing the second set of acts and one or more of the first set of acts or the third set of acts ([0007] - management module con figured to process the database entry and to assign the media content a unique event identifier, wherein the unique event identifier is unique to a specific instance of the media content in the specific broadcast stream; a media storage database module configured to store in a media storage database at least the unique event identifier and the database entry, wherein the unique event identifier is database linked to the database entry; and a device monitoring module configured to transmit the unique identifier to at least one of the plurality of broadcast receiving devices),48(Attorney Docket No. 20-1660-US)
wherein the second set of labels includes a label identifying at least one time-window corresponding to the advertisement as potential repetitive content ([0054] - The broadcast scanning system 160 can be configured to capture not only the media content from the airwaves and/or internet broadcast streams but also the… duration).
	Regarding claim 11, Christensen in view of Wold teaches all limitations of claim 1, above. Christensen discloses a non-transitory computer-readable storage medium, having stored thereon program instructions that, upon execution by a processor, cause performance of a set of operations comprising ([0131] - the acts, methods, and pro cesses described herein are implemented within, or using, Software modules (programs) that are executed by one or more general purpose computers. The Software modules may be stored on or within any suitable computer-readable medium):
receiving podcast content ([0006] - receive a plurality of broadcast streams [0027] – The terms “broadcast stream”…broadly refer to…podcasts”);
identifying potential repetitive content within the podcast content by performing two or more of a first set of acts, a second set of acts, or a third set of acts, wherein ([0061] - The stream analysis module 320 in FIG. 3 can identify media content transmitted in the broadcast stream 121 in many other ways):
the first set of acts comprises ([0062] - stream analysis module 320):
detecting a fingerprint match between query fingerprint data representing at least one audio segment within the podcast content and reference fingerprint data representing known repetitive content within other podcast content, the other podcast content being different from the podcast content, wherein the known repetitive content comprises content segments that repeatedly appear in the other podcast content ([0062] - the stream analysis module 320 can further analyze the broadcast stream for a digital fingerprint, and/or compare the digital fingerprint of the scanned media content and/or broadcast stream with the digital fingerprints already stored in the media storage database 340),
and responsive to detecting the fingerprint match, generating a first set of labels identifying the at least one audio segment as potential repetitive content within the podcast content ([0062] - If the fingerprint for the media content and/or the broadcast stream match information that is stored in media storage database 340, then the stream analysis module can assign the broadcast stream and/or media segment a unique event identifier, and/or store the unique event identifier with the matching information in the media storage database 340),
the second set of acts comprises ([0065] - media storage database 340):
for each of a plurality of time-windows of the podcast content, determining a set of audio features, wherein each of the sets of audio features include the same audio feature types ([0065] - In FIG. 3, the broadcast scanning system 160 comprises the media storage database 340, which can be configured to store the results from the analysis performed by the stream analysis module 320…In one embodiment, the media storage database 340 stores waveforms associated with the audio of the broadcast stream 121 Such as a song, advertisement, radio and/or television program, and/or an introduction to media content),45(Attorney Docket No. 20-1660-US)
and the third set of acts comprises ([0055] - broadcast scanning module):
generating a transcript of at least a portion of the podcast content ([0055] - the broadcast scanning module scans a transcript of a show transmitted by the alternate broadcast source 130),
dividing the generated transcript into query text sentences ([0061] - Identifying methodologies can comprise: textual string acquisition from broadcast stream 121 with normalization of acquired text),
responsive to identifying potential repetitive content within the podcast contet by  performing two or more of the first set of acts, the second set of acts, or the third set of acts, selecting, from two or more of the generated first set of labels, the generated second set of labels, or the generated third set of labels, a consolidated set of labels identifying segments of repetitive content within the podcast content ([0062] - broadcast Scanning system 160 is also configured to send to broadcast receiving devices 140 the unique event identifier that corresponds and/or that is associated with the broadcast stream that is being received by the broadcast receiving devices 140);
and responsive to selecting the consolidated set of labels, performing an action ([0062] - broadcast Scanning system 160 is also configured to send to broadcast receiving devices 140 the unique event identifier that corresponds and/or that is associated with the broadcast stream that is being received by the broadcast receiving devices 140).
However, Christensen does not disclose detecting a feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows;
and responsive to detecting the feature match, generating a second set of labels identifying the at least one time-window as potential repetitive content within the podcast content;
detecting a text match between at least one of the query text sentences and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content;
and responsive to detecting the text match, generating a third set of labels identifying at least one audio segment of the podcast content that corresponds to the at least one of the query text sentences as potential repetitive content within the podcast content.
Wold does teach detecting a feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows ([0133] - the feature identification logic 215 performs harmonic pitch class profile analysis using short-time Fourier transform over a set of overlapping time windows);
and responsive to detecting the feature match, generating a second set of labels identifying the at least one time-window as potential repetitive content within the podcast content ([0139] - When comparing the normalized feature vectors to normalized feature vectors of known media content items, similarities may be found even when the unidentified media content item and a known media content item have different tempos);
detecting a text match between at least one of the query text sentences and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content ([0085] - Sounds received may be represented as paths between states of the HMM and multiple paths may represent multiple possible text matches for the same sound. Ultimately, the speech recognition engine outputs text in the form of a sequence of words, text in the form of a sequence of phonemes, or text in the form of a combination of words and phonemes);
and responsive to detecting the text match, generating a third set of labels identifying at least one audio segment of the podcast content that corresponds to the at least one of the query text sentences as potential repetitive content within the podcast content ([0160] - After identifying a match count, the metadata matching logic 230 may deter mine a match score as a percentage of the normalized text matched. For example, the metadata matching logic 230 may divide the match count by the greater of the number of words in either the unidentified media content item or from the known media content item to generate a match percentage score).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed application to have modified Christensen to incorporate the teachings of Wold in order to implement detecting a feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows; and responsive to detecting the feature match, generating a second set of labels identifying the at least one time-window as potential repetitive content within the podcast content; detecting a text match between at least one of the query text sentences and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content; and responsive to detecting the text match, generating a third set of labels identifying at least one audio segment of the podcast content that corresponds to the at least one of the query text sentences as potential repetitive content within the podcast content. Doing so allows unidentified media content to be compared to known media content items for similarities (Wold [0028]).
Regarding claim 12, Christensen in view of Wold teaches all limitations of claim 11, above. Christensen discloses the non-transitory computer-readable storage medium, wherein the audio feature types comprise one or more of: 
a power of an audio signal, a pitch, an intensity, a rate of speaking, or a timbre ([0065] - The media storage database 340 can further store additional information that provides more detailed information about broadcast stream 121, and Such information comprises without limitation; stream type, signal strength, broadcast clarity, time and/ or date of each broadcast stream transmission, errors in the broadcast stream 121, frequency and/or channel identification, signal anomalies and/or signal fingerprints that are distinctly characteristic of a particular broadcast source 120, station identification, combinations of the same, or the like).
Regarding claim 13, Christensen in view of Wold teaches all limitations of claim 11, above. Christensen discloses the non-transitory computer-readable storage medium, wherein the second set of acts further comprises:
for each of the plurality of time-windows, comparing the time-window to each other time- window of the plurality of time-windows ([0054] - The broadcast scanning system 160 can be configured to capture not only the media content from the airwaves and/or internet broadcast streams but also the… duration [0065] - In FIG. 3, the broadcast scanning system 160 comprises the media storage database 340, which can be configured to store the results from the analysis performed by the stream analysis module 320…In one embodiment, the media storage database 340 stores waveforms associated with the audio of the broadcast stream 121 Such as a song, advertisement, radio and/or television program, and/or an introduction to media content),
wherein detecting the feature match is based on the comparing ([0065] - broadcast scanning system 160 comprises the media storage database 340, which can be configured to store the results from the analysis performed by the stream analysis module 320).
Regarding claim 14, Christensen in view of Wold teaches all limitations of claim 11, above. Christensen discloses the non-transitory computer-readable storage medium, wherein the third set of acts further comprises:
detecting a speaker match between at least one speaker associated with the at least one of the query text sentences and speaker identifiers associated with at least one of the reference text sentences ([0063] - stream analysis module 320 assigns a unique event identifier to each specific broadcast stream and/ or media segment, and/or stores the unique event identifier as well as the artist/advertiser and/or title of the song/advertisement, the time, date, channel, content identifier, ISCI Code and/or station and/or other relevant data),
wherein generating the third set of labels is further responsive to detecting the speaker match ([0063] - stream analysis module 320 assigns a unique event identifier to each specific broadcast stream and/ or media segment, and/or stores the unique event identifier as well as the artist/advertiser and/or title of the song/advertisement, the time, date, channel, content identifier, ISCI Code and/or station and/or other relevant data).
Regarding claim 16, Christensen in view of Wold teaches all limitations of claim 11, above. Christensen discloses the non-transitory computer-readable storage medium, wherein selecting the consolidated set of labels comprises:
determining that two or more of the first set of labels, the second set of labels, or the third set of labels include one or more labels that identify one or more segments of the podcast content as potential repetitive content ([0062] - broadcast Scanning system 160 is also configured to send to broadcast receiving devices 140 the unique event identifier that corresponds and/or that is associated with the broadcast stream that is being received by the broadcast receiving devices 140);
and 47(Attorney Docket No. 20-1660-US)responsive to determining that two or more of the first set of labels, the second set of labels, or the third set of labels include the one or more labels that identify the one or more segments of the podcast content as potential repetitive content, selecting the one or more labels to include in the consolidated set of labels ([0051] - The broadcast receiving system 140 can also be configured to receive a content identifier, a broadcaster media event identifier, a unique event identifier and/or any combination of identifiers from the broadcast scanning system 160).
	Regarding claim 17, Christensen in view of Wold teaches all limitations of claim 11, above. Christensen discloses the non-transitory computer-readable storage medium, wherein performing the action comprises transmitting the selected consolidated set of labels to a client device ([0007] - device monitoring module configured to transmit the unique identifier to at least one of the plurality of broadcast receiving devices).
Regarding claim 19, Christensen in view of Wold teaches all limitations of claim 11, above. Christensen discloses the non-transitory computer-readable storage medium, wherein identifying potential repetitive content within the podcast content by performing two or more of the first set of acts, the second set of acts, or the third set of acts comprises identifying potential repetitive content within the podcast content by performing the first set of acts, the second set of acts, and the third set of acts ([0007] - management module con figured to process the database entry and to assign the media content a unique event identifier, wherein the unique event identifier is unique to a specific instance of the media content in the specific broadcast stream; a media storage database module configured to store in a media storage database at least the unique event identifier and the database entry, wherein the unique event identifier is database linked to the database entry; and a device monitoring module configured to transmit the unique identifier to at least one of the plurality of broadcast receiving devices).
Regarding claim 20, Christensen discloses a computing system comprising ([0081] – comprising system 360):
a processor ([0081] - computing system 360 also comprises a central processing unit (CPU) 362, which may comprise a conventional microprocessor and/or baseband chip);
and a non-transitory computer-readable storage medium, having stored thereon program instructions that, upon execution by a processor, cause performance of a set of operations comprising ([0131] - the acts, methods, and pro cesses described herein are implemented within, or using, Software modules (programs) that are executed by one or more general purpose computers. The Software modules may be stored on or within any suitable computer-readable medium):
receiving podcast content ([0006] - receive a plurality of broadcast streams [0027] – The terms “broadcast stream”…broadly refer to…podcasts”);
identifying potential repetitive content within the podcast content by performing two or more of a first set of acts, a second set of acts, or a third set of acts, wherein ([0061] - The stream analysis module 320 in FIG. 3 can identify media content transmitted in the broadcast stream 121 in many other ways):
the first set of acts comprises ([0062] - stream analysis module 320):
detecting a fingerprint match between query fingerprint data representing at least one audio segment within the podcast content and reference fingerprint data representing known repetitive content within other podcast content, the other podcast content being different from the podcast content, wherein the known repetitive content comprises content segments that repeatedly appear in the other podcast content ([0062] - the stream analysis module 320 can further analyze the broadcast stream for a digital fingerprint, and/or compare the digital fingerprint of the scanned media content and/or broadcast stream with the digital fingerprints already stored in the media storage database 340),
and responsive to detecting the fingerprint match, generating a first set of labels identifying the at least one audio segment as potential repetitive content within the podcast content ([0062] - If the fingerprint for the media content and/or the broadcast stream match information that is stored in media storage database 340, then the stream analysis module can assign the broadcast stream and/or media segment a unique event identifier, and/or store the unique event identifier with the matching information in the media storage database 340),
the second set of acts comprises ([0065] - media storage database 340):
for each of a plurality of time-windows of the podcast content, determining a set of audio features, wherein each of the sets of audio features include the same audio feature types ([0065] - In FIG. 3, the broadcast scanning system 160 comprises the media storage database 340, which can be configured to store the results from the analysis performed by the stream analysis module 320…In one embodiment, the media storage database 340 stores waveforms associated with the audio of the broadcast stream 121 Such as a song, advertisement, radio and/or television program, and/or an introduction to media content),45(Attorney Docket No. 20-1660-US)
and the third set of acts comprises ([0055] - broadcast scanning module):
generating a transcript of at least a portion of the podcast content ([0055] - the broadcast scanning module scans a transcript of a show transmitted by the alternate broadcast source 130),
dividing the generated transcript into query text sentences ([0061] - Identifying methodologies can comprise: textual string acquisition from broadcast stream 121 with normalization of acquired text),
responsive to identifying potential repetitive content within the podcast content by performing two or more of the first set of acts, the second set of acts, or the third set of acts, selecting, from two or more of the generated first set of labels, the generated second set of labels, or the generated third set of labels, a consolidated set of labels identifying segments of repetitive content within the podcast content ([0062] - broadcast Scanning system 160 is also configured to send to broadcast receiving devices 140 the unique event identifier that corresponds and/or that is associated with the broadcast stream that is being received by the broadcast receiving devices 140);
and responsive to selecting the consolidated set of labels, performing an action ([0062] - broadcast Scanning system 160 is also configured to send to broadcast receiving devices 140 the unique event identifier that corresponds and/or that is associated with the broadcast stream that is being received by the broadcast receiving devices 140).
However, Christensen does not disclose detecting a feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows;
and responsive to detecting the feature match, generating a second set of labels identifying the at least one time-window as potential repetitive content within the podcast content;
detecting a text match between at least one of the query text sentences and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content;
and responsive to detecting the text match, generating a third set of labels identifying at least one audio segment of the podcast content that corresponds to the at least one of the query text sentences as potential repetitive content within the podcast content.
Wold does teach detecting a feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows ([0133] - the feature identification logic 215 performs harmonic pitch class profile analysis using short-time Fourier transform over a set of overlapping time windows);
and responsive to detecting the feature match, generating a second set of labels identifying the at least one time-window as potential repetitive content within the podcast content ([0139] - When comparing the normalized feature vectors to normalized feature vectors of known media content items, similarities may be found even when the unidentified media content item and a known media content item have different tempos);
detecting a text match between at least one of the query text sentences and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content ([0085] - Sounds received may be represented as paths between states of the HMM and multiple paths may represent multiple possible text matches for the same sound. Ultimately, the speech recognition engine outputs text in the form of a sequence of words, text in the form of a sequence of phonemes, or text in the form of a combination of words and phonemes);
and responsive to detecting the text match, generating a third set of labels identifying at least one audio segment of the podcast content that corresponds to the at least one of the query text sentences as potential repetitive content within the podcast content ([0160] - After identifying a match count, the metadata matching logic 230 may deter mine a match score as a percentage of the normalized text matched. For example, the metadata matching logic 230 may divide the match count by the greater of the number of words in either the unidentified media content item or from the known media content item to generate a match percentage score).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed application to have modified Christensen to incorporate the teachings of Wold in order to implement detecting a feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows; and responsive to detecting the feature match, generating a second set of labels identifying the at least one time-window as potential repetitive content within the podcast content; detecting a text match between at least one of the query text sentences and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content; and responsive to detecting the text match, generating a third set of labels identifying at least one audio segment of the podcast content that corresponds to the at least one of the query text sentences as potential repetitive content within the podcast content. Doing so allows unidentified media content to be compared to known media content items for similarities (Wold [0028]).
Regarding claim 21, Christensen in view of Wold teaches all limitations of claim 1, above. 
However, Christensen does not disclose the method, wherein detecting the feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows comprises detecting the feature match by comparing the determined set of audio features of each time- window of the plurality of time-windows to the respective determined set of audio features for each other time-window of the plurality of time-windows.
Wold does teach the method, wherein detecting the feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows comprises detecting the feature match by comparing the determined set of audio features of each time-window of the plurality of time-windows to the respective determined set of audio features for each other time-window of the plurality of time-windows ([0133] - the feature identification logic 215 performs harmonic pitch class profile analysis using short-time Fourier transform over a set of overlapping time windows [0139] - When comparing the normalized feature vectors to normalized feature vectors of known media content items, similarities may be found even when the unidentified media content item and a known media content item have different tempos).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed application to have modified Christensen to incorporate the teachings of Wold in order to implement the method, wherein detecting the feature match between the set of audio features of at least one time-window of the plurality of time-windows and the set of audio features of at least one other time-window of the plurality of time-windows comprises detecting the feature match by comparing the determined set of audio features of each time- window of the plurality of time-windows to the respective determined set of audio features for each other time-window of the plurality of time-windows. Doing so allows unidentified media content to be compared to known media content items for similarities (Wold [0028]).
Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Christensen (U.S. Publication No. 20090205000) in view of Wold (U.S. Publication No. 20210357451) in view of Henkin (U.S. Publication No. 20110213655).
Regarding claim 5, Christensen in view of Wold teaches all of the limitations as in claim 1, above.
However, Christensen in view of Wold does not teach the method, wherein the third set of acts further comprises:
determining a similarity score between each of the at least one query text sentences and a respective one of the reference text sentences that matches the query text sentence,
wherein generating the third set of labels is further responsive to each of the similarity scores exceeding a predefined threshold similarity.
Henkin does teach the method, wherein the third set of acts further comprises:
determining a similarity score between each of the at least one query text sentences and a respective one of the reference text sentences that matches the query text sentence ([0052] - The statistical distribution of words and phrases on the web page may be determined and scored against a taxonomy of topics stored in a database on a server. A score indicating how related the web page is to each topic in the taxonomy is determined),
wherein generating the third set of labels is further responsive to each of the similarity scores exceeding a predefined threshold similarity ([0643] - for each source, phrase, target combination assert that relevancy threshold is above a pre-defined thresh (configurable by publisher, default is 0.2 for entire system)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Christensen in view of Wold to incorporate the teachings of Henkin in order to implement the method of claim 5, wherein the third set of acts further comprises: determining a similarity score between each of the at least one query text sentences and a respective one of the reference text sentences that matches the query text sentence, wherein generating the third set of labels is further responsive to each of the similarity scores exceeding a predefined threshold similarity. Doing so allows for a degree of relevancy between two items to be determined so that the system can decide whether to link those two items ([Henkin [0052]).
Regarding claim 15, Christensen in view of Wold teaches all of the limitations as in claim 11, above.
However, Christensen in view of Wold does not teach the non-transitory computer-readable storage medium, wherein the third set of acts further comprises:
determining a similarity score between each of the at least one query text sentences and a respective one of the reference text sentences that matches the query text sentence,
wherein generating the third set of labels is further responsive to each of the similarity scores exceeding a predefined threshold similarity.
Henkin does teach the non-transitory computer-readable storage medium, wherein the third set of acts further comprises:
determining a similarity score between each of the at least one query text sentences and a respective one of the reference text sentences that matches the query text sentence ([0052] - The statistical distribution of words and phrases on the web page may be determined and scored against a taxonomy of topics stored in a database on a server. A score indicating how related the web page is to each topic in the taxonomy is determined),
wherein generating the third set of labels is further responsive to each of the similarity scores exceeding a predefined threshold similarity ([0643] - for each source, phrase, target combination assert that relevancy threshold is above a pre-defined thresh (configurable by publisher, default is 0.2 for entire system)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Christensen in view of Wold to incorporate the teachings of Henkin in order to implement the non-transitory computer-readable storage medium, wherein the third set of acts further comprises: determining a similarity score between each of the at least one query text sentences and a respective one of the reference text sentences that matches the query text sentence, wherein generating the third set of labels is further responsive to each of the similarity scores exceeding a predefined threshold similarity. Doing so allows for a degree of relevancy between two items to be determined so that the system can decide whether to link those two items ([Henkin [0052]).
Claim 22 is rejected under 35 U.S.C. 103 as being unpatentable over Christensen (U.S. Publication No. 20090205000) in view of Wold (U.S. Publication No. 20210357451) in view of King (U.S. Publication No. 20110043652).
Regarding claim 22, Christensen in view of Wold teaches all of the limitations as in claim 1, above.
However, Christensen in view of Wold does not teach the method, wherein performing the action comprises performing an action that facilitates a computing device ignoring or removing the segments of repetitive content identified by the selected consolidated set of labels when generating an audio summary of the podcast content or a text summary of the podcast content.
King does teach the method, wherein performing the action comprises performing an action that facilitates a computing device ignoring or removing the segments of repetitive content identified by the selected consolidated set of labels when generating an audio summary of the podcast content or a text summary of the podcast content ([0641] - The text analysis component 1008 may employ auto-summarizing or auto-abstracting functions to automatically generate an abstract of the received audio).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Christensen in view of Wold to incorporate the teachings of King in order to implement the method, wherein performing the action comprises performing an action that facilitates a computing device ignoring or removing the segments of repetitive content identified by the selected consolidated set of labels when generating an audio summary of the podcast content or a text summary of the podcast content. Doing so allows the system to identify the most important information of a file and allows a text filed to be condensed (King [0641]).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Johnson (U.S. Publication No. 20070124144) teaches synthesized interoperable communications. Sharifi (U.S. Patent No. 8484017) teaches identifying media content.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ETHAN DANIEL KIM whose telephone number is (571) 272-1405.  The examiner can normally be reached on Monday - Friday 9:00 - 5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ETHAN DANIEL KIM/
Examiner, Art Unit 2658

/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658