DETAILED ACTION
This action is in response to the initial filing of Application no. 16/875,927 on 05/15/2020.
Claims 1 – 22 are still pending in this application, with claims 1, 19 and 20 being independent.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Allowable Subject Matter
Claims 9 and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims since the prior art fails to teach or suggest in reasonable combination the limitations recited in claim 9. For example, Athineos et al. (US 2007/0282860) discloses that one or more words occur at time offsets within lyrical content associated with known media content ([0070] [0071]), yet fails to teach or suggest determine that one or more words occur at a first time offset within the lyrical content associated with unidentified media content and comparing these one or more words with the one or more words at the first time offset of the lyrical content associated with known media content.
Claim 12 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims since the prior art fails to teach or suggest in reasonable combination the limitations recited in claim 12, “responsive to determining the lyrical similarity satisfies the similarity threshold: determining a set of features of the unidentified media content item; determining a feature similarity between features of the unidentified and known media; and identifying the unidentified media content item as a cover of the known media content item based on the lyrical similarity and feature similarity.”
Claim 21 (with independent claim 22) is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims since the prior art fails to teach or suggest in reasonable combination the limitations recited in claim 21.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1 - 8, 12 – 17 and 19 are  provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1- 21 of copending Application No. 15/862,433 in view of Oztaskent et al. (US 9,507,860) ("Oztaskent") and Oztaskent et al. (US 9,507,860) ("Oztaskent"), and further in view of Ochmanek et al. (US 2017/0109504) (“Ochmanek”) and further in view of Correya et al. (“Large-Scale Cover Song Detection in Digital Music Libraries Using Metadata, Lyrics and Audio Features).
This is a provisional nonstatutory double patenting rejection.

The claim mapping is as follows.
Current Application

1. A method comprising: receiving an unidentified media content item; determining lyrical content associated with the unidentified media content item; determining, by a processing device, a lyrical similarity between the lyrical content associated with the unidentified media content item and additional lyrical content associated with a known media content item of a plurality of known media content items; and identifying, by the processing device, the unidentified media content item as a cover of the known media content item based at least in part on the lyrical similarity, resulting in an identified cover-media content item.

2. The method of claim 1, wherein the lyrical content comprises one or more words that are lyrics of a musical composition.

3. The method of claim 1, wherein determining the lyrical content comprises: determining that the unidentified media content item comprises audio; analyzing the audio of the unidentified media content item to extract one or more words from the audio; and generating the lyrical content using the extracted one or more words, wherein the lyrical content comprises a text representation of the one or more words.

4. The method of claim 3, wherein analyzing the audio comprises performing an automatic speech recognition (ASR) operation on the audio of the unidentified media content item, wherein an output of the ASR operation is text comprising the lyrical content.

5. The method of claim 1, wherein determining the lyrical content comprises: determining that the unidentified media content item comprises audio; analyzing the audio of the unidentified media content item to extract a sequence of phonemes from the audio; and generating the lyrical content using the extracted sequence of phonemes, wherein the lyrical content comprises a representation of the one or more phonemes.

6. The method of claim 1, wherein determining the lyrical similarity between the lyrical content associated with the unidentified media content item and the additional lyrical content associated with a known media content item comprises: comparing the lyrical content associated with the unidentified media content item with the additional lyrical content associated with a known media content item; and generating a similarity score between the lyrical content associated with the unidentified media content item with the additional lyrical content associated with the known media content item.

7. The method of claim 6, wherein the lyrical content associated with the known media content item is stored within a repository of lyrical content for the plurality of known media content items, wherein the repository comprises an inverted index of at least one of sets of words or sets of phonemes generated from transcriptions of lyrical content associated with the plurality of known media content items.

8. The method of claim 6, wherein generating the similarity score between the lyrical content associated with the unidentified media content item and the additional lyrical content associated with a known media content item comprises: determining a match count by counting at least one of a) a number of words and n-grams that match between the lyrical content of the unidentified media content item and the additional lyrical content of the known media content item or b) a number of phonemes and n-grams that match between the lyrical content of the unidentified media content item and the additional lyrical content of the known media content item; and dividing the match count by a greater of a) a number of words or phonemes from the lyrical content of the unidentified media content item or b) a number of words or phonemes from the lyrical content of the known media content item.

9. The method of claim 1, wherein determining the lyrical similarity between the lyrical content associated with the unidentified media content item and the additional lyrical content associated with a known media content item further comprises: identifying one or more words in the lyrical content associated with the unidentified media content item; determining that the one or more words occur at a first time offset within the lyrical content associated with the unidentified media content item; comparing the one or more words in the lyrical content associated with the unidentified media content item with one or more additional words that occur at the first time offset within the additional lyrical content associated with the known media content item; and generating a similarity score between the one or more words and the one or more additional words that occur at the first time offset.

10. The method of claim 1, further comprising: determining a set of features of the unidentified media content item; determining a feature similarity between the set of features of the unidentified media content item and an additional set of features associated with the known media content item; determining a combined similarity score based on the lyrical similarity and the feature similarity; determining whether the combined similarity score meets or exceeds a similarity threshold; and identifying the unidentified media content item as a cover of the known media content item responsive to determining that the combined similarity score meets or exceeds the similarity threshold.

11. The method of claim 10, wherein determining the set of features for the unidentified media content item comprises: generating a plurality of signal-based vectors from the unidentified media content item, wherein the plurality of signal-based vectors represents at least one of pitch, timbre, or rhythm; determining a beat of the unidentified media content item; dividing the unidentified media content item into a plurality of segments; for each beat in the unidentified media content item, determining a plurality of normalized signal-based vectors from the plurality of signal-based vectors for a subset of the plurality of segments; and generating the set of features from the normalized plurality of signal-based vectors.

12. The method of claim 1, further comprising: determining whether the lyrical similarity satisfies a similarity threshold; responsive to determining that the lyrical similarity satisfies the similarity threshold: determining a set of features of the unidentified media content item; determining a feature similarity between the set of features of the unidentified media content item and an additional set of features associated with the known media content item; and identifying the unidentified media content item as a cover of the known media content item based on the lyrical similarity and the feature similarity.

13. The method of claim 1, further comprising: comparing the lyrical content associated with the unidentified media content item with additional lyrical content associated with two or more of the plurality of known media content items; determining similarity values for each of the two or more of the plurality of known media content items based on the comparing, wherein a similarity value represents a similarity between the additional lyrical content associated with one of the two or more known media content items and the lyrical content associated with the unidentified media content item; determining a set of known media content items having similarity values that meet or exceed a similarity threshold; and comparing the set of features of the unidentified media content item to sets of features of each of the known media content items in the set of known media content items.

14. The method of claim 1, further comprising: determining a publishing rights holder of the known media content item; and determining a publishing resource allocation for the identified cover-media content item.

15. The method of claim 1, further comprising: updating metadata of the identified cover-media content item to include cover information that identifies the identified cover-media content item as a cover of the known media content item.

16. The method of claim 1, further comprising: dividing the unidentified media content item into a plurality of segments; generating, for one or more segments of the plurality of segments, a respective digital fingerprint from the segment; comparing each respective digital fingerprint to a plurality of stored digital fingerprints, wherein each of the plurality of stored digital fingerprints is associated with a respective known media content item of the plurality of known media content items; determining, based on the comparing, that at least a threshold amount of the digital fingerprints of the one or more segments fail to match stored digital fingerprints of any the plurality of known media content items; and determining that the unidentified media content item does not correspond to any of the plurality of known media content items, wherein the determining of the lyrical content associated with the unidentified media content item is performed after determining that the unidentified media content item does not correspond to any of the plurality of known media content items.

17. The method of claim 1, further comprising: identifying a geographic location associated with the unidentified media content item based on at least one of a) the geographic location associated with a user account that uploaded the unidentified media content item or b) metadata of the unidentified media content item; and determining one or more spoken languages associated with the geographic location; wherein determining the lyrical content associated with the unidentified media content item comprises, for each spoken language of the one or more spoken languages associated with the geographic location, processing the unidentified media content item using a machine learning model trained to perform speech recognition on audio content comprising speech in the spoken language.

18. The method of claim 1, further comprising: determining timing information of at least one of words or phonemes in the lyrical content associated with the unidentified media content item; generating a first cross-similarity matrix between words or phonemes at timing offsets from the unidentified media content and additional words or additional phonemes at additional timing offsets from the known media content item; determining one or more musical features representing at least one of pitch, timbre or rhythm from the media content item; generating, for at least one of the one or more musical features, an additional cross- similarity matrix between the musical features at timing offsets from the unidentified media content item and additional musical features at additional timing offsets from the known media content item; determining a similarity score between the unidentified media content item and the known media content item based on the first cross-similarity matrix and the additional cross- similarity matrix; and determining that the similarity score meets or exceeds a similarity threshold.

19. A non-transitory computer readable medium comprising instructions that, when executed by a processing device, cause the processing device to perform operations comprising: receiving an unidentified media content item; determining lyrical content associated with the unidentified media content item; determining, by a processing device, a lyrical similarity between the lyrical content associated with the unidentified media content item and additional lyrical content associated with a known media content item of a plurality of known media content items from a media content repository; determining one or more musical features representing at least one of pitch, timbre or rhythm from the media content item; determining, by the processing device, a musical similarity between the one or more musical features of the unidentified media content item and the known media content item; and identifying, by the processing device, the unidentified media content item as a cover of the known media content item based on the lyrical similarity and the musical similarity, resulting in an identified cover-media content item.

20. A system comprising: a first computing device comprising a media content sharing platform, wherein the first computing device is to: receive an unidentified media content item uploaded to the media content sharing platform; determine lyrical content associated with the unidentified media content item; determine that a lyrical similarity between the lyrical content associated with the unidentified media content item and additional lyrical content associated with a known media content item of a plurality of known media content items exceeds a similarity threshold; and responsive to determining that the lyrical similarity exceeds the similarity threshold, perform at least one of a) send the media content item to a second computing device connected to the first computing device via a network, or b) extract one or more features of the unidentified media content item and send at least one of the one or more features or the lyrical content to the second computing device.

21. The system of claim 20, further comprising: the second computing device, to: receive at least one of the unidentified media content item, the lyrical content or the one or more features; determine an additional similarity between the one or more features of the unidentified media content item and the known media content item; identify the unidentified media content item as a cover of the known media content item based on the lyrical similarity and the additional similarity, resulting in an identified cover-media content item; and notify the first computing device that the identified cover-media content item is a cover of the known media content item.

22. The system of claim 21, wherein determining that the lyrical similarity exceeds the similarity threshold comprises: the first computing device sending the lyrical content to the second computing device; the second computing device comparing the lyrical content to additional lyrical content of one or more of the plurality of known media items; the second computing device determining the lyrical similarity between the lyrical content associated with the unidentified media content item and the additional lyrical content associated with the known media content item; the second computing device determining that the lyrical similarity meets or exceeds a similarity threshold; and the second computing device reporting at least one of a) the lyrical similarity or b) that the lyrical similarity meets or exceeds the similarity threshold to the first computing device.
Application No. 15/862,433

1. A method comprising: receiving an unidentified media content item; determining a set of features of the unidentified media content item; determining metadata associated with the unidentified media content item; determining, by a processing device, a first similarity between the metadata associated with the unidentified media content item and additional metadata associated with a known media content item of a plurality of known media content items from a media content repository; determining, by the processing device, a second similarity between the set of features of the unidentified media content item and an additional set of features associated with the known media content item; and identifying, by the processing device, the unidentified media content item as a cover of the known media content item based on the first similarity and the second similarity.

2. The method of claim 1, further comprising updating the metadata of the unidentified media content item to include cover information that identifies the unidentified media content item as a cover of the known media content item.

3. The method of claim 1, wherein the metadata comprises a metadata description attribute that describes the unidentified media content item.

4. The method of claim 1, further comprising: comparing the metadata associated with the unidentified media content item with additional metadata associated with two or more of the plurality of known media content items; determining similarity values for each of the two or more of the plurality of known media content items based on the comparing, wherein a similarity value represents a similarity between metadata associated with a known media content item and the metadata associated with the unidentified media content item; determining a set of known media content items having similarity values that meet or exceed a similarity threshold; and comparing the set of features of the unidentified media content item to sets of features of each of the known media content items in the set of known media content items.

5. The method of claim 1, wherein determining the first similarity between the metadata associated with the unidentified media content item and the additional metadata associated with the known media content item of the plurality of known media content items comprises: normalizing the metadata of the unidentified media content item to generate a normalized descriptive text for the unidentified media content item; comparing the normalized descriptive text of the unidentified media content item with normalized descriptive text of the known media content item, wherein the normalized descriptive text of the known media content item is based on the additional metadata associated with the known media content item; generating a first similarity score between the normalized descriptive text of the unidentified media content item and the normalized descriptive text of the known media content item; and determining that the similarity score is above a similarity threshold.

6. The method of claim 5, wherein the normalized descriptive text of the known media content item is stored within a normalized text repository of descriptions for the plurality of known media content items, wherein the normalized text repository comprises an inverted index of normalized words and trigrams generated from textual descriptions from metadata attributes associated with the plurality of known media content items.

7. The method of claim 5, wherein generating the first similarity score between the normalized descriptive text of the unidentified media content item and the normalized descriptive text of the known media content item comprises: determining a match count by counting a number of words and trigrams that match between the normalized descriptive text of the unidentified media content item and the normalized descriptive text of the known media content item; and dividing the match count by a greater of a) a number of words from the normalized descriptive text of the unidentified media content item or b) a number of words from the normalized descriptive text of the known media content item.

8. The method of claim 1, wherein determining the set of features for the unidentified media content item comprises: generating a plurality of signal-based vectors from the unidentified media content item, wherein the plurality of signal-based vectors represents at least one of pitch, timbre, or rhythm; determining a beat of the unidentified media content item; dividing the unidentified media content item into a plurality of segments; for each beat in the unidentified media content item, determining a plurality of normalized signal-based vectors from the plurality of signal-based vectors for a subset of the plurality of segments; and generating the set of features from the normalized plurality of signal-based vectors.

9. The method of claim 1, further comprising: responsive to identifying the unidentified media content item as the cover of the known media content item, which results in an identified cover media content item, storing an entry for the identified cover media content item in a covers content repository, the entry comprising a link between the identified cover media content item and the known media content item, wherein the covers content repository maintains relationships between cover-media content items that have been identified as cover performances of known media content items and the known media content items.

10. The method of claim 9, further comprising: receiving a request for cover-media content items that are covers of the known media content item, wherein the request for the cover-media content items comprises an identifier identifying the known media content item; generating a dataset of cover-media content items that are covers of the known media content item; and sending the dataset to a requestor.

11. The method of claim 10, wherein the request for cover-media content items further comprises an indication of a specified genre; and wherein generating the dataset of cover-media content items from the covers content repository comprises selecting a subset of the covers of the known media content item that have the specified genre.

12. The method of claim 11, wherein the specified genre is different from a genre associated with the known media content item.

13. The method of claim 9, further comprising: receiving a request for cover performers associated with the known media content item; determining, from the covers content repository, cover-media content items that are covers of the known media content item; determining cover performers of the cover-media content items; and providing a dataset comprising the cover performers of the cover-media content items.

14. The method of claim 13, wherein the request comprises a request for top cover performers of the known media content item, the method further comprising: selecting a subset of top cover performers from the determined cover performers of the cover-media content items based on one or more performer metrics, the one or more performance metrics comprising at least one of a number of cover-media content items associated with a specific performer or a total number of views of the cover-media items associated with the specific performer.

15. The method of claim 9, further comprising: generating a dataset of cover-media content items that are covers of the known media content item from the covers content repository; determining a rights holder for the known media content item; and determining a resource allocation for the rights holder based upon the dataset of cover-media content items.

16. The method of claim 1, further comprising: determining a dynamic threshold based on the first similarity; determining whether the second similarity exceeds the dynamic threshold; and identifying the unidentified media content item as the cover of the known media content item responsive to determining that the second similarity exceeds the dynamic threshold.

17. A method comprising: receiving an unidentified media content item; determining a set of features of the unidentified media content item; determining, by a processing device, a first similarity between the set of features of the unidentified media content item and an additional set of features associated with a known media content item of a plurality of known media content items from a media content repository; identifying, by the processing device, the unidentified media content item as a cover of the known media content item based on the first similarity, resulting in an identified cover-media content item; determining a publishing rights holder of the known media content item; and determining a publishing resource allocation for the identified cover-media content item.

18. The method of claim 17, further comprising updating metadata of the unidentified media content item to include cover information that identifies the unidentified media content item as a cover of the known media content item.

19. The method of claim 17, wherein determining the set of features for the unidentified media content item comprises: generating a plurality of signal-based vectors from the unidentified media content item, wherein the plurality of signal-based vectors represents at least one of pitch, timbre, or rhythm; determining a beat of the unidentified media content item; dividing the unidentified media content item into a plurality of segments; for each beat in the unidentified media content item, determining a plurality of normalized signal-based vectors from the plurality of signal-based vectors for a subset of the plurality of segments; and generating the set of features from the normalized plurality of signal-based vectors.

20. The method of claim 17, further comprising: responsive to identifying the unidentified media content item as the cover of the known media content item, which results in an identified cover media content item, storing an entry for the identified cover-media content item in a covers content repository, the entry comprising a link between the identified cover media content item and the known media content item, wherein the covers content repository maintains relationships between cover-media content items that have been identified as cover performances of known media content items and the known media content items.

21. A system comprising: a memory; and a processing device operatively coupled with the memory to: receive an unidentified media content item; determine a set of features of the unidentified media content item; determine metadata associated with the unidentified media content item; determine a first similarity between the metadata associated with the unidentified media content item and additional metadata associated with a known media content item of a plurality of known media content items from a media content repository; determine a second similarity between the set of features of the unidentified media content item and an additional set of features associated with the known media content item; and identify the unidentified media content item as a cover of the known media content item based on the first similarity and the second similarity.


As shown above, claims  1- 21 of Application no. 15/862,433 in combination recite the limitations of claims 1 - 8, 12 – 17 and 19 of the currently pending application, except for determining a lyrical content and lyrical similarity. However, as discussed below, Oztaskent et al. (US 9,507,860) ("Oztaskent") and Oztaskent et al. (US 9,507,860) ("Oztaskent") in view of Ochmanek et al. (US 2017/0109504) (“Ochmanek”) disclose determining lyrical content and lyrical content similarity between an unidentified media content item and a known media content item. Furthermore, Correya et al. (“Large-Scale Cover Song Detection in Digital Music Libraries Using Metadata, Lyrics and Audio Features) discloses the use of metadata and lyrics in combination with audio features to detect cover songs (whole document). Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s failing that claims 1 – 8, 12 – 17 and 19 of the currently pending application are obvious variants of claims 1 – 21 of Application no. 15/862,433 in combination with Oztaskent, Ochmanek and Correya to deter

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.



Claim 13 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 13 recites “comparing the set of features.” There is insufficient antecedent basis for this limitation in the claims.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1- 4 are rejected under 35 U.S.C. 102(a) (1) as being anticipated by Oztaskent et al. (US 9,507,860) ("Oztaskent").
As to claim 1, Oztaskent teaches a method (Abstract) comprising: receiving an unidentified media content item (performance including a singer singing a song, Fig.4A, 405; column 8 lines 5 - 19); determining lyrical content (text including one or more words that are sung) associated with the unidentified media content item (Fig.4A, 410 column 8 lines 20 – 31); determining, by a processing device (content identification hardware, Fig.1, 130; column 4 lines 15 – 20,  column 10 lines 28 – 33), a lyrical similarity between the lyrical content associated with the unidentified media content item and additional lyrical content associated with a known media content item of a plurality of known media content items (Fig.4A, 425 and 430; column 8 lines 55 - 61); and identifying, by the processing device, the unidentified media content item as a cover of the known media content item based at least in part on the lyrical similarity, resulting in an identified cover-media (unknown performance of known content, column 1 lines 29 - 43) content item (Fig.4A and Fig.4B, C; column 4 lines 47 – column 5 line 2; column 8 lines 62 – column 9 lines 11, 57 – column 10 line 27).

As to claim 2, Oztaskent further teaches, wherein the lyrical content comprises one or more words that are lyrics of a musical composition (column 4 lines 47 – 54 and column 8 lines 20 - 23).
As to claim 3, Oztaskent further teaches, wherein determining the lyrical content comprises: determining that the unidentified media content item comprises audio; analyzing the audio of the unidentified media content item to extract one or more words from the audio; and generating the lyrical (text is obtained using speech-to – text translation, column 8 lines 20 – 27).
As to claim 4, Oztaskent further teaches, herein analyzing the audio comprises performing an automatic speech recognition (ASR) operation on the audio of the unidentified media content item, wherein an output of the ASR operation is text comprising the lyrical content (text is obtained using speech-to – text translation, column 8 lines 20 – 27).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 5 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over Oztaskent et al. (US 9,507,860) ("Oztaskent") in view of Kruspe et al. (“Retrieval of Song Lyrics from Sung Queries”).
For claim 5, Oztaskent further discloses wherein determining the lyrical content comprises: determining that the unidentified media content item comprises audio (column 8 lines 5 – 27), yet fails to teach analyzing the audio of the unidentified media content item to extract a sequence of phonemes from the audio; and generating the lyrical content using the extracted sequence of phonemes, wherein the lyrical content comprises a representation of the one or more phonemes.
However, Kruspe discloses a method for retrieving lyrics of a sung recording (Abstract), wherein a sequence of phonemes are extracted from audio to generate lyrical content, and further wherein the lyrical content comprises a representation of  one or more phonemes (4.Proposed Approach and 4.1 Symbolic Mapping, pg.112 and 113).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify Oztaskent’s teachings with Kruspe’s teachings so that the lyrical content is further generated by analyzing the unidentified media content item to extract a sequence of phonemes from the audio, wherein the lyrical content comprises a representation of the one or more phonemes for the purpose increasing user satisfaction by enabling the retrieval of music information using sung recording (Kruspe, Abstract).

For claim 6, Oztaskent further discloses comparing the lyrical content associated with the unidentified media content item with the additional lyrical content associated with a known media content item (column 4 lines 47 – 54; column 8 lines 55 – 61), yet fails to teach generating a similarity score between the lyrical content associated with the unidentified media content item with the additional lyrical content associated with the known media content item.
However, Kruspe discloses a method for retrieving lyrics of a sung recording (Abstract), wherein similarity score (distance calculation) between lyrical content associated with a received audio sequence and an additional lyrical content associated with known media content is generated (4.Proposed Approach and 4.2 Distance calculation, pg. 112 - 114).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve Oztaskent’s invention in the same way that Kruspe’s invention has been improved to achieve the predictable results of the similarity criterion (Oztaskent, column 4 lines 47 – 54) further comprising generating a similarity score between the lyrical content associated with the unidentified media content item with the additional lyrical content associated with the known media content item for the purpose of identifying content using a cover version of the content (Oztaskent, column 1 lines 10 – 45).

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Oztaskent et al. (US 9,507,860) ("Oztaskent") in view of Kruspe et al. (“Retrieval of Song Lyrics from Sung Queries”) and further in view of Pan et al. (CN 1038859 A) (“Pan”).
For claim 7,  the combination of Oztaskent and Kruspe further disclose, wherein the lyrical content associated with the known media content item is stored within a repository of lyrical content for the plurality of known media content items (Oztaskent, column 4 lines 47 – 54), wherein the repository comprises an index of at least one of sets of words or sets of phonemes generated from transcriptions of lyrical content associated with the plurality of known media content items (column 4 lines 47 – 54), yet fails to teach that the index is an inverted index.
However, Pan discloses a song search system (Abstract), wherein song lyrics are stored in an inverted index ([0045 - 0051] [0058]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify the combined teachings of Oztaskent’s and Kruspe with Pan’s teachings so that the index is an inverted index for the purpose of identifying content using a cover version of the content (Oztaskent, column 1 lines 10 – 45).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Oztaskent et al. (US 9,507,860) ("Oztaskent") in view of Kruspe et al. (“Retrieval of Song Lyrics from Sung Queries”) and further in view of Wold et al. (US 2019/0205467) (“Wold”).
For claim 8, the combination of Oztaskent and Kruspe fails to teach, wherein generating the similarity score between the lyrical content associated with the unidentified media content item and the additional lyrical content associated with a known media content item comprises: determining a match count by counting at least one of a) a number of words and n-grams that match between the lyrical content of the unidentified media content item and the additional lyrical content of the known media content item or b) a number of phonemes and n-grams that match between the lyrical content of the unidentified media content item and the additional lyrical content of the known media content item; and dividing the match count by a greater of a) a number of words or phonemes from the lyrical content of the unidentified media content item or b) a number of words or phonemes from the lyrical content of the known media content item.
	However, Wold discloses a method for identifying a music cover (Abstract), wherein a similarity score is generated by determining match count by counting a number of words and trigrams between text associated with an unidentified media content and text associated with a known media content and dividing the match count by a greater of number of words from the text of the unidentified media content and a number of words of the text of the known media content.
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve the invention disclosed by the combination of Oztaskent and Kruspe in the same way that Wold’s invention has been improved to achieve the following predictable results for the purpose of identifying content using a cover version of the content (Oztaskent, column 1 lines 10 – 45): determining a match count by counting at least one of a) a number of words and n-grams that match between the lyrical content (text) of the unidentified media content item and the additional lyrical content (text) of the known media content item or b) a number of phonemes and n-grams that match between the lyrical content of the unidentified media content item and the additional lyrical content of the known media content item; and dividing the match count by a greater of a) a number of words or phonemes from the lyrical content of the unidentified media content item or b) a number of words or phonemes from the lyrical content of the known media content item.

Claims 10 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Oztaskent et al. (US 9,507,860) ("Oztaskent") in view of Wold et al. (US 2019/0205467) (“Wold”).
For claim 10, Oztaskent further discloses determining a musical representation of the unidentified media content (column 8 lines 32 – 41); and determining a similarity between the musical representation of the unidentified media content item and an additional musical representation associated with the known media content item (column 5 lines 4 – 35 and column 8 lines 42- 54). Yet, Oztaskent fails to teach the following: the musical representation further comprises a set of features; determining a combined similarity score based on the lyrical similarity and the feature similarity; determining whether the combined similarity score meets or exceeds a similarity threshold; and identifying the unidentified media content item as a cover of the known media content item responsive to determining that the combined similarity score meets or exceeds the similarity threshold.
However, Wold discloses a method for identifying a music cover (Abstract), wherein a set of an unidentified media content item are determined ([0062 – 0064] [0081- 0091]); a feature similarly between the set of features of the unidentified media content item and an additional set of features associated with the known media content item is determined ([0065] [0099 – 0105]); a combined similarity score based on a text similarity ([0092 – 0096]) and feature similarity is determined ([0107] [0108]); and the unidentified media content is identified as a cover of the known media content responsive to determining that the combined similarity score meets or exceeds a similarity threshold (similarity scores are compared to thresholds, [0107] [0108]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve Oztaskent’s invention in the same way Wold’s invention has been improved to achieve the following predictable results for the purpose of identifying content using a cover version of the content (Oztaskent, column 1 lines 10 – 45): the musical representation further comprises a set of features; further and alternatively determining a combined similarity score based on the lyrical similarity and the feature similarity; further and alternatively determining whether the combined similarity score meets or exceeds a similarity threshold; and further and alternatively identifying the unidentified media content item as a cover of the known media content item responsive to determining that the combined similarity score meets or exceeds the similarity threshold.

For claim 11, Wold further discloses  determining the set of features for the unidentified media content item comprises: generating a plurality of signal-based vectors from the unidentified media content item, wherein the plurality of signal-based vectors represents at least one of pitch, timbre, or rhythm (Wold, [0082 - 0084]); determining a beat of the unidentified media content item (Wold, [0085]); dividing the unidentified media content item into a plurality of segments (Wold, [0086]); for each beat in the unidentified media content item, determining a plurality of normalized signal-based vectors from the plurality of signal-based vectors for a subset of the plurality of segments (Wold, [0087]); and generating the set of features from the normalized plurality of signal-based vectors ([0091]).

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Oztaskent et al. (US 9,507,860) ("Oztaskent") in view of Kruspe et al. (“Retrieval of Song Lyrics from Sung Queries”) and further in view of Ellis et al. (US 2013/0226957) (“Ellis”).
For claim 13, Oztaskent further discloses comparing the lyrical content associated with the unidentified media content item with additional lyrical content associated with two or more of the plurality of known media content items (Oztaskent, column 4 lines 47 – column 5 line 3; column 8 lines 55 -  67). Yet, Oztaskent fails to teach the following: determining similarity values for each of the two or more of the plurality of known media content items based on the comparing, wherein a similarity value represents a similarity between the additional lyrical content associated with one of the two or more known media content items and the lyrical content associated with the unidentified media content item; determining a set of known media content items having similarity values that meet or exceed a similarity threshold; and comparing the set of features of the unidentified media content item to sets of features of each of the known media content items in the set of known media content items.
However, Kruspe discloses a method for retrieving lyrics of a sung recording (Abstract), wherein similarity scores (distance calculation) between lyrical content associated with a received audio sequence and additional lyrical content associated with known media contents are generated (4.Proposed Approach and 4.2 Distance calculation, pg. 112 - 114).
Moreover, Ellis discloses method for identifying similar songs (Abstract), wherein a set features of an unidentified media content item is compared to sets of features of each of known media content items in a set of known media content items ([0041] [0044] [0053 – 0061]). Additionally, a set of known media content items having similarity values that meet or exceed a similarity threshold are determined ([0113] [0114]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify Oztaskent’s teachings with Kruspe’s and Ellis’ teachings so that the method further comprises the following for the purpose of identifying content using a cover version of the content (Oztaskent, column 1 lines 10 – 45): determining similarity values (distance calculation) for each of the two or more of the plurality of known media content items based on the comparing, wherein a similarity value represents a similarity between the additional lyrical content associated with one of the two or more known media content items and the lyrical content associated with the unidentified media content item; determining a set of known media content items having similarity values that meet or exceed a similarity threshold; and comparing the set of features of the unidentified media content item to sets of features of each of the known media content items in the set of known media content items.

Claims 14 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Oztaskent et al. (US 9,507,860) ("Oztaskent") in view of Ochmanek et al. (US 2017/0109504) (“Ochmanek”).
For claim 14, Oztaskent fails to teach: determining a publishing rights holder of the known media content item; and determining a publishing resource allocation for the identified cover-media content item.
However, Ochmanek discloses a database processing system (Abstract), wherein a database stores data associated with copyrighted work including songs with a list of rights holders and associated share of revenue ([0052] [0054] [0055] [0114] [0115]). Furthermore, the database stores a list of possible methods of reuse of the copyrighted work including licensing terms and revenue shares  between owners ([0053] [0059] [0062] [0109] [0110 – 0113]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify Oztaskent’s teachings with Ochmanek’s teachings so that the song database (Oztaskent, column 5 lines 35 – 49) further comprises information about a publishing rights (copyright) holder of the songs (known media content item) and resource allocations (reuse information) for a cover -media content item for the purpose of determining the publishing rights and publishing resource allocations to uphold intellectual property rights, including providing royalty payments to rightsholders (Ochmanek, [0040 – 0042]).

	For claim 15, Oztaskent fails to teach: updating metadata of the identified cover-media content item to include cover information that identifies the identified cover-media content item as a cover of the known media content item.
However, Ochmanek discloses a database processing system (Abstract), wherein a database stores data associated with copyrighted work including songs with a list of rights holders and associated share of revenue ([0052] [0054] [0055] [0114] [0115]). Furthermore, the database stores updates which identifies a song as a cover of known song ([0115] [0116]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify Oztaskent’s teachings with Ochmanek’s teachings so that the song database (Oztaskent, column 5 lines 35 – 49) further comprises information about a publishing rights (copyright) holder of the songs (known media content item) and resource allocations (reuse information) for a cover -media content item, wherein metadata for the identified cover-media (song) is updated to include to include cover information that identifies the identified cover-media content item as a cover of the known media content item for the purpose of determining the publishing rights and publishing resource allocations to uphold intellectual property rights, including providing royalty payments to rightsholders (Ochmanek, [0040 – 0042]).

Claim 16 is  rejected under 35 U.S.C. 103 as being unpatentable over Oztaskent et al. (US 9,507,860) ("Oztaskent") in view of Hedgecock (US 2017/0024441).
For claim 16, Oztaskent further discloses generating, for one or more segments, a respective fingerprint from the segment (column 8 lines 33 – 41); comparing each respective fingerprint to a plurality of stored fingerprints, wherein each of the plurality of stored digital fingerprints is associated with a respective known media content item of the plurality of known media content items (column 5 lines 4 – 25; column 8 lines 32 – 49); determining, based on the comparing, that at least a threshold amount of the digital fingerprints of the one or more segments fail to match stored fingerprints of any the plurality of known media content items (column 8 lines 50 – 54); and determining that the unidentified media content item does not correspond to any of the plurality of known media content items (column 8 lines 50 – 54), wherein the determining of the lyrical content associated with the unidentified media content item is performed after determining that the unidentified media content item does not correspond to any of the plurality of known media content items (Fig.4, 420, 425, 430; column 8 lines 43 – 62). Yet, Oztaskent fails to teach dividing the unidentified media content item into a plurality of segments to generate digital fingerprints, wherein the digital fingerprints are used to determine that the unidentified media content item does not correspond to any of the plurality of known media content items.
However, Hedgecock discloses a method for detecting and identifying songs in a continuous audio stream (Abstract), wherein media content is first divided into segments before generating digital fingerprints to perform song recognition ([0009] [0010] [0022] [0032]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify Oztaskent’s teachings with Hedgecock’s teachings so that the e unidentified media content item is further divided into a plurality of segments to generate digital fingerprints, wherein the digital fingerprints are used to determine that the unidentified media content item does not correspond to any of the plurality of known media content items for the purpose of identifying content using a cover version of the content (Oztaskent, column 1 lines 10 – 45).

Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Oztaskent et al. (US 9,507,860) ("Oztaskent") in view of Weinstein et al. (US 2015/0039299) (“Weinstein”) and further in view of Velusamy et al. (US 2008/0147826) (“Velusamy”).
For claim 17, Oztaskent fails to teach: identifying a geographic location associated with the unidentified media content item based on at least one of a) the geographic location associated with a user account that uploaded the unidentified media content item or b) metadata of the unidentified media content item; and determining one or more spoken languages associated with the geographic location; wherein determining the lyrical content associated with the unidentified media content item comprises, for each spoken language of the one or more spoken languages associated with the geographic location, processing the unidentified media content item using a machine learning model trained to perform speech recognition on audio content comprising speech in the spoken language.
However, Weinstein discloses a context based speech recognition system and method (Abstract), wherein context information including a geographic location of a source is used by a neural network to perform speech recognition on audio data by determining a language/dialect associated with geographic location, wherein the neural network is trained to process speech comprising multiple languages and accent/dialects ([0035] [0036] [0078 – 0083] [0085 – 0093]).
Additionally, Velsumy discloses a networked media recording system (Abstract), wherein a geographic location from which a media stream originated is metadata associated with the stream of media ([0033]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve Oztaskent’s invention in the same way that Weinstein’s and Velsumy’s inventions have been improved to achieve the following predictable results for the purpose of improving the speech recognition process used to identify content using a cover version of the content (Oztaskent, column 1 lines 10 – 45 and column 8 lines 20 - 26): further identifying context information, e.g. a geographic location associated with the source including  the unidentified media content item based on at least one of a) the geographic location associated with a user account that uploaded the unidentified media content item or b) metadata of the unidentified media content item; and determining one or more spoken languages associated with the geographic location; wherein determining the lyrical content associated with the unidentified media content item comprises, for each spoken language of the one or more spoken languages associated with the geographic location, processing the unidentified media content item using a machine learning model trained to perform speech recognition on audio content comprising speech in the spoken language.

Claims 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Oztaskent et al. (US 9,507,860) ("Oztaskent") in view of Ellis et al. (US 2013/0226957) (“Ellis”).
For claim 19, Oztaskent discloses a non-transitory computer readable medium comprising instructions that, when executed by a processing device, cause the processing device to perform operations comprising (Abstract; column 11 lines 7 - 25): receiving an unidentified media content item (performance including a singer singing a song, Fig.4A, 405; column 8 lines 5 - 19); determining lyrical content  (text including one or more words that are sung) associated with the unidentified media content item (Fig.4A, 410 column 8 lines 20 – 31); determining, by a processing device (content identification hardware, Fig.1, 130; column 4 lines 15 – 20,  column 10 lines 28 – 33), a lyrical similarity between the lyrical content associated with the unidentified media content item and additional lyrical content associated with a known media content item of a plurality of known media content items from a media content repository (Fig.4A, 425 and 430; column  4 lines 47 – column 5 line 3 and column 8 lines 55 - 61); determining a musical representation (column 8 lines 33 – 41); determining, by the processing device, a similarity between the one or more musical representations of the unidentified media content item and the known media content item (column 5 lines 4 – 35 and column 8 lines 42- 54); and identifying, by the processing device, the unidentified media content item as a cover of the known media content item based on the lyrical similarity and the musical representation similarity, resulting an in identified cover-media content item (column 9 lines 1 – 10, 57 – column 10 line 27). Yet,  Oztaskent fails to teach that the determining the one or more musical representations further comprises determining one or more musical features representing at least one of pitch, timbre or rhythm from the media content item.
However, Ellis discloses method for identifying similar songs (Abstract), wherein a set of pitch related musical features of an unidentified media content item is compared to sets of pitch related features of each of known media content items in a set of known media content items ([0041] [0044] [0053 – 0061]). 
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify Oztaskent’s teachings with Ellis’ teachings so that the method further comprises the following for the purpose of identifying content using a cover version of the content (Oztaskent, column 1 lines 10 – 45): the determining the one or more musical representations further comprises determining one or more musical features representing at least one of pitch, timbre or rhythm from the media content item.

For claim 20, Oztaskent discloses: a first computing device comprising a media content sharing platform (content identification hardware, Fig.1, 130; column 4 lines 15 – 20,  column 10 lines 28 – 33), wherein the first computing device is to: receive an unidentified media content item uploaded (media delivered from a remote storage device to the content identification hardware, column 3 lines 54 – column 4 line 5) to the media content sharing platform (column 8 lines 5 – 19); determine lyrical content (text including one or more words that are sung) associated with the unidentified media content item (Fig.4A, 410 column 8 lines 20 – 31); and determine a lyrical similarity between a lyrical content associated with the unidentified media content item and additional lyrical content associated with a known media content item (Fig.4A, 425 and 430; column 8 lines 55 - 61); and responsive to determining the lyrical similarity, perform at least one of a) send the media content item to a second computing device connected to the first computing device via a network, or b) extract one or more features of the unidentified media content item (metadata of the unidentified media content item including song title is extracted from a database, column 9 lines 65 – column 10 line 17) and send at least one of the one or more features  (Fig.1, 120, 130 and 240; column 10 line 18 – 27)or the lyrical content to the second computing device.
Yet, Oztaskent fails to teach: determine that the lyrical similarity between the lyrical content associated with the unidentified media content item and additional lyrical content associated with a known media content item of a plurality of known media content items exceeds a similarity threshold; and responsive to determining that the lyrical similarity exceeds the similarity threshold, perform at least one of a) send the media content item to a second computing device connected to the first computing device via a network, or b) extract one or more features of the unidentified media content item and send at least one of the one or more features or the lyrical content to the second computing device.
However, Ellis discloses method for identifying similar songs (Abstract), wherein a set features of an unidentified media content item is compared to sets of features of each of known media content items in a set of known media content items ([0041] [0044] [0053 – 0061]). Additionally, a set of known media content items having similarity values that meet or exceed a similarity threshold are determined ([0113] [0114]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to apply the thresholding technique disclosed by Ellis to the similarity determinization method disclosed by Oztaskent to achieve the predictable results of using a threshold to further determine the lyrical similarity and perform further actions in response this determination for the purpose of identifying content using a cover version of the content (Oztaskent, column 1 lines 10 – 45).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SONIA L GAY whose telephone number is (571)270-1951. The examiner can normally be reached Monday-Friday 9-5 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SONIA L GAY/Primary Examiner, Art Unit 2657