DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 06/16/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings were submitted on 06/16/2021.  These drawings are reviewed and accepted by the examiner.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Jellison Jr. et al. ( US 20140214917 A1) in view of Arrasvuori et al. (US 20130290818 A1).

Regarding claims 1, 8, and 15, Jellison Jr. teaches:
“obtaining a first speech audio file scheduled for transmission in a first position of a transmission schedule” (par. 0104; ‘In addition to slots for music, talk shows, programs, and other primary media content, the master logs and station logs usually include slots designated for voice tracks. Voice track slots can be used, by way of example, for DJ (disc jockey) chatter, announcements, station identification, identification of one or more songs or other media played prior to the voice track slot, and identification of songs or other media scheduled to be played after the voice track slot.’);
“obtaining first metadata (title or other identifier) associated with a first content item scheduled for transmission in a second position of the transmission schedule” (par. 0121; ‘Segue editor 2300 includes a first waveform 2305 representing a media item scheduled in a media item slot immediately preceding the empty voice track slot. The title or other identifier 2307 of the song represented by waveform 2305 is shown near the bottom left side of segue editor 2300.’); and
“generating a graphical user interface (GUI) displaying a list of items scheduled for transmission” (par. 0124; ‘Referring next to FIG. 24, another example of a display 2400, which can be used to assist custom voice track recording, is discussed according to various embodiments of the present disclosure. Display 2400 includes a master log portion 2410, a multisite voice track portion 2430, and a voice track editor 2460. Master log portion 2410 shows two voice track slots, 2411 and 2413. Voice track slot 2411 is shaded using a solid fill, which represent green highlighting.’) including:
“a first identifier representing the first speech audio file in the first position” and “a second identifier representing the first content item in the second position” (par. 0124; ‘Referring next to FIG. 24, another example of a display 2400, which can be used to assist custom voice track recording, is discussed according to various embodiments of the present disclosure. Display 2400 includes a master log portion 2410, a multisite voice track portion 2430, and a voice track editor 2460. Master log portion 2410 shows two voice track slots, 2411 and 2413. Voice track slot 2411 is shaded using a solid fill, which represent green highlighting.’).
However, Jellison Jr. does not expressly teach:
“obtaining a first transcript of the first speech audio file”;
“comparing the first metadata to at least a portion of the first transcript to determine whether any of the first metadata matches the at least a portion of the first transcript”;
“in response to determining that at least a portion of the first metadata matches at least a portion of the first transcript creating a linkage between the first speech audio file and the first content item”.
In a similar field of endeavor (Arrasvuori teaches:
“obtaining a first transcript of the first speech audio file” (par. 0035; ‘By way of example, the system 100 retrieves one or more voice recordings of the user, the user's contact, celebrities (e.g., Dr. Martin Luther King, a US Presidential candidate, Warren Buffett, Steve Jobs, etc.), etc. to generate a vocal transition. The information of the voice recordings may be obtained from the metadata thereof to match with the metadata of the two songs and/or the name of a new channel. In another embodiment, the user says the title and artist of the next song, and the system 100 makes the recording into a vocal recording.’; par. 0131; ‘Examples of ASICs include graphics accelerator cards for generating images for display 614, cryptographic boards for encrypting and decrypting messages sent over a network, speech recognition, and interfaces to special external devices, such as robotic arms and medical scanning equipment that repeatedly perform some complex sequence of operations that are more efficiently implemented in hardware.’);
“comparing the first metadata to at least a portion of the first transcript to determine whether any of the first metadata matches the at least a portion of the first transcript” (par. 0035; ‘The information of the voice recordings may be obtained from the metadata thereof to match with the metadata of the two songs and/or the name of a new channel.’);
“in response to determining that at least a portion of the first metadata matches at least a portion of the first transcript creating a linkage between the first speech audio file and the first content item” (par. 0035; ‘The information of the voice recordings may be obtained from the metadata thereof to match with the metadata of the two songs and/or the name of a new channel.’).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify Jellison Jr.’s music programming which includes voice tracks and songs by incorporating Arrasvuori’s speech recognition feature in order to generate a transcript of voice tracks, generate metadata, and link voice tracks to upcoming songs accordingly in a music program. The combination would merely involve applying the well-known speech recognition technology to the voice tracks taught by Jellison Jr.’s, thus producing text from the speech. The combination would provide a way to see whether a voice track is correct for announcing an upcoming song or previous song.

Regarding claims2 (dep. on claim 1), 9 (dep. on claim 8), and 16 (dep. on claim 15), the combination of Jellison Jr. in view of Arrasvuori further teaches:
“receiving the first speech audio file from an external media source” (Jellison Jr.: par. 0143; ‘Once recorded, the voice tracks can be transmitted to the subscribing stations.’); and 
“performing a speech-to-text conversion on the first speech audio file” (Arrasvuori: par. 0131; ‘Examples of ASICs include graphics accelerator cards for generating images for display 614, cryptographic boards for encrypting and decrypting messages sent over a network, speech recognition, and interfaces to special external devices, such as robotic arms and medical scanning equipment that repeatedly perform some complex sequence of operations that are more efficiently implemented in hardware.’).

Regarding claims 3 (dep. on claim 1), 10 (dep. on claim 8), and 17 (dep. on claim 15), the combination of Jellison Jr. in view of Arrasvuori further teaches:
“wherein the first metadata includes: at least one of a song title, an artist name, a genre, or an album name” (Arrasvuori: par. 0067; ‘In one embodiment, the vocal module 207 converts text of the metadata of the current song/channel and the next song/channel into speech for a virtual DJ to announce the vocal transition including the titles and artists of the songs as follows: "That was Dancing Queen by ABBA, next coming up is channel "Funky 80s" starting with "Sign o' the Times" by Prince".’).

Regarding claims 4 (dep. on claim 3), 11 (dep. on claim 10), and 18 (dep. on claim 17), the combination of Jellison Jr. in view of Arrasvuori further teaches:
“wherein comparing the first metadata to at least a portion of the first transcript includes: determining whether text included in the first transcript matches any of the song title, the artist name, the genre, or the album name included in the first metadata” (Arrasvuori: par. 0035; ‘The information of the voice recordings may be obtained from the metadata thereof to match with the metadata of the two songs and/or the name of a new channel.’).

Regarding claims 5 (dep. on claim 1), 12 (dep. on claim 8), and 19 (dep. on claim 15), the combination of Jellison Jr. in view of Arrasvuori further teaches:
“wherein generating the GUI further includes: configuring the GUI to display a transcript of the first speech audio file” (Arrasvuori: par. 0067; ‘In one embodiment, the vocal module 207 converts text of the metadata of the current song/channel and the next song/channel into speech for a virtual DJ to announce the vocal transition including the titles and artists of the songs as follows: "That was Dancing Queen by ABBA, next coming up is channel "Funky 80s" starting with "Sign o' the Times" by Prince".’).

Regarding claims 6 (dep. on claim 5), 13 (dep. on claim 12), and 20 (dep. on claim 19), the combination of Jellison Jr. in view of Arrasvuori further teaches:
“receiving user input, via the GUI, indicating that the first speech audio file is to be deleted” (Jellison Jr.: par. 0106; ‘As discussed above with respect to FIGS. 1-19, a master log can include various slots, or positions, that are editable by one or more local stations when copied to the local station log.’ Editable slots or positions suggests that audio filed may be deleted.).

Regarding claims 7 (dep. on claim 5) and 14 (dep. on claim 12), the combination of Jellison Jr. in view of Arrasvuori further teaches:
“receiving user input, via the GUI, indicating alterations to the transcript of the first speech audio file; and updating the first speech audio file based on the user input” (Jellison Jr.: par. 0104; ‘The voice track slot in which the voice track is scheduled can be either partially or fully locked, using techniques previously described, to control which local stations are permitted to change the content of the voice track slot.’; par. 0106; ‘As discussed above with respect to FIGS. 1-19, a master log can include various slots, or positions, that are editable by one or more local stations when copied to the local station log.’ Changing transcript content is a form of editing, well-known in the art.).
 
Conclusion
Other pertinent prior art are listed in the PTO-892 for consideration.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARK VILLENA whose telephone number is (571)270-3191. The examiner can normally be reached 10 am - 6pm EST Monday through Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MARK . VILLENA
Examiner
Art Unit 2658



/MARK VILLENA/Examiner, Art Unit 2658