DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Election/Restrictions
Applicant’s election without traverse of claims 1 - 23 in the reply filed on June 7, 2021 is acknowledged.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1, 4, and 18 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 3, 3, and 15, respectfully, of U.S. Patent No. 10,445,052 B2. Although the claims at issue are not identical, they are not patentably distinct from each other because while obvious variations in wording are present claims 3, 3, and 15, respectfully, anticipate all of the limitations required of claim 1, 4, and 18, respectfully, of the instant application. 
In regards to claim 1, U.S. Patent No. 10,445,052 B2 discloses a computer-implemented method comprising: acquiring, by a processor, an audio file (see at least, “receiving an audio file to be used in production of a content-based experience,” claim 1); receiving, by the processor, input indicative of a selection of a transcript is to be associated with which the audio file (see at least, “enabling a user to associate the audio file with a preexisting transcript stored in a database accessible to a media production platform,” claim 1); linking, by the processor, the audio file to the transcript (see at least, “dynamically linking the audio file and the preexisting transcript,” claim 1) on a phoneme level (see at least, “The computer-implemented method of claim 1, wherein said dynamically linking is performed on 
	In regards to claim 4, U.S. Patent No. 10,445,052 B2 discloses a computer-implemented method comprising: acquiring, by a processor, an audio file (see at least, “receiving an audio file to be used in production of a content-based experience,” claim 1); receiving, by the processor, input indicative of a selection of a transcript with which the audio file is to be associated (see at least, “enabling a user to associate the audio file with a preexisting transcript stored in a database accessible to a media production platform,” claim 1); linking, by the processor, the audio file to the transcript (see at least, “dynamically linking the audio file and the preexisting transcript,” claim 1) on a phoneme level (see at least, “The computer-implemented method of claim 1, wherein said dynamically linking is performed on a word level or a phoneme level,” claim 3); posting, by the processor, the transcript and the audio file to an interface for review by an individual (see at least, “posting the preexisting transcript and the audio file to an interface for review by the user,” claim 1); determining, by the processor, that a change has been made to the transcript (see at least, “responsive to a determination that a change has been made to the preexisting transcript,” claim 1); and effecting, by the processor, the change by automatically making a corresponding change to the audio file in real time (see at least, “globally effecting the change by automatically making a corresponding change to the audio file in real time,” claim 1).
	In regards to claim 18, U.S. Patent No. 10,445,052 B2 discloses a computer-implemented method comprising: acquiring, by a processor, an audio file; performing, by the processor, a recognition .

Claims 2, 19, and 21 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 3, 15, and 15, respectfully, of U.S. Patent No. 10,445,052 B2 in view of Sherwani et al. (US 2008/0177536 A1), hereinafter Sherwani.
In regards to claim 2, claim 3 of U.S. Patent No. 10,445,052 B2 discloses the computer-implemented method of claim 1, but does not disclose further comprising: generating, by the processor, the transcript by converting words recognized in the audio file into text. However, Sherwani discloses in Sherwani [0020]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Sherwani in the invention of claim 3 of U.S. Patent No. 10,445,052 B2, thereby allowing for the advantage of automatic conversion.
In regards to claim 19, claim 15 of U.S. Patent No. 10,445,052 B2 discloses computer-implemented method of claim 18, but does not disclose further comprising: generating, by the processor, the transcript by converting words recognized in the audio file into text. However, Sherwani discloses in regards to content A/V content editing generating, by the processor, the transcript by converting words recognized in the audio file into text (see at least, “At step 304, words from the speech are recognized by speech recognizer 110 to form a transcript of the audio segment,” Sherwani [0020]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Sherwani in the invention of claim 15 of U.S. Patent No. 10,445,052 B2, thereby allowing for the advantage of automatic conversion.
In regards to claim 21, claim 15 of U.S. Patent No. 10,445,052 B2 discloses computer-implemented method of claim 18, but does not disclose wherein the first section of the interface includes a word processor that enables the individual to directly edit the transcript. However, Sherwani discloses in regards to content A/V content editing wherein the first section of the interface includes a word processor that enables the individual to directly edit the transcript (see at least, “Transcript section 406 also allows the user to selectively edit words contained therein. For example, a user can edit the words similar to a word processor or a user can selectively add and/or delete letters of words,” Sherwani [0024], FIG. 4). It would have been obvious to one of ordinary skill in the art before the Sherwani in the invention of claim 15 of U.S. Patent No. 10,445,052 B2, thereby allowing for the advantage of easily making changes.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sherwani et al. (US 2008/0177536 A1), hereinafter Sherwani, in view of Phillips et al. (US 2011/0239107 A1), hereinafter Phillips.

Claim 4: Sherwani discloses a computer-implemented method comprising: acquiring, by a processor, an audio file (see at least, “Method 300 begins at step 302 wherein a speech audio segment is accessed,” Sherwani [0020]); receiving, by the processor, input indicative of a selection of a transcript with which the audio file is to be associated (see at least, “At step 304, words from the speech are recognized by speech recognizer 110 to form a transcript of the audio segment,” Sherwani [0020]); linking, by the processor, the audio file to the transcript (see at least, “The words in the transcript are aligned with the speech audio segment at step 306. During alignment, word boundaries within the A/V content 114 are Sherwani [0020]); posting, by the processor, the transcript and the audio file to an interface for review by an individual (see at least, Sherwani FIG. 4, Sherwani [0022]); determining, by the processor, that a change has been made to the transcript; and effecting, by the processor, the change by automatically making a corresponding change to the audio file in real time (see at least, “More specifically, moving or deleting a sequence of contiguous words causes the associated A/V content to be moved or deleted through the use of the word time alignment against the A/V content,” Sherwani [0022]).
Sherwani does not disclose linking on a phoneme level. However, Phillips discloses a similar transcript editor and further discloses linking on a phoneme level (see at least, “FIG. 1 is a flow diagram showing the main steps in the first stage. The system loads audio dialog track 102 of the media to be edited into an audio recognition system that analyzes the speech in the audio dialog track and identifies the phonemes present within the speech (step 104). This process generates phoneme audio track 106, which contains a sequence of the identified phonemes and the time codes of the locations within the audio dialog track where they occur. Note that it is not necessary to recognize the words or understand the text being spoken-detection of the phonemes is sufficient. Such an engine is available from vendors such as Nexidia Inc. of Atlanta Ga., which offers products that can process audio at rates of250 or greater times faster than real time,” Phillips [0014]. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the aforementioned features of Phillips in the invention of Sherwani thereby allowing for more accuracy in the invention of Sherwani.

Claims 7 – 11, 16, and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sherwani in view of Kofman et al. (US 2018/0053510 A1), hereinafter Kofman.

Claim 7: Sherwani discloses a non-transitory computer-readable medium with instructions stored thereon that, when executed by a processor, cause the processor to perform operations (see at least, “Concepts presented herein may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer,” Sherwani [0032], FIG. 6) comprising: acquiring an audio file to be used in the production of a content-based experience (see at least, “Method 300 begins at step 302 wherein a speech audio segment is accessed,” Sherwani [0020]); linking the audio file to a transcript (see at least, “The words in the transcript are aligned with the speech audio segment at step 306. During alignment, word boundaries within the A/V content 114 are identified,” Sherwani [0020]); causing display of the transcript in a first section of an interface (see at least, “FIG. 4 is a user interface 400 for editing A/V content. User interface 400 includes… transcript section 406,” Sherwani [0022]); causing display of the audio file in a second section of the interface (see at least, “FIG. 4 is a user interface 400 for editing A/V content. User interface 400 includes… audio wave forms 404,” Sherwani [0022]); detect input indicative of a modification of content, wherein the modified content is either the transcript or the audio file (see at least, “A user, by editing words in transcript section 406, can alter images 402 as well as audio waveforms 404 automatically,” Sherwani [0022]); and in response to detecting the input, effecting the modification by making a corresponding modification to other content, wherein the other content is whichever of the transcript and the audio file is not the modified content (see at least, “More specifically, moving or deleting a sequence of contiguous words causes the associated A/V content to be moved or deleted through the use of the word time alignment against the A/V content,” Sherwani [0022]).
Sherwani does not disclose monitoring the first and second sections of the interface. However, Kofman discloses in regards to a similar media generating and editing system that it is well-known to continually monitoring first and second windows of a UI (see at least, “In at least some example embodiments, at least some editing functions (for example paragraph and sentence editing, word Kofman [0108]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention that since both windows in the invention of Sherwani are time aligned similar to the invention of Kofman to continually monitor for edits as disclosed by Kofman in either window so that changes can be automatically applied to the other side of the UI i.e., the other window.

Claim 8: Sherwani and Kofman disclose the non-transitory computer-readable medium of claim 7, wherein the operations further comprise: generating the transcript by converting words recognized in the audio file into text (see at least, “Thus, the speech segments can be sent to speech recognizer 110 to recognize words contained therein,” Sherwani [0019]).

Claim 9: Sherwani and Kofman disclose the non-transitory computer-readable medium of claim 7, wherein the operations further comprise: selecting the transcript from amongst multiple transcripts in a database (see at least, “As shown in FIG. 9, each of the edit data 228 files in displayed list 94 are identified by a title field 906 (for example "US Gold Medal Winner", which corresponds to the interview data illustrated in FIGS. 5 to 7), and include the following associated status fields: (1) creation/edit date field 910 which indicates when the edit file 228 was first created and last edited in media editing system 102 (for example the field may display "Created 15 days ago Updated 3 days ago); and (2) Transcription/Edit field 908 which indicates if the edit file 228 is newly transcribed or has been previously edited by a user (for example the field 908 may display "Transcribed" to indicate a file for which media editing system 102 has produced new edit data 228 for but which has not yet been edited by a user, and display "Edited" to indicate a file that has been previously edited by a user. In example embodiments, the information used to display the elements of list 904 is created by media editing Kofman [0058], see also at least Kofman [0028]).

Claim 10: Sherwani and Kofman disclose the non-transitory computer-readable medium of claim 9, wherein said selecting is based on a feature of the audio file, input provided via the interface, or any combination thereof (see at least, “As shown in FIG. 9, each of the edit data 228 files in displayed list 94 are identified by a title field 906 (for example "US Gold Medal Winner", which corresponds to the interview data illustrated in FIGS. 5 to 7), and include the following associated status fields: (1) creation/edit date field 910 which indicates when the edit file 228 was first created and last edited in media editing system 102 (for example the field may display "Created 15 days ago Updated 3 days ago); and (2) Transcription/Edit field 908 which indicates if the edit file 228 is newly transcribed or has been previously edited by a user (for example the field 908 may display "Transcribed" to indicate a file for which media editing system 102 has produced new edit data 228 for but which has not yet been edited by a user, and display "Edited" to indicate a file that has been previously edited by a user. In example embodiments, the information used to display the elements of list 904 is created by media editing system 102 and stored as metadata 229 in the storage 206. Metadata 229 is updated by the media editing system whenever a new edit data 228 file is added the system or an existing file is edited, and functions as an index to the information stored as transcript data 226 by the media editing system 102. In some example embodiments metadata 229 is stored in a separate storage location than the files that make up edit data 228,” Kofman [0058], see also at least Kofman [0028]).

Claim 11: Sherwani and Kofman disclose the non-transitory computer-readable medium of claim 7, wherein said linking is performed on a word level or a phoneme level (see at least, “More specifically, moving or deleting a sequence of contiguous words causes the associated A/V content to be moved or deleted through the use of the word time alignment against the A/V content,” Sherwani [0022]).

Claim 16: Sherwani and Kofman disclose the non-transitory computer-readable medium of claim 7, wherein the operations further comprise: receiving input indicative of a request to initiate playback of the audio file; and in response to detecting the input, initiating playback of the audio file, wherein a current location is visually highlighted in the transcript and the audio file throughout playback of the audio file (see at least, “In some example embodiments, the system provides a cloud-based platform for uploading audio and video (A/V) files, returning in minutes with text that is precisely aligned with the original A/V, making it easily searchable and verifiable. In example embodiments, word-level timings are used to provide an interactive transcript in which the system highlights words as they are spoken and conversely the user can click on them to play that exact part in the A/V file. In various example embodiments, the media generating and editing system provides a platform that can provide users with one or more of precise timings, speaker identification, audio waveform, and a simple text-aligned drag-and-drop edit and export system that allow quick, accurate and efficient turnaround of content,” Kofman [0023]).

Claim 17: Sherwani and Kofman disclose the non-transitory computer-readable medium of claim 7, wherein the operations further comprise: causing display of each word in the transcript in proximity to a corresponding segment of the audio file in the second section of the interface (see at least, Kofman FIG. 13).

Claims 12 – 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sherwani and Kofman in view of Rodriguez et al. (US 2014/0328575 A1), hereinafter Rodriguez.

Claim 12: Sherwani and Kofman disclose the non-transitory computer-readable medium of claim 7, but do not disclose wherein the second section of the interface includes multiple tracks along which multiple audio files are arranged. However, Rodriguez discloses in a similar invention pertaining media editing that it is well known to include a user interface with one or more audio tracks along which one or more audio files are arranged (see at least, “The composite display area 130 includes multiple tracks that span a timeline 160, and displays one or more graphical representations of media clips in the composite presentation. As shown, the composite display area 130 displays a music clip representation 165 and a video clip representation 170. The composite display area 130 also includes a track 180 (that is empty in stages 105-120) for displaying a voice-over clip representation 108,” Rodriguez [0041]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the aforementioned aspects of Rodriguez in the invention of Sherwani and Kofman thereby expanding the invention to include beneficial features such as voice-over editing.

Claim 13: Sherwani, Kofman, and Rodriguez disclose the non-transitory computer-readable medium of claim 12, wherein the multiple audio files are temporally aligned with respect to a common timeline (see at least, “Since transcript 118 is aligned with the A/V content 114, removing, editing and/or moving of words in the transcript can be used to modify the A/V content associated therewith,” Sherwani [0017], “A user, by editing words in transcript section 406, can alter images 402 as well as audio waveforms 404 automatically. More specifically, moving or deleting a sequence of contiguous words causes the associated A/V content to be moved or deleted through the use of the word time alignment against the A/V content,” Sherwani [0022]).

Claim 14: Sherwani, Kofman, and Rodriguez disclose the non-transitory computer-readable medium of claim 12, wherein the operations further comprise: constructing the content-based experience by combining the multiple audio files into a composite audio file (see at least, “Some embodiments output audio from a different track (e.g., a music track) which will be combined with the voice-over audio in the composite presentation,” Rodriguez [0076]).

Allowable Subject Matter
Claims 3, 5, 6, 15, 20, 22, and 23 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOSEPH SAUNDERS whose telephone number is (571)270-1063.  The examiner can normally be reached on Monday-Thursday, 9:00 a.m. - 4 p.m., EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached on (571)272-7488.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained 






/JOSEPH SAUNDERS JR/Primary Examiner, Art Unit 2652