Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
The drawings were received on 3/27/2020.  These drawings are accepted.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 11-14 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 11 recites “enabling the user to align the matching portion of the second audio file with;”. The limitation is incomplete, leaving the claim unclear and indefinite, hence fails to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
	Claims 12-14 are dependent on respective independent claim, hence rejected as per the rejection of the independent claim.


Claims 12-14 recites “the second audio track”, “the first audio track” in claim 11. There is insufficient antecedent basis for this limitation in the claim.
	
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over LeVoit (US Publication No.: 20180277132) in view of Eppolito (US Publication No.: 20190363689).
Claim 1, LeVoit discloses
	receiving a first audio filed that includes a recording of speech (Fig. 1, label 108 shows a word of a recorded speech included in the audio file. Paragraph 48 discloses the 
	receiving a second audio file that includes a second recording of speech that is indexed by a second phonetic index (Fig. 1, label 116,118 as the second audio file with a second recorded speech with a different accent. Label 116,118 shows the phonemes of the audio file. Paragraph 48 discloses the media asset 102 may contain audio with human speech.); and
	in response to a user of the audio editing application (Fig. 3,1,7) specifying a portion of the first audio file that is to be replaced (Paragraph 81 discloses “… may users desire a form of media guidance through an interface that allows users to efficiently navigate content selection and easily identify content that they may desire.” Paragraph 82 discloses “… allow users to navigate among and locate many types of content or media assets. …” Fig. 3 shows the interactive media guidance application that allows the user to select content or media assets such as the selection of portion of the first audio, label 108. Fig. 1, label speech including label 108 or label 108 (media asset) must be selected prior to converting and searching as per the limitations below.), the audio editing application (Fig. 3,1,7) automatically (Fig. 1 shows once the selection is performed, the media guidance application automatically determines matching second audio.):
	using the first phonetic index to identify a phoneme sequence corresponding to the specified portion of the first audio file (Fig. 1, label 110,112 shows the phoneme sequence of 108, label 116,118 shows the phonetic sequence of the second audio file that matches 110,112.);

	in response to locating an occurrence of the phoneme sequence in the phonetic index (Fig. 1, label 116,118 shows the matching portion of the second audio file identified subsequent to locating an occurrence (paragraph 7 discloses comparing the first phoneme sequence with the phonemes of different accent types which indicates a search for or locating the occurrence of the phoneme sequence.).):
	identifying a matching portion of the second audio file corresponding to the located occurrence of the phoneme sequence (Paragraph 7 discloses through comparison, a matching portion of the second audio file is identified. Fig. 1, label 116,118 shows the matching portion of the second audio file corresponding to the located occurrence of the phoneme sequence, label 108.); and
	in response to the user of the audio editing application selecting the matching portion of the second audio file (Paragraph 81-82 discloses an interface that allows the user to navigate content selection and paragraph 7 discloses matching portion of the second audio file with the first audio file, similarly shown in Fig. 1, label 108 as the first portion of the first audio file and labels 116,118 as the second portion of the second audio file. Since the second portion of the second audio file must match the first portion of the first audio file, wherein the user selects the media content, the user also selects the second portion of the second audio file.):

	replacing the specified portion of the first audio file that is to be replaced with the temporally aligned matching portion of the second audio file (Fig. 1, label 116,118 replaces 108 in the speech output to the user. Paragraph 2 discloses “replace phonemes and/or words determined to need modification with phonemes and/or words that are intermediate between two dialects.” Paragraph 23 discloses “… replacement audio for each phoneme of the subset of phonemes, wherein the replacement audio replaces each phoneme of the subset of phonemes with a new phoneme with the similarity greater than the amount ….”. Due to the replacement of the portion of the first audio file uses the matching portion of the second audio file, the replacement must occur subsequent or in response to the selection of the matching portion of the second audio file.)

	Eppolito discloses media editing application comprising enabling a user (paragraph 178 discloses the user indicates desire to align the highest peak of 3865 of audio waveform 3825 to the second highest peak 3870 of audio waveform 3830.) to align the matching portion of the second audio file (Paragraph 178 discloses the media editing application aligns a point on the first audio waveform to a point on the second audio clip. Paragraph 2 discloses media editing application allows for composition of pieces of media content such as video, audio, etc, where the users have the ability to edit, combine, transition, overlay and piece together different media content in a variety of manners. This indicates the user is enable to match or combine two audio files as desired.) It would be obvious to one skilled in the art before the effective filing date of the application to modify LeVoit’s media guidance application by allowing the user the ability to align two audio signals as disclosed by … so to allow the user to place the desired audio in the position as desired, hence improving the user’s experience by increasing user autonomy, thus allowing the user creative freedom.
Claim 2, LeVoit discloses matching the waveform of the matching portion of the second audio file with the waveform of the specified portion of the first audio file (Fig. 2, label 260,270 shows the alignment of a waveform of the portion of the second audio file (Fig. 1, label 116,118) matching a waveform of the specified portion of the first audio file (Fig. 1, label 110,112). Paragraph 75) includes time stretching the waveform of the matching portion of the second audio file (paragraph 75 discloses “the media guidance application may shorten or lengthen one of the audio clips such that they are the same length and also align critical points …”.).

Claim 4, LeVoit discloses the first recording and the second recording as a movie, for example, (paragraph 50,51) and recordings from a database (paragraph 151) were captured during a given take (Paragraph 50,51 discloses the first recording is from a movie, wherein depending on the making of the movie, the first recording can be captured continuously or a given take. Paragraph 151 discloses the database includes an indicator to a location in memory with audio of a corresponding phoneme of the second accent type, which indicates the second recording is captured during a given take.).
	Claim 5, LeVoit discloses the second recording was captured in a different take from the first recording (Paragraph 50 discloses the first recording is a movie, which is 
Claim 6, LeVoit discloses the first recording was captured on a film set (Paragraph 50 discloses the first audio can be a movie about hockey  or movie in English. Such indicates capturing of a first recording on a film set or wherever the movie is made.) and the second recording was captured in a recording studio (Paragraph 151 discloses the second recording is from a database of accent types.). Although LeVoit does not disclose the recording of the audio in the database (paragraph 151) is performed in a studio, such database includes a plurality of captured audio, each with speech spoken with an accent. Such suggests capturing can occur in any location, wherein a studio is a location that allows for such capturing of audio with human speech. For this reason, it would be obvious to one skilled in the art before the effective filing date of the application for the recordings in the database as disclosed by LeVoit is captured in a studio since LeVoit discloses the database includes captured recordings and a studio enables capturing of speech.
	Claim 7, LeVoit discloses automatically adjusting the matching portion of the second audio file with the specified portion of the first audio file (Fig. 2, label 260,270 shows the alignment of a waveform of the portion of the second audio file (Fig. 1, label 116,118) matching a waveform of the specified portion of the first audio file (Fig. 1, label 110,112). Paragraph 75) by adjusting a temporal offset of the matching portion of the second track so that a feature of a waveform of the matching portion of the second audio track is aligned with a corresponding feature of a waveform of the specified portion of the first audio file. (paragraph 75 discloses “the media guidance application may shorten or lengthen one of the audio clips such that they are the same length and also align 
	Claim 8, LeVoit discloses the feature of the waveform corresponds to a beginning of an utterance (paragraph 75 discloses “the media guidance application may shorten or lengthen one of the audio clips such that they are the same length and also align critical points …”, wherein the critical points can be at the beginning of an utterance. An example of alignment is shown in Fig .2, label 260,270 where the timing is aligned.)
	Claim 9, LeVoit discloses the feature of the waveform corresponds to a beginning of sung phrase (Paragraph 48 discloses a media asset can be audio with human speech, wherein a sung phrase is an audio with human speech. Fig. 2, labels 260,270 indicate shows waveform of label 108, wherein the feature of the waveform includes the beginning of the media asset such as the beginning of the utterance “about” with phonemes as shown in Fig. 2,1.)
	Claim 10, LeVoit discloses automatically stretching or shrinking at least a part of the matching portion of the second audio file so as to temporally align a plurality of features of a waveform of the matching portion with a corresponding plurality of features of a waveform of the specified portion of the first audio file. (paragraph 75 discloses “the media guidance application may shorten or lengthen one of the audio clips such that they are the same length and also align critical points …”.)
Claim 11, LeVoit discloses 
	receiving a first audio filed that includes a recording of speech (Fig. 1, label 108 shows a word of a recorded speech included in the audio file. Paragraph 48 discloses the media asset 102 may contain audio with human speech.);

	in response to a user of the audio editing application (Fig. 3,1,7) specifying a text string representing speech contained within a portion of the first audio file that is to be replaced (Paragraph 81 discloses “… may users desire a form of media guidance through an interface that allows users to efficiently navigate content selection and easily identify content that they may desire.” Paragraph 82 discloses “… allow users to navigate among and locate many types of content or media assets. …” Fig. 3 shows the interactive media guidance application that allows the user to select content or media assets such as the selection of text string as shown in Fig. 1. Fig. 1, label speech including label 108 or label 108 (media asset) must be selected prior to converting and searching as per the limitations below.), the audio editing application (Fig. 3,1,7) automatically (Fig. 1 shows once the selection is performed, the media guidance application automatically determines matching second audio.):
	converting the text string into a corresponding phoneme sequence (Fig. 1, label 110,112 shows the phoneme sequence of 108.);
	searching the phonetic index of the second audio file for an occurrence of the phoneme sequence (paragraph 7 discloses “compare the audio properties … of the first phoneme with corresponding phonemes of different accent types.” The different accent types are considered the second audio file such as the audio file, label 116,118. Labels 116,118 correlates with the occurrence of label 108.); and

	identifying a matching portion of the second audio file corresponding to the located occurrence of the phoneme sequence (Paragraph 7 discloses through comparison, a matching portion of the second audio file is identified. Fig. 1, label 116,118 shows the matching portion of the second audio file corresponding to the located occurrence of the phoneme sequence, label 108.); and
	in response to the user of the audio editing application selecting the matching portion of the second audio file (Paragraph 81-82 discloses an interface that allows the user to navigate content selection and paragraph 7 discloses matching portion of the second audio file with the first audio file, similarly shown in Fig. 1, label 108 as the first portion of the first audio file and labels 116,118 as the second portion of the second audio file. Since the second portion of the second audio file must match the first portion of the first audio file, wherein the user selects the media content, the user also selects the second portion of the second audio file.):
	align the matching portion of the second audio file with (Paragraph 75 discloses “The media guidance application may align a first audio clip of each phoneme of the subset of phonemes with a second respective audio clip of the corresponding phoneme of the second accent type. … To correct this, the media guidance application may shorten or length one of the audio clips such that they are the same length and also align critical points (e.g. the global maximum of one audio clip may be at 1 second and 
	replacing the portion of the first audio file that is to be replaced with the temporally aligned matching portion of the second audio file (Fig. 1, label 116,118 replaces 108 in the speech output to the user. Paragraph 2 discloses “replace phonemes and/or words determined to need modification with phonemes and/or words that are intermediate between two dialects.” Paragraph 23 discloses “… replacement audio for each phoneme of the subset of phonemes, wherein the replacement audio replaces each phoneme of the subset of phonemes with a new phoneme with the similarity greater than the amount ….”. Due to the replacement of the portion of the first audio file uses the matching portion of the second audio file, the replacement must occur subsequent or in response to the selection of the matching portion of the second audio file.)
LeVoit fails to disclose alignment is performed by enabling the user to align the matching portion of the second audio file.
	Eppolito discloses media editing application comprising enabling a user (paragraph 178 discloses the user indicates desire to align the highest peak of 3865 of audio waveform 3825 to the second highest peak 3870 of audio waveform 3830.) to align the matching portion of the second audio file (Paragraph 178 discloses the media editing application aligns a point on the first audio waveform to a point on the second audio clip. Paragraph 2 discloses media editing application allows for composition of pieces of media content such as video, audio, etc, where the users have the ability to edit, combine, transition, overlay and piece together different media content in a variety of manners. This indicates the user is enable to match or combine two audio files as 
Claim 12, LeVoit discloses adjusting a gain of the portion of the second audio track to match a gain of the portion of the first audio track. (Fig. 3, label 314,316,310,312 indicates the amount of each audio clip that is mixed into the new audio clip. Paragraph 77 discloses “The media guidance application may combine the first audio clip of each phoneme of the subset of phonemes with the second respective audio clip of the corresponding phoneme of the second accent type, wherein the first audio clip is scaled by the mixing value. … The media guidance application may perform pitch modulation, smoothing ….to ensure that the clips are combined to form a cohesive new audio clip.” Such indicates that, depending on the mixing value that can be set by the user (paragraph 78), the gain of the portion of the second audio is adjusted to match a gain of the portion of the first audio in order to form a cohesive new audio.)
Claim 13, LeVoit discloses an average gain of the portion of the second audio track is matched to an average gain of the portion of the first audio track. (Fig. 3, label 314,316,310,312 indicates the amount of each audio clip that is mixed into the new audio clip. Paragraph 77 discloses “The media guidance application may combine the first audio clip of each phoneme of the subset of phonemes with the second respective audio clip of the corresponding phoneme of the second accent type, wherein the first audio clip is scaled by the mixing value. … The media guidance application may perform pitch modulation, smoothing ….to ensure that the clips are combined to form a cohesive new 
Claim 14, LeVoit discloses adjusting a gain of a feature of a waveform of the second audio track to match a corresponding feature of a waveform of the first audio track. (Fig. 3, label 314,316,310,312 indicates the amount of each audio clip that is mixed into the new audio clip. Paragraph 77 discloses “The media guidance application may combine the first audio clip of each phoneme of the subset of phonemes with the second respective audio clip of the corresponding phoneme of the second accent type, wherein the first audio clip is scaled by the mixing value. … The media guidance application may perform pitch modulation, smoothing ….to ensure that the clips are combined to form a cohesive new audio clip.” Such indicates that, depending on the mixing value that can be set by the user (paragraph 78), the gain of a feature (accent type) of the second audio is adjusted to match or blend with the gain of a feature (accent type) of the first audio in order to form a cohesive new audio.)
Claim 15, LeVoit discloses
	a non-transitory computer readable medium with computer readable instructions encoded thereon, wherein the computer readable instructions when processed by a processing device instruct the processing device to perform a method of editing an audio composition using an audio editing application (Paragraph 97 discloses microprocessor and execution of media guidance application stored in memory. Paragraph 83 discloses non-transitory computer readable media with instructions. Fig. 1,2,3,6,7 shows the media guidance application editing an audio composition.), the method comprising:

	receiving a second audio file that includes a second recording of speech that is indexed by a second phonetic index (Fig. 1, label 116,118 as the second audio file with a second recorded speech with a different accent. Label 116,118 shows the phonemes of the audio file. Paragraph 48 discloses the media asset 102 may contain audio with human speech.); and
	in response to a user of the audio editing application (Fig. 3,1,7) specifying a portion of the first audio file that is to be replaced (Paragraph 81 discloses “… may users desire a form of media guidance through an interface that allows users to efficiently navigate content selection and easily identify content that they may desire.” Paragraph 82 discloses “… allow users to navigate among and locate many types of content or media assets. …” Fig. 3 shows the interactive media guidance application that allows the user to select content or media assets such as the selection of portion of the first audio, label 108. Fig. 1, label speech including label 108 or label 108 (media asset) must be selected prior to converting and searching as per the limitations below.), the audio editing application (Fig. 3,1,7) automatically (Fig. 1 shows once the selection is performed, the media guidance application automatically determines matching second audio.):
	using the first phonetic index to identify a phoneme sequence corresponding to the specified portion of the first audio file (Fig. 1, label 110,112 shows the phoneme 
	searching the second phonetic index for an occurrence of the phoneme sequence in the second phonetic index (paragraph 7 discloses “compare the audio properties … of the first phoneme with corresponding phonemes of different accent types.” The different accent types are considered the second audio file such as the audio file, label 116,118. Labels 116,118 correlates with the occurrence of label 108. Labels 116,118 also shows the phonetic index of the second audio.); and
	in response to locating an occurrence of the phoneme sequence in the phonetic index (Fig. 1, label 116,118 shows the matching portion of the second audio file identified subsequent to locating an occurrence (paragraph 7 discloses comparing the first phoneme sequence with the phonemes of different accent types which indicates a search for or locating the occurrence of the phoneme sequence.).):
	identifying a matching portion of the second audio file corresponding to the located occurrence of the phoneme sequence (Paragraph 7 discloses through comparison, a matching portion of the second audio file is identified. Fig. 1, label 116,118 shows the matching portion of the second audio file corresponding to the located occurrence of the phoneme sequence, label 108.); and
	in response to the user of the audio editing application selecting the matching portion of the second audio file (Paragraph 81-82 discloses an interface that allows the user to navigate content selection and paragraph 7 discloses matching portion of the second audio file with the first audio file, similarly shown in Fig. 1, label 108 as the first portion of the first audio file and labels 116,118 as the second portion of the second audio file. Since the second portion of the second audio file must match the first portion 
	temporally aligning the matching portion of the second audio file with the specified portion of the first audio file by matching a waveform of the matching portion of the second audio file with a waveform of the specified portion of the first audio file (Fig. 2, label 260,270 shows the alignment of a waveform of the portion of the second audio file (Fig. 1, label 116,118) matching a waveform of the specified portion of the first audio file (Fig. 1, label 110,112). Paragraph 75 discloses “The media guidance application may align a first audio clip of each phoneme of the subset of phonemes with a second respective audio clip of the corresponding phoneme of the second accent type. … To correct this, the media guidance application may shorten or length one of the audio clips such that they are the same length and also align critical points (e.g. the global maximum of one audio clip may be at 1 second and another may be at 1.5 seconds).” Due to alignment occurs using the matching portion of the second audio file, the selection of the matching portion of the second audio file must occur subsequent or in response to the alignment.);
	replacing the specified portion of the first audio file that is to be replaced with the temporally aligned matching portion of the second audio file (Fig. 1, label 116,118 replaces 108 in the speech output to the user. Paragraph 2 discloses “replace phonemes and/or words determined to need modification with phonemes and/or words that are intermediate between two dialects.” Paragraph 23 discloses “… replacement audio for each phoneme of the subset of phonemes, wherein the replacement audio replaces each phoneme of the subset of phonemes with a new phoneme with the similarity greater than the amount ….”. Due to the replacement of the portion of the first audio file uses 
	LeVoit fails to disclose alignment is performed by enabling the user to align the matching portion of the second audio file.
	Eppolito discloses media editing application comprising enabling a user (paragraph 178 discloses the user indicates desire to align the highest peak of 3865 of audio waveform 3825 to the second highest peak 3870 of audio waveform 3830.) to align the matching portion of the second audio file (Paragraph 178 discloses the media editing application aligns a point on the first audio waveform to a point on the second audio clip. Paragraph 2 discloses media editing application allows for composition of pieces of media content such as video, audio, etc, where the users have the ability to edit, combine, transition, overlay and piece together different media content in a variety of manners. This indicates the user is enable to match or combine two audio files as desired.) It would be obvious to one skilled in the art before the effective filing date of the application to modify LeVoit’s media guidance application by allowing the user the ability to align two audio signals as disclosed by … so to allow the user to place the desired audio in the position as desired, hence improving the user’s experience by increasing user autonomy, thus allowing the user creative freedom.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LINDA WONG whose telephone number is (571)272-6044.  The examiner can normally be reached on 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on (571) 272-7453.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/LINDA WONG/Primary Examiner, Art Unit 2656