DETAILED ACTION
This communication is in response to the Amendments and Arguments filed on   04/08/2022. 
Claims 1, 11, 16, and 21-23 are pending and have been examined.
All previous objections/rejections not mentioned in this Office Action have been withdrawn by the examiner. 
	Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 11, and 16 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Please see the new mappings for further detail.
Claim Objections
Claims 1, 11, 16, and 23 are objected to because of the following informalities:  
Claim 1 - recites “the first audio file” in limitation 7, which has no antecedent basis in the preceding limitations;
Claims 11 and 16 - 
recite an electronic device in the preamble and several limitations, then switches to the use of “the terminal”. In the interest of compact prosecution, the “electronic device” and “terminal” are interpreted as referring to the same device. However, it is suggested that the terminology be amended for consistency;
recite “the first audio file”, which is lacking in antecedent basis;
Claim 23 - recites “to-be-placed audio segment”, which is lacking in antecedent basis. In the interest of compact prosecution, the “to-be-replaced audio segment” and “to-be-placed audio segment” are interpreted as referring to the same audio segment. However, it is suggested that the terminology be amended for consistency.
Appropriate correction is required.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 11, 16, and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over MacDonald (U.S. PG Pub No. 2018/0330756), hereinafter MacDonald, in view of Rossano et al (U.S. PG Pub No. 2016/0021334), hereinafter Rossano, and further in view of Neven et al. (U.S. Patent No. 6948131), hereinafter Neven.

Regarding claims 1, 11, and 16, MacDonald teaches
(claim 1) An audio file processing method for an electronic device via a user interface of a terminal in communication with a processing device (a method for devices to edit videos, including allowing users to record takes, i.e. an audio file processing method for an electronic device, where one or more devices are connected through a network, such as a mobile device, i.e. terminal, connected to cloud hardware, such as a server, i.e. in communication with a processing device [0016],[0024],[0037], and there is a simplified user interface for the user to perform the editing, i.e. user interface [0031:1-17]), the method comprising:
(claim 11) An electronic device (one or more computer devices [0016:4-5]), comprising:
a processor (the computer devices include microprocessors [0044]); and
a memory (the computer devices include memory storage [0044]), the memory storing computer instructions executable by the processor, wherein the processor is configured to execute the computer instruction to perform a method via a user interface of the electronic device in communication with a processing device (a memory is used to provide machine instructions to a processor, i.e. memory storing computer instructions executable by the processor, where the processor executes the program, i.e. processor is configured to execute the computer instruction [0148], and where the method is for devices to edit videos, where one or more devices are connected through a network, such as a mobile device, i.e. electronic device, connected to cloud hardware, such as a server, i.e. in communication with a processing device [0016],[0024],[0037], and there is a simplified user interface for the user to perform the editing, i.e. user interface [0031:1-17]), the method including:
(claim 16) A non-transitory computer-readable storage medium storing computer program instructions executable by at least one processor to perform a method via a user interface of the electronic device in communication with a processing device (a memory is used to provide machine instructions to a processor, i.e. non-transitory computer-readable storage medium storing computer program instructions executable by at least one processor, where the processor executes the program [0148], and where the method is for devices to edit videos, where one or more devices are connected through a network, such as a mobile device, i.e. electronic device, connected to cloud hardware, such as a server, i.e. in communication with a processing device [0016],[0024],[0037], and there is a simplified user interface for the user to perform the editing, i.e. method via a user interface [0031:1-17]), the method including:

receiving, via the user interface of the terminal, a voice replacement request for a target role in a first video file (a user is allowed to select the roles, i.e. for a target role, and audio lines that they want to insert, i.e. voice replacement request, into the video, i.e. in a first video file [0018-9], where the user performs actions through the application on their device, such as a phone or other computing device, i.e. receiving, via the user interface of the terminal [0081]), by:
receiving, via the user interface of the terminal, a right-click trigger on the user interface (the user can navigate to access a drop down menu the allows for further selections, i.e. a trigger on the user interface [0088:1-22], where the user performs actions through the application on their device, such as a phone or other computing device, i.e. receiving, via the user interface of the terminal [0081]);
displaying, on the user interface, a --menu--, the --menu-- showing a list of actor roles whose voice is replaceable (the drop down menu, i.e. displaying, on the user interface, a menu, lets the user select roles to play, i.e. showing a list of actor roles, where the user can record their own take of the audio lines of the role, i.e. roles whose voice is replaceable [0018-20],[0088]), and
receiving, via the --menu-- on the user interface, a user selection of the target role from the list of actor roles as displayed (the user can navigate to a drop down menu to select, i.e. receiving, via the --menu-- on the user interface, a user selection, the role or roles they wish to play, i.e. target role from the list of actor roles as displayed [0018-20],[0088]);
sending, by the terminal, the voice replacement request to the processing device, the voice replacement request including an identifier of the first video file and an identifier of the target role as selected (the user may click a button to render the scene by inserting their recorded audio line takes into the original video, i.e. voice replacement request including an identifier of the first video file [0018-23],[0088], where a take is associated with the user selected role, i.e. identifier of the target role as selected [0089], where the server can combine the user takes recorded on the mobile device with the videos stored in the cloud, i.e. sending, by the terminal, the voice replacement request to the processing device [0037]);
receiving, by the processing device, the voice replacement request from the terminal (where a client, i.e. terminal, and server, i.e. processing device, are remote from each other and are in communication [0080], the server can combine the user takes recorded on the mobile device with the videos stored in the cloud, i.e. voice replacement, based on the user’s selection of a scene, role, and lines, i.e. receiving...the request from the terminal [0087],[0089]);
determining, by the processing device, ... a to-be-replaced audio segment in the first audio file according to the voice replacement request (the user selects the roles to play, where the user can record their own take of the audio lines of the role, i.e. according to the voice replacement request, to be inserted into the original video, i.e. in the first audio file [0018-20],[0088], and when rendering occurs, and the audio is set for dub over, the recorded audio track will replace the related track, i.e. determining...a to-be-replaced audio segment [0131:53-59], where the process can be performed on a server, i.e. by the processing device [0130:1-9]); and
-2-replacing data in the to-be-replaced audio segment with to-be-dubbed audio data..., to obtain a second audio file (when rendering occurs, and the audio is set for dub over, the recorded audio track, i.e. to-be-dubbed audio data, will replace the related track, i.e. replacing data in the to-be-replaced audio segment, where the final output is saved for the user to review, which includes both video and audio files, i.e. obtain a second audio file [0026],[0131:53-66]); and
sending, by the processing device, the second audio file to the terminal (the final output, i.e. second audio file, is saved for the user to review [0131:53-66], where the process can be performed on a server, i.e. processing device, that is connected to a user through a computer display monitor that can display the rendered video, i.e. sending...the second audio file to the terminal [0130:1-28]).
While MacDonald provides the ability for a user to choose through a user interface which roles and lines will have the audio replaced and the time syncing of the audio, MacDonald does not specifically teach the detection of time information for the lines, and thus does not teach
determining, by the processing device, time frame information of a to-be-replaced audio segment in the first audio file...; and
-2-replacing data in the to-be-replaced audio segment with to-be-dubbed audio data according to the time frame information....  
Rossano, however, teaches determining, by the processing device, time frame information of a to-be-replaced audio segment in the first audio file ... (the prosody analysis unit compares the original audio, i.e. first audio file, with the generated baseline voice, to identify the exact speech beginning timing and speed of the sound segment, i.e. determining time frame information of each to-be-replaced audio segment [0052]); and
-2-replacing data in the to-be-replaced audio segment with to-be-dubbed audio data according to the time frame information... (the dubbing unit ‘speaks’ the local language using TTS, using the relevant voice, on top of the video’s audio, i.e. replacing data in the to-be-replaced audio segment with the to-be-dubbed audio data, where additional adjustments are performed to comply with the given timing of the original audio, such as stretching or shrinking the dubbed speech audio, i.e. according to the time frame information [0053-5]).  
MacDonald and Rossano are analogous art because they are from a similar field of endeavor in dubbing audiovisual content. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the ability for a user to choose through a user interface which roles and lines will have the audio replaced and the time syncing of the audio of MacDonald with the determination of timing for each sound segment as taught by Rossano. The motivation to do so would have been to achieve a predictable result of enabling adjustment of the timing of the dubbing audio to fit in with the original movie’s timing (Rossano [0055]).
While MacDonald in view of Rossano provides the ability for a user to choose through a user interface which roles and lines will have the audio replaced, MacDonald does not specifically teach the user interface reacting to a right-click trigger to provide a pop up window, and thus does not teach
receiving, via the user interface of the terminal, a right-click trigger on the user interface;
displaying, on the user interface, a pop up window, the pop up window showing a list ....
Neven, however, teaches receiving, via the user interface of the terminal, a right-click trigger on the user interface (the application window for the communicator, i.e. terminal, of a system, has a user interface (03:26-35),(6:46-57), where the user may right click the mouse over a media target, i.e. receiving...a right-click trigger on the user interface (4:33-39));
displaying, on the user interface, a pop up window, the pop up window showing a list ... (the application window has a user interface (03:26-35),(6:46-57), where, when the user right clicks the mouse over a media target, a pop-up menu is invoked, i.e. displaying, on the user interface, a pop up window, with a list of options, i.e. showing a list (4:33-39)).
MacDonald, Rossano, and Neven are analogous art because they are from a similar field of endeavor in enabling a user to edit audiovisual media. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the user choosing through a drop-down menu of a user interface which roles and lines will have the audio replaced teachings of MacDonald, as modified by Rossano, with the use of a right-click to invoke a pop-up menu presenting options to a user as taught by Neven. The motivation to do so would have been to achieve a predictable result of enabling a user to create media for presentation to others (Neven (3:11-25)).

	Regarding claim 21, MacDonald in view of Rossano and Neven teach claim 1, and Rossano further teaches
	extracting, ..., first candidate time frame information from the to-be-replaced audio segment based on a short sentence division principle (a speech-to-text module can be run to generate text from the video sound track, i.e. audio segment [0060], and a text analysis unit identifies the next subtitle text, which can be one or more lines of text where the text is a sentence and the length of gaps and words are recognized, i.e. a short sentence division principle, and the sentence will be translated into a target language and then passed to the TTS generation unit [0042],[0050],[0057], where the prosody analysis unit compares the original audio with the generated baseline voice to identify the exact speech beginning timing and speed of the sound segment, and the post-processing unit identifies the length of gaps and length of words in a sound segment, i.e. extracting...first candidate time frame information from the to-be-replaced audio segment [0052]);
extracting, ..., second candidate time frame information from the to-be-replaced audio segment based on a long sentence division principle (a speech-to-text module can be run to generate text from the video sound track, i.e. audio segment [0060], and a text analysis unit identifies the next subtitle text, which can be one or more lines of text where the text is a sentence, i.e. a long sentence division principle, and the sentence will be translated into a target language and then passed to the TTS generation unit [0042],[0050],[0057], where the prosody analysis unit compares the original audio with the generated baseline voice to identify the exact speech beginning timing and speed of the sound segment, and the length of a sound segment in comparison to the original movie timing, such as longer or shorter, i.e. extracting...second candidate time frame information from the to-be-replaced audio segment [0052-6]); and
determining the time frame information of the to-be-replaced audio segment according to the first candidate time frame information and the second candidate time frame information (the target language speech audio, i.e. to-be-replaced audio segment, in order to match the original movie’s timing, i.e. determining the time frame information, can be homogenously stretched or shrunk as a segment, i.e. according...to the second candidate time frame information, and can have the gaps between words shortened or widened and manipulated on a different scale than that used on the actual words, i.e. according to the first candidate time frame information [0053-7]).  
Where MacDonald teaches that the processing occurs at a server [0130:1-9].
And where the motivation to combine is the same as previously presented.

Claim(s) 23 is/are rejected under 35 U.S.C. 103 as being unpatentable over MacDonald, in view of Rossano, in view of Neven, and further in view of Zhang et al. (CN105959773A), as found in the IDS, hereinafter Zhang.

	Regarding claim 23, MacDonald in view of Rossano and Neven teaches claim 1.
While MacDonald in view of Rossano and Neven provides the identification of start times for audio segments, MacDonald in view of Rossano and Neven does not specifically teach the determination of an offset of a segment start time, and thus does not teach
extracting a start time tN-1 of an N-1th audio segment before the to-be-placed audio segment; and
 -6-A iiorney Docket No. 00144.0851.OQUS Application No. 16/844,283 determining a start time tN of the Nth audio segment as tN-1 plus an offset due to a time deviation during extraction.
Zhang, however, teaches extracting a start time tN-1 of an N-1th audio segment before the to-be-placed audio segment (where there are 2 adjacent dubbing subsections, the starting time, i.e. start time tN-1, of the first dubbing sub-segment, i.e. an N-1th audio segment before the to-be-placed audio segment, is determined (pg 3, para 2)); and
 -6-A iiorney Docket No. 00144.0851.OQUS Application No. 16/844,283 determining a start time tN of the Nth audio segment as tN-1 plus an offset due to a time deviation during extraction (the endpoint of the first dubbing sub-segment is also determined, where the difference between the starting point and the endpoint may be interpreted as a duration, and if the end time point of the first dubbing sub-segment is not equal, i.e. due to a time deviation during extraction, to the start time point of the second dubbing sub-segment, i.e. tN of the Nth audio segment, the starting time point of the second dubbing sub-segment is determined where the two segments are connected by a preset dubbing sub-segment that is equal to the length of time between the end time point of the first dubbing sub-segment, which is a duration after the starting time of the first dubbing sub-segment, and the start time of the second dubbing sub-segment, i.e. determining...tN-1 plus an offset (pg 3, para 2)).
MacDonald, Rossano, Neven, and Zhang are analogous art because they are from a similar field of endeavor in editing audiovisual media. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the identification of start times for audio segments teachings of MacDonald, as modified by Rossano and Neven, with the use of a preset dubbing segment with a particular length to determine the start time of a second dubbing segment as taught by Zhang. The motivation to do so would have been to achieve a predictable result of sorting the timing information of multiple dubbing segments (Zhang (pg 3, para 2)).
Allowable Subject Matter
Claim 22 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. More specifically, none of the prior art teaches or makes obvious the averaging of two start times of the same audio segment, where the start times are determined using different methods, in order to determine a final start time of the audio segment. 
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICOLE A K SCHMIEDER whose telephone number is (571)270-1474. The examiner can normally be reached 8:00 - 5:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NICOLE A K SCHMIEDER/           Examiner, Art Unit 2659   

/PIERRE LOUIS DESIR/           Supervisory Patent Examiner, Art Unit 2659