DETAILED ACTION
Claims 1-20 are pending.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 21 January 2021 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the Examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 12 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Zhan et al (CN 106373558 A (from the filed IDS): hereafter – Zhan (see attached English translation)) in view of Deshmukh et al (US 2012/0323570 A1: hereafter – Deshmukh).
aim 1, Zhan discloses a method for processing a speech, comprising:
acquiring an original speech (Zhan: Abstract — acquiring to-be-recognized speech data as an original speech);
performing speech recognition on the original speech, 5to obtain an original text corresponding to the original speech (Zhan: Abstract — conducting speech recognition on the speech data);
associating a speech segment in the original speech with a text segment in the original text (Zhan: Abstract — conducting abnormal speech detection on the speech data to determine abnormal speech, and marking the part corresponding to the abnormal speech, as well as providing the marked recognition text to the user (indicating an association of an abnormal speech segment in the speech with its text segment));
recognizing an abnormal segment in at least one of the 10original speech or the original text (Zhan: Abstract — conducting abnormal speech detection on the speech data to determine abnormal speech).
The reference of Zhan provides teaching for the detection of an abnormal segment in an original speech. It fails to provide teaching for generating a final speech after processing the text segment or the speech segment indicated by the abnormal segment.
This isn’t new to the art as the reference of Deshmukh provides:
processing at least one of the text segment indicated by the abnormal segment in the original text or the speech segment indicated by the abnormal segment in the original speech, to generate a final speech (Deshmukh: [0019] — modifying stutter regions in speech, removing them, in order to reconstruct the speech as a smooth speech signal version).
Hence, at the time the application was effectively filed, one of ordinary skill in the art would have found it obvious to incorporate the teaching of Deshmukh into that of 
As for claim 12, apparatus claim 12 and method claim 1 are related as apparatus and the method of using same, with each claimed element’s function corresponding to the claimed method step. Deshmukh in [0043] provides a processor of a computer to implement the techniques of the claimed apparatus, and [0044] provides the required storage memory. Accordingly, claim 12 is similarly rejected under the same rationale as applied above with respect to method claim 1.
As for claim 20, computer program product claim 20 and method claim 1 are related as computer program product storing executable instructions required for performing the claimed method steps on a computer. Deshmukh in [0044] provides storage memory necessary for reading upon the limitations of this claim. Accordingly, claim 20 is similarly rejected under the same rationale as applied above with respect to method claim 1.
Claims 2, 4, 13 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Zhan (CN 106373558 A) in view of Deshmukh (US 2012/0323570 A1) as applied to claim 1, further in view of RAMASWAMY et al (US 2001/0056344 A1: hereafter – Ramaswamy).
For claim 2, claim 1 is incorporated but the combination of Zhan in view of Deshmukh fails to provide teaching for this claim, for which the reference of Ramaswamy is now introduced to teach as:
the method, wherein performing the speech recognition on the original speech to obtain the original text corresponding to the original speech comprises:
using the language model in speech recognition to indicate a period of silence in the text of recognised speech, with particular tokens (the tokens being the symbols)).
The combination of Zhan in view of Deshmukh provides teaching for performing speech recognition on an original speech, but differs from the claimed invention in that the claimed invention further provides teaching for recognising a blank speech segment as a first preset symbol. This isn’t new to the art as the reference of Ramaswamy provides teaching for indicating a period of silence encountered in speech as a particular token. Hence, at the time the application was effectively filed, one of ordinary skill in the art would have found it obvious to incorporate the teaching of Ramaswamy into that of the combination, given the predictable result that assigned symbols to a period of blank speech would make reading the text produced from speech recognition easy to understand, while still understanding that a period of blank speech is present, as opposed to not indicating it at all.
For claim 4, claim 2 is incorporated and the combination of Zhan in view of Deshmukh further in view of Ramaswamy discloses the method, wherein the recognizing the blank speech segment as the first preset symbol, and/or recognizing the elongated tone speech segment as the second 5preset symbol comprises:
determining, based on a ratio of a duration of the blank speech segment to a first preset duration, a number of the first preset symbol recognized from the blank speech segment (Ramaswamy: [0078] — checking the duration for which the silence period lasts (3 seconds or more, or as chosen), to determine that a silence period token should be included (a ratio of the of the blank speech segment to a preset duration such a 3 seconds, can be determined from here)); and/or
10determining, based on the ratio of a duration of the elongated tone speech segment to a second preset duration, a number of the second preset symbol recognized from the elongated tone speech segment.
As for claim 13, apparatus claim 13 and method claim 2 are related as apparatus and the method of using same, with each claimed element’s function corresponding to the claimed method step. Accordingly, claim 13 is similarly rejected under the same rationale as applied above with respect to method claim 2.
As for claim 15, apparatus claim 15 and method claim 4 are related as apparatus and the method of using same, with each claimed element’s function corresponding to the claimed method step. Accordingly, claim 15 is similarly rejected under the same rationale as applied above with respect to method claim 4.
Claims 5, 7 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Zhan (CN 106373558 A) in view of Deshmukh (US 2012/0323570 A1) further in view of Ramaswamy (US 2001/0056344 A1) as applied to claim 1, and further in view of Ju et al (US 2020/0342860 A1: hereafter – Ju).
For claim 5, claim 4 is incorporated but the combination of Zhan in view of Deshmukh further in view of Ramaswamy fails to disclose the limitations of this claim, for which Ju is now introduced to teach as:
the method, wherein processing at least 15one of the text segment indicated by the abnormal segment in the original text or the speech segment indicated by the abnormal segment in the original speech to generate the final speech comprises:
scrubbing (deleting) identifying information (taken as abnormal segments) from both transcribed text and speech data representations, and then generating final speech).
The combination of Zhan in view of Deshmukh further in view of Ramaswamy provides teaching for recognising abnormal segments in at least one of an original speech or text. It differs from the claimed invention in that the claimed invention further provides asynchronously deleting the abnormal segments from both the text and speech representations before generating a final speech. This isn’t new to the art as the reference of Ju provides teaching for deleting sensitive information from both text and audio representations, and then generating a final audio. Hence, at the time the application as effectively filed, one of ordinary skill in the art would have found it obvious to incorporate the teaching of Ju in that of the combination, given the predictable result of properly aligning the clean version of the generated transcribed text, with the clean version of the audio.
For claim 7, claim 5 is incorporated and the combination of Zhan in view of Deshmukh further in view of Ramaswamy and further in view of Ju discloses the method, wherein after generating the final speech, the method further comprises:
smoothing the final speech (Deshmukh: [0019] — reconstructing a smooth speech signal).
As for claim 16, apparatus claim 16 and method claim 5 are related as apparatus and the method of using same, with each claimed element’s function corresponding to .
Claims 8, 18 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Zhan (CN 106373558 A) in view of Deshmukh (US 2012/0323570 A1) further in view of Ramaswamy (US 2001/0056344 A1), further in view of Ju (US 2020/0342860 A1), as applied to claim 7, and further in view of XIE et al (CN 103647880 A: hereafter – Xie, see the attached English translation).
For claim 8, claim 7 is incorporated but the combination of Zhan in view of Deshmukh further in view of Ramaswamy and further in view of Ju fails to disclose the limitations of this claim, for which Xie is now introduced to teach as the method, wherein the smoothing the final speech comprises:
determining, based on a speech feature of the final speech, a dialect category corresponding to the final speech (Xie: page 5 lines 8-12 — using special syllable block areas to make corrections to a dialect pronunciation); and
10correcting, based on the dialect category corresponding to the final speech, syllables in the final speech, and adjusting accents of the final speech (Xie: page 5 lines 8-17 — outputting accent modified sound signal based on making corrections to the accent and dialect pronunciations at special syllable block areas).
The combination of Zhan in view of Deshmukh further in view of Ramaswamy and further in view of Ju provides teaching for smoothening out a generated final speech. It differs from the claimed invention in that the claimed invention further provides that the smoothing includes correcting and adjusting accents of the final speech. This isn’t new to the art as the reference of Xie provides such teaching above. Hence, at the time the application was effectively filed, one of ordinary skill in the art 
As for claim 18, apparatus claim 18 and method claims 7 and 8 are related as apparatus and the method of using same, with each claimed element’s function corresponding to the claimed method step. Accordingly, claim 18 is similarly rejected under the same rationale as applied above with respect to method claims 7 and 8.
As for claim 19, apparatus claim 19 and method claim 8 are related as apparatus and the method of using same, with each claimed element’s function corresponding to the claimed method step. Accordingly, claim 19 is similarly rejected under the same rationale as applied above with respect to method claim 8.
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Zhan (CN 106373558 A) in view of Deshmukh (US 2012/0323570 A1) further in view of Ramaswamy (US 2001/0056344 A1) further in view of Ju (US 2020/0342860 A1) as applied to claim 5, and further in view of Kirby (GB 2 326 516 A).
For claim 10, claim 5 is incorporated, but the combination of Zhan in view of Deshmukh further in view of Ramaswamy and further in view of Ju fails to disclose the limitation of this claim, for which Kirby is now introduced to teach as the method, wherein the method further comprises:
synchronously revising, in response to detecting a revision operation on at least part of the text segment in 25the original text, at least part of the speech segment in the original speech associated with the revised at least part of the text segment (Kirby: Page performing an edit on the text, which automatically edits the text segment in the audio).
The combination of Zhan in view of Deshmukh further in view of Ramaswamy and further in view of Ju provides teaching for deleting a text segment of the original text and synchronously deleting the speech segment in the original speech. It differs from the claimed invention in that the claimed invention further provides a revision of the text segment leads to a synchronous revision of the segment in the original speech. This isn’t new to the art as the reference of Kirby goes to show above. Hence, at the time the application was effectively filed, one of ordinary skill in the art would have found it obvious to incorporate the teaching of Kirby into that of the combination, given the predictable result of performing only one series of editing to reflect on both the text and audio data, instead of having to separately edit each of the text and audio data.
Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Zhan (CN 106373558 A) in view of Deshmukh (US 2012/0323570 A1) as applied to claim 1, further in view of Lv et al (US 2010/0169096 A1: hereafter – Lv).
For claim 11, claim 1 is incorporated but the combination of Zhan in view of Deshmukh fails to disclose the limitations of this claim, for which Lv is now introduced to teach as the method, wherein the original speech is sent by a first user in an instant message application (Lv: [0015] — a system that supports instant messaging between a first user and a second user; the speech server can receive speech data from one user); 30and
the method further comprises:
sending the final speech to a server of the instant 30message application, so that the server of the instant message application sends the final speech to a second user of the speech server can forward the speech data to another user terminal through instant messaging).
The combination of Zhan in view of Deshmukh provides teaching for acquiring an original speech, but differs from the claimed invention in that the claimed invention further provides teaching for an instant messaging exchange of the speech data between a first and second user, with the speech passing through a server of the instant messaging application. This isn’t new to the art as the reference of Lv goes to show the instant messaging exchange of speech data between two users. Hence, at the time the application as effectively filed, one of ordinary skill in the art would have found it obvious to incorporate the teaching of Lv into that of the combination, given the predictable result of fostering an instant communication between different users, while ensuring that undesired speech sounds aren’t delivered to either of the users.
Allowable Subject Matter
Claims 3, 6, 9, 14 and 17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant’s disclosure.
Cassidy et al (US 20180136794 A1) teaches of determining sentiment based on volume and accent [0074].
Danieli (US 2006/0095262 A1) provides teaching for bleeping out offending words in live audio input data stream [0059].

Kimoto et al (US 2013/0182147 A1) teaches of defining a silence period as one that exceeds a predetermined period of time, and having a sound pressure smaller than a predetermined pressure [0115].
Miller et al (US 2019/0378499 A1) teaches of being able to determine that a region is classified as silence based on an attempt to detect human voice using an acoustic model that corresponds to human voice [0185].
Phillips et al (US 10,579,835 B1) provides teaching for associating a detected term such as ‘uhm’ with a semantic-type disfluency.
Ruby (US 2011/0144993 A1) teaches of applying semantic processing to determining disfluent utterances.
Any inquiry concerning this communication or earlier communications from the Examiner should be directed to OLUWADAMILOLA M. OGUNBIYI whose telephone number is (571)272-4708. The Examiner can normally be reached Monday - Thursday (8:00 AM - 5:30 PM Eastern Standard Time).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the Examiner by telephone are unsuccessful, the Examiner’s Supervisor, DANIEL C WASHBURN can be reached on (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/OLUWADAMILOLA M OGUNBIYI/Examiner, Art Unit 2657


/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657