Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statements (IDS) submitted on February 03, 2021 and August 02, 2021 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims are 1, 4-5, 9, and 12-13 are rejected under 35 U.S.C. 102(a)(1) as being anticipated over Huffman (U.S. Publication No. 20180342258)
Regarding claim 1, Huffman discloses an audio signal processing evaluation method, comprising:
performing an audio signal processing on a synthesized audio signal so as to generate a processed audio signal, wherein the synthesized audio signal is generated by adding a second 5signal to a main signal, the main signal is merely a speech signal, and the audio signal processing is related to filtering the secondary signal from the synthesized audio signal (Figure 3 – 304 - Filter speech sample 105 into analytical audio segments [0034] - the source voice 102 may be a synthesized voice. [0097] - the target voice 104 is refined through training or the addition of more voices to the vector space 112);
obtaining a sound characteristics of the processed audio signal and the main signal respectively, wherein the sound characteristics comprises text content, and the text content is generated by performing speech-to-text on the processed audio signal and the main signal ([0009] - at least one voice characteristic of the timbre data as a variable [[0114] - the generator 140 can provide data that represents a speech segment (e.g., from a text input));
and 10evaluating the audio signal processing based on a comparison result between the sound characteristics of a processed audio signal and the main signal, wherein the comparison result comprises correctness of the text content of the processed audio signal corresponding to the main signal (Figure 11 – 1108 - Compare authentic voice profile to voice profiles in vector space 112 [0099] - The discriminator 142 may do this by comparing the candidate speech segment 146 to the target timbre 104 and other voices mapped in the vector space 112 (i. e., with reference to a plural timbre data of a plurality of different voices)).
Regarding claim 4, Huffman discloses the method for evaluating audio signal processing, wherein the sound characteristics further comprises a voiceprint feature, and evaluating the 5audio signal processing comprises: comparing a voiceprint similarity of the voiceprint feature between the processed audio signal and of the main signal, wherein the comparison result further comprises the voiceprint similarity (Figure 11 – 1108 - Compare authentic voice profile to voice profiles in vector space 112 [0099] - The discriminator 142 may do this by comparing the candidate speech segment 146 to the target timbre 104 and other voices mapped in the vector space 112 (i.e., with reference to a plural timbre data of a plurality of different voices)).
Regarding claim 5, Huffman discloses the audio signal processing evaluation method, wherein evaluating the audio signal processing comprises:
determining that the higher the voiceprint similarity and the higher the correctness of the text content corresponding to a better evaluation result ([0101] - The discriminator 142 may, for example, start by comparing the fundamental frequency of certain phones to see which possible timbre is most clearly i.e., has the highest probability of being) the match. As described previously, there are more characteristics that define the timbre other than fundamental frequency);
and determining that the lower the voiceprint similarity or the lower the correctness of the 15text content corresponding to a poorer evaluation result ([0101] - The discriminator 142 may, for example, start by comparing the fundamental frequency of certain phones to see which possible timbre is most clearly i.e., has the highest probability of being) the match. As described previously, there are more characteristics that define the timbre other than fundamental frequency).
Regarding claim 9, Huffman discloses an audio signal apparatus for processing evaluation, comprising:
a storage, storing program code ([0133] - such instructions may be stored in any memory device);
and a processor, coupled to the storage, loading the program code to be configured for ([0046] - the voice extractor 112 may be implemented using a plurality of microprocessors executing firmware):
performing an audio signal processing on a synthesized audio signal so as to generate a processed audio signal, wherein the synthesized audio signal is generated by adding a second 5signal to a main signal, the main signal is merely a speech signal, and the audio signal processing is related to filtering the secondary signal from the synthesized audio signal (Figure 3 – 304 - Filter speech sample 105 into analytical audio segments [0034] - the source voice 102 may be a synthesized voice. [0097] - the target voice 104 is refined through training or the addition of more voices to the vector space 112);
obtaining a sound characteristics of the processed audio signal and the main signal respectively, wherein the sound characteristics comprises text content, and the text content is generated by performing speech-to-text on the processed audio signal and the main signal ([0009] - at least one voice characteristic of the timbre data as a variable [[0114] - the generator 140 can provide data that represents a speech segment (e.g., from a text input));
and 10evaluating the audio signal processing based on a comparison result between the sound characteristics of a processed audio signal and the main signal, wherein the comparison result comprises correctness of the text content of the processed audio signal corresponding to the main signal (Figure 11 – 1108 - Compare authentic voice profile to voice profiles in vector space 112 [0099] - The discriminator 142 may do this by comparing the candidate speech segment 146 to the target timbre 104 and other voices mapped in the vector space 112 (i. e., with reference to a plural timbre data of a plurality of different voices)).
Dependent claims 12-13 are analogous in scope to claims 4-5, and are rejected according to the same reasoning.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-3, 6-7, 10-11, and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Huffman (U.S. Publication No. 20180342258) in view of Xie (U.S. Publication No. 20210152534)
Regarding claim 2, Huffman discloses all of the limitations as in claim 1, above.
However, Huffman does not disclose the method for evaluating audio signal processing, wherein evaluating the audio signal processing comprises:
comparing a character difference between the text content of the processed audio signal and of the main signal, wherein the character difference is related to whether corresponding characters in the text content are the same;
and 20determining a text correctness rate of the processed audio signal relative to the main signal based on the character difference, wherein the correctness of the text content is related to the text correctness rate.
Xie does teach the method for evaluating audio signal processing, wherein evaluating the audio signal processing comprises:
comparing a character difference between the text content of the processed audio signal and of the main signal, wherein the character difference is related to whether corresponding characters in the text content are the same ([0093] - The authentication module 150 determines a measure of the overall correctness of the phrase provided by the user based on the conversation with the user… The authentication module 150 may determine a degree of mismatch between two phrases based on a distance metric, for example, Levenshtein distance or any aggregate value representing differences in characters of the two terms);
and 20determining a text correctness rate of the processed audio signal relative to the main signal based on the character difference, wherein the correctness of the text content is related to the text correctness rate ([0093] - The authentication module 150 determines a measure of the overall correctness of the phrase provided by the user based on the conversation with the user… The authentication module 150 may determine a degree of mismatch between two phrases based on a distance metric, for example, Levenshtein distance or any aggregate value representing differences in characters of the two terms).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Huffman to incorporate the teachings of Xie in order to implement the method for evaluating audio signal processing, wherein evaluating the audio signal processing comprises: comparing a character difference between the text content of the processed audio signal and of the main signal, wherein the character difference is related to whether corresponding characters in the text content are the same; and 20determining a text correctness rate of the processed audio signal relative to the main signal based on the character difference, wherein the correctness of the text content is related to the text correctness rate. Doing so allows the authentication of users when there is a high accuracy of the content (Xie [0094]).	Regarding claim 3, Huffman in view of Xie teaches all of the limitations as in claim 2, above.
However, Huffman does not disclose the method for evaluating audio signal processing, wherein the text correctness rate is a ratio of a number of identical texts to all characters of the-11-File: 105603usf text content.
Xie does teach the method for evaluating audio signal processing, wherein the text correctness rate is a ratio of a number of identical texts to all characters of the-11-File: 105603usf text content ([0093] - The authentication module 150 determines a measure of the overall correctness of the phrase provided by the user based on the conversation with the user… The authentication module 150 may determine a degree of mismatch between two phrases based on a distance metric, for example, Levenshtein distance or any aggregate value representing differences in characters of the two terms).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Huffman to incorporate the teachings of Xie in order to implement the method for evaluating audio signal processing, wherein the text correctness rate is a ratio of a number of identical texts to all characters of the-11-File: 105603usf text content. Doing so allows the authentication of users when there is a high accuracy of the content (Xie [0094]).
Regarding claim 6, Huffman discloses all of the limitations as in claim 5, above.
However, Huffman does not disclose the audio signal processing evaluation method, wherein the voiceprint similarity is related to a distance between a characteristic vector of the processed audio signal and of the main signal, the characteristic vector is converted by the voiceprint 20feature, and evaluating the audio signal processing comprises:
determining the closer the distance as the higher the voiceprint similarity;
and determining the farther the distance as the lower the voiceprint similarity.
Xie does teach the audio signal processing evaluation method, wherein the voiceprint similarity is related to a distance between a characteristic vector of the processed audio signal and of the main signal, the characteristic vector is converted by the voiceprint 20feature, and evaluating the audio signal processing comprises:
determining the closer the distance as the higher the voiceprint similarity (Figure 7 - Perform conversation with user to confirm each sub - phrase determined to have low correctness 750 [0093] - The authentication module 150 determines a measure of the overall correctness of the phrase provided by the user based on the conversation with the user… The authentication module 150 may determine a degree of mismatch between two phrases based on a distance metric, for example, Levenshtein distance or any aggregate value representing differences in characters of the two terms [0094] - If the comparison indicates high accuracy of the information provided by the user during the conversation, the authentication module 150 authenticates the user);
and determining the farther the distance as the lower the voiceprint similarity (Figure 7 - Perform conversation with user to confirm each sub - phrase determined to have low correctness 750 [0093] - The authentication module 150 determines a measure of the overall correctness of the phrase provided by the user based on the conversation with the user… The authentication module 150 may determine a degree of mismatch between two phrases based on a distance metric, for example, Levenshtein distance or any aggregate value representing differences in characters of the two terms [0094] - If the comparison indicates high accuracy of the information provided by the user during the conversation, the authentication module 150 authenticates the user).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Huffman to incorporate the teachings of Xie in order to implement the audio signal processing evaluation method, wherein the voiceprint similarity is related to a distance between a characteristic vector of the processed audio signal and of the main signal, the characteristic vector is converted by the voiceprint 20feature, and evaluating the audio signal processing comprises: determining the closer the distance as the higher the voiceprint similarity; and determining the farther the distance as the lower the voiceprint similarity. Doing so allows the authentication of users when there is a high accuracy of the content (Xie [0094]).
Regarding claim 7, Huffman in view of Xie teaches all of the limitations as in claim 6, above.
However, Huffman does not disclose the audio signal processing evaluation, wherein evaluating the audio signal processing comprises:
comparing a character difference between the text content of the processed audio signal and of the main signal, wherein the character difference is related to whether corresponding characters in the text content are the same;
determining a text correctness rate of the processed audio signal relative to the main 5signal based on the character difference, wherein the correctness of the text content is related to the text correctness rate;
determining the closer the distance and the higher the text correctness rate as the higher the voiceprint similarity;
and determining the farther the distance or the lower the text correctness rate as the lower 10the voiceprint similarity.
Xie does teach the audio signal processing evaluation, wherein evaluating the audio signal processing comprises:
comparing a character difference between the text content of the processed audio signal and of the main signal, wherein the character difference is related to whether corresponding characters in the text content are the same ([0018] - The system determines correctness of each sub-phrase (correctness of a phrase/sub-phrase is also referred to herein as the accuracy of the phrase/sub-phrase) [0093] - The authentication module 150 determines a measure of the overall correctness of the phrase provided by the user based on the conversation with the user… The authentication module 150 may determine a degree of mismatch between two phrases based on a distance metric, for example, Levenshtein distance or any aggregate value representing differences in characters of the two terms);
determining a text correctness rate of the processed audio signal relative to the main 5signal based on the character difference, wherein the correctness of the text content is related to the text correctness rate ([0018] - The system determines correctness of each sub-phrase (correctness of a phrase/sub-phrase is also referred to herein as the accuracy of the phrase/sub-phrase) [0093] - The authentication module 150 determines a measure of the overall correctness of the phrase provided by the user based on the conversation with the user… The authentication module 150 may determine a degree of mismatch between two phrases based on a distance metric, for example, Levenshtein distance or any aggregate value representing differences in characters of the two terms);
determining the closer the distance and the higher the text correctness rate as the higher the voiceprint similarity (Figure 7 - Perform conversation with user to confirm each sub - phrase determined to have low correctness 750 [0093] - The authentication module 150 determines a measure of the overall correctness of the phrase provided by the user based on the conversation with the user… The authentication module 150 may determine a degree of mismatch between two phrases based on a distance metric, for example, Levenshtein distance or any aggregate value representing differences in characters of the two terms [0094] - If the comparison indicates high accuracy of the information provided by the user during the conversation, the authentication module 150 authenticates the user);
and determining the farther the distance or the lower the text correctness rate as the lower 10the voiceprint similarity (Figure 7 - Perform conversation with user to confirm each sub - phrase determined to have low correctness 750 [0093] - The authentication module 150 determines a measure of the overall correctness of the phrase provided by the user based on the conversation with the user… The authentication module 150 may determine a degree of mismatch between two phrases based on a distance metric, for example, Levenshtein distance or any aggregate value representing differences in characters of the two terms [0094] - If the comparison indicates high accuracy of the information provided by the user during the conversation, the authentication module 150 authenticates the user).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Huffman to incorporate the teachings of Xie in order to implement the audio signal processing evaluation, wherein evaluating the audio signal processing comprises: comparing a character difference between the text content of the processed audio signal and of the main signal, wherein the character difference is related to whether corresponding characters in the text content are the same; determining a text correctness rate of the processed audio signal relative to the main 5signal based on the character difference, wherein the correctness of the text content is related to the text correctness rate; determining the closer the distance and the higher the text correctness rate as the higher the voiceprint similarity; and determining the farther the distance or the lower the text correctness rate as the lower 10the voiceprint similarity. Doing so allows the authentication of users when there is a high accuracy of the content (Xie [0094]).
Dependent claims 10-11 and 14-15 are analogous in scope to claims 2-3 and 6-7 respectively, and are rejected according to the same reasoning.
Allowable Subject Matter
Claims 8 and 16 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. 
The following is a statement of reasons for the indication of allowable subject matter:  The prior art could not overcome or render obvious the limitation “determining a completeness of the processed audio signal based on  
    PNG
    media_image1.png
    31
    215
    media_image1.png
    Greyscale
 a is a variable adjustment parameter, I is the completeness and related to an evaluation result of the audio signal processing, dl is the distance, and d2 is the text correctness rate” as claimed. 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Lee (U.S. Publication No. 20190385608) teaches intelligent voice recognizing method, apparatus, and intelligent computing device. Ishikawa (U.S. Publication No. 20120022676) teaches audio signal processing apparatus, audio coding apparatus, and audio decoding apparatus. Ossowski (U.S. Publication No. 20190051299) teaches method and system of audio false keyphrase rejection using speaker recognition. Shukla (U.S. Publication No. 20190279642) teaches system and method for speech understanding via integrated audio and visual based speech recognition. Yamamoto (U.S. Publication No. 20150120307) teaches signal processing apparatus and signal processing method, encoder and encoding method, decoder and decoding method, and program.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ETHAN DANIEL KIM whose telephone number is (571) 272-1405.  The examiner can normally be reached on Monday - Friday 9:00 - 5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ETHAN DANIEL KIM/
Examiner, Art Unit 2658
/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658