Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is responsive to the application filed on 12/11/2020.
Claims 1-23 are pending.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-4 and 15-18 are rejected under 35 U.S.C. 102 (a)(1) as being anticipated by Yasa et al. (US Patent 10,861,446 B2).
Regarding Claims 1 and 15, Yasa teaches a control method for speech interaction (see Fig.1, Fig.8 and Col.28, Line 20-22), comprising:
collecting an audio signal (see Fig.8 (802) and Col.28, Line 22-23);
detecting a wake-up word in the audio signal to obtain a wake-up word result (see Fig.8 (804,806,808) and Col.28, Line 23-31);
and playing a prompt tone and/or executing a speech instruction in the audio signal based on the wake-up word result (see Fig.7 (265,710), Fig.8 (810,812,814,820,822) and Col.28, Line 38-61, generating output data or playing a prompt to a user).
Regarding Claims 2 and 16, Yasa further teaches wherein the wake-up word result comprises a first confidence, the first confidence is configured to represent a reliability that the audio signal comprises a target wake-up word (see Fig.8 (808,810) and Col.2, Line 28-42);
and playing the prompt tone and/or executing the speech instruction in the audio signal based on the wake-up word result comprises:
executing the speech instruction in a case that the first confidence reaches a first confidence threshold (see Fig.8 (810,812) and Col.28, Line 38-42);
and playing the prompt tone in a case that the first confidence fails to reach the first confidence threshold (see Fig.8 (814,820,822), Col.2, Line 1-8 and Col.28, Line 42-53, audio prompt requesting user to repeat).
Regarding Claims 3 and 17, Yasa further teaches withholding from playing the prompt tone when executing the speech instruction in the audio signal based on wake-up word result (see Fig.8 (810,812) and Col.28, Line 38-42, generate output data without an audio prompt to the user when the confidence score is above a threshold).
Regarding Claims 4 and 18, Yasa further teaches wherein the wake-up word result comprises a second confidence, the second confidence is configured to represent a reliability that the audio signal comprises an ordinary wake-up word (see Fig.8 (814,820,822), Col.2, Line 1-8 and Col.28, Line 42-53, audio prompt requesting user to repeat when the confidence score is below a first threshold);
and playing the prompt tone based on the wake-up word result comprises: playing the prompt tone in a case that the second confidence reaches a second confidence threshold and the first confidence fails to reach the first confidence threshold (see Fig.8 (814,820,822), Col.2, Line 1-8 and Col.28, Line 42-53, audio prompt requesting user to repeat when the confidence score is below a first threshold).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 5 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Yasa et al. (US Patent 10,861,446 B2) in view of Nadkar et al. (US Patent 11,164,570 B2).
Regarding Claims 5 and 19, Yasa teaches wherein the ordinary wake-up word comprises at least one target wake-up word (see Fig.2A (220) and Col.7, Line 31-37);
and detecting the wake-up word in the audio signal comprises:
performing a primary detection on the target wake-up word in the audio signal by employing a wake-up word detection model to obtain a first detection result ((see Fig.2A (220) and Col.7, Line 31-37);
performing a secondary defection on the target wake-up word after the primary detection to obtain a second detection result (see Fig.8 (810,814,820) and Col.28, Line 38-52);
and determining the first confidence and the second confidence based on the first detection result and the second detection result (see Fig.8 (810,814,820) and Col.28, Line 38-52).
Yasa fails to teach performing a secondary detection on the target wake-up word with a set period.
Nadkar, however, teaches ending a user interaction with a voice assistant system after failing to receive a user voice input after a set period (see Col.6, Line 60 – Col., Line 1).
It would have been obvious for one skilled in the art, before the effective filing date of the application, to include to Yasa’s method the step for performing a secondary detection on the target wake-up word with a set period. The motivation would be to provide a set period for allowing a user to provide a voice input for the secondary detection.
Claims 6-7, 10-14 and 20-21 are rejected under 35 U.S.C. 103 as being unpatentable over Yasa et al. (US Patent 10,861,446 B2) in view of Hughes et al. (US Patent 10,276,161 B2).
Regarding Claims 6 and 20, Yasa teaches the method of Claim 1, but fails to teach wherein the speech instruction is obtained by detecting a part subsequent to the wake-up word in the audio signal.
Hughes, however, teaches receiving a speech instruction by obtaining speech data subsequent to the wake-up word in the audio signal (see Fig.1 (104) and Col.9, Line 51-61).
It would have been obvious for one skilled in the art, before the effective filing date of the application, to include to Yasa’s method the step for receiving a speech instruction by obtaining speech data subsequent to the wake-up word in the audio signal. The motivation would be to process a speech command after detecting a wake or trigger word.
Regarding Claims 7 and 21, Hughes further teaches wherein the method is executed by a speech interaction terminal (see Fig.1 (104,F,G) and Col.9, Line 20-37), and executing the speech instruction in the case that the first confidence reaches the first confidence threshold comprises:
sending the audio signal comprising the target wake-up word and the speech instruction subsequent to the target wake-up word to a server, such that the server detects the wake- up word at a front part of the audio signal and the speech instruction subsequent to the wake-up word (see Fig.1 (104,120,C), Col.6, Line 20-33 and Col.9, Line 51-61);
and obtaining the speech instruction from the server and executing the speech instruction (see Fig.1 (104,120,E,G) Col.8, Line 3-10 and Col.9, Line 33-37).
Regarding Claim 10, Yasa teaches a method for controlling a speech interaction (see Fig.1, Fig.8 and Col.28, Line 20-22), comprising:
obtaining an audio signal (see Fig.8 (802) and Col.28, Line 22-23);
detecting a wake-up word at a front part of an audio signal to obtain a wake-up word result and a speech instruction result (see Fig.8 (804,806,808,810,812) and Col.28, Line 23-31);
and controlling a speech interaction terminal to play a prompt tone and/or to execute the speech instruction based on at least one of the wake-up word result and the speech instruction result (see Fig.7 (265,710), Fig.8 (810,812,814,820,822) and Col.28, Line 38-61, generating output data or playing a prompt to a user).
Yasa fails to teach detecting a speech instruction subsequent to the wake-up word.
Hughes, however, teaches receiving a speech instruction by obtaining speech data subsequent to the wake-up word in the audio signal (see Fig.1 (104) and Col.9, Line 51-61).
It would have been obvious for one skilled in the art, before the effective filing date of the application, to include to Yasa’s method the step for detecting a speech instruction subsequent to the wake-up word. The motivation would be to process a speech command after detecting a wake or trigger word.
Regarding Claim 11, Yasa further teaches wherein the wake-up word result comprises a third confidence, the third confidence is configured to represent a reliability that the audio signal comprises a target wake-up word (see Fig.8 (808,810) and Col.2, Line 28-42);
and playing the prompt tone and/or executing the speech instruction in the audio signal based on the wake-up word result comprises:
executing the speech instruction in a case that the first confidence reaches a third confidence threshold (see Fig.8 (810,812) and Col.28, Line 38-42);
and playing the prompt tone in a case that the first confidence fails to reach the third confidence threshold (see Fig.8 (814,820,822), Col.2, Line 1-8 and Col.28, Line 42-53, audio prompt requesting user to repeat).
Regarding Claim 12, Yasa further teaches wherein the wake-up word result comprises a fourth confidence, the fourth confidence is configured to represent a reliability that the front part of the audio signal comprises an ordinary wake-up word (see Fig.8 (814,820,822), Col.2, Line 1-8 and Col.28, Line 42-53, audio prompt requesting user to repeat when the confidence score is below a first threshold);
and controlling the speech interaction terminal to play the prompt tone and/or to execute the speech instruction based on the at least one of the wake-up word result and the speech instruction result comprises:
controlling the speech interaction terminal to execute the speech instruction and/or to play the prompt tone based on the speech instruction result in a case that the fourth confidence reaches the fourth confidence threshold (see Fig.8 (814,820,822), Col.2, Line 1-8 and Col.28, Line 42-53, audio prompt requesting user to repeat when the confidence score is below a first threshold);
and controlling the speech interaction terminal to send a dummy instruction in a case that the fourth confidence fails to reach the fourth confidence threshold and the third confidence fails to reach the third confidence threshold (see Fig.8 (814,816) and Col.28, Line 42-46).
Regarding Claim 13, Yasa further teaches performing wake-up word detection on a front part of a recognition text of the audio signal to obtain a wake-up word detection (see Fig. (808) and Col.28, Line 20-31);
determining an interaction confidence of the audio signal based on a textual representation associated with the recognition text of the audio signal, the interaction confidence indicating a reliability that the audio signal is taken as the speech instruction for interacting with the speech interaction terminal (see Fig.8 (802,804,806,808,812) and Col.28, Line 20-42);
determining a match condition between the recognition text and the audio signal, the match condition indicating a level that the recognition text correctly reflects information comprised in the audio signal (see Fig.2B (258), Fig.8 (808) and Col.9, Line 28-42);
and obtaining the wake-up word result and the speech instruction result based on the interaction confidence, the match condition and the wake-up word detection result of the front part (see Fig.8 (812,814,816) and Col.28, Line 38-52).
Regarding Claim 14, Yasa further teaches wherein the method is executed by a server (see Fig.2A (120) and Col.4, Line 29-41); and obtaining the audio signal comprises: receiving the audio signal sent by the speech Interaction terminal (see Fig.2A (110) and Col.4, Line 11-15).
Claims 8-9 and 22-23 are rejected under 35 U.S.C. 103 as being unpatentable over Yasa et al. (US Patent 10,861,446 B2) in view of Smith et al. (US Patent 11,361,756 B2).
Regarding Claims 8 and 22, Yasa teaches the method of Claim 1, but fails to teach wherein the target wake-up word is a word with less than four syllables; and the ordinary wake-up word is a word with four or more syllables.
Smith, however, teaches differentiating keywords or phrases based on the number of syllables (see Col.30, Line 42-48).
It would have been obvious for one skilled in the art, before the effective filing date of the application, to include to Yasa’s method the step for distinguishing the target wake-up word and the ordinary wake-up word based on the number of syllables. The motivation would be to set a criteria for differentiating the target wake-up word and the ordinary wake-up word.
Regarding Claims 9 and 23, the rationale provided for the rejection of Claim is incorporated herein.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VU B HANG whose telephone number is (571)272-0582.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mohammad H. Ghayour, can be reached at (571)272-3021. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/VU B HANG/Primary Examiner, Art Unit 2672