DETAILED ACTION

Notice of AIA  Status

The present application, filed on September 3rd, 2021, is being examined under the first inventor to file provisions of the AIA . 

Priority

Acknowledgment is made of applicant's claim for foreign priority based on an application filed in Republic of China on 09/08/2020. It is noted, however, that applicant has filed a certified copy of the TW109130731 application as required by 37 CFR 1.55.

Claim Rejections - 35 USC § 102

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:

A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 1, 3, 6, 7, 9, 11, 14, 15 and 17 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Taleb et al.,  (US 20120232896 A1), hereinafter referenced as Taleb .

Regarding Claim 1, Taleb teaches a voice activity detection (VAD) device capable of referring to an environment detection result and thereby selecting one of multiple VAD results as a basis for determining whether a voice activity occurs, the VAD device comprising: 
an environment detection circuit configured to process an audio input signal and thereby generate the environment detection result (Para.[0076], line 3, Fig.1, item #2 is the input audio signal. Para.[0077], lines 1-5,Fig.1, item #3 is the signal condition analyzing unit which detect signal condition/environment. Further in Para.[0108], lines 1-3, discloses voice activity detection device in Fig.1 can be formed by integrated circuit);
a VAD circuit configured to analyze the audio input signal with multiple VAD algorithms and thereby generate the multiple VAD results (Para.[ 0077], lines 5-9, Fig.1, Voice activity detection apparatus comprises several VAD units, item # 4-i, where i is an integer. Further in Para.[0108], lines 1-3, discloses voice activity detection device in Fig.1 can be formed by integrated circuit); 
and a voice activity decision circuit configured to select one of the multiple VAD results according to the environment detection result ( Para.[0077], lines 14-20, Fig.1, item#5 is the voice activity decision unit).

Regarding Claim 3, The VAD device of claim 2, wherein the signal analysis circuit includes at least one filter circuit configured to generate the M frequency band signal(s) of each of the L frame(s) according to the audio input signal, or the signal analysis circuit includes at least one conversion circuit configured to generate M frequency domain signal(s) of each of the L frame(s) according to the audio input signal (Para.[0096], lines 7-9, first VAD unit can divide the received input fame into number of sub frequency bands by using a filter).

Regarding Claim 6, The VAD device of claim 2, wherein the voice activity decision circuit selects one of the multiple VAD results according to a predetermined rule and a variation in the L comparison result(s) (Para.[0101], Fig.1, lines 23-27, Decision unit 5 selects one of the results from VAD units 4-1,4-2, based on a predetermined value);  
the predetermined rule instructs the voice activity decision circuit to select a detection result from the multiple VAD results when the variation in the L comparison result(s) exceeds a predetermined variation range (Para.[0101], lines 8-10, if the result exceeds threshold value, the result is set to 1); 
and the predetermined rule instructs the voice activity decision circuit to select another detection result from the multiple VAD results when the variation in the L comparison result(s) does not exceed the predetermined variation range (Para.[0101], lines 8-11, if the result does not exceed threshold value, the result is set to 0).

Regarding Claim 7, The VAD device of claim 1, wherein the environment detection circuit includes: a characteristic extraction circuit configured to process the audio input signal according to at least one characteristic extraction algorithm and thereby generate at least one noise characteristic (Para.[0012], different characteristics can be obtained by different VAD algorithms. Para.[0034], Fig.1, signal condition analyzing unit 3, analyzes input signal with background noise fluctuation) ; 
and a classification circuit configured to determine at least one noise type as the environment detection result according to the at least one noise characteristic (Para.[0083], Fig.1, signal condition analyzing unit 3 further analyzes a background noise fluctuation of the input signal to detect a signal condition and/or signal type of the received input signal). 

Claims 9,15 are device and method claims performing the steps in device claim 1 above and as such, claims 9, 15 are similar in scope and content to claim 1 and therefore, claims 9, 15 are rejected under similar rationale as presented against claim 1 above.

Claim 11 is device claim performing the steps in device claim 3 above and as such, claim 11 is similar in scope and content to claim 3 and therefore, claim 11 is rejected under similar rationale as presented against claim 3  above.

Claims 14 and 17  are device and method claims performing the steps in device claim 7 above and as such, claims 14 and 17 are similar in scope and content to claim 7 and therefore, claims 14 and 17 are rejected under similar rationale as presented against claim 7  above.

  			Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 5, 8, 10, 13 are rejected under 35 U.S.C. 103 as being unpatentable over Taleb et al.,  (US 20120232896 A1), hereinafter referenced as Taleb, in view of Sehlstedt et al. ( US 20120215536 A1), hereinafter referenced as Sehlstedt. 

Regarding Claim 2, Taleb teaches the VAD device of claim 1. Taleb further teaches wherein the environment detection circuit includes: a signal analysis circuit configured to generate M processed signal(s) of each of L frame(s) according to the audio input signal, wherein the M processed signal(s) are M frequency band signal(s) or M frequency domain signal(s), the M is a positive integer, and the L is a frame number (Taleb: (Para.[ 0016], line 3, input signals are divided into several sub frequency bands. Para.[0082;], lines 7-11, signal condition analyzing unit 3 analyzes the input signals of each frame); 
Taleb fails to explicitly teach an energy variation detection circuit configured to calculate according to the M processed signal(s) of each of the L frame(s) and thereby generate X energy variation value(s) of the L frame(s), wherein the X is equal to a product of the M and the L; and a variation information decision circuit configured to process the X energy variation value(s) to generate L energy variation detection value(s), then compare each of the L energy variation detection value(s) with a variation threshold to generate L comparison result(s), and then generate the environment detection result according to the L comparison result(s).

However, Sehlstedt does teach the claimed an energy variation detection circuit configured to calculate according to the M processed signal(s) of each of the L frame(s) and thereby generate X energy variation value(s) of the L frame(s), wherein the X is equal to a product of the M and the L (Sehlstedt: Para.[0058], lines 2-4, Etot_v is the energy variation value); 
and a variation information decision circuit configured to process the X energy variation value(s) to generate L energy variation detection value(s), then compare each of the L energy variation detection value(s) with a variation threshold to generate L comparison result(s), and then generate the environment detection result according to the L comparison result(s) (Sehlstedt: Para.[0017], lines 8-15).

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Sehlstedt’s teaching of a method of detecting the voice activity by using threshold adaptation, into the method and apparatus of voice activity detection of Taleb, because, this would effectively improve the handling of non-stationary background noise while maintaining the quality for speech input. (Sehlstedt [0020]).

Regarding Claim 5, Taleb in view of Sehlstedt teaches the method of claim 2. Sehlstedt further teaches, wherein the variation information decision circuit adds up the M energy variation value(s) of each of the L frame(s) in connection with the X energy variation value(s) and thereby generates the L energy variation detection value(s) (Sehlstedt : Para.[0058], lines 1-4, energy variation between frames are measured).

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Sehlstedt’s teaching of a method of detecting the voice activity by using threshold adaptation, into the method and apparatus of voice activity detection of Taleb, because, this would effectively improve the handling of non-stationary background noise while maintaining the quality for speech input. (Sehlstedt [0020]).

Regarding Claim 8, Taleb teaches the method of claim 7. Taleb further teaches wherein the voice activity decision circuit selects one of the multiple VAD results according to a predetermined rule and the at least one noise type (Taleb: Para.[0101], Fig.1, lines 23-27, Decision unit 5 selects one of the results from VAD units 4-1,4-2, based on a predetermined value);   
Taleb fails to explicitly teach the predetermined rule instructs the voice activity decision circuit to select a detection result from the multiple VAD results when the at least one noise type includes a non-stationary noise type; and the predetermined rule instructs the voice activity decision circuit to select another detection result from the multiple VAD results when the at least one noise type includes a stationary noise type.

However, Sehlstedt does teach the claimed the predetermined rule instructs the voice activity decision circuit to select a detection result from the multiple VAD results when the at least one noise type includes a non-stationary noise type (Sehlstedt : Para.[0017], lines 10-13, a predetermined value named adaptive threshold is disclosed. Para.[0019], with reliable VAD threshold adaptation function it is possible to better characterize the input noise. Further in Para.[0029], lines 6-7, noise type can be non-stationary); 
and the predetermined rule instructs the voice activity decision circuit to select another detection result from the multiple VAD results when the at least one noise type includes a stationary noise type (Sehlstedt : Para.[0017], lines 10-13, a predetermined value named adaptive threshold is disclosed. Para.[0019], with reliable VAD threshold adaptation function it is possible to better characterize the input noise. Further in Para.[0030], line 5, noise type can be stationary).

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Sehlstedt’s teaching of a method of detecting the voice activity by using threshold adaptation, into the method and apparatus of voice activity detection of Taleb, because, this would effectively improve the handling of non-stationary background noise while maintaining the quality for speech input. (Sehlstedt [0020]).

Claims 10 and 13  are device and method claims performing the steps in device claims 2 and 5 above and as such, claims 10 and 13 are similar in scope and content to claims 2 and 5 and therefore, claims 10 and 13 are rejected under similar rationale as presented against claims 2 and 5 above.



Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Taleb et al.,  (US 20120232896 A1), hereinafter referenced as Taleb, in view of Sehlstedt et al. ( US 20120215536 A1), hereinafter referenced as Sehlstedt, further in view of Gao et al. ( US 20150372723 A1), hereinafter referenced as Gao. 

Regarding Claim 4, Taleb in view of Sehlstedt teaches the VAD device of claim 2. Taleb further teaches wherein the energy variation detection circuit performs a plurality of steps including: calculating according to the M processed signal(s) of each of the L frame(s) and thereby obtaining X signal energy value(s) ( Taleb: Para.[0037], signal condition analyzing unit determines input signal and energy metric);
Taleb in view of Sehlstedt fails to explicitly teach calculating X short-term energy value(s) according to the X signal energy value(s) and a short-term frame number, and calculating X long-term energy value(s) according to the X signal energy value(s) and a long-term frame number; obtaining X energy correlation value(s) according to the X short-term energy value(s) and the X long-term energy value(s); and comparing each of the X energy correlation value(s) with an energy threshold and thereby generating the X energy variation value(s).

However, Gao does teach the claimed calculating X short-term energy value(s) according to the X signal energy value(s) and a short-term frame number, and calculating X long-term energy value(s) according to the X signal energy value(s) and a long-term frame number (Gao: Para.[0031], lines 32-36, Fig.5 in process 522,  short-term and long-term energy is calculated);
obtaining X energy correlation value(s) according to the X short-term energy value(s) and the X long-term energy value(s); and comparing each of the X energy correlation value(s) with an energy threshold and thereby generating the X energy variation value(s) (Gao: Para.[0026], lines 4-7, two energy ratios of high and low band are determined. Lines 11-16, the ratios are compared to energy threshold).

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Gao’s teaching of a method and apparatus for mitigating feedback in a digital radio receiver by determining an energy level in each of a plurality of frequency bands of the digital audio signal, into the method and apparatus of voice activity detection of Taleb in view of Sehlstedt, because, this would effectively improve the detection of voice activity by determining the time-smoothed maximum and minimum energies in the very low band. (Gao [0016], [0029]).

Claims 12 and 16  are device and method claims performing the steps in device claim 4 above and as such, claims 12 and 16 are similar in scope and content to claim 4 and therefore, claims 12 and 16 are rejected under similar rationale as presented against claim 4  above.

Conclusion

Listed below are the prior arts made of record and not relied upon but are considered pertinent to applicant's disclosure.
Wahab et al. (US 20080249771 A1) An efficient voice activity detection method and system suitable for real-time operation in low SNR (signal-to-noise) environments corrupted by non-Gaussian non-stationary background noise. The method utilizes rank order statistics to generate a binary voice detection output based on deviations between a short-term energy magnitude signal and a short-term noise reference signal. The method does not require voice-free training periods to track the background noise nor is it susceptible to rapid changes in overall noise level making it very robust. In addition, a long-term adaptation mechanism is applied to reject harmonic or tonal interference. [Abstract]
Wang et al.  (US 20100088094 A1) A voice activity detection (VAD) device and method are disclosed, so that the VAD threshold can be adaptive to the background noise variation. The VAD device includes: a background analyzing unit, adapted to: analyze background noise features of a current signal according to an input VAD judgment result, obtain parameters related to the background noise variation, and output these parameters; a VAD threshold adjusting unit, adapted to: obtain a bias of the VAD threshold according to the parameters output by the background analyzing unit, and output the bias of the VAD threshold; and a VAD judging unit, adapted to: modify a VAD threshold to be modified according to the bias of the VAD threshold output by the VAD threshold adjusting unit, judge the background noise by using the modified VAD threshold, and output a VAD judgment result. [Abstract]
Mattila et al. ( US 6,810,273 Bl) A method of noise suppression to suppress noise in a signal containing background noise in a communications path between a cellular communications network and a mobile terminal. The method comprises the steps of: estimating and up-dating a spectrum of the background noise; using the background noise spectrum to suppress noise in the signal; generating an indication to indicate the operation of at least one of a discontinuous transmission unit (DTX) and a bad frame handling unit (BFI); and freezing estimating and up-dating of the spectrum of the background noise when the indication is present. [Abstract]
Tang et al.  (US 11450336 B1) A system and method are described for automatic acoustic feedback cancellation in real time. In some implementations, the system may receive audio data describing an audio signal, which the system may use to determine a set of frames of the audio signal. Spectral analysis may be performed on the one or more frames of the audio to detect spectral patterns of two or more frames indicative of acoustic feedback. An additional delay identification test may be performed to identify a consistent delay indicative of acoustic feedback. In some implementations, a state machine is advanced based in part on accumulated delay votes. Decisions can be made to mute the acoustic feedback and cease the muting operation when silence is detected. [Abstract]

Any inquiry concerning this communication or earlier communications from the examiner should be directed to NADIRA SULTANA whose telephone number is (571)272-4048. The examiner can normally be reached M-F,7:30 am-5:00pm.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/N.S./Examiner, Art Unit 2658                                                                                                                                                                                                        

/VIJAY B CHAWAN/Primary Examiner, Art Unit 2658