Zusammenfassung
In this paper we propose a voice activity detection (VAD) algorithm for improving speech recognition performance in noisy environments. The approach is based on statistical tests applied to multiple observation window based on the determination of the speech/non-speech bispectra by means of third order auto-cumulants. This algorithm differs from many others in the way the decision rule is ...
Zusammenfassung
In this paper we propose a voice activity detection (VAD) algorithm for improving speech recognition performance in noisy environments. The approach is based on statistical tests applied to multiple observation window based on the determination of the speech/non-speech bispectra by means of third order auto-cumulants. This algorithm differs from many others in the way the decision rule is formulated (detection tests) and the domain used in this approach (bispectrum). It is shown that application of statistical detection test leads to a better separation of the speech and noise distributions, thus allowing a more effective discrimination and a tradeoff between complexity and performance. The experimental analysis carried out on the AURORA databases and tasks provides an extensive performance evaluation together with an exhaustive comparison to the standard VADs such as ITU G.729, GSM AMR and ETSI AFE for distributed speech recognition (DSR), and other recently reported VADs. Clear improvements in Speech Recognition are obtained when the proposed VAD is used as a part of a ASR system.