Górriz, J. M. and Ramírez, J. and Turias, I. and Puntonet, Carlos G. and González, J. and Lang, Elmar (2006) C-Means Clustering Applied to Speech Discrimination. In: Alexandrov, Vassil N., (ed.) Computational science - ICCS 2006: 6th international conference, Reading, UK, May 28 - 31, 2006; Prodeedings, Part I. Lecture Notes in Computer Science, 3991. Springer, Berlin, pp. 649-656. ISBN 978-3-540-34380-6 ; 978-3-540-34379-0.
Full text not available from this repository.
An effective voice activity detection (VAD) algorithm is proposed for improving speech recognition performance in noisy environments. The proposed speech/pause discrimination method is based on a hard-decision clustering approach built over a set of subband log-energies. Detecting the presence of speech frames (a new cluster) is achieved using a basic sequential algorithm scheme (BSAS) according to a given “distance” (in this case, geometrical distance) and a suitable threshold. The accuracy of the Cl-VAD algorithm lies in the use of a decision function defined over a multiple-observation (MO) window of averaged subband log-energies and the modeling of noise subspace into cluster prototypes. In addition, time efficiency is also reached due to the clustering approach which is fundamental in VAD real time applications, i.e. speech recognition. An exhaustive analysis on the Spanish SpeechDat-Car databases is conducted in order to assess the performance of the proposed method and to compare it to existing standard VAD methods. The results show improvements in detection accuracy over standard VADs and a representative set of recently reported VAD algorithms.
|Item Type:||Book Section|
|Institutions:||Biology, Preclinical Medicine > Institut für Biophysik und physikalische Biochemie > Prof. Dr. Elmar Lang|
|Subjects:||500 Science > 570 Life sciences|
|Created at the University of Regensburg:||Unknown|
|Deposited On:||13 Oct 2010 05:47|
|Last Modified:||13 Oct 2010 05:47|