Isolated word recognition from in-ear microphone data using Hidden Markov Models (HMM)
Kurcan, Remzi Serdar
Fargues, Monique P.
MetadataShow full item record
This thesis is part of an ongoing larger scale research study started in 2004 at the Naval Postgraduate School (NPS) which aims to develop a speech-driven human-machine interface for the operation of semi-autonomous military robots in noisy operational environments. Earlier work included collecting a small database of isolated word utterances of seven words from 20 adult subjects using an in-ear microphone. The research conducted here develops a speaker-independent isolated word recognizer from these acoustic signals based on a discrete-observation Hidden Markov Model (HMM). The study implements the HMM-based isolated word recognizer in three steps. The first step performs the endpoint detection and speech segmentation by using short-term temporal analysis. The second step includes speech feature extraction using static and dynamic MFCC parameters and vector quantization of continuous-valued speech features. Finally, the last step involves the discrete-observation HMM-based classifier for isolated word recognition. Experimental results show the average classification performance around 92.77%. The most significant result of this study is that the acoustic signals originating from speech organs and collected within the external ear canal via the in-ear microphone can be used for isolated word recognition. The second dataset collected under low signal-to-noise ratio conditions with additive noise results in 79% recognition accuracy in the HMM-based classifier. We also compared the classification results of the data collected within the ear canal and outside the mouth via the same microphone. The second dataset collected under low signal-to-noise ratio conditions with additive noise results in 79% recognition accuracy in the HMM-based classifier. We also compared the classification results of the data collected within the ear canal and outside the mouth via the same microphone. Average classification rates obtained for the data collected outside the mouth shows significant performance degradation (down to 63%), over that observed with the data collected from within the ear canal (down to 86%). The ear canal dampens high frequencies. As a result, the HMM model derived for the data with dampened higher frequencies does not accurately fit the data collected outside the mouth, resulting in degraded recognition performances.
Showing items related by title, author, creator and subject.
Bulbuller, Gokhan. (Monterey California. Naval Postgraduate School, 2006-03);Speech collected through a microphone placed in front of the mouth has been the primary source of data collection for speech recognition. There are only a few speech recognition studies using speech collected from the ...
Koliousis, Dimitrios S. (Monterey California. Naval Postgraduate School, 2007-06);This study is part of an ongoing research started in 2004 at the Naval Postgraduate School (NPS) investigating the development of a human-machine interface commandand- control package for controlling robotic units in ...
Zambartas, Michail (Monterey, California. Naval Postgraduate School, 1999-09-01);This thesis presents an introduction to Hidden Markov models (HMM) and their applications to classification problems. HMMs have been used extensively to model the temporal structure and variability of speech and other ...