Recognition of in-ear microphone speech data using multi-layer neural networks

Download
Author
Bulbuller, Gokhan.
Date
2006-03Advisor
Fargues, Monique P.
Vaidyanathan, Ravi
Metadata
Show full item recordAbstract
Speech collected through a microphone placed in front of the mouth has been the primary source of data collection for speech recognition. There are only a few speech recognition studies using speech collected from the human ear canal. In this study, a speech recognition system is presented, specifically an isolated word recognizer which uses speech collected from the external auditory canals of the subjects via an in-ear microphone. Currently, the vocabulary is limited to seven words that can be used as control commands for a wide variety of applications. The speech segmentation task is achieved by using the short-time signal energy parameter and the short-time energy-entropy feature (EEF), and by incorporating some heuristic assumptions. Multi-layer feedforward neural networks with two-layer and three-layer network configurations are selected for the word recognition task and use real cepstrum (RC) and mel-frequency cepstral coefficients (MFCCs) extracted from each segmented utterance as characteristic features for the word recognizer. Results show that the neural network configurations investigated are viable choices for this specific recognition task as the average recognition rates obtained with the MFCCs as input features for the two-layer and three-layer networks are 94.731% and 94.61% respectively on the data investigated. Average recognition rates obtained using the RCs as features on the same network configurations are 86.252% and 86.7% respectively.
Collections
Related items
Showing items related by title, author, creator and subject.
-
Methods for phonemic recognition in speech processing.
Hollabaugh, Jon Dale. (Monterey, California: U.S. Naval Postgraduate School, 1963);Speech is one of the most inefficient methods of communication. Therefore, there has been a continuing effort to devise means to reduce the redundancy, that is, compress the bandwidth required for speech communication ... -
Diphone-based speech recognition using neural networks
Cantrell, Mark E. (Monterey, California. Naval Postgraduate School, 1996-06);Speaker-independent automatic speech recognition (ASR) is a problem of long-standing interest to the Department of Defense. Unfortunately, existing systems are still too limited in capability for many military purposes. ... -
Isolated word recognition from in-ear microphone data using Hidden Markov Models (HMM)
Kurcan, Remzi Serdar (Monterey, California. Naval Postgraduate School, 2006-03);This thesis is part of an ongoing larger scale research study started in 2004 at the Naval Postgraduate School (NPS) which aims to develop a speech-driven human-machine interface for the operation of semi-autonomous ...