Show simple item record

dc.contributor.advisorMcGhee, Robert B.
dc.contributor.advisorBoger, Dan C.
dc.contributor.authorCantrell, Mark E.
dc.dateJune, 1996
dc.date.accessioned2013-04-30T22:04:54Z
dc.date.available2013-04-30T22:04:54Z
dc.date.issued1996-06
dc.identifier.urihttp://hdl.handle.net/10945/32063
dc.description.abstractSpeaker-independent automatic speech recognition (ASR) is a problem of long-standing interest to the Department of Defense. Unfortunately, existing systems are still too limited in capability for many military purposes. Most large-vocabulary systems use phonemes (individual speech sounds, including vowels and consonants) as recognition units. This research explores the use of diphones (pairings of phonemes) as recognition units. Diphones are acoustically easier to recognize because coarticulation effects between the diphones's phonemes become recognition features, rather than confounding variables as in phoneme recognition. Also, diphones carry more information than phonemes, giving the lexical analyzer two chances to detect every phoneme in the word. Research results confirm these theoretical advantages. In testing with 4490 speech samples from 163 speakers, 70.2% of 157 test diphones were correctly identified by one trained neural network. In the same tests, the correct diphone was one of the top three outputs 89.0% of the time. During word recognition tests, the correct word was detected 85% of the time in continuous speech. Of those detections, the correct diphone was ranked first 41.6% of the time and among the top six 74% of the time. In addition, new methods of pitch-based frequency normalization and network feedback-based time alignment are introduced. Both of these techniques improved recognition accuracy on male and female speech samples from all eight dialect regions in the U.S. In one test set, frequency normalization reduced errors by 34%. Similarly, feedback-based time alignment reduced another network's test set errors from 32.8% to 11.0%.en_US
dc.description.urihttp://archive.org/details/diphonebasedspee1094532063
dc.format.extent340 p.en_US
dc.language.isoen_US
dc.publisherMonterey, California. Naval Postgraduate Schoolen_US
dc.titleDiphone-based speech recognition using neural networksen_US
dc.typeThesisen_US
dc.description.recognitionNAen_US
dc.description.serviceU.S. Marine Corps (U.S.M.C.) authoren_US
etd.thesisdegree.nameM.S. in Systems Technology;M.S. in Computer Scienceen_US
etd.thesisdegree.levelMastersen_US
etd.thesisdegree.disciplineSystems Technologyen_US
etd.thesisdegree.disciplineComputer Scienceen_US
etd.thesisdegree.grantorNaval Postgraduate Schoolen_US
dc.description.distributionstatementApproved for public release; distribution is unlimited.


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record