Methodologies for improving the g2p conversion of Dutch names
|Title||Methodologies for improving the g2p conversion of Dutch names|
|Year of Publication||2006|
|Conference Name||Summer Meeting on Corpus-based Research|
|Authors||van den Heuvel, Henk|
|Publisher||Nederlandse Vereniging voor Fonetische Wetenschappen|
|Conference Location||Nijmegen, The Netherlands|
Names pose particular problems for grapheme-to-phoneme (g2p) converters. This is due to their non-standard orthography caused by foreign origin or fossilisation of older spelling forms. In the Autonomata project a variety of techniques is studied to improve the g2p conversion of Dutch names, more specifically: first names, second names, street names and town names. In Autonomata, a standard g2p converter is augmented with a name-specific phoneme-to-phoneme (p2p) converter that captures the peculiarities of names. Based on large collections of names with a manually verified phonetic transcription, the p2p is trained with the specific information it requires. Various inductive and deductive approaches are studied to achive this goal. We will exemplify our approach by showing results on the g2p of Dutch first names.
Autonomata is carried out in the framework of the STEVIN-programme.
Partners in the project are the Radboud University Nijmegen, Ghent University, Utrecht University, Nuance, and TeleAtlas.