Listeners learn and predict talker-specific prosodic cues in speech perception

TitleListeners learn and predict talker-specific prosodic cues in speech perception
Publication TypePresentation
Year of Publication2020
Conference NameMiddag van de Fonetiek
AuthorsSeverijnen, Giulio G. A., Hans Rutger Bosker, Vitoria Piai, and James M. McQueen
PublisherNederlandse Vereniging voor Fonetische Wetenschappen
Conference Locationonline
Abstract

One of the challenges in speech perception is that listeners must deal with considerable segmental and suprasegmental variability in the acoustic signal due to differences between talkers. Most previous studies have focused on how listeners deal with segmental variability. In this EEG experiment, we investigated how listeners track talker-specific usage of suprasegmental cues to lexical stress to correctly recognize spoken words. In a 3-day training phase, Dutch participants learned to map non-word minimal stress pairs onto different object referents (e.g., USklot means “lamp”; usKLOT means “train”). These non-words were produced by two male talkers. Critically, each talker only used one suprasegmental cue to signal lexical stress (e.g., Talker A only used F0, Talker B only amplitude). We expected participants to learn which talker used which cue to signal stress. In the test phase, participants indicated whether spoken sentences including these non-words were correct (“The word for ‘lamp’ is...”). We recorded participants’ response times and EEG patterns, targeting an ERP related to phonological prediction: the N200. We found that participants were slower to indicate that a stimulus was correct if the non-word was produced with the unexpected cue (e.g., Talker A using amplitude). That is, if in training Talker A used F0 to signal stress, participants experienced a mismatch between predicted and perceived phonological word-forms if, at test, Talker A unexpectedly used amplitude as cue to stress. This illustrates talker-specific prediction of suprasegmental cues, picked up through perceptual learning in training. In contrast the N200 amplitude, was not modulated by the mismatch. Theoretical implications for these results are discussed.