“Knock, Knock. Who's there?” - Speaker Tracking in the BATS Project

Title“Knock, Knock. Who's there?” - Speaker Tracking in the BATS Project
Publication TypePresentation
Year of Publication2010
AuthorsHuijbregts, Marijn
PublisherNederlandse Vereniging voor Fonetische Wetenschappen
Conference LocationNijmegen, The Netherlands

Creating large digital multimedia archives is no problem. With an investment of less than two hundred euros for example, it is possible to record the Dutch public television broadcast channels every single day for about a year. This archive would fill a 1.5 Terabyte hard drive and would contain over 7000 hours of video. Creating such an archive is no problem, but efficiently finding information in the archive is a challenge.

An effective method of searching multimedia archives and collections is to run automatic speech recognition on each file and to apply standard search techniques on the speech transcriptions. This makes it possible to find video fragments on basis of what has been said.

By applying speech recognition it is possible to search an archive on content words, but it is not possible to answer queries such as: “Find a video fragment where Armstrong talks about the Amstel Gold Race”. In the BATS project we attempt to solve these kinds of queries by applying speaker tracking (“Armstrong”) and topic detection (“Amstel Gold Race”).

BATS, Topic and Speaker tracking in Broadcast Archives, is a joint project of the University of Leuven and the Radboud University Nijmegen, funded by ICTRegie and IBBT. In my talk I will focus on the speaker tracking task. I will explain why it is a challenge to automatically determine the identity of each single speaker in a collection and I will describe our approach to solve this challenge.