Hackerman Hall B17 @ 3400 N Charles St, Baltimore, MD 21218, USA
The CHiME challenge series has been aiming to advance robust automatic speech recognition technology by promoting research at the interface of speech and language processing, signal processing and machine learning. This talk presents the 5th CHiME Challenge, which has considered the task of distant multi-microphone conversational speech recognition in domestic environments. The talk will present an overview of the CHiME-5 dataset, a fully-transcribed audio-video dataset that has captured 50 hours of audio from 20 separate dinner parties held in real homes each with 6 video channels and 32 audio channels. The talk will discuss the design of the light-weight recording set up that allowed for highly natural data to be recorded. The talk will present some analysis of the data highlighting the major sources of difficulty it presents for recognition systems. The talk will then present the outcomes of the challenge itself which attracted submissions from 19 teams submitting systems to single device or multiple device tracks. In particular, we will look at which techniques worked and which did not, and use the outcomes to identify priorities for future research and future challenges.
Jon Barker is a Professor in the Computer Science Department at the University of Sheffield. He received his degree from University of Cambridge (1991) and the Ph.D. from the University of Sheffield (1998). His research interests include human speech processing, speech intelligibility modelling and human-inspired approaches to speech separation and recognition. He has made significant contributions to the development of missing data speech recognition and statistical auditory scene analysis. In more recent years this has led to an interest in robust processing for distant microphone speech recognition. In 2011 he co-founded the CHiME series of workshops and evaluations for speech recognition and separation which are now in their 5th iteration.