October 4, 2019

12:00 pm / 1:15 pm


Hackerman B17 @ 3400 N. Charles Street, Baltimore, MD 21218

Title: Recognizing Sound Events
Abstract: The Sound Understanding team at Google has been developing automatic sound classification tools with the ambition to cover all possible sounds ? speech, music, and environmental. I will describe our application of vision-inspired deep neural networks to the classification of our ?AudioSet’ ontology of ~600 soundevents, as well as related applications in bioacoustics and cross-modal learning. With UPF Barcelona, we recently ran a Kaggle competition (part of DCASE 2019) with over 800 participants, and we will shortly release a pretrained model to make state-of-the-art generic sound recognition widelyavailable.
Bio: Dan Ellis joined Google in 2015 after 15 years as a faculty member in the Electrical Engineering department at Columbia University, where he headed the Laboratory for Recognition and Organization of Speech and Audio (LabROSA). He has over 150 publications in the areas of audioprocessing, speech recognition, and music information retrieval.
Jointwork with Eduardo Fonseca, Frederic Font, Matt Harvey, Shawn Hershey,Aren Jansen, Caroline Liu, Jiayang Liu, Channing Moore, Ratheet Pandya, Manoj Plakal, Rif A. Saurous