Hackerman Hall B17 @ 3400 N. Charles Street, Baltimore, MD
We tackle the challenge of understanding voice queries posed against the Comcast Xfinity X1 entertainment platform, where consumers direct speech input at their ?voice remotes?. Such queries range from specific program navigation (i.e., watch a movie) to requests with vague intents and even queries that have nothing to do with watching TV. We present successively richer neural network architectures to tackle this challenge based on two key insights: The first is that session context can beexploited to disambiguate queries and recover from ASR errors, which we operationalize with hierarchical recurrent neural networks. The second insight is that query understanding requires evidence integration across multiple related tasks, which we identify as program prediction, intent classification, and query tagging. We present a novel multi-task neural architecture that jointly learns to accomplish all three tasks. Our initial model, already deployed in production, serves millions of queries daily with an improved user experience.
Ferhan Ture leads the Natural Language Processing team at the Comcast Applied AI Research group, where he blends latest advancements in the field into a suite of voice-activated Xfinity products used by millions every day. Prior to that, he was a scientist at BBN Technologies, developing novel algorithms for multilingual question answering as part of the DARPA BOLT program. From 2008 to 2013, he worked on search and language translation problems under the supervisionof Jimmy Lin, as part of his doctorate studies at the Department of Computer Science at University of Maryland.