Hackerman Hall B17 @ 3400 N Charles St, Baltimore, MD 21218, USA
In this presentation, I will talk about a new scheme to train a seq2seq ASR model integrating a pre-trained LM. The proposed fusion method focuses on updating the memory cell/hidden state of LSTMin the seq2seq decoder using the pre-trained LM information. This means the memory retained by the main seq2seq is adjusted by the external LM. Experimental results show the effectiveness of the proposed methods in a mono-lingual ASR setup on the Librispeech corpus and in a transfer learning setup from a multilingual ASR (MLASR) base model to a low-resourced language. In Librispeech, our best model improved WER by 3.7%, 2.4% for test clean, test other relatively to the shallow fusion baseline. In transfer learning from an MLASR base model to the IARPA Babel Swahili model, the bestscheme improved the transferred model on eval set by 9.9%, 9.8% in CER,WER relatively to the 2-stage transfer baseline.