Neural Machine Translation (NMT) is a new paradigm in data-driven machine translation. Previous generation Statistical Machine Translation (SMT) systems are built using a collection of heuristic models, typically combined in a log-linear model with a small number of parameters. In Neural Machine Translation, the entire translation process is posed as an end-to-end supervised classification problem, where the training data is pairs of sentences. While in SMT systems, word-alignment is carried out, and then fixed, and then various sub-models are estimated from the word-aligned data, this is not the case in NMT. In NMT, fixed word-alignments are not used, and instead the full sequence to sequence task is handled in one model.
The course will work backwards from the current state of the art in NMT, which is the "ensemble" system submitted by the Bengio group in Montreal to the 2015 shared task on machine translation (Jean et al. 2015, see below, with some additional details to be published). Depending on the background of the participants, some basics of SMT may also be covered.
Email Address: SubstituteLastName@cis.uni-muenchen.de
Email Address: SubstituteFirstName.SubstituteLastName@gmail.com
JHU and LMU
August 11th, 2015 | Concluding discussion, plans for next semester |
August 4th, 2015 | Learning to Forget: Continual Prediction with LSTM. Felix A. Gers, Jürgen Schmidhuber, and Fred Cummins. Neural Computation, October 2000. ftp://ftp.idsia.ch/pub/juergen/FgGates-NC.pdf |
July 28th, 2015 | Learning to Forget: Continual Prediction with LSTM. Felix A. Gers, Jürgen Schmidhuber, and Fred Cummins. Neural Computation, October 2000. ftp://ftp.idsia.ch/pub/juergen/FgGates-NC.pdf |
July 21st, 2015 | Sutskever, Ilya, Oriol Vinyals, and Quoc V Le (2014). Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems. http://arxiv.org/abs/1409.3215 |
July 14th, 2015 | Gulcehre, Caglar, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loic Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, Yoshua Bengio (2015). On Using Monolingual Corpora in Neural Machine Translation. http://arxiv.org/abs/1503.03535 |
July 7th, 2015 | Bahdanau, Dzmitry, Kyunghyun Cho, Yoshua Bengio (2015). Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. http://arxiv.org/abs/1409.0473 |
June 30th, 2015 | Jean, Sébastien, Kyunghyun Cho, Roland Memisevic, Yoshua Bengio (2015). On Using Very Large Target Vocabulary for Neural Machine Translation. http://arxiv.org/abs/1412.2007 |
June 23rd, 2015 | Introduction to Neural Machine Translation |
June 16th, 2015 | Organizational Meeting |
Further literature:
Please click here for an NMT reading list, and here for a short list of LSTM papers recommended by David Kaumanns.