Statistical Machine Translation (SMT) was the dominant approach used for online translation until 2015. Neural Machine Translation (NMT) is the new dominant approach.
Neural Machine Translation (NMT) is a new paradigm in data-driven machine translation. Previous generation Statistical Machine Translation (SMT) systems are built using a collection of heuristic models, typically combined in a log-linear model with a small number of parameters. In Neural Machine Translation, the entire translation process is posed as an end-to-end supervised classification problem, where the training data is pairs of sentences. While in SMT systems, word-alignment is carried out, and then fixed, and then various sub-models are estimated from the word-aligned data, this is not the case in NMT. In NMT, fixed word-alignments are not used, and instead the full sequence to sequence task is handled in one model.
Content:
The seminar will begin with the basics of Statistical Machine Translation and then briefly introduce Deep Learning before covering the basics of Neural Machine Translation.
Goals:
The goal of the seminar is to understand the basics of SMT and NMT. The varying role of the lexicon (and representations of the lexicon) in these approaches is a critical aspect which will be a focus of study.
Email Address: SubstituteMyLastName@cis.uni-muenchen.de
Room U139, Tuesdays, 16:00 to 18:00 (c.t.)
Date | Topic | Reading (DO BEFORE THE MEETING!) | Slides |
October 18th | Introduction to Statistical Machine Translation | ppt pdf | |
October 25th | Bitext alignment (extracting lexical knowledge from parallel corpora) | ppt pdf | |
November 8th | Many-to-many alignments and Phrase-based model | ppt pdf | |
November 15th | Log-linear model and Minimum Error Rate Training Referat | ppt pdf Fraser Braune/Huck | |
November 22nd | Decoding (Guest Lecture from Tsuyoshi Okita) | ||
November 29th | Introduction to Linear Models (SLIDES UPDATED!) | pptx pdf | |
December 6th | Neural Networks (and Word Embeddings), Fabienne Braune | ||
December 13th | Recurrent Neural Networks, Tsuyoshi Okita | ||
December 20th | SMT: Advanced Word Alignment, Morphology, Syntax | ppt pdf | |
January 24th | Neural Machine Translation, Matthias Huck |
Referatsthemen (name: topic)
Date | Topic | Materials | Hausarbeit Received |
January 10th | Palchik: Word-Sense-Disambiguation and WSD for SMT | yes | |
January 10th | Deck: Computer-Aided Translation | yes | |
January 17th | Bilan: Cross-Lingual Lexical Substitution | yes | |
January 17th | Sedinkina: Wikification of Ambiguous Entities | yes | |
January 24th | SEE ABOVE | ||
January 31st | Poerner: System Combination | yes | |
January 31st | Krachenfels: Neural Parsing with Gated Recursive Convolutional Networks | yes |
Literature:
Philipp Koehn's book Statistical Machine Translation
Kevin Knight's tutorial on SMT (particularly look at IBM Model 1)