Deep Learning is an interesting new branch of machine learning where neural networks consisting of multiple layers have shown new generalization capabilities. The seminar will look at advances in both general deep learning approaches, and at the specific case of Neural Machine Translation (NMT). NMT is a new paradigm in data-driven machine translation. In Neural Machine Translation, the entire translation process is posed as an end-to-end supervised classification problem, where the training data is pairs of sentences and the full sequence to sequence task is handled in one model.
Here is a link to last semester's seminar.
There is a Munich interest group for Deep Learning, which has an associated mailing list, the paper announcements are sent out on this list. See the link here.
Email Address: Put Last Name Here @cis.uni-muenchen.de
Thursdays 14:45 (s.t.), location ZOOM ONLINE
You can install the zoom client or click cancel and use browser support (might not work for all browsers).
Contact Alexander Fraser if you need the zoom link.
New attendees are welcome. Read the paper and bring a paper or electronic copy with you, you will need to refer to it during the discussion.
Click here for directions to CIS.
If this page appears to be out of date, use the refresh button of your browser
Date | Paper | Links | Discussion Leader |
April 27th, 2023 | Maarten Sap et al. (2022). Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection. NAACL | paper | Xingpeng Wang |
May 11th, 2023 | Saurav Kadavath, Tom Conerly, et al. (2022). Language Models (Mostly) Know What They Know. arXiv | paper | Abdullatif Köksal |
May 25th, 2023 | Benjamin Minixhofer, Fabian Paischer, Navid Rekabsaz (2022). WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models. NAACL | paper | Robert Litschko |
June 15th, 2023 | Lukas Edman, Gabriele Sarti, Antonio Toral, Gertjan van Noord, Arianna Bisazza (2023). Are Character-level Translations Worth the Wait? Comparing Character- and Subword-level Models for Machine Translation. arXiv | paper | Lukas Edman |
June 22nd, 2023 | Caleb Ziems, William Held, Jingfeng Yang, Jwala Dhamala, Rahul Gupta, Diyi Yang (2023). Multi-VALUE: A Framework for Cross-Dialectal English NLP. ACL | paper | Verena Blaschke |
July 6th, 2023 | Isaac Caswell, Theresa Breiner, Daan van Esch, Ankur Bapna (2020). Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus. COLING | paper Also maybe look at: paper paper | Amir Kargaran |
July 20th, 2023 | Md Mahfuz Ibn Alam, Sina Ahmadi, Antonios Anastasopoulos (2023). CODET: A Benchmark for Contrastive Dialectal Evaluation of Machine Translation. arXiv | paper | Katya Artemova |
July 27th, 2023 | Tianjian Li and Kenton Murray (2023). Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution. Findings of ACL | paper | Ercong Nie |
August 17th, 2023 | Alisa Liu, Zhaofeng Wu, et al (2023). We're Afraid Language Models Aren't Modeling Ambiguity. arXiv | paper | Leon Weber |
September 28th, 2023 | Sheng Lu, Irina Bigoulaeva, Rachneet Sachdeva, Harish Tayyar Madabushi, Iryna Gurevych (2023). Are Emergent Abilities in Large Language Models just In-Context Learning? arXiv | paper | Yihong Liu |
Further literature:
You can go back through the previous semesters by clicking on the link near the top of the page.