Deep Learning is an interesting new branch of machine learning where neural networks consisting of multiple layers have shown new generalization capabilities. The seminar will look at advances in both general deep learning approaches, and at the specific case of Neural Machine Translation (NMT). NMT is a new paradigm in data-driven machine translation. In Neural Machine Translation, the entire translation process is posed as an end-to-end supervised classification problem, where the training data is pairs of sentences and the full sequence to sequence task is handled in one model.
Here is a link to last semester's seminar.
There is a Munich interest group for Deep Learning, which has an associated mailing list, some announcements relevant to this seminar are sent out on this list. See the link here.
Email Address: Put Last Name Here @cis.uni-muenchen.de
Thursdays 14:45 (s.t.), location ZOOM ONLINE
You can install the zoom client or click cancel and use browser support (might not work for all browsers).
Contact Alexander Fraser if you need the zoom link.
New attendees are welcome. Read the paper and bring a paper or electronic copy with you, you will need to refer to it during the discussion.
Click here for directions to CIS.
If this page appears to be out of date, use the refresh button of your browser
Date | Paper | Links | Discussion Leader |
October 12th, 2023 | Kaitlyn Zhou, Dan Jurafsky, Tatsunori Hashimoto (2023). Navigating the Grey Area: Expressions of Overconfidence and Uncertainty in Language Models. arXiv | paper | Siyao (Logan) Peng |
November 2nd, 2023 | Shangbin Feng, Chan Young Park, Yuhan Liu, Yulia Tsvetkov (2023). From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models. ACL | paper | Faeze Ghorbanpour |
November 9th, 2023 | Grégoire Delétang, Anian Ruoss et al. (2023). Language Modeling Is Compression. arXiv | paper | Xingpeng Wang |
November 23rd, 2023 | Yizhong Wang, Hamish Ivison et al. (2023). How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources. arXiv | paper | Peiqin Lin |
November 30th, 2023 | Zhenghao Lin, Yeyun Gong et al. (2023). Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise. PMLR 202:21051-21064 | paper | Viktor Hangya |
December 7th, 2023 | Cancelled (EMNLP) | ||
December 14th, 2023 | Jirui Qi, Raquel Fernández, Arianna Bisazza (2023). Cross-Lingual Consistency of Factual Knowledge in Multilingual Language Models. EMNLP | paper | Kathy Hämmerl |
December 21st, 2023 | Sireesh Gururaja, Amanda Bertsch, Clara Na, David Gray Widder, Emma Strubell (2023). To Build Our Future, We Must Know Our Past: Contextualizing Paradigm Shifts in Natural Language Processing. EMNLP | paper | Leonie Weissweiler |
January 18th, 2024 | Gati Aher, Rosa Arriaga, Adam Kalai (2023). Using large language models to simulate multiple humans and replicate human subject studies. ICML | paper | Philipp Wicke |
Feb 1st, 2024 | Vineel Pratap, Andros Tjandra et al. (2023). Scaling Speech Technology to 1,000+ Languages. arXiv. | paper | Verena Blaschke |
Feb 22nd, 2024 | Peter Hase, Mohit Bansal, Peter Clark, Sarah Wiegreffe (2024). The Unreasonable Effectiveness of Easy Training Data for Hard Tasks. arXiv. | paper | Andreas Stephan |
Further literature:
You can go back through the previous semesters by clicking on the link near the top of the page.