Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?
Abstract:
Most of today's large language models (LLMs) are English-centric, having been pretrained predominantly on English text. Yet, in order to meet user expectations, models need to be able to respond appropriately in multiple languages once deployed in downstream applications. Given limited exposure to other languages during pretraining, cross-lingual transfer is important for achieving decent performance in non-English settings. In this talk, I present findings from a recent study which aims to determine just how much multilinguality is required during finetuning to elicit strong cross-lingual generalisation across a range of tasks and target languages. Compared to English-only finetuning, multilingual instruction tuning with as few as three languages can significantly improve a model's cross-lingual transfer abilities on long-form, generative tasks that assume input/output language agreement, while being of less importance for highly structured, classification-style tasks
Speaker:
Tannon Kew is a PhD student from the Department of Computational Linguistics at the University of Zurich. His current research focuses on methods for steering text-to-text generation for different applications, such as response generation, dialogue modelling and text simplification. Tannon holds a BA in Languages and Linguistics from the University of Adelaide and a MA in Computational Linguistics from the University of Zurich. Prior to relocating to Switzerland, Tannon worked for the Mobile Language Team, where he focused on the digitalisation of language learning resources for Indigenous languages of South Australia’s Far West Coast.
Venue
Online: https://uqz.zoom.us/j/89174093328