Paper Presentation

Layer swapping for zero-shot cross-lingual transfer in large language models

- By Lucas Bandarkar, PhD student, UCLA

Read the Paper

In this session, we dive into cutting-edge advancements in cross-lingual AI and model merging. With the increasing need for AI systems that perform well across multiple languages, overcoming the challenges of limited task-specific data, especially in non-English languages, has become a major focus in the field. Our speaker for the session, Lucas Bandarkar, is at the forefront of this innovation.

Lucas, a researcher from the University of California, Los Angeles, alongside his team, has developed a groundbreaking methodology that offers a solution to this challenge. His paper, Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models, presents a novel model merging technique that significantly improves AI’s ability to reason in multiple languages, especially for tasks like mathematical reasoning where multilingual data is scarce.

In this session, Lucas shares how his approach of "layer swapping" allows models to perform cross-lingual tasks with remarkable efficiency and precision. Whether you're working in AI research or applying AI solutions in global contexts, this presentation will offer valuable insights into the future of multilingual AI capabilities.


Meet our Speaker:

Lucas Bandarkar

Lucas Bandarkar is a machine learning researcher currently pursuing a PhD in CS at UCLA with recurring affiliations to Meta/Facebook AI. Lucas worked as the data scientist on the machine translation product on Meta's services as well as numerous other multilingual services such as OCR, language identification, and automatic content moderation. His research focuses on multilinguality in language models, with work spanning evaluations, training data, model interpretability, and cross-lingual transfer. He holds a B.A. in Statistics and Data Science from UC Berkeley.