Paper Presentation

OLMoE: Open Mixture-of-Experts Language Models

- By Niklas Muennighoff, PhD student, Stanford

Read the Paper

Despite significant advancements in Large Language Models (LLMs), the challenge of balancing performance and cost persists in training and inference. Many high-performing LLMs remain inaccessible to academics and open-source developers due to their prohibitive costs.

To tackle this issue, a research titled 'OLMoE: Open Mixture-of-Experts Language Models' was published, presenting a fully open Mixture-of-Experts language model designed to offer state-of-the-art performance among similarly-sized models.

Join the author of the paper, Niklas Muennighoff, as they present OLMoE and share key insights from their research, including:

  • Introduction to OLMoE: Discover how OLMoE represents a significant step towards making high-performance language models more accessible through its fully open-source framework.
  • Efficiency of Mixture-of-Experts: Understand the mechanics behind MoEs, which activate only a subset of their parameters, leading to enhanced efficiency compared to traditional dense models.
  • QnA Session: A dedicated session for participants to ask questions, engage with the speaker, and discuss the research findings in depth.

Meet our Speaker:

Niklas Muennighoff

Niklas Muennighoff is a PhD student at Stanford. His research focuses on improving large language models across pretraining, instruction finetuning, and retrieval via work such as OLMo, BLOOM, and StarCoder. He did his Bachelor's at Peking University.