Keynote Speakers

Below you can find the keynote speakers for EACL 2026, including the scheduled date/time, location, and the talk title and abstract.

Nizar Habash

Professor of Computer Science, New York University Abu Dhabi

Date: Wednesday, March 25, 2026 Time: 10:00–11:00 Location: Salle Le Riad (Level 1)

Title: Arabic and Technology: A 40-Year Perspective

Abstract: Over the past four decades, work at the intersection of Arabic and technology has evolved alongside major technical breakthroughs and shifting regional and global political dynamics. In this talk, I revisit the development of Arabic NLP and AI, from early foundational efforts to today’s large-scale generative systems, and highlight how each phase has engaged with distinct aspects of Arabic’s linguistic complexity. I reflect on the language’s ambiguous orthography, rich morphology, diglossic landscape, and wide geographic and historical reach, and on the growing research community committed to ensuring that AI technologies meaningfully support Arabic and its diverse cultural expressions. I conclude with a forward-looking vision for building a cohesive and sustainable ecosystem that advances Arabic in AI through strengthened training, deeper collaboration, and sustained innovation for the next generation of researchers and practitioners.

Nizar Habash is a Professor of Computer Science at New York University Abu Dhabi (NYUAD). He is also the director of the Computational Approaches to Modeling Language (CAMeL) Lab. Before joining NYUAD in 2014, he was a research scientist at Columbia University's Center for Computational Learning Systems. He received his PhD in Computer Science from the University of Maryland College Park in 2003. He has two bachelors degrees from Old Dominion University, one in Computer Engineering and one in Linguistics and Languages. His research includes extensive work on machine translation, morphological analysis, and computational modeling of Arabic and its dialects. Professor Habash has been a principal investigator or co-investigator on over 30 research grants and has over 300 publications including a book entitled "Introduction to Arabic Natural Language Processing". He is the founding and current President of SIGARAB, the ACL Special Interest Group on Arabic NLP. Professor Habash is one of the recipients of the King Salman Academy for Arabic Language Award (2022); he is the recipient of the Antonio Zampolli Prize (2024); and he is a Fellow of the Association for Computational Linguistics (2025).

Marta R. Costa-jussà

Research Scientist, FAIR at Meta

Date: Thursday, March 26, 2026 Time: 16:30–17:30 Location: Salle Le Riad (Level 1)

Title: Omnilinguality, Scaling AI to Any language

Abstract: Communication across languages has been historically a "holy grail" in the context of AI. In a world with more than 7000 languages, AI still falls short in terms of coverage or equitably serving different languages. For decades we have focused on high-resource languages, since models have largely been resource-dependent. Today we are witnessing the evolution of techniques that can contribute to better represent a larger number of languages. We are kind of breaking the “digital desert”. In this talk, we will cover an overview of the path towards Omnilinguality in AI through the lens of the Machine Translation, starting with basic but complex definitions such as language or long-tail. MT is one of the most popular and well-explored applications in multilinguality. MT has been shown to scale to a massive number of languages with flagship specialised models covering from hundreds up to thousands of languages. Progress in MT has been largely driven by the scientific community effort of WMT that has put together competitions, benchmarks and constantly posing new challenges. Other key contributions have been the open-sourced evaluation datasets such as FLORES that cover hundreds of languages in the Wikipedia domain, and more recently BOUQuET that builds on top of non-English data, originally created and expanding to more domains. These initiatives are complemented with open-initiatives that allow the community to contribute with more languages. In my personal opinion, while MT has served as a platform to reach such progress, it may not always be the main one. Even if we have not yet solved this task, the rise of LLMs shows the opportunity to solve many tasks at once. MT will still be a relevant part of the puzzle by being a source of synthetic data generation and a representative evaluation in breadth. However, I will argue that Omniliguality should not be faced task-specifically anymore and it should be taken as a great arena to test more broadly advanced LLMs techniques and an opportunity to come up with new methods to massively scale general purpose evaluation.

Marta R. Costa-jussà is a research scientist at Meta AI since February 2022. She received her PhD from the UPC in 2008. Her research experience is mainly in Machine Translation. She has worked at LIMSI-CNRS (Paris), Barcelona Media Innovation Center, Universidade de São Paulo, Institute for Infocomm Research (Singapore), Instituto Politécnico Nacional (Mexico), the University of Edinburgh and at Universitat Politècnica de Catalunya (UPC, Barcelona). She has received an ERC Starting Grant and two Google Faculty Research Awards. Recently, she has participated in the No-language-left-behind (NLLB) and Seamless projects which both have been published in Nature. She has published hundreds of scientific papers and she is co-author of the novel El sueño de Mia.

Mariya Toneva

Faculty Member, Max Planck Institute for Software Systems

Date: Friday, March 27, 2026 Time: 14:00–15:00 Location: Salle Le Riad (Level 1)

Title: Large Language Models as Model Organisms of Language in the Human Brain

Abstract: Language is one of the richest and most complex human cognitive capacities. Yet, we lack a model organism to study its underlying neural mechanisms: unlike other important cognitive capacities, such as vision or memory, language does not have a clear counterpart in non-human animals, leaving a gap in our ability to develop and test mechanistic hypotheses. In recent years, large language models (LLMs) have emerged as the closest computational analogs we have but how can we use them effectively as model organisms for language in the human brain? In this talk, I will discuss the promise and challenges of this approach. I will present our recent work on brain-tuning using naturalistic brain recordings to refine LLMs so that their internal representations and processing better align with human neural data. But beyond representational similarity, a key question remains: do LLMs rely on mechanisms that are similar to those in the brain? And at what level of abstraction should we assess this similarity? This research direction aims to transform LLMs from mere engineering artifacts into powerful scientific tools for uncovering how the brain supports our most distinctive cognitive ability

Mariya Toneva is a faculty member at the Max Planck Institute for Software Systems, where she leads the Bridging AI and Neuroscience (BrAIN) group. Her research bridges natural language processing, machine learning, and cognitive neuroscience to develop computational models that deepen our understanding of how the brain processes language and guide the creation of more human-aligned AI systems. Her pioneering work at this intersection has been recognized and supported by the U.S. National Science Foundation (NSF), the German Research Foundation (DFG), and the European Research Council through an ERC Starting Grant.

Andrew Yates

Karen Spärck Jones Awardee and Lecture

Senior Research Scientist at Johns Hopkins University

Date: Friday, March 27, 2026 Time: 16:00–17:00 Location: Salle Le Riad (Level 1)

Title: Search with Complex Topics and Learned Sparse Retrieval

Abstract: Today, search is used by LLMs to support retrieval-augmented generation as well as by humans. In this new setting, how does what we want from a search engine change, and what are promising approaches to take? I will describe work on two themes related to these questions: (1) retrieving documents to support complex information needs in the context of long-form retrieval-augmented generation and (2) exploring learned sparse retrieval as a promising paradigm for first-stage retrieval. In the former line of work, I will discuss search systems aimed at maximizing coverage of relevant information and approaches for evaluating their ability to do so independent of a generation system. In the latter line of work, I will describe research positioning learned sparse retrieval as a compelling alternative to dense retrieval.

Andrew Yates is a Senior Research Scientist at Johns Hopkins University, where his research focuses on developing content-based neural ranking methods and leveraging them to improve search and downstream tasks in challenging scenarios. He has co-authored a variety of papers on neural ranking methods as well as a book on transformer-based neural methods: "Pretrained Transformers for Text Ranking: BERT and Beyond." Previously, Andrew was an Assistant Professor at the University of Amsterdam and a Senior Researcher at the Max Planck Institute for Informatics.