Student Research Workshop (SRW) Accepted Papers
- From Sentences to Proof Trees: Leveraging Language Models for Structured Reasoning
Aayushee Gupta
- Understanding Subliminal Learning: Generality, Sensitivity, and Token-Level Explanations
Yagnesh Veeraraghavan, Keanu Lim, Jacob Lipner, Saanvi Ibrahimpatnam, Kevin Zhu, Madhur Panwar
- Active Learning for Corpus Refinement: Cost-Effective Preprocessing to Improve Validity of Applied Quantitative Text Analysis
Jakob Steglich, Stephan Poppe
- You Didn’t Have to Say It Like That: Subliminal Learning from Faithful Paraphrases
Isaia Gisler, Zhonghao He, Tianyi Alex Qiu
- Broken Chains: The Cost of Incomplete Reasoning in LLMs
Ian Su, Gaurav Purushothaman, Jey Narayan, Ruhika Goel, Kevin Zhu, Sunishchal Dev, Yash More, Maheep Chaudhary
- Pushing the Boundaries of Multiple Choice Evaluation to One Hundred Options
Nahyun Lee, Guijin Son
- Colorism in Large Vision-Language Models: An Empirical Exploration of Socioeconomic Linguistic Bias
Raj Gaurav Maurya, Vaibhav Shukla, Sreedath Panat
- Hospitality-VQA: Decision-Oriented Informativeness Evaluation for Vision–Language Models
Jeongwoo Lee, Baek Duhyeong, Eungyeol Han, Soyeon Shin, Gukin han, Seungduk Kim, Jaehyun Jeon, Taewoo Jeong
- TimeRes: A Turkish Benchmark For Evaluating Temporal Understanding of Large Language Models
Habib Yağız Demir, Susan Üsküdarlı, Ümit Atlamaz
- What the Router Sees Matters: Funnel Pooling for Fast, Content Driven Expert Routing
Josef Pichlmeier, Sebastian Nicolas Mueller, Jakob Sturm, Josef Dräxl, Andre Luckow
- FluffInjector: Diagnosing Logical Consistency Failures in Chain-of-Thought Reward Models
Varshith Vijjapu, Krishiv Ray, Archana Vaidheeswaran
- Emergent Misalignment: Tracking the Emergence and Evolution of Misaligned traits throughout Model Training
Geunwoo Park, Pranay Chauhan, Haihao Liu
- Beep boop: Bot Detection as a Preprocessing Step for Polish Reddit
Karmela Matyjaszek
- An Evaluation of Classifiers for Mapping Generative LLM Responses to Answer Options of Multiple-choice Questionnaires
Alisea Stroligo, Anna Shamray, Julian Schelb, Andreas Spitz
- Bring the Apple, Not the Sofa: Impact of Irrelevant Context in Embodied AI Commands on VLA Models
Andrey Moskalenko, Daria Pugacheva, Denis Shepelev, Andrey Kuznetsov, Vlad Shakhuro, Elena Tutubalina
- Thesis Proposal: Comparing Human and Model Perception of Writing Style under Controlled Perturbations
Ewelina Paulina Księżniak
- Automatic Generation of a Compositional QA Benchmark for Geospatial Reasoning under Spatial and Entity Constraints
Tetsuhisa Suizu, Shohei Higashiyama, Hiroyuki Shindo, Hiroki Ouchi, Sakriani Sakti
- $\texttt{lrnn-lib}$: A library for Linear RNNs
Karan Bania, Soham Kalburgi, Manit Tanwar, Dhruthi, Aditya Nagarsekar, Harshvardhan Mestha, Naman Chibber, Raj Deshmukh, Anish Sathyanarayanan, Aarush Rathore, Pratham Chheda
- Detecting Overflow in Compressed Token Representations for Retrieval-Augmented Generation
Julia Belikova, Danila Rozhevskii, Dennis Svirin, Konstantin Polev, Alexander Panchenko
- Thesis Proposal: Stability-Aware, Evidence-Grounded Knowledge Graphs for Substance Use Disorders and Social Determinants of Health
Gautham Vijay Kumar
- Energy Matching based Preference Learning for Diffusion Langauge Models
Shiv Shankar
- Thesis Proposal: Measuring Prejudice at Scale
Zoran Fijavž, Senja Pollak, Veronika Bajt
- Evaluating Cost-Efficiency of LLMs in a RAG Setup on Polish Wikipedia: Quality vs. Energy Consumption
Patrycja Smits, Tomasz Walkowiak
- How Do Lexical Senses Correspond Between Spoken German and German Sign Language?
Melis Çelikkol, Wei Zhao
- From Detection to Explanation: Modeling Fine-Grained Emotional Social Influence Techniques with LLMs and Human Preferences
Maciej Markiewicz, Wiktoria Mieleszczenko-Kowszewicz, Beata Bajcar, Tomasz Adamczyk, Aleksander Szczęsny, Jolanta Babiak, Przemyslaw Kazienko
- Beyond Bias Scores: Unmasking Vacuous Neutrality in Small Language Models
Sumanth Manduru, Carlotta Domeniconi
- LLMs Exhibit Performative Fairness When Generating Profiles with Complex Geopolitical Identities
Maida Aizaz, Quang Minh Nguyen
- Learning Nested Named Entity Recognition from Flat Annotations
Igor Rozhkov, Natalia V Loukachevitch
- Efficient Low-Resource Language Model Using Tokenizer Transfer
Gustaf Gren, Murathan Kurfali
- DRAGOn: Designing RAG On Periodically Updated Corpus
Fedor Chernogorskii, Sergei Averkiev, Liliya Kudraleeva, Zaven Martirosian, Maria Tikhonova, Valentin Malykh, Alena Fenogenova
- Fake News Detection Strategies under Dataset Bias: Using Large-scale Coarse-grained Labels
Yuki Kishi, Yuji Arima, Hitoshi Iyatomi
- Thesis Proposal: A Multi-Agent System for Ontology-Based Perspective-Aware Knowledge Extraction
Luiz do Valle Miranda, Grzegorz J. Nalepa
- A Computational Forensic Linguistic Analysis of Narrative and Question-Answer Structures in Italian Police Interrogation Transcripts
Romane Werner, Thomas François, Sonja Bitzer
- Annotation-Efficient Vision-Language Model Adaptation to the Polish Language Using the LLaVA Framework
Grzegorz Statkiewicz, Alicja Dobrzeniecka, Karolina Seweryn, Aleksandra Krasnodębska, Karolina Piosek, Katarzyna Bogusz, Sebastian Cygert, Wojciech Kusa
- Evaluating the Impact of SAE-based Language Steering on LLM Performance
Sebastian Zwirner, Wentao Hu, Koshiro Aoki, Daisuke Kawahara
- Towards Singable Lyrics Translation Using Large Language Models
Liu Hanze, Yusuke Sakai, Taro Watanabe
- Thesis Proposal: Development of End-to-End Speech Translation Models for Indian Languages
Jamaluddin
- Probabilistic Bilingual Subword Segmentation with Latent Subword Alignment
Shoto Nishida, Daiki Matsui, Takashi Ninomiya, Isao Goto, Akihiro Tamura
- Text-to-Text Automatic Story Generation: A Survey
Yuan Ma, Hanna Suominen, Patrik Haslum, Richard Susilo
- In-Image Machine Translation. A Preliminary Modular Approach
Sergio Gomez Gonzalez, Miguel Domingo, Francisco Casacuberta
- Plasticity vs. Rigidity: The Impact of Low-Rank Adapters on Reasoning on a Micro-Budget
Zohaib Khan, Omer Tafveez, Zoha Hayat Bhatti
- Scale Is All You Need 🙄: Analyzing Modality Interaction and Speaker Intent Without Fine-Tuning
Animesh Gurjar, Nikhil Krishnaswamy
- Token Pruning for Improving Graph-Generating State Space Model Performance
Monish Beegamudre, Jack Zheng, Margaret Capetz
- GraphRAG-Rad: Concept-Aware Radiology Report Generation via Latent Visual-Semantic Retrieval
Faezeh Safari, Hang Dong, ZEYU FU, Aline Villavicencio
- When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models
Zafir Shamsi, Nikhil Chekuru, Zachary Guzman, Shivank Garg
- Chronocept: Instilling a Sense of Time in Machines
Krish Goel, Sanskar Pandey, Mahadevan KS, Harsh Kumar, Vishesh Khadaria
- Acceleration of Backpropagation in Linear Layers of Transformer Models Based on Gradient Structure
Dmitrii Topchii, Alexander Panchenko, Viktoriia A. Chekalina
- The Clinical Fingerprint: Comparing the Rhetorical Integrity and Epistemic Safety of Human Physicians and Large Language Models
Bayram Ayadi
- Communication as a Complex System: Modeling the Feedback Dynamics of Trust and Credibility
Swaptik Chowdhury, Samuel D. Allen, Jung Hee Hyun
- Thesis Proposal: Multimodal Benchmark for Music Understanding in Large Language Models
Tomáš Sourada
- Who Plays Which Role? Protagonist Detection and Classification in Moral Discourse
Mirko Sommer, Maria Becker
- A Benchmark and Evaluation of Automated Language of Study Extraction from Computational Linguistics Publications
Ashwin Kirubakaran, Henry Gagnier
- Kahaani: A Multimodal Co-Creative Storytelling System
Samee Arif, Taimoor Arif, Muhammad Saad Haroon, Aamina Jamal Khan, Agha Ali Raza, Awais Athar
- Exploring the Semantic Space of Second Language Learners
Trisha Godara, Rui He, Wolfram Hinzen, Yan Cong
- CAPID: Context-Aware PII Detection for Question-Answering Systems
Mariia Ponomarenko, Sepideh Abedini, Masoumeh Shafieinejad, D. B. Emerson, Shubhankar Mohapatra, Xi He
- Generalising LLM Routing using Past Performance Retrieval: A Few-Shot Router is Sufficient
Clovis Varangot-Reille, Christophe Bouvard, Antoine Gourru
- Call, Reward, Repeat: Advancing Dialog State Tracking with GRPO and Function Calling
Timur Ionov, Anna Marshalova, Valentin Malykh
- Different Time, Different Language: Revisiting the Bias Against Non-Native Speakers in GPT Detectors
Adnan Al Ali, Jindřich Helcl, Jindřich Libovický
- Trainable, Multiword-aware Tokenization Using Modern Neural Networks
Clara Boesenberg, Kilian Evang
- LEMUR: Robust Fine-Tuning for Multilingual Embedding Models for Retrieval
Narges Baba Ahmadi, Jan Strich, Martin Semmann, Chris Biemann
- Comprehensive Comparison of RAG Methods Across Multi-Domain Conversational QA
Klejda Alushi, Jan Strich, Chris Biemann, Martin Semmann
- Comparing Text Compression Capabilities of Large Language Models with Traditional Compression Algorithms
Mehran Haddadi, William John Teahan
- Construction of an Evaluation Dataset for Hallucination Detection in Japanese Summarization Task
Hikari Tanaka, Atsushi Keyaki, Mamoru Komachi
- Thesis proposal: Are We Losing Textual Diversity to Natural Language Processing?
Josef Jon, Ondřej Bojar
- Beyond One-Step Distillation: Bridging the Capacity Gap in Small Language Models via Multi-Step Knowledge Transfer
Gaeun Yim, Nayoung Ko, Manasa Bharadwaj
- Thesis proposal: COGNILENS: Analyzing Cognitive Decline in Language Models for Alzheimer’s Monitoring
Jonathan Guerne
- Thesis Proposal: Efficient KV Cache Reuse for Multi-Document Retrieval-Augmented Generation
Zhipeng Zhang, Dmitry Ilvovsky
- Voice Identification of 1960s Tamil Singers Using Transfer Learning for Preserving Cultural Heritage
Sathiyakugan Balakrishnan, Uthayasanker Thayasivam
- What Persona Are We Missing? Identifying Unknown Relevant Personas for Faithful User Simulation
Weiwen SU, Yuhan Zhou, Zihan Wang, Naoki Yoshinaga, Masashi Toyoda
- Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+
York Hay Ng, Aditya Khan, Xiang Lu, Matteo Salloum, Michael Zhou, Phuong Hanh Hoang, A. Seza Doğruöz, En-Shiun Annie Lee
- Rethinking the Evaluation of Alignment Methods: Insights into Diversity, Generalisation, and Safety
Denis Janiak, Julia Moska, Dawid Motyka, Karolina Seweryn, Paweł Walkowiak, Bartosz Żuk, Arkadiusz Janz
- Machine Translation for Low-Resource Languages through Monolingual Data and LLM: A Case Study of English-to-Basque
Nam Luu, Aitor Soroa, German Rigau, Ondřej Bojar
- Luth: Efficient French Specialization for Small Language Models and Cross-Lingual Transfer
Sinoué GAD, Maxence Lasbordes
- PATCH Dataset: Empowering Traditional Chinese Safety Classifiers for Lightweight LLM
Chi-Wei Chang, Chiung-Jui Chen, Richard Tzong-Han Tsai
- Do Multi-Agents Solve Better Than Single? Evaluating Agentic Frameworks for Diagram-Grounded Geometry Problem Solving and Reasoning
Mahbub E Sobhani, Md. Faiyaz Abdullah Sayeedi, Mohammad Nehad Alam, Proma Hossain Progga, Swakkhar Shatabda
- Domain Adaptation of Image Encoder for Multimodal Manga Translation
Kota Manabe, Tomoyuki Kajiwara, Takashi Ninomiya, Isao Goto, Shonosuke Ishiwatari, Hiroshi Noji
- Mask What Matters: Mitigating Object Hallucinations in Large Vision–Language Models with Object-Aligned Visual Contrastive Decoding
Boqi Chen, Xudong Liu, Jianing Qiu