Overview
Recent advancements in Large Language Models (LLMs) have significantly impacted evaluation methodologies in Information Retrieval (IR), reshaping the way relevance, quality, and user satisfaction are assessed. Initially demonstrating potential for query-document relevance judgments, LLMs are now being applied to more complex tasks, including relevance label generation, assessment of retrieval-augmented generation systems, and evaluation of the quality of text-generation systems. As IR systems evolve toward more sophisticated and personalized user experiences, integrating search, recommendations, and conversational interfaces, new evaluation methodologies become necessary.
Building upon the success of our previous workshops, this third iteration of the LLM4Eval workshop at SIGIR 2025 seeks submissions exploring new opportunities, limitations, and hybrid approaches involving LLM-based evaluations.
Important Dates
- Paper submission deadline: April 23, 2025 (AoE)
- Notification of acceptance: May 21, 2025 (AoE)
- Workshop date: July 17, 2025
Topics of Interest
We invite submissions on topics including, but not limited to:
- LLMs for query-document relevance assessment
- Evaluating conversational IR and recommendation systems with LLMs
- Hybrid evaluation frameworks combining LLM and human annotations
- Identifying failure modes and limitations of LLM annotations
- Prompt engineering strategies for improving LLM annotation quality
- Standardizing protocols for reliable LLM-based evaluations
- Bias, fairness, and ethical considerations in LLM evaluations
- LLM annotation robustness, reliability, and reproducibility
- User-centric evaluations, personalization, and subjective assessments with LLMs
- Case studies and lessons from industry applications of LLM-based evaluations
Special Themes
As a core part of this year’s workshop, we will host three breakout sessions, each focused on a special theme that we believe is fundamental to advancing research and practice in LLM-based evaluation. We especially encourage submissions that speak to one of the following themes:
-
Using LLMs to Evaluate Different Output
We invite work that explores how LLMs can evaluate complex outputs beyond traditional query-document pairs, such as sets or ranked lists, conversational turns, sessions, or full conversations. This theme excludes Cranfield-style qrel generation. -
Using LLMs to Simulate People
This theme focuses on using LLMs to simulate interactions, capture user context, or impersonate personas in the evaluation process. This may include modeling user behavior, simulating dialogue, or understanding how well LLMs can approximate human judgments. -
Synthesizing a Corpus
We encourage contributions that explore synthetic data generation to address data scarcity in IR evaluation, especially in low-resource or privacy-sensitive contexts. This includes techniques and insights into corpus creation when collecting real-world data is difficult or expensive.
We welcome position papers, opinion pieces, short abstracts, and published or unpublished work that can foster rich discussion within these themes. These contributions will be presented during the workshop and serve as the basis for the breakout discussions.
Submission Guidelines
- We accept:
- Research papers (up to 9 pages, excluding references)
- Position papers, opinion pieces, and demo papers
- Papers must follow the SIGIR format
- All papers will undergo double-blind peer review and be judged based on relevance and potential to spark discussion
- Previously published papers can be submitted in their original format and will be evaluated solely for relevance
- All submissions must be in English and in PDF format
- Submit via EasyChair: https://easychair.org/conferences/?conf=llm4evalsigir25
Publication Option
All accepted papers will be non-archival. Authors are encouraged to upload their papers to platforms such as arXiv.org. These versions will be linked from the workshop website and remain eligible for submission elsewhere.
Presentation
Details about presentation format will be updated soon. We aim to create an inclusive and engaging environment for sharing your work and fostering discussion.
Contact
For any questions about paper submission, please contact the organizers at:
📩 llm4eval@easychair.org