CLARIN 2025 - 2 October

Post-conference Workshop on LLMs for SSH

Benchmarking and testing, establishing CLARIN as an entity to do such assessment — with a strong focus on hands-on sessions and deep discussion.

About the Workshop

Welcome to the LLMs for SSH workshop page!

The Post-conference Workshop on LLMs for SSH is an official component of the broader programme of the CLARIN2025 Annual Conference, which serves as the primary yearly gathering for those engaged with the CLARIN infrastructure and its application within the humanities and social sciences.

The workshop aims to lead to concrete results with respect to establishing cooperation around building, gathering, and curating collections of test datasets for the evaluation of LLMs within CLARIN. The workshop is primarily targeted at CLARIN centres and associated institutions, especially those participating in the LLMs4SSH K-Centre, but it is also open to newcomers. The aim is to discuss the creation of a CLARIN-associated portfolio of such resources, potentially as a new collection within the CLARIN Resource Families. Special interest will be given to language- and culture-specific benchmarking and evaluation, as well as to achieving comprehensive coverage. Discussion on resources for LLM evaluation must inevitably take into account evaluation techniques, procedures, and systems, as well as the computational resources required. An important challenge is to organise cooperation in such a way that key test resources (e.g. those forming the basis for the most reliable evaluation) are kept, to some extent, within limited-access areas (not fully open), while still enabling inter-centre and pan-CLARIN collaboration on their harmonisation toward a CLARIN-wide, comprehensive, distributed evaluation suite. The workshop is strongly focused on hands-on discussions, moderated by the workshop organisers from the LLMs4SSH K-Centre, and should conclude with a work plan for the next year. Its implementation, led by the centre, should result in concrete milestones to be specified in the plan. ::

Programme Skeleton

CzasSesja / Tematy
14:00 - 15:30
Test datasets used in LLM evaluation – quick reports from the participants: types and coverage vs evaluation techniques

Special focus: open resources (benchmarks) and methodology of their construction, use and maintenance.
Goal: Towards CLARIN’s comprehensive collection of datasets for LLM evaluation and the LLMs4SSH benchmarking set (e.g. seed list example)

Workshop
15:30 - 16:00
Coffee break
Break
16:00 - 17:00
Cooperation, synchronisation, and harmonisation in developing non-open datasets for LLM testing

Outcome: Workplan and practical milestones (led by LLMs4SSH) with assigned responsibilities.
Future outlook: Potential CLARIN cooperation with ALT-EDIC Distributed LLM Evaluation Centre (in progress)

Workshop

Target audience

The workshop targets individuals who are directly involved in the evaluation and development of LLMs, rather than PIs or managers of LLM-related projects. The presence of practitioners will be key to the success of the workshop.

Practical Information

Date
2 October 2025, from 14:00 (3+ hours incl. coffee break)
Location
Eventhotel Pyramide, Vienna, Austria
Attendance
On-site only · 20–30 participants (first come, first served) in order to make the discussion fruitful and on-site cooperation possible. In the case of high interest during registration, the program committee will select participants based on their expressions of interest.

Organising Team

Krzysztof Hwaszcz
Vincent Vandeghinste
Julia Misersky

Program Committee

Vincent Vandeghinste
Henk van den Heuvel
Maciej Piasecki

The organisation of this workshop would not have been possible without the dedicated support of the CLARIN ERIC team.