Labs


CLEF promotes the systematic evaluation of information access systems, primarily through experimentation on shared tasks.

CLEF 2021 consists of a set of 12 Labs designed to test different aspects of multilingual and multimedia IR systems:

  1. Answer Retrieval for Questions on Math
  2. BioASQ: Large-scale biomedical semantic indexing and question answering
  3. CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News
  4. Cheminformatics Elsevier Melbourne University lab
  5. CLEF eHealth
  6. Early Risk prediction on the Internet
  7. ImageCLEF Multimedia Retrieval Challenge in CLEF
  8. LifeCLEF: Multimedia Life Species Identification
  9. Living Labs for Academic Search
  10. PAN Lab on Digital Text Forensics and Stylometry
  11. SimpleText: (Re)Telling right scientific stories to non-specialists via text simplification
  12. Touché: Argument Retrieval

Labs Publications:

  • Lab Overviews published in LNCS Proceedings
  • Labs Working Notes published in CEUR-WS Proceedings
  • Best of Lab Papers will be nominated for CLEF 2022 submission to LNCS proceedings

Labs Participation:


Important Dates:

  • Labs registration opens: 16 November 2020

Answer Retrieval for Questions on Math ARQMath

ARQMath aims to advance math-aware search and the semantic analysis of mathematical notation and texts.
  • Task 1 – Answer Retrieval: Given a math question post, return relevant answer posts.
  • Task 2 – Formula Search: Given a formula in a math question post, return relevant formulas from both question and answer posts.
  • Lab Coordination: Richard Zanibbi (Rochester Institute of Technology, USA), Douglas Oard (University of Maryland, USA), Anurag Agarwal (Rochester Institute of Technology, USA), Behrooz Mansouri (Rochester Institute of Technology, USA).
  • Contact:
  • https://www.cs.rit.edu/~dprl/ARQMath
  • @ARQMath1

BioASQ: Large-scale biomedical semantic indexing and question answering

The aim of the BioASQ Lab is to push the research frontier towards systems that use the diverse and voluminous information available online to respond directly to the information needs of biomedical scientists.
  • Task 1 – Task A: Large-Scale Online Biomedical Semantic Indexing. In this task, the participants are asked to classify new PubMed documents, before PubMed curators annotate (in effect, classify) them manually. The classes come from the MeSH hierarchy. As new manual annotations become available, they are used to evaluate the classification performance of participating systems.
  • Task 2 – Task B: Biomedical Semantic Question Answering. This task uses benchmark datasets containing development and test questions, in English, along with gold standard (reference) answers constructed by a team of biomedical experts. The participants have to respond with relevant concepts, articles, snippets and RDF triples, from designated resources, as well as exact and 'ideal' answers.
  • Task 3 – Task MESINESP: Medical Semantic Indexing In Spanish. In this task, the participants are asked to classify new IBECS and LILACS documents, before curators annotate them manually. The classes come from the MeSH hierarchy through the DeCS vocabulary. As new manual annotations become available, they are used to evaluate the classification performance of participating systems.
  • Task 4 – Task Synergy: Question Answering for COVID-19. In this task, biomedical experts pose unanswered questions for the developing problem of COVID-19. Participating systems are required to provide answers, which will in turn be assessed by the experts and fed back to the systems, together with updated questions. Through this process, this task aims to facilitate the incremental understanding of COVID-19 and contribute to the discovery of new solutions.
  • Lab Coordination: Anastasia Krithara (National Center for Scientific Research "Demokritos", Greece), Anastasios Nentidis (National Center for Scientific Research "Demokritos", Greece & Aristotle University of Thessaloniki, Greece), George Paliouras (National Center for Scientific Research "Demokritos", Greece), Martin Krallinger (Barcelona Supercomputing Center, Spain), Marta Villegas (Barcelona Supercomputing Center, Spain).
  • Contact:
  • http://www.bioasq.org/workshop2021
  • @BioASQ

CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News

The CheckThat! Lab aims at fighting misinformation and disinformation in social media, in political debates, and in the news, with focus on three tasks (in English, Arabic, German and Spanish): check-worthiness estimation, detecting previously fact-checked claims, and fake news detection.
  • Task 1 - Check-worthiness estimation. Given a piece of text, detect whether it is worth fact-checking. The text can be a tweet or a sentence from a political debate, and the task is offered in English, Arabic, and Spanish.
  • Task 2 - Detecting previously fact-checked claims. Given a check-worthy claim in the form of a tweet, and a set of previously-checked claims, determine whether the claim has been previously fact-checked. The text can be a tweet or a sentence from a political debate, and the task is offered in English, Arabic, and Spanish.
  • Task 3 - Fake new detection. This task has two subtasks: multi-class fake news detection, and domain identification of fake news articles. The task is offered in English.
  • Lab Coordination: Preslav Nakov (Qatar Computing Research Institute, HBKU, Qatar), Giovanni da San Martino (Qatar Computing Research Institute, HBKU, Qatar), Tamer Elsayed (Qatar University, Qatar), Alberto Barron-Cedeno (Università di Bologna, Italy), Ruben Miguez (Newtral Media Audiovisual, Spain), Shaden Shaar (Qatar Computing Research Institute, HBKU, Qatar), Firoj Alam (Qatar Computing Research Institute, HBKU, Qatar), Fatima Haouari (Qatar University, Qatar), Maram Hasanain (Qatar University, Qatar), Nikolay Babulkov (Sofia University, Bulgaria), Alex Nikolov (Sofia University, Bulgaria), Thomas Mandl (University of Hildesheim, Germany), Julia Maria Struß (University of Applied Sciences Potsdam, Germany), Gautam Kishore Shahi (University of Duisburg-Essen, Germany), Sandip Modha (LDRP Institute of Technology and Research, India).
  • Contact:
  • https://sites.google.com/view/clef2021-checkthat

Cheminformatics Elsevier Melbourne University lab ChEMU

ChEMU (Cheminformatics Elsevier Melbourne University) lab aims to advance the state-of-the-arts in information extraction over chemical patents, and provide the opportunity to address the fundamental NLP task - anaphora resolution in the context of chemical patents.
  • Task 1 - Chemical Reaction Reference Resolution. This task focuses on capturing pragraph-level reference relationships in chemical patents. Given a description of a chemical reaction, the task requires identification of other chemical reactions or general conditions that it refers to.
  • Task 2 - Anaphora Resolution. This task focuses on expression-level references, and requires identification of reference resolution between expressions in chemical patents. It considers both the coreference relationship where two expressions refer to the same entity in the real-world, and the bridging relationship where two expressions do not refer to the same entity but are related semantically.
  • Lab Coordination: Karin Verspoor (The University of Melbourne, Australia), Tim Baldwin (The University of Melbourne, Australia), Trevor Cohn (The University of Melbourne, Australia), Saber Akhondi (Elsevier, Australia), Jiayuan He (The University of Melbourne, Australia), Christian Druckenbrodt (Elsevier, Australia), Camilo Thorne (Elsevier, Australia), Biaoyan Fang (The University of Melbourne, Australia), Hiyori Yoshikawa (The University of Melbourne, Australia), Zenan Zhai (The University of Melbourne, Australia).
  • Contact:
  • http://chemu2021.eng.unimelb.edu.au/

CLEF eHealth

This lab aims to support the development of techniques to aid laypeople, clinicians and policy-makers in easily retrieving and making sense of medical content to support their decision making.
  • Task 1 - Information Extraction from Noisy Text: Participants will identify and classify Named Entities in written ultrasonography reports, containing misspellings and inconsistencies, from a major public hospital in Argentina. Named Entities include the challenging Finding class, which encompasses anomalies and presents great variations in its form of expression.
  • Task 2 - Consumer Health Search: Participants must retrieve web pages that fulfil a given patient’s personalised information need. This needs to fulfil the following criteria: information credibility, quality, and suitability.
  • Lab Coordination: Laura Alonso Alemany (Univ. Nacional de Córdoba, Argentina), Viviana Cotik (Univ. de Buenos Aires, Argentina), Lorraine Goeuriot (Univ. J. Fourier, France), Liadh Kelly (Maynooth University, Ireland), Gabriella Pasi (University of Milano-Bicocca, Italy), Hanna Suominen (Australian National Univ., Australia).
  • Contact:
  • https://clefehealth.imag.fr/
  • @clefhealth

Early Risk prediction on the Internet eRisk

The main purpose of eRisk is to explore issues of evaluation methodology, effectiveness metrics and other processes related to early risk detection.
  • Task 1 - Early detection of pathological gambling. According to WHO, up to 6% of the population suffers from pathological gambling issues. This task will pursue early detection models for emitting alerts on users starting to develop this mental health problem.
  • Task 2 - Early detection of self-harm. Recent studies estimate the prevalence of self-harm in the population at around 6.4%, scaling up to more than 19% on young people. This task will pursue early detection models for emitting alerts on users starting to develop this mental health problem.
  • Task 3 - Depression -level estimation. According to WHO, more than 320 million people suffer from depression. We present the third edition of the depression-level estimation task. In this task, participants are expected to develop models to automatically fill a standard depression questionnaire based on the user's writings.
  • Lab Coordination: Javier Parapar (Universidade da Coruña, Spain), Patricia Martín-Rodilla (Universidade da Coruña, Spain), David E. Losada (Universidade de Santiago de Compostela, Spain), Fabio Crestani (University of Lugano, Italy).
  • Contact:
  • https://erisk.irlab.org/
  • @earlyrisk

ImageCLEF Multimedia Retrieval Challenge in CLEF

ImageCLEF 2021, the Multimedia Retrieval Challenge in CLEF, is set to promote the evaluation of technologies for annotation, indexing, classification and retrieval of multi-modal data, with the objective of providing information access to large collections of images in various usage scenarios and domains.
  • Task 1 - Participants will be requested to develop solutions for automatically identifying individual components from which captions are composed in Radiology Objects in COntext images and to combine natural language processing and computer vision for answering a medical question based on a visual image content. The tuberculosis task is on texture analysis in 3D CT images with the objective to generate reports with quantitative data on image findings.
  • Task 2 - Requires participants to automatically segment and label a collection of coral images that can be used in combination to create 3-dimensional models of an underwater environment.
  • Task 3 - Requires participants to detect rectangular bounding boxes corresponding to website User Interface elements from hand-drawn and real-world images. The objective is to be able to automatically generate websites out of proposed layouts.
  • Task 4 - Participants are required to provide automatic rankings of photographic user profiles in a series of real-life situations such as searching for a job, accommodation, insurance, bank loans, etc. The ranking will be based on an automatic analysis of profile images and the aggregation of individual results.
  • Lab Coordination: Bogdan Ionescu (University Politehnica of Bucharest, Romania), Henning Müller (University of Applied Sciences Western Switzerland, Switzerland), Renaud Péteri (University of La Rochelle, France).
  • Contact:
  • https://www.imageclef.org/2021
  • @imageclef

LifeCLEF: Multimedia Life Species Identification

LifeCLEF lab aims at boosting research on the identification and prediction of living organisms in order to solve the taxonomic gap and improve our knowledge of biodiversity.
  • Task 1 - PlantCLEF: image-based plant species identification from herbarium sheets.
  • Task 2 - BirdCLEF: bird species identification from bird calls and songs in audio soundscapes.
  • Task 3 - GeoLifeCLEF: location-based species prediction based on environmental and occurrence data.
  • Task 4 - SnakeCLEF: image-based and location-based snake identification.
  • Lab Coordination: Alexis Joly (Inria, LIRMM, Montpellier, France), Henning Müller (HES-SO, Sierre, Switzerland), Hervé Goëau (CIRAD, UMR AMAP, Montpellier, France).
  • Contact:
  • https://www.imageclef.org/LifeCLEF2021

Living Labs for Academic Search LiLAS

In the LiLAS lab, we would like to bring together IR researchers interested in the online evaluation of academic search systems. The goal is to foster knowledge on improving the search for academic resources like literature (ranging from short bibliographic records to full-text papers), research data, and the interlinking between these resources. The employed online evaluation approach in this workshop allows the direct connection to existing academic search systems from the Life Sciences and the Social Sciences. Participants are invited to either submit pre-computed runs based on previously compiled or seed documents or Docker containers of full running retrieval/recommendation systems that run within our evaluation framework called STELLA.
  • Task 1 - Ad-hoc Search Ranking. In Task 1 the participants get access to the LIVIVO corpus that consists of about 80 million documents from more than 50 data sources in multiple languages from the Life Sciences.
  • Task 2 - Research Data Recommendations. The main task here is to provide recommendations of research data that are relevant and related to the publication the user is currently viewing. The data for this task is taken from the academic search system GESIS Search from the Social Sciences.
  • Lab Coordination: Philipp Schaer (TH Köln - University of Applied Sciences, Germany), Johann Schaible (GESIS - Leibniz Institute for the Social Sciences, Germany), Leyla Jael Garcia Castro (ZB MED - Information Centre for Life Sciences, Germany).
  • Contact:
  • https://clef-lilas.github.io

PAN Lab on Digital Text Forensics and Stylometry

PAN is a series of scientific events and shared tasks on digital text forensics and stylometry, studying how to quantify writing style and improve authorship technology.
  • Task 1 - Cross-domain Authorship Verification: Given two texts, determine if they are written by the same author.
  • Task 2 - Multi-Author Writing Style Analysis: Given a text written by two or more authors, find all positions of writing style change.
  • Task 3 - Profiling Haters on Twitter: Given a user's Twitter timeline, determine if they spread hate speech.
  • Lab Coordination: Martin Potthast (Leipzig University, Germany), Paolo Rosso (Universitat Politècnica de València, Spain), Efstathios Stamatatos (University of the Aegean, Greece), Benno Stein (Bauhaus-Universität Weimar, Germany).
  • Contact:
  • https://pan.webis.de/
  • @webis_de

SimpleText: (Re)Telling right scientific stories to non-specialists via text simplification SimpleText-2021

SimpleText aims at contributing to making science more open & accessible via automatic generation of simplified summaries of scientific documents.
  • Pilot Task 1 - Searching for background information: Given a scientific text, provide background information from an external source to help a user to understand it (definitions, context, etc).
  • Pilot Task 2 - Text Simplification: Given a short scientific text (e.g., abstract), generate its simplified version.
  • Lab Coordination: Liana Ermakova (Université de Bretagne Occidentale, France), Patrice Bellot (Université de Aix-Marseille, France), Pavel Braslavski (Ural Federal University, JetBrains Research, Russia), Jaap Kamps (University of Amsterdam, Netherlands), Josiane Mothe (Université de Toulouse, France), Diana Nurbakova (INSA, Lyon, France), Irina Ovchinnikova (Sechenov University, Russia), Eric San-Juan (Institut de technologie de Avignon, France).
  • Contact:
  • https://www.irit.fr/simpleText/
  • @SimpletextW

Touché: Argument Retrieval

Touché has been designed with the goal to establish a collaborative platform for researchers in the field of argument retrieval and to provide tools for developing and evaluating argument retrieval approaches.
  • Task 1: Conversational Argument Retrieval. The goal is to support users who search for arguments to be used in conversations (e.g., getting an overview of pros and cons). Given a query on a controversial topic, the task is to retrieve relevant arguments from a focused crawl of online debate portals.
  • Task 2: Comparative Argument Retrieval. The goal is to support users facing some choice problem from ""everyday life"". Given a comparative question, the task is to retrieve and rank documents from the ClueWeb12 that help to answer the comparative question with arguments and help to make an informed decision.
  • Lab Coordination: Alexander Bondarenko (Martin-Luther-Universität Halle-Wittenberg, Germany), Lukas Gienapp (Leipzig University, Germany), Maik Fröbe (Martin-Luther-Universität Halle-Wittenberg, Germany), Meriem Beloucif (Universität Hamburg, Germany), Yamen Ajjour (Martin-Luther-Universität Halle-Wittenberg, Germany), Alexander Panchenko (Skolkovo Institute of Science and Technology, Russia), Chris Biemann (Universität Hamburg, Germany), Benno Stein (Bauhaus-Universität Weimar, Germany), Henning Wachsmuth (Paderborn University, Germany), Martin Potthast (Leipzig University, Germany), Matthias Hagen (Martin-Luther-Universität Halle-Wittenberg, Germany).
  • Contact:
  • https://touche.webis.de
  • @webis_de