CLEF 2009 |



CLEF 2009 offered a series of evaluation tracks to test different aspects of cross-language information retrieval system development. The aim was to promote research into the design of user-friendly, multilingual, multimodal retrieval systems.


There were 8 main evaluation tracks in 2009

Multilingual Document Retrieval (Ad-Hoc)

This track tested mono- and cross-language text retrieval. Tasks in 2009  tested both CL and IR aspects in a multilingual context.


The track was coordinated by ISTI-CNR (IT) & U.Padua (IT), U. Tehran (IR), and U. Basque Country (ES). For  information, see here.

Interactive Cross-Language Retrieval (iCLEF)

Interactive retrieval of images using the Flickr database ( was again studied. Flickr is a dynamic, image database with labels provided by creators and viewers in a self-organizing ontology of tags. This labeling activity is naturally multilingual, reactive, and cooperative. The focus is on measuring relevance, user confidence/satisfaction and user behaviour on a large scale. To serve this purpose, a single multilingual interface to Flickr was used by all participants. Coordinators were UNED (ES), SICS (SE) & U. Sheffield (UK). See for details.

Multiple Language Question Answering (QA@CLEF)

QA@CLEF 2009 proposed three separate exercises: ResPubliQA, QAST and GikiCLEF:


The track was coordinated by CELCT (IT) and UNED (ES).  The central website is

Cross-Language Image Retrieval (ImageCLEF)

This track evaluated retrieval from visual collections; both text and visual retrieval techniques were exploitable.A number of tasks were offered


Track coordinators were U. Sheffield (UK), U. Applied Sciences Western Switzerland (CH), Oregon Health and Science U. (US), RWTH Aachen (DE), U. Geneva (CH), CWI (NL), IDIAP (CH). For details see:


Multilingual Information Filtering (INFILE@CLEF)


INFILE (information filtering evaluation) extended the TREC 2002 filtering track as follows: it uses a corpus of 100,000 Agence France Press comparable newswires for Arabic, English and French; Evaluation is performed using an automatic querying of test systems with a simulated user feedback. Each system can use the feedback at any time to increase performance. Test systems provide boolean decisions for each document and filter profile. A curve of the evolution of efficiency is computed along with more classical measures tested in TREC. INFILE was also open to monolingual participation. Coordinators were CEA (FR), U. Lille (FR) , ELDA (FR). See

Cross-Language Video Retrieval (VideoCLEF)


VideoCLEF offered classification and retrieval tasks on a video collection containing episodes of dual language television programming. The collection  extended the Dutch/English corpus used for the 2008 VideoCLEF pilot track. Task participants were provided with speech recognition transcripts, metadata and shot-level keyframes for the video data. Two classification tasks were offered: "Subject Classification", which involves automatically tagging videos with subject labels, and "Affect and Appeal", which involves classifying videos according to characteristics beyond their semantic content. A semantic keyframe extraction task and an exercise on identifying related English-language resources to support viewer comprehension of Dutch-language video was also planned. The track was coordinated by Dublin City University (IE) and Delft University of Technology (NL). See

Intellectual Property (CLEF-IP) New this Year


The CLEF IP track in 2009 utilized a collection of more than 1M patent documents mainly derived from EPO sources, the collection included English, French and German with at least 100,000 documents in each language. Queries and relevance judgements were produced by two methods. The first used queries produced by Intellectual Property Experts and reviewed by them in a fairly conventional way. The second was an automatic method using patent citations from seed patents. Search results were reviewed to ensure the majority of test and training queries produce results in more than one language.  In 2009 we kept  to the Cranfield evaluation model:  in subsequent years we expect to offer refined retrieval process models and assessment tools.

The track was coordinated by:  Information Retrieval Facility & Matrixware (AT)  See

Log File Analysis (LogCLEF) New this Year


LogCLEF dealt with the analysis of queries as expression of user behavior. The goal was the analysis and classification of queries in order to improve search systems. LogCLEF had two tasks:


The coordinators were: U. Hildesheim (DE), U. Padua (IT), Mitre Corp. (US). See


CLEF 2009 also offered a new pilot task

Grid Experiments (Grid@CLEF)
Multilingual information access (MLIA) is increasingly part of many complex systems, such as digital libraries, enterprise portals, Web search engines. But do we really know how MLIA components (stemmers, IR models, relevance feedback, translation techniques, etc.) behave with respect to languages? Grid@CLEF has launched a cooperative effort where a series of large-scale, systematic grid experiments aim at improving our comprehension of MLIA systems and gaining an exhaustive picture of their behaviour with respect to languages. Participants are asked to take part in a series of experiments that have been carefully designed to ensure that the tested MLIA components are really comparable over groups and differences come only from the languages and tasks at hand. Task Coordinators are U. Padua (IT) and NIST (US). Details available at