Welcome to Cross Language Evaluation Forum

CLEF 2006 | Agenda

CLEF Agenda for 2006

CLEF 2006 offered a series of evaluation tracks to test different aspects of information retrieval system development. The aim is to promote research into the design of user-friendly, multilingual, multimodal retrieval systems. Information on the test collections available for each track can be found in the instructions on How to Participate.

There were eight evaluation tracks in 2006

Mono-, Bi- and Multilingual Document Retrieval on News Collections (Ad-Hoc)

The ad-hoc track tested mono- and cross-language textual document retrieval. Similarly to 2005, the 2006 track offered mono- and bilingual tasks on target collections in French, Portuguese, Bulgarian and Hungarian (possibly also Polish). Topics (i.e. statements of information needs from which queries are derived) were prepared in a wide range of European languages. We also offered a bilingual task aimed at encouraging system testing with non-European languages against an English target collection. Topics were supplied in a variety of languages including Amharic, Oromo, Hindi, Telugu and Indonesian.

In addition, a new “robust” task was offered; this task emphasized the importance of stable performance over languages instead of high average performance in mono-, cross-language and multilingual IR. The robust task is essentially an ad-hoc task which makes use of test collections previously developed at CLEF. The evaluation methodology considered the geometric average as well as the mean average precision of all topics. Geometric average has proven to be a stable measure for robustness. Data collections were provided in six languages and a topic set of some 150 topics.In the long term, we are interested in topic difficulty and failure analysis for hard topics. The track was coordinated jointly by ISTI-CNR and U.Padua (Italy) and U.Hildesheim (Germany). For further details, see the Ad-Hoc website.

Mono- and Cross-Language Information Retrieval on Structured Scientific Data (Domain-Specific)

Domain-specific retrieval was studied using the GIRT-4 German/English social science and economics database (as a pseudo-parallel corpus with identical documents in two languages), and Russian sociology database data from the Russian Social Science Corpus (RSSC), and the ISSIC database. Multilingual controlled vocabularies (German/English, German-Russian, English-Russian) were available. Mono- and cross-language tasks were offered. Topics were available in English, German and Russian. Participants could use the indexing terms in the documents and/or the social science thesaurus provided, not only for translation but also to tune relevance decisions of their systems. The track was coordinated by IZ Bonn (Germany). See the Domain-Specific website for more information.

Interactive Cross-Language Information Retrieval (iCLEF)

For CLEF 2006, the interactive track joined forces with the image track to work on a new type of interactive image retrieval task to better capture the interplay between image and the multi-lingual reality of the internet for the public at large. The task was based on the popular image perusal community Flickr (www.flickr.com), a dynamic and rapidly changing database of images with textual comments, captions, and titles in many languages and annotated by image creators and viewers cooperatively in a self-organizing ontology of tags (a so-called “folksonomy”). The track was coordinated by UNED (Spain), U. Sheffield (UK) and SICS (Sweden). See the iCLEF website.

Multiple Language Question Answering
This track, which has received increasing interest at CLEF since 2003, evaluated both monolingual (non-English) and cross-language QA systems. Questions are posed in a source language and answers are searched for in a document collection of a target language. The languages were Bulgarian, Dutch, English, French, German, Italian, Portuguese and Spanish. All combinations between them will be explored. The main task evaluated open domain QA systems which find exact answers for factoid and definition questions; in addition, a pilot task evaluated cross-language QA systems in a real, user-oriented scenario. There was also a pilot task that assessed question answering using Wikipedia, the online encyclopedia, and an Answer Validation Exercise. The track was organised by several Institutions (one for each language) and coordinated by ITC-irst and CELCT, Trento (Italy). Information for participants was available at the QA@CLEF website .

Cross-Language Retrieval in Image Collections (ImageCLEF)

This track evaluated retrieval of images described by text captions based on queries in a different language; both text and image matching techniques were potentially exploitable. Five tasks were offered in 2006:

CLEF Agenda for 2006

There were eight evaluation tracks in 2006

Mono-, Bi- and Multilingual Document Retrieval on News Collections (Ad-Hoc)

Mono- and Cross-Language Information Retrieval on Structured Scientific Data (Domain-Specific)

Interactive Cross-Language Information Retrieval (iCLEF)

Cross-Language Retrieval in Image Collections (ImageCLEF)

Cross-Language Speech Retrieval (CL-SR)

Multilingual Web Track (WebCLEF)

Cross-Language Geographical Retrieval (GeoCLEF)