CLEF AGENDA for 2001

Task Description

There were three main evaluation tracks in CLEF 2001, testing multilingual, bilingual and monolingual (non-English) information retrieval systems.  There was also a special sub-task for domain-specific cross-language evaluation and an experimental track testing interactive cross-language systems.

  1. Multilingual Information Retrieval

    The main task in CLEF 2001 required searching a multilingual document collection for relevant documents. The multilingual collection contained English, German, French, Italian and Spanish documents. Using a selected topic (query) language, the goal was to retrieve documents for all languages in the collection, rather than just a given pair, listing the results in a merged, ranked list.

    The official topic languages for CLEF 2001 were English, French, German, Italian, Spanish, Dutch and Japanese. However,  topics were also made available in  Finnish, Russian, Swedish. Chinese and Thai.

  2. Bilingual Information Retrieval

    CLEF 2001 offered 2 distinct bilingual tracks. Similarly to the previous year, the first consisted in querying a document collection of English texts in any of the other available topic languages. However, in order to assist participants who want to test their systems on a less familiar language, a second task provided the opportunity to query a Dutch document collection, again using any other topic language. A stopword list and stemmer for Dutch plus a small Dutch-English bilingual lexicon was made available to CLEF participants to assist them in this task.

  3. Monolingual (non-English) Information Retrieval

    We  provided the opportunity for monolingual system testing and tuning Dutch, French, German, Italian and Spanish.

  4. Domain-Specific Mono- and Cross-Language Information Retrieval

    In addition to the three main tasks, there was special sub-task for CLEF 2001. This task was based on a data collection from a vertical domain (social sciences): the GIRT collection. This collection contains nearly 80,000 German documents in a structured database. Topics were made available in English, German and Russian.

  5. Interactive Cross-Language Information Retrieval

    The goal of the interactive track at CLEF 2001 was to explore evaluation methods for interactive CLIR and to establish baselines against which future research progress can be measured. The track focused on interactive selection of documents that have been automatically translated from a language that the searcher would otherwise have been unable to read. 


The CLEF test collection for 2001 consisted of SGML formatted newspaper and news agency documents for English, French, German, Italian, Spanish and Dutch from the same time period. 

Important Dates

Data Release - 1 March 2001
Topic Release - from 9 April 2001
Receipt of results from participants - 10 June 2001
Release of relevance assessments and individual results - 25 July 2001
Submission of paper for Working Notes - 6 August 2001
Workshop and Working Notes - 3-4 September 2001