Task Description
There were five evaluation tracks in CLEF 2002, testing different aspects of mono- and cross-language information retrieval system performance.
The main track in CLEF2002 required searching a multilingual document collection for relevant documents. Using a selected topic language, the goal was to retrieve documents for all languages in the collection, rather than just a given pair, listing the results in a merged, ranked list.
The CLEF 2002 document collection for this track contains English, German, French, Italian and Spanish documents. A common set of topics (i.e. structured statements of information needs from which queries are extracted) was prepared in twelve languages: Dutch, English, Finnish, French, German, Italian, Spanish, Swedish,Russian, Portuguese, Japanese and Chinese.
CLEF 2002 also offered a series of additional tracks designed to test different aspects of information retrieval system development.
In the bilingual track, any topic language could be used to search target document collections in Dutch, Finnish, French, German, Italian, Spanish or Swedish. First-time CLEF participants only could choose to search the English document collection using a European topic language.
Until recently, most IR system evaluation focused on English. CLEF provides the opportunity for monolingual system testing and tuning, and for building test suites in other European languages. CLEF 2002 offered tasks for Dutch, Finnish, French, German, Italian, Spanish and Swedish.
Mono- and Cross-Language Information Retrieval for Scientific Collections
This track offered two distinct tasks:
AMARYLLIS: System performance in searching a multi-disciplinary scientific database of approximately 150,000 French bibliographic documents was studied. Tools were provided that could be used in the retrieval task (a controlled vocabulary in English and French). The task was coordinated by Patrick Kremer and Laurent Schmitt, INIST-CNRS, France.
GIRT: This task was based on the GIRT collection which contains nearly 80,000 German social science documents in a structured database. A German/English/Russian thesaurus and English translations of the document titles were available. The rationale for this task is to study CLIR in a vertical domain (i.e. social science). The task was coordinated by Michael Kluck, IZ-Bonn, Germany.
A special interest interactive track was offered again this year. Participating teams
used a common
experiment design to explore interactive formulation of cross-language queries
and/or cross-language document selection. The coordinators were Julio Gonzalo,
UNED, Madrid, Spain, and Douglas Oard, University of Maryland, USA. For details,
see the iCLEF Web site.