CLEF 2001 - INFORMATION FOR PARTICIPANTS

Procedure

Those wishing to take part in CLEF 2001 should send an e-mail to Carol Peters (carol@iei.pi.cnr.it), indicating in which task(s) they intend to participate (see Task Description for 2001) and then compile and submit the appropriate Data Agreement forms (see below).

Multilingual Test Collection

The main test collection for CLEF 2001 consists of SGML-formatted documents from national newspapers and journals and news agency documents for English, French, German, Italian, Spanish, Dutch from the same time period:
Los Angeles Times - 1994
Le Monde - 1994
Frankfurter Rundschau - 1994, 1995
Der Spiegel - 1994
La Stampa - 1994
Agencia EFE S.A. (Spanish news agency) - 1994
NRC Handelsblad - 1994, 1995
Algemeen Dagblad - 1994, 1995
SDA German, French and Italian Swiss news agency data - 1994.

The domain-specific collection consists of a structured database of nearly 80,000 social science documents in German, known as GIRT. A German/English/Russian thesaurus and English translations of the document titles are available.

Data Release Forms

In order to have access to the test collections, the relevant data release agreement forms must be first compiled, signed, and sent (by express mail) to Carol Peters at the address below. On receipt of the forms, you will be sent information on how to download the data.

For copyright reasons, you must fill out 2 sets of forms: 1 for the English data; the other covers Dutch, French, Italian, Spanish and German - including GIRT - data. Please indicate clearly which task(s) you will be performing as we are only authorised to provide data to those who are actually going to use it in CLEF.

N.B.: Please note that while the English data collection is the same as last year's, the collections for the other languages are new - or have been enlarged. Thus, if you already have the right to use and access to the LA Times data you do not need to submit a Data Release form for this collection, but all participants must complete the Data Release Forms for the other collections.

Everyone should fill in the CLEF 2001 Agreement concerning Dissemination of CLEF Results. The Organisation Applications should be sent to the CLEF Coordinator at the address indicated. The Individual Applications should be sent to and conserved by their Organisation. The Organization will retain the applications of all persons ever granted access to the information and make them available upon request by or on behalf of any of the copyright holders.
A Help File is available to assist you in filling out all the forms.

Data Release Forms for English Data

Organisation Application

Individual Application

Data Release Forms for Dutch, French, German, Italian, Spanish Data

Organisation Application

Individual Application

Access to the Data

When you have submitted the Data Agreement forms and have the necessary password(s), data can be downloaded here. You will also be able to access the "Guidelines for Participation" in the Working Area for Active Participants.

Topics

The topic set will be available in seven main languages (Dutch, English, French, German, Italian, Spanish and Japanese) and additional languages (possibly Greek, Finnish, Swedish and Russian) in the Working Area for Active Participants, from 9 April.

Acknowledgments

We gratefully acknowledge the support of all the data providers and copyright holders, and in particular:

  • The Los Angeles Times, for the English data collection;
  • Le Monde S.A. and ELDA: European Language Resources Distribution Agency, for the French data.
  • Frankfurter Rundschau, Druck und Verlagshaus Frankfurt am Main; Der Spiegel, Spiegel Verlag, Hamburg, for the German newspaper collections.
  • InformationsZentrum Sozialwissen-schaften, Bonn, for the GIRT database.
  • Hypersystems Srl, Torino and La Stampa, for the Italian data.
  • Agencia EFE S.A. for the Spanish data.
  • NRC Handelsblad, Algemeen Dagblad and PCM Landelijke dagbladen/Het Parool for the Dutch newspaper data.
  • Schweizerische Depeschenagentur, Switzerland, for the French, German and Italian Swiss news agency data.

Without their help, this evaluation activity would be impossible.

Contact

Carol Peters - IEI-CNR
Area della Ricerca di san Cataldo, 56100 PISA (Italy)
Tel: +39 050 315 2897 - Fax: +39 050 315 2810
E-mail: carol@iei.pi.cnr.it