CLEF 2008 | How to Participate

How to Participate

In order to participate in CLEF 2008 and have access to the test collections, a registration form and (for most tracks) a data release form must be first compiled, signed, and sent to Carol Peters at the address below . On receipt of the forms, you will be sent information on how to download the data. Some information is repeated on the different forms. The reason for this is that they will be kept on different sites. All forms must be signed by a person authorised by your organisation for such signatures (e.g. Department or Administrative Head or similar).
Please note that previous CLEF participants must resubmit End-User Agreements each year in order to renew their authorisation to access the data covered by the agreement.

For copyright reasons, for some tracks, you will have to sign seperate data release forms. In these cases, these forms and data access is provided directly by the track coordinators. This is indicated on the track website.

Please compile the forms carefully, inserting all relevant information. However, first, read the task descriptions in the Agenda for CLEF 2008 to see which data collections you should request - this depends on the tasks you will be performing. Remember that access to the data will only be provided when we have received signed, original copies of the forms (not electronic or fax versions).

Registration Form

The Registration Form is your statement of intention to participate in CLEF 2008. All participating groups must submit this form. Please fill it in carefully, providing full details of the tasks in which you intend to participate and the languages you will be using. Registration remains open until 1 May 2008

End-User Agreement

The End-User Agreement must be submitted in two original signed copies. Even if you have already participated in a CLEF campaign and have access to some of the data, you must have this form (in two copies) compiled and signed by the appropriate person in your organisation in order to be authorised to continue to use the data for CLEF 2008. You should indicate on this form only the data sets you need, depending on the tasks that you intend to perform. On receipt of the two copies of this form, they will both be signed on behalf of CLEF, and one will be kept in our archives and the other will be returned to you.

CLEF Document Collections

The following collections are made available through the End User Agreement. Not all will be used in CLEF2008.

TEL Data

These collections consist of library catalog records from The European Library (www.theeuropeanlibrary.org).

English: Data provided by The European Library; Copyright British Library (BL)

French: Data provided by The European Library; Copyright Bibliothèque Nationale de France (BnF)

German: Data provided by The European Library; Copyright Austrian National Library (ONB)

News Documents

The CLEF document collection consists of SGML or XML-formatted documents from national newspapers, journals and news agencies from two distinct time periods (1994/1995 and 2002).

1994-5
Dutch: NRC Handelsblad - 1994, 1995; Algemeen Dagblad - 1994, 1995
English: Los Angeles Times - 1994; The Herald -1995 (also in WSD versions)
Finnish: Aamulehti - 1995 (+small amount of 94 data);
French: Le Monde - 1994, 1995; SDA French Swiss news agency data - 1994, 1995
German: Frankfurter Rundschau - 1994, 1995; Der Spiegel - 1994; SDA German Swiss news agency data - 1994, 1995
Italian: La Stampa - 1994; SDA Italian Swiss news agency data - 1994, 1995
Portuguese: Público - 1994, 1995 (Portuguese newspaper); Folha de São Paulo (FSP) - 1994; 1995 (Brazilian newspaper)
Russian: Izvestia - 1995
Spanish: Agencia EFE S.A. (Spanish news agency data) - 1994, 1995
Swedish: Tidningarnas Telegrambyrå -1994, 1995

1996-2002
Persian: Hamshahri newspaper corpus - 1996-2002

2000, 2001, 2002
Basque: Egunkaria, 2000, 2001, 2002

2002
Bulgarian: Sega - 2002; Standart - 2002; Novinar - 2002
Czech: Mlada fronta DNES - 2002; Lidové Noviny - 2002
English: Los Angeles Times - 2002
Hungarian: Magyar Hirlap - 2002

Domain-Specific Collections
Cambridge Sociological Abstracts (CSA): Bibliographical data from the Sociological Abstracts Database
GIRT4: German Indexing and Retrieval Test Database
RSSC: Russian Social Science Corpus
ISISS: Russian Economic and Social science bibliographic data

Image Collections
ImageCLEFmed Radiological Medical Database
IRMA Database of Medical Images
IAPR-TC12 Benchmark (travel photographs)

How to Participate

Registration Form

End-User Agreement

CLEF Document Collections

TEL Data

News Documents

Other Document Collections used in CLEF

Access to the Data

Topics/Questions