CLEF 2009 | How to Participate
In
order to participate in CLEF 2009 and have access to the test collections, a
Registration Form
and an End User Agreement must be first compiled,
signed, and sent to Carol Peters at the address below by all participants. On receipt of the forms, you will be sent information on how
to download the data. Some
information is repeated on the different forms. The reason for this is that
they will be kept on different sites. All forms must be signed by a person authorised
by your organisation for such signatures (e.g. Department or Administrative
Head or similar).
Please note that previous CLEF participants must resubmit End-User Agreements
each year in order to renew their authorisation to access the data covered by
the agreement.
For copyright reasons, for some tracks, you will also have to sign separate data release forms. In these cases, these forms and data access is provided directly by the track coordinators. This is indicated on the track website.
Please compile the forms carefully, inserting all relevant information. However, first, read the task descriptions in the Agenda for CLEF 2009 to see which data collections you should request - this depends on the tasks you will be performing. Remember that access to the data will only be provided when we have received signed, original copies of the End User Agreement (not electronic or fax versions).
The Registration Form is your statement of intention to participate in CLEF 2009. All participating groups must submit this form. Please fill it in carefully, providing full details of the tasks in which you intend to participate. Registration remains open until 1 May 2009
The End-User Agreement must be submitted in two original signed copies. Even if you have already participated in a CLEF campaign and have access to some of the data, you must have this form (in two copies) compiled and signed by the appropriate person in your organisation in order to be authorised to continue to use the data for CLEF 2009. You should indicate on this form only the data sets you need, depending on the tasks that you intend to perform. On receipt of the two copies of this form, they will both be signed on behalf of CLEF, and one will be kept in our archives and the other will be returned to you.
When you have submitted the Registration and End User Agreement forms you will be given the necessary information to access the data and the Workspace for Registered Participants.
The topic/question sets will be made accessible via the track websites from 1 April on. Topics will be prepared in a number of languages depending on the task. Please refer to track/task descriptions. Other languages may be added depending on demand.
Acknowledgments
We gratefully acknowledge the support of all the data providers and copyright holders, and in particular:
The Los Angeles Times, for the American-English news data collections;
SMG Newspapers (The Herald) for the British-English news data collection
Le Monde S.A. and ELDA: Evaluations and Language resources Distribution Agency, for the French news data.
Frankfurter Rundschau, Druck und Verlagshaus Frankfurt am Main; Der Spiegel, Spiegel Verlag, Hamburg, for the German newspaper collections.
InformationsZentrum Sozialwissen-schaften, Bonn, for the GIRT social science database
SocioNet system for the Russian Social Science Corpora
Institute of Scientific Information for Social Sciences of the Russian Academy of Science (ISISS RAS) for the ISISS database
Hypersystems Srl, Torino and La Stampa, for the Italian news data.
Agencia EFE S.A. for the Spanish news data.
NRC Handelsblad, Algemeen Dagblad and PCM Landelijke dagbladen/Het Parool for the Dutch newspaper data.
Aamulehti Oyj and Sanoma Osakeyhtiö for the Finnish newspaper data
Russika-Izvestia for the Russian newspaper data
Hamshahri newspaper and DBRG, Univ. Tehran, for the Persian newspaper data
Público, Portugal, and Linguateca for the Portuguese (PT) newspaper collection
Folha de São Paulo, Brazil, and Linguateca for the Portuguese (BR) newspaper collection
Tidningarnas Telegrambyrå (TT) SE-105 12 Stockholm, Sweden for the Swedish newspaper data
Schweizerische Depeschenagentur, Switzerland, for the French, German and Italian Swiss news agency data
Ringier Kiadoi Rt. [Ringier Publishing Inc.].and the Research Institute for Linguistics, Hungarian Acad. Sci. for the Hungarian newspaper documents
Sega AD, Sofia; Standart Nyuz AD, Sofia, Novinar OD, Sofia, and the BulTreeBank Project, Linguistic Modelling Laboratory, IPP, Bulgarian Acad. Sci, for the Bulgarian newspaper documents
Mafra a.s. and Lidové Noviny a.s. for the Czech newspaper data
The British Library, Bibliothèque Nationale de France and Austrian National Library for the library catalog records forming part of The European Library (TEL)
University and University Hospitals, Geneva, Switzerland and Oregon Health and Science University for the ImageCLEFmed Radiological Medical Database
Aachen University of Technology (RWTH), Germany for the IRMA database of annotated medical images
The Radiology Dept. of the University Hospitals of Geneva for the Casimage database and the PEIR (Pathology Education Image Resource) for the images and the HEAL (Health Education Assets Library) for the Annotation of the Peir dataset
T
Mallinkrodt Institue of Radiology for permission to use their nuclear medicine teaching file
University of Basel's Pathopic project for their Pathology teaching file
Michael Grubinger, administrator of the IAPR Image Benchmark, Clement Leung who initiated and supervised the IAPR Image Benchmark Project, and André Kiwitz, the Managing Director of Viventura for granting access to the image database and the raw image annotations of the tour guides.
USC Shoah Foundation Institute, and IBM (English) and The Johns Hopkins University Center for Language and Speech Processing (Czech) for the speech transcriptions
AFP Agence France Presse for the English, French and Arabic newswire data used in the INFILE track
LIACS Medialab at Leiden University, The Netherlands and Fraunhofer IDMT, Ilmenau, Germany, for use of the MIRFLICKR 25000 Image Collection
ELDA - Evaluation and language resource Distribution Agency for the EPPS 2005/2006 EN Corpus and EPPS 2005/2006 ES Corpus; Manual and automatic transcriptions of European Parliament Plenary Sessions in English & Spanish
ELDA - Evaluation and language resource Distribution Agency for the ESTER Corpus. Manual and automatic transcriptions of French broadcast news
European Patent Office and Matrixware Information Services Gmbh for the CLEF-IP test collection of more than 1M patent documents in English, French and German from the European Patent Office.
Netherlands Institute for Sound and Vision for the
VideoCLEF data.
Without their help, this evaluation activity would be impossible.
Please mail all forms by express or priority post to the address below. Access to the data will be provided only on receipt of the relevant forms.
Carol Peters - CLEF Coordinator
ISTI-CNR
Area della Ricerca CNR
Via Moruzzi, 1 - 56124 Pisa, Italy
Fax: +39 050 315 3464/2810 Tel: +39 050 315 2897