CLEF 2009 | Ad-Hoc | Guidelines
Guidelines for Participation in CLEF 2009 Ad-Hoc Track (Preliminary version)
In these Guidelines, we provide information on the CLEF 2009 test collections, tasks, data manipulation, query construction and results submission for the Ad-Hoc tracks.
The main task offers monolingual and cross-language search on library catalog records in English, French, and German, organised in collaboration with The European Library (TEL@CLEF). The second task is an Ad-Hoc retrieval task on a Persian newspaper corpus (Persian@CLEF). The third task is the robust task (Robust-WSD) which which aims at assessing whether word sense disambiguated (WSD) data does impact on IR system performance..
TEL@CLEF
2 tasks are offered: monolingual and bilingual:
Monolingual; Monolingual+ and Bilingual;
Bilingual+
The + tasks are tasks where the participating group also
attempts to use additional tools to cater for the multilinguality of the
collections. Groups must state whether their runs are to be considered as
“+”.
The objective is to query the selected target collection using topics in the same language (monolingual run) or topics in a different language (bilingual run) and to submit the results in a list ranked in decreasing order of relevance.
The topic sets for these two tasks consist of 50 topics and are prepared in: English, French, German. Other topic languages can be offered if requested.
The bilingual runs can consist of any topics in any language to either the English, French or German target collection.
Conditions for participation: All groups submitting bilingual tasks must also submit one base monolingual run in the target language(s) chosen.
Persian@CLEF
This task is run in collaboration with the Database Research Group of the University of Tehran. It uses the Hamshahri corpus of 1996-2002 newspapers. A very complete description can be found on the Hamshahri website. Monolingual and bilingual (EN - > FA) tasks will be offered. Training and test topics will be made available. The objective is to query the target collection using topics in the same language (monolingual run) or topics in English (bilingual run) and to submit the results in a list ranked in decreasing order of relevance.
Robust WSD
The robust task will bring semantic and retrieval evaluation together. The participants will be offered topics and document collections from previous CLEF campaigns which were annotated by systems for word sense disambiguation (WSD). The goal of the task is to test whether WSD can be used beneficially for retrieval systems.
The robust task will use two languages often used in previous CLEF campaigns (English, Spanish). Documents will be in English, and topics in both English and Spanish.
Full details on this task can be found at http://ixa2.si.ehu.es/clirwsd/.
CONSTRUCTING AND MANIPULATING
THE SYSTEM DATA STRUCTURES FOR THE AD-HOC TRACK
1. The system data structures may not
be modified in response to CLEF 2009
topics. For example, you cannot add topic words that are not in your dictionary.
The CLEF tasks represent the real-world problem of an ordinary user posing a
question to a system. In the case of the cross-language tasks, the question is
posed in one language and relevant documents must be retrieved whatever the
language in which they have been written. If an ordinary user could not make the
change to the system, you should not make it after receiving the topics.
3. Only the following
fields may be used for automatic retrieval:
LA TIMES 1994:
HEADLINE, TEXT only
LA TIMES 2002:
HD, LD, TE only
Glasgow Herald: HEADLINE, TEXT only
TELBL:
all fields
TELBNF:
all fields
TELONB:
all fields
Hamshahri:
TEXT only
WHAT TO DO WITH YOUR RESULTS
Your results must be sent to the
DIRECT
results server (address to be communicated), respecting the submission deadlines
(see below).
Results have to be submitted in ASCII format, with one line per
document retrieved.
The lines have to be formatted as
follows:
10.2452/451-AH |
Q0 |
document.00072 |
0 |
0.017416 |
runidex1 |
1 |
2 |
3 |
4 |
5 |
6 |
The fields must be separated by ONE blank and have the following meanings:
1) Query identifier. Please use the
complete DOI identifier of the topic (e.g. 10.2452/451-AH, not only
451)
INPUT MUST BE SORTED NUMERICALLY BY QUERY
NUMBER.
2) Query iteration (will be ignored. Please choose "Q0" for all experiments).
3) Document number (content of the <DOCNO> tag.).
4) Rank 0..n (0 is best matching
document. If you retrieve 1000 documents per query, rank will be 0..999, with 0
best and 999 worst). Note that rank starts at 0 (zero) and not 1 (one).
MUST
BE SORTED IN INCREASING ORDER PER QUERY.
5) RSV value (system specific value that
expresses how relevant your system deems a document to be. This is a floating
point value. High relevance should be expressed with a high value). If a
document D1 is considered more relevant than a document D2, this must be
reflected in the fact that RSV1 > RSV2. If RSV1 = RSV2, the documents may be
randomly reordered during calculation of the evaluation measures. Please use a
decimal point ".", not a comma. Do not use any form of separators for thousands.
RSV values must NOT be negative numbers. The only legal characters for the RSV values are 0-9 and the decimal
point.
MUST BE SORTED IN DECREASING ORDER PER
QUERY.
6) Run identifier (please chose an unique ID for each experiment you submit). Only use a-z, A-Z and 0-9. No special characters, accents, etc.
The fields are separated by a single
space.
The file contains nothing but lines formatted in the way described
above.
You are expected to retrieve 1000 documents per query. An experiment
that retrieves a maximum of 1000 documents each for 20 queries therefore
produces a file that contains a maximum of 20000 lines.
You should know
that the effectiveness measures used in CLEF evaluate the performance of systems
at various points of recall. Participants must thus return at most 1000
documents per query in their results. Please note that by its nature, the
average precision measure does not penalize systems that return extra irrelevant
documents at the bottom of their result lists. Therefore, you will usually want
to use the maximum number of allowable documents in your official submissions.
If you knowingly retrieved less than 1000
documents for a topic, please take note of that and check your numbers with
those reported by the system during the submission.
You will have to submit each run through the DIRECT system. An E-mail
will be sent to you explaining how to submit your
results.
N.B. Please read the following very carefully
TEL Tasks: Bilingual: We accept up to a maximum of 4 runs per language pair.
TEL Tasks: Monolingual: We will also accept a maximum of 4 runs per language for the monolingual task (there are three languages to choose from).
Persian Tasks: We accept up to a maximum of 4 runs for both monolingual and bilingual tasks (a total of no more than 8 runs)
Robust tasks: Participants are required to submit at least one baseline run without WSD and one run using the WSD data. They can submit four further baseline runs without WSD and four runs using WSD in various ways.
In all of the above tasks, in order to facilitate comparison between results, there must be a mandatory run: Title + Description (per experiment, per topic language).
The absolute deadline for submission of results for all Ad-Hoc tasks is midnight (24.00) Central European Time, Tuesday, 9 June. Detailed information on how and where to submit your results will be communicated shortly.
An input checker program, used by TREC and modified to meet the requirements of CLEF, can be accessed here. (ready shortly)
WORKING NOTES
A clear description of the strategy adopted and the resources you used for each run MUST be given in your paper for the Working Notes. The deadline for receipt of these papers is
30 August 2009. The Working Notes will be distributed to all participants on registration at the Corfu Workshop (30 September - 2 October 2009). This information is considered of great importance; the point of the CLEF activity is to give participants the opportunity to compare system performance with respect to variations in approaches and resources. Groups that do not provide such information risk being excluded from future CLEF experiments. ------------------------------------------------------------------------