(→Call for participation) |
(→GikiCLEF 2009: Cross-language Geographic Information Retrieval from Wikipedia) |
||
| Line 2: | Line 2: | ||
The GikiCLEF 2009 is an evaluation task for the [http://www.clef-campaign.org/ CLEF 2009], succeeding the [[#GikiP 2008 pilot task|GikiP 2008 pilot task]]. The task is being co-organized by (the list is in alphabetical order by last name, and it is not yet complete): | The GikiCLEF 2009 is an evaluation task for the [http://www.clef-campaign.org/ CLEF 2009], succeeding the [[#GikiP 2008 pilot task|GikiP 2008 pilot task]]. The task is being co-organized by (the list is in alphabetical order by last name, and it is not yet complete): | ||
| + | |||
| + | * Sören Auer | ||
* Gosse Bouma | * Gosse Bouma | ||
* Nuno Cardoso | * Nuno Cardoso | ||
Revision as of 08:05, 21 October 2008
Contents |
GikiCLEF 2009: Cross-language Geographic Information Retrieval from Wikipedia
The GikiCLEF 2009 is an evaluation task for the CLEF 2009, succeeding the GikiP 2008 pilot task. The task is being co-organized by (the list is in alphabetical order by last name, and it is not yet complete):
- Sören Auer
- Gosse Bouma
- Nuno Cardoso
- Paula Carvalho
- Iustin Dornescu
- Corina Forascu
- Sven Hartrumpf
- Johannes Leveling
- Constantin Orasan
- Diana Santos
- Yvonne Skalban
Call for participation
We see GikiCLEF as a joint-evaluation task, where all participants may contribute to improve the task and suit all their needs. We are currently inviting all participants to join the GikiCLEF mailing list through the following form.
GikiCLEF 2009 task description
GikiCLEF intends to evaluate systems on finding Wikipedia entries / documents that answer a particular information need which requires geographical reasoning of some sort.
GikiCLEF participants must build systems that are capable of answering a group of geographically challenging topics, using Wikipedia collections and returning a list of document URIs that contain the correct answers for each topic (this is an open subject, as the topics might require, for example, an ordered list of answers ordered by date, location or a given storyline). Examples of the GikiP 2008 topics include:
- Which African capital have more than two million inhabitants?
- List places where Goethe lived.
- What wars occurred in Greek soil?
The GikiCLEF open issues are the following:
Languages
We will use Dutch, English, German, Norwegian, Portuguese and Romanian languages for topics and collections for GikiCLEF 2009. Other languages can be suggested by participants; if you want to add another language, please contact the GikiCLEF organizers.
Collection
The organizers will take care of making the Wikipedia static dumps from all GikiCLEF languages available for pre-processing and test the systems with the new collections. We intend to release the collections until the end of 2008. Another topic of discussion is whether we should pre-process the collections with the WikiXML tool, in order to provide a collection in the same format as GikiP's collection, and if we could use some DBpedia data.
Topics
Our goal is to create topics that are as close as possible to a true information need, and well described in natural language. The topics should span several kinds and be from different cultures so that they have different coverages in different Wikipedia collections.
Open issues include number of topics, topic difficulty (for instance, whether there will be some anaphoric questions or not), and topic gathering. There should be a contribution from the participants to an initial pool of topics, to somewhat blend in the kinds of questions that they are currently tackling with their research works. We encourage participants to inform GikiCLEF organizers of the kind of topics they are particularly interested.
Evaluation
GikiCLEF accepts only answers / documents of the correct type are expected. For example, names of people (painters and scientists), names of countries (not of wars or kings), etc. The system's results in GikiP 2008 were evaluated according to number of correct hits (N) and precision, by the simple formula mult*N*N/total, for each topic, where mult rewards multilinguality. The system's final score was given by the average of the individual scores.
For the GikiCLEF 2009, we need to develop new measures to evaluate the performance of the systems, in a way that it encourages multilinguality and diversity of answers. Any suggestions on this subject are welcome.
Important dates
We aim to an early topic development and release, and also to an early submission deadline compared to the other CLEF tracks, to avoid the 'rush-months' of CLEF tracks. The final dates are still being decided among the organizers.
- 9 October 2008 - GikiCLEF mailing list open, call for participation and guideline discussion
- 28-30 October 2008 - Promoting GikiCLEF on the GIR workshop held at CIKM 2008, Napa Valley, CA, EUA.
- Until the end of 2008 - Wikipedia collections made available to all participants.
- November 2008 - January 2009 - Discussion among participants and organizers on the GikiCLEF evaluation task methodology.
- January - February 2009 - Final definition of the GikiCLEF task. Publication of the details of the task. Topic Release.
- March - August 2009 - Submission of the results. Release of the results and the assessments. GikiCLEF paper submission for the CLEF 2009 working notes.
- September 2009 - CLEF workshop at Corfu, Greece.
GikiP 2008 pilot task
The GikiP task was accepted as a pilot task for the GeoCLEF 2008 main track. The GikiP organization (including topic development, assessments and evaluation of the results) was made by Linguateca.
Please visit the main page of GikiP 2008 for more information regarding GikiP 2008.
Acknowledgements
GikiCLEF is organized under the scope of CLEF, an activity of the TrebleCLEF Coordination Action. Other related evaluation tasks: QA@CLEF, GeoCLEF.
So far GikiCLEF is being funded by Linguateca, jointly funded by the Portuguese Government and the European Union (FEDER and FSE) under contract ref. POSC/339/1.3/C/NAC.
Other material
- Diana Santos, Nuno Cardoso, Paula Carvalho, Iustin Dornescu, Sven Hartrumpf, Johannes Leveling & Yvonne Skalban. "Getting geographical answers from Wikipedia: the GikiP pilot at CLEF". In Francesca Borri, Alessandro Nardi & Carol Peters (eds.), CLEF 2008 Working notes (Aarhus, 17-19 September 2008). Working notes PDF, Local copy PDF.
- Diana Santos, Nuno Cardoso, Paula Carvalho, Yvonne Skalban, Iustin Dornescu, Johannes Leveling & Sven Hartrumpf. Getting geographical answers from Wikipedia: the GikiP pilot at CLEF. In Working Notes of CLEF 2008, Århus, Denmark, 19-21 September 2008.
- Diana Santos & Nuno Cardoso. "GikiP: Evaluating geographical answers from Wikipedia". In 5th Workshop on Geographic Information Retrieval (GIR'08) (Napa Valley, CA, USA, November 1 2008).
![[Main Page]](/GikiCLEF/images/logoGikiCLEF.png)