| Line 1: | Line 1: | ||
| + | |||
| + | == GikiCLEF 2009: Cross-language Geographic Information Retrieval from Wikipedia == | ||
| + | |||
The GikiCLEF 2009 is an evaluation task for the Question Answering track for the [http://www.clef-campaign.org/ CLEF 2009 campaign], succeeding the [http://www.linguateca.pt/GikiP GikiP 2008 pilot task] and following its main guidelines. The task is being co-organized by (the list is in alphabetical order by last name, and it is not yet complete): | The GikiCLEF 2009 is an evaluation task for the Question Answering track for the [http://www.clef-campaign.org/ CLEF 2009 campaign], succeeding the [http://www.linguateca.pt/GikiP GikiP 2008 pilot task] and following its main guidelines. The task is being co-organized by (the list is in alphabetical order by last name, and it is not yet complete): | ||
* Gosse Bouma | * Gosse Bouma | ||
Revision as of 14:50, 29 September 2008
Contents |
GikiCLEF 2009: Cross-language Geographic Information Retrieval from Wikipedia
The GikiCLEF 2009 is an evaluation task for the Question Answering track for the CLEF 2009 campaign, succeeding the GikiP 2008 pilot task and following its main guidelines. The task is being co-organized by (the list is in alphabetical order by last name, and it is not yet complete):
- Gosse Bouma
- Nuno Cardoso
- Paula Carvalho
- Iustin Dornescu
- Corina Forascu
- Johannes Leveling
- Diana Santos
Are you interested in participate in GikiCLEF 2009? Join the mailing-list.
Overview of GikiP 2008 pilot task
The GikiP task was was accepted as a pilot task for the GeoCLEF 2008 main track. The GikiP organization (including topic development, assessments and evaluation of the results) was made by Linguateca.
GikiP's overview paper on the Working notes of CLEF 2008 details the pilot task and the experiments made by the three participants: i) Johannes Leveling and Sven Hartrumpf, from the University of Hagen, Germany (Presentation in PDF), ii) Iustin Dornescu, from the University of Wolverhampton, UK (Presentation in PDF), and iii) Nuno Cardoso, from the University of Lisbon, Portugal (Presentation in PDF).
In the main page of GikiP 2008 you can find the 15 topics used in English, German and Portuguese, the assessments and the results achieved by the participants.
GikiCLEF 2009 task description
The GikiCLEF task description is the same as the GikiP's pilot task, and it is the following:
Find Wikipedia entries / documents that answer a particular information need which requires geographical reasoning of some sort.
The GikiCLEF participants must build systems that are capable of answering a group of geographically challenging topics, using the Wikipedia collection(s)from the QA@CLEF main track and returning the URIs of the documents that contain the correct answers for each topic (information on how to get them is provided on CLEF registration). Examples of the GikiP 2008 topics include:
- Which African capital have more than two million inhabitants?
- List places where Goethe lived.
- What wars occurred in Greek soil?
Call for participation
We see GikiCLEF as a joint-evaluation task, where all participants can contribute in order to improve the task and suit all their needs. We are currently asking for all participants to join the mailing list and take and active part and make suggestions for the GikiCLEF task. The topics we want to address are the following:
Languages
English, German and Portuguese were used in 2008, and should again be used in GikiCLEF 2009. Maybe Dutch could be used as well? In order to add a language, we need to ensure that there is someone who has it as a mother tongue, and that is available for topic translation and results assessments for the Wikipedia snapshot for that language.
Collection
We target the Wikipedia releases periodic snapshots of the static pages and SQL databases, and the results are a simple list of URLs, we can use more up-to-date Wikipedia snapshots as the collection. This has the advantage that it makes even more easy to add new languages. We could also use some hosting space to store some of the Wikipedia snapshots, to ensure that the collection used in GikiCLEF is always available and unambiguously defined.
Topics
The 15 topics of GikiP 2008 were created by the organizers, and they have a degree of geographical complexity and requires some sort of geographic reasoning capabilities from the systems (as a GeoCLEF-motivated pilot task) and are formulated in the form of questions (as a QA-flavored task). The main goal is to create topics that are as close as we can to a true information need, and as well described as possible in natural language. The topics should span several types as as those discussed in [#References|Gey et al. (2006)], given that these facts are bound to be joined in entries about relevant subjects.
We are opening the discussion on the total number of topics, and whether they should be formulated by the organizers, or all participants should contribute to a pool of topics, and somewhat represent the kinds of questions that they are currently tackling with their research works.
Evaluation
GikiCLEF accepts only answers / documents of the correct type are expected. For example, names of people (painters and scientists), names of countries (not of wars or kings), etc. The system's results in GikiP were evaluated according to number of correct hits (N) and precision, by the simple formula mult*N*N/total, for each topic, where mult rewards multilinguality. The system's final score will be given by the average of the individual scoers.
Important dates
- 1 October 2008 - GikiCLEF mailing list open, call for participation and guideline discussion
- 28-30 October 2008 - Promoting GikiCLEF on the GIR workshop held at CIKM 2008, Napa Valley, CA, EUA.
- November 2008-February 2009 - Discussion among participants and organizers on the GikiCLEF evaluation moulds.
- March 2009 - Final definition of the GikiCLEF task. Publication of the details of the task.
- May 2009 - Topic Release.
- June 2009 - Submission of the results.
- July 2009 - Release of the results and the assessments.
- August 2009 - GikiCLEF paper submission for the CLEF 2009 working notes.
- September 2009 - CLEF workshop at Corfu, Greece.
Acknowledgements
References
- Diana Santos, Nuno Cardoso, Paula Carvalho, Iustin Dornescu, Sven Hartrumpf, Johannes Leveling & Yvonne Skalban. "Getting geographical answers from Wikipedia: the GikiP pilot at CLEF". In Francesca Borri, Alessandro Nardi & Carol Peters (eds.), CLEF 2008 Working notes (Aarhus, 17-19 September 2008). Working notes PDF, Local copy PDF.
- Diana Santos, Nuno Cardoso, Paula Carvalho, Yvonne Skalban, Iustin Dornescu, Johannes Leveling & Sven Hartrumpf. Getting geographical answers from Wikipedia: the GikiP pilot at CLEF. In Working Notes of CLEF 2008, Århus, Denmark, 19-21 September 2008.
- Johannes Leveling & Sven Hartrumpf. A fully-automatic approach to answer geographic queries: GIRSA-WP at GikiP. In Working Notes of CLEF 2008, Århus, Denmark, 19-21 September 2008.
- Iustin Dornescu. Digging for information WikipediaQAList@wlv at GikiP. In Working Notes of CLEF 2008, Århus, Denmark, 19-21 September 2008.
- Nuno Cardoso. Towards semantic flavored queries for GIR systems: RENOIR at the GikiP pilot task. In Working Notes of CLEF 2008, Århus, Denmark, 19-21 September 2008.
- Fredric Gey, Ray Larson, Mark Sanderson, Kerstin Bischoff, Thomas Mandl, Christa Womser-Hacker, Diana Santos, Paulo Rocha, Andres Montoyo, Giorgio M. Di Nunzio & Nicola Ferro. Challenges to Evaluation of Multilingual Geographic Information Retrieval in GeoCLEF. In Workshop on Evaluation of Information Access (EVIA) May 15 (Tokyo, Japan, Maio 15 2007 ), s/pp.
- Diana Santos & Nuno Cardoso. "GikiP: Evaluating geographical answers from Wikipedia". In 5th Workshop on Geographic Information Retrieval (GIR'08) (Napa Valley, CA, USA, November 1 2008).
![[Main Page]](/GikiCLEF/images/logoGikiCLEF.png)