[Main Page]

GikiCLEF - Cross-language Geographic Information Retrieval from Wikipedia

(Difference between revisions)



(Topics)
(GikiCLEF 2009: Cross-language Geographic Information Retrieval from Wikipedia)
Line 9: Line 9:
* Iustin Dornescu
* Iustin Dornescu
* Corina Forascu
* Corina Forascu
 +
* Pamela Forner
 +
* Danilo Giampiccolo
* Sven Hartrumpf
* Sven Hartrumpf
* Ray Larson
* Ray Larson
* Johannes Leveling
* Johannes Leveling
* Constantin Orasan
* Constantin Orasan
 +
* Petya Osenova
* Diana Santos
* Diana Santos
* Yvonne Skalban
* Yvonne Skalban
-
 
=== Call for participation ===
=== Call for participation ===

Revision as of 10:54, 11 November 2008

Contents

GikiCLEF 2009: Cross-language Geographic Information Retrieval from Wikipedia

The GikiCLEF 2009 is an evaluation task for the CLEF 2009, succeeding the GikiP 2008 pilot task. The task is being co-organized by (the list is in alphabetical order by last name, and it is not yet complete):

Call for participation


We see GikiCLEF as a joint-evaluation task, where all participants may contribute to improve the task and suit all their needs. We are currently inviting all participants to join the GikiCLEF mailing list through the following form.

GikiCLEF 2009 task description


GikiCLEF intends to evaluate systems on finding Wikipedia entries / documents that answer a particular information need which requires geographical reasoning of some sort.

GikiCLEF participants must build systems that are capable of answering a group of geographically challenging topics, using Wikipedia collections and returning a list of document URIs that contain the correct answers for each topic (this is an open subject, as the topics might require, for example, an ordered list of answers ordered by date, location or a given storyline). Examples of the GikiP 2008 topics include:

  1. Which African capital have more than two million inhabitants?
  2. List places where Goethe lived.
  3. What wars occurred in Greek soil?

The GikiCLEF open issues are the following:

Languages

We will use Bulgarian, Dutch, English, German, Italian, Norwegian, Portuguese, Romanian and Spanish languages for topics and collections for GikiCLEF 2009. Other languages can be suggested by participants; if you want to add another language, please contact the GikiCLEF organizers.

Collection

The organizers will take care of making the Wikipedia static dumps from all GikiCLEF languages available for pre-processing and test the systems with the new collections. We intend to release the collections until the end of 2008. The Wikipedia snapshots will be converted to XML with the WikiXML tool.

Topics

There will be 50 topics for GikiCLEF 2009, spanning several kinds and different cultures, to cover all GikiCLEF collections. The topics will be release in early March 2009, and after the release, the systems will have 2 weeks to return a list of answers.

We encourage participants to inform GikiCLEF organizers of the kind of topics they are particularly interested, so that the final topics may reflect it.

Evaluation

GikiCLEF accepts only answers / documents of the correct type are expected. For example, names of people (painters and scientists), names of countries (not of wars or kings), etc. The system's results in GikiP 2008 were evaluated according to number of correct hits (N) and precision, by the simple formula mult*N*N/total, for each topic, where mult rewards multilinguality. The system's final score was given by the average of the individual scores.

For the GikiCLEF 2009, we need to develop new measures to evaluate the performance of the systems, in a way that it encourages multilinguality and diversity of answers. Any suggestions on this subject are welcome.

Important dates


We aim to an early topic development and release, and also to an early submission deadline compared to the other CLEF tracks, to avoid the 'rush-months' of CLEF tracks. The final dates are still being decided among the organizers.

  1. 9 October 2008 - GikiCLEF mailing list open, call for participation and guideline discussion
  2. 28-30 October 2008 - Promoting GikiCLEF on the GIR workshop held at CIKM 2008, Napa Valley, CA, EUA.
  3. Until the end of 2008 - Wikipedia collections made available to all participants.
  4. November 2008 - February 2009 - Final definition of the GikiCLEF task. Publication of the details of the task.
  5.  March 2009 - Topic release.
  6. (2 weeks after topic release) - Deadline for run submission.
  7.  June 2009 - Assessment and results made available.
  8.  September 2009 - CLEF workshop at Corfu, Greece.

GikiP 2008 pilot task


The GikiP task was accepted as a pilot task for the GeoCLEF 2008 main track. The GikiP organization (including topic development, assessments and evaluation of the results) was made by Linguateca.

Please visit the main page of GikiP 2008 for more information regarding GikiP 2008.

Acknowledgements


GikiCLEF is organized under the scope of CLEF, an activity of the TrebleCLEF Coordination Action. Other related evaluation tasks: QA@CLEF, GeoCLEF.

So far GikiCLEF is being funded by Linguateca, jointly funded by the Portuguese Government and the European Union (FEDER and FSE) under contract ref. POSC/339/1.3/C/NAC.

Other material