m |
m |
||
| Line 4: | Line 4: | ||
</style> | </style> | ||
<script> | <script> | ||
| - | document.getElementById('page-Main_Page'). | + | document.getElementById('page-Main_Page').innerHTML = 'foo' |
</script> | </script> | ||
Revision as of 15:13, 14 November 2008
GikiCLEF 2009 is an evaluation task under the scope of CLEF. Its aim is to evaluate systems which find Wikipedia entries / documents that answer a particular information need, which requires geographical reasoning of some sort.
GikiCLEF is the successor of the GikiP 2008 pilot task which ran in 2008 under GeoCLEF.
Contents |
Call for participation
Prospective participants are requested to join the GikiCLEF mailing list through the following form.
Task description
For GikiCLEF, systems will need to answer or address geographically challenging topics, on the Wikipedia collections, returning Wikipedia document as a list of answers.
For example, in GikiP 2008, topics/questions were:
- Which African capitals have a population larger than two million inhabitants?
- List places where Goethe lived.
- Which wars occurred in Greek soil?
And answers, more precisely answer lists, would be:
- Argel, Cairo, Nairobi, Harare, etc.
- Germany, Darmstadt, Strasbourg, Frankfurt, etc.
- Aetolian War, Cretan War, World War II, etc.
GikiCLEF Languages
So far we have organizers or participants interested in the following languages:
Bulgarian, Dutch, English, German, Italian, Norwegian, Portuguese, Romanian and Spanish.
If you are specifically interested in adding another language, please let us know.
GikiCLEF collections
The Wikipedia collections for all GikiCLEF languages will be made available for pre-processing and testing by the end of 2008. The Wikipedia snapshots for all languages were taken on June 2008.
The Wikipedia snapshots will be converted to XML with the WikiXML tool.
Topics
Fifty (50) topics will be prepared for GikiCLEF. The topic choice committee will strive to devise topics with crosslingual and cultural interest, so that the need for looking in Wikipedia in different languages is real and not artificial.
The topics will be released early March 2009.
Participants are warmly encouraged to tell GikiCLEF organizers about the kind of topics they are particularly interested in, in order for GikiCLEF to reflect the real needs of the community.
Evaluation
Only answers / documents of the correct type are expected (and will therefore be rewarded). After pooling all answers, they will be manually assessed by the organization.
In GikiP 2008, systems were evaluated according to the number of correct hits (N) and precision, by the simple formula mult*N*N/total, for each topic, where mult rewarded multilinguality. The system's final score was given by the average of the individual scores.
We believe that this was too simple an approach, feasible for three languages but in obvious need for improvement. So, for GikiCLEF 2009, we suggest the following:
Answers for each language are scored separately according to N*precision, and the final score of any system is given by the sum of the scores for each individual language.
Submission format
Submissions should be encoded in UTF-8, with each line representing an answer given by: i) topic id; ii) space/tab separator; iii) file URI.
For example:
GC34 de/e/x/a/example.xmnl GC34 bg/o/t/h/other_example.xml GC35 pt/d/i/f/different_topic.xml
GC35 es/s/t/i/still_on_the_second_topic.xml
Further improvements/tracks
As a result of the reflection after GikiP, the following suggestion for the future were made. We request feedback from participants on whether they would also be interested in these issues:
- Presentation of the results
- Geographical diversity
Important dates
We intend to avoid the 'rush-months' of CLEF.
- Until the end of 2008 - Wikipedia collections made available to participants.
- November 2008 - February 2009 - Discussion on the definition of the GikiCLEF final task, and corresponding publication of participation guildeline.
- March 2009 - Topic release and run submission.
- (2 weeks after topic release) - Deadline for run submission.
- June 2009 - Assessment and GikiCLEF results made available.
- September 2009 - CLEF workshop at Corfu, Greece.
Already past
- 9 October 2008 - GikiCLEF mailing list open, call for participation and guideline discussion
- 30 October 2008 - Presenting GikiCLEF atthe GIR workshop held at CIKM 2008, Napa Valley, CA, EUA
Organization committee
GikiCLEF is being co-organized by (in alphabetical order):
- Sören Auer
- Gosse Bouma
- Nuno Cardoso
- Paula Carvalho
- Iustin Dornescu
- Corina Forascu
- Pamela Forner
- Danilo Giampiccolo
- Sven Hartrumpf
- Ray Larson
- Johannes Leveling
- Thomas Mandl
- Constantin Orasan
- Petya Osenova
- Diana Santos
- Yvonne Skalban
- Christa Womser-Hacker
GikiP 2008 pilot task
The GikiP task was accepted as a pilot task for the GeoCLEF 2008 main track, and its organization (including topic development, assessments and evaluation of the results) was made by Linguateca.
Please visit the main page of GikiP 2008 for more information on GikiP 2008.
Further material
- [Santos et al. 2008]
- Diana Santos, Nuno Cardoso, Paula Carvalho, Iustin Dornescu, Sven Hartrumpf, Johannes Leveling & Yvonne Skalban. "Getting geographical answers from Wikipedia: the GikiP pilot at CLEF". In Francesca Borri, Alessandro Nardi & Carol Peters (eds.), Cross Language Evaluation Forum: Working Notes for the CLEF 2008 Workshop (Aarhus, Denmark, 17-19 de Setembro de 2008), s/pp. http://www.linguateca.pt/Diana/download/SantosetalWNCLEF2008.pdf
- [Santos & Cardoso 2008]
- Diana Santos & Nuno Cardoso. "GikiP: Evaluating geographical answers from Wikipedia". In 5th Workshop on Geographic Information Retrieval (GIR'08) (Napa Valley, CA, USA, 30 October 2008), pp. 59-60. http://www.linguateca.pt/Diana/download/SantosCardosoGIR08.pdf Slides
- [Santos & Cardoso 2009]
- Diana Santos & Nuno Cardoso. "REMando para o futuro: reconhecimento de entidades mencionadas e não só". Escola de Verão Belinda Maia (Edv 2009) (FLUP, Porto, Portugal, 29 de Junho - 3 de Julho 2009). Slides
- [Dornescu 2009]
- Iustin Dornescu. "EQUAL - Encyclopaedic QA for Lists". GikiCLEF overview session at CLEF workshop (GikiCLEF) (Corfu, Greece, 30 September - 2 October). Slides
- [Santos & Cabral 2009]
- Diana Santos & Luís Miguel Cabral. "GikiCLEF: Crosscultural issues in an international setting: asking non-English-centered questions to Wikipedia". In Francesca Borri, Alessandro Nardi & Carol Peters (eds.), Cross Language Evaluation Forum: Working notes for CLEF 2009 (Corfu, Grécia, 30 Setembro - 2 Outubro), Springer. Slides http://www.linguateca.pt/Diana/download/SantosCabralCLEF2009WN.pdf
- [Hartrumpf & Leveling 2009]
- Sven Hartrumpf & Johannes Leveling. "GIRSA-WP at GikiCLEF: Integration of Structured Information and Decomposition of Questions". GikiCLEF overview session at CLEF workshop (GikiCLEF) (Corfu, Greece, 30 September - 2 October). Slides
- [Larson 2009]
- Ray R. Larson. "Interactive Probabilistic Search for GikiCLEF". GikiCLEF overview session at CLEF workshop (GikiCLEF) (Corfu, Greece, 30 September - 2 October). Slides
- [Cardoso 2009]
- Nuno Cardoso. "GikiCLEF topics and Wikipedia articles: did it blend?". CLEF2009 (Corfu, Grécia, 30 Setembro - 2 Outubro). Poster
- [Santos et al. 2009]
- Diana Santos, Nuno Cardoso, Paula Carvalho, Iustin Dornescu, Sven Hartrumpf, Johannes Leveling & Yvonne Skalban. "GikiP at GeoCLEF 2008: Joining GIR and QA forces for querying Wikipedia". In Carol Peters, Tomas Deselaers, Nicola Ferro, Julio Gonzalo, Gareth J.F.Jones, Mikko Kurimo, Thomas Mandl, Anselmo Peñas & Viviane Petras (eds.), Evaluating Systems for Multilingual and Multimodal Information Access 9th Workshop of the Cross-Language Evaluation Forum, CLEF 2008, Aarhus, Denmark, September 17-19, 2008, Revised Selected Papers 2009, Springer, pp. 894-905. http://www.linguateca.pt/Diana/download/SantosetalGikiPCLEF2008Springer2009.pdf
- [Santos et al. 2010]
- Diana Santos, Nuno Cardoso & Luís Miguel Cabral. "How geographic was GikiCLEF? A GIR-critical review". (FCUL, Lisboa, 26 de Janeiro de 2010). Slides
- [Santos et al. 2010]
- Diana Santos, Nuno Cardoso & Luís Miguel Cabral. "How geographical was GikiCLEF? A GIR-critical review". In 6th Workshop on Geographic Information Retrieval (GIR'10) (Zurique, 18-19 Fevereiro). http://www.linguateca.pt/Diana/download/SantosCardosoCabralGIR2010.pdf
- [Santos et al. 2010]
- Diana Santos, Luís Miguel Cabral, Corina Forascu, Pamela Forner, Fredric Gey, Katrin Lamm, Thomas Mandl, Petya Osenova, Anselmo Peñas, Alvaro Rodrigo, Julia Schulz, Yvonne Skalban & Erik Tjong Kim Sang. "GikiCLEF: Crosscultural issues in multilingual information access". In Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner & Daniel Tapias (eds.), Proceedings of the International Conference on Language Resources and Evaluation (LREC 2010) (Valletta, Malta, 17-23 May de 2010), European Language Resources Association, pp. 2346-2353. http://www.linguateca.pt/Diana/download/SantosetalGikiCLEF.pdf
- [Santos et al. 2010]
- Diana Santos, Luís Miguel Cabral, Pamela Forner, Corina Forascu, Fredric Gey, Katrin Lamm, Thomas Mandl, Petya Osenova, Anselmo Peñas, Alvaro Rodrigo, Julia Schulz, Yvonne Skalban, Erik Tjong Kim Sang & Nuno Cardoso. "GikiCLEF: Crosscultural issues in multilingual information access". Proceedings of the International Conference on Language Resources and Evaluation (LREC 2010) (Valletta, Malta, 17-23 May de 2010). Poster
- [Santos & Cabral 2010]
- Diana Santos & Luís Miguel Cabral. "GikiCLEF : Expectations and lessons learned". In Carol Peters, Giorgio Di Nunzio, Mikko Kurimo, Thomas Mandl, Djamel Mostefa, Anselmo Peñas & Giovanna Roda (eds.), Multilingual Information Access Evaluation, VOL I Setembro de 2010, Springer, pp. 212-222. http://www.linguateca.pt/Diana/download/SantosCabralSpringer2010.pdf
- [Cardoso 2010]
- Nuno Cardoso. "GikiCLEF topics and Wikipedia articles: Did they blend?". In Carol Peters, Giorgio Di Nunzio, Mikko Kurimo, Thomas Mandl, Djamel Mostefa, Anselmo Peñas & Giovanna Roda (eds.), Multilingual Information Access Evaluation, VOL I Setembro de 2010, Springer.
- [Costa et al. 2012]
- Luís Costa, Cristina Mota, Diana Santos, Luís Costa, Cristina Mota & Diana Santos. "SIGA, a System to Manage Information Retrieval Evaluations". In Computational processing of the Portuguese language (PROPOR2012) (Coimbra, Abril de 2012), pp. 173-184. http://www.linguateca.pt/Diana/download/CostaetalPROPOR2012.pdf
- [Mota et al. 2012]
- Cristina Mota, Alberto Simões, Cláudia Freitas, Luís Costa & Diana Santos. "Págico: Evaluating Wikipedia-based information retrieval in Portuguese". In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet U?ur Do?an, Bente Maegaard, Joseph Mariani, Jan Odijk & Stelios Piperidis (eds.), Proceedings of the Eigth International Conference on Language Resources and Evaluation (LREC'12) (Istambul, 23-25 de Maio de 2012), pp. 2015-2022. pdf poster pdf
![[Main Page]](/GikiCLEF/images/logoGikiCLEF.png)