Guidelines for topic creation ============================= (from the GikiCLEF webpage) The topic choice committee strove to devise topics with crosslingual and cultural interest, so that the need for looking in Wikipedia in different languages is real and not artificial. GikiCLEF topics should conform to the following criteria: * realistic topics which can be answered in some Wikipedia covered by GikiCLEF * most topics will be chosen with a cultural bias so that not any Wikipedia should have that information * topics may require knowledge of culture to understand the way they should be answered (or better, what it is that is being sought) (this may mean that translation into other languages may require lengthy explanations) * answers have to be justified in at least one Wikipedia (that is, the string may be found as a entry in all Wikipedias, but the rest of the information has to be found in at least one) * questions may include ambiguous concepts or names. In that case, participant systems have to accept that only answers related to the proper disambiguation will be considered correct e.g. ''Which countries did Bush visit in the first two years of his mandate?'' will not be correctly answered by Kate Bush's travels * in case there appear ambiguities in the topic formulation that have not been discussed or clarified in the narrative, and which have more than one interpretation acceptable (e.g. has a user model), the assessment will accept both. E.g. in ''Awarded-winning Romanian actresses in international cinema festivals'', one would have to accept those actresses actually receiving prizes, or just in the audience or even hosting the event. * different answers about the same subject are welcome, though, as in ''Who is (considered to be) the founder of mathematics?'' or ''Name the greatest scientific breakthroughs in the XIXth century'' The GikiCLEF topic management system ==================================== At Linguateca we have developed a system to help a large group cooperatively create, define the narrative, translate and discuss the topics. We defined the notion of topic owner (currently equated with language group), namely the one who proposed the topic and should know best the answers to it because the subject was included in his or her interests. In the system, topic owners were the responsible for narrattives of their topics, and could change all translations and remove all answers irrespective of who had put them. All others (not owners) could just add/tamper with their own language and English, add and remove the answers created by them, and set/change the SelfJustified bit. After the initial topic choice and ownership decision, done through email and whose results were input to the GikiCLEF topic management system, members of the topic group were asked to perform the following tasks: * edit the narrative field and fill in both further clarifications and a user model * add translations in their languages to the other topics * add known answers to their topics * add answers in their language to the other topics * for all answers added, set the Self_justified bit to Yes or No The system then still allowed statistics gathering, as well as XML topic set creation. Why add answers ================== The purpose of having the topic group collecting (right) answers to a topic should be further explained, and also the limitations of the topic management system in that regard should be made clear. The idea of filling in answers was to do (part of) the job beforehand, in order to help later assessment (and also to collect some statistics). So, members of the topic group should be putting correct answers, and in addition tell the system if they were justified by themselves (and just this small piece of justification info). They should not put other answers whose type were incorrect, nor more complex justifications. Both these decisions can obviously be argued against. The first is intimately knitted with the way the task was defined: answers have to be of the correct kind, not useful other kinds of pages. This has been part and parcel of the task definition since the beginning. The second was a design option of the topic management system: it was thought it would be an overkill to ask the members of the topic group to have to enclose large chains of justification, and that it would be quite improbable that systems would find/get the same as them. So the assessment of justification was left to assessment time. ================================ Main author/editor: Diana Santos