Diana Santos

Em português

I work in natural language processing and language engineering at SINTEF Information and Communication Technology, in Oslo, Norway.

My main achievement was launching a distributed resource center for the the processing of the Portuguese language, Linguateca (2000-), as a follow-up of the Computational Processing of Portuguese project (1998-2000).

In this connection, I have been involved in the organization of several evaluation contests for Portuguese (Morfolimpíadas, HAREM) as well as in adding Portuguese to CLEF, the main international forum for crosslingual information retrieval.

Also in the scope of Linguateca, I have deployed or helped deploy several important corpus resources for Portuguese, such as AC/DC, COMPARA and the Floresta Sintá(c)tica treebank. I have also supervised the creation of CETEMPúblico, CETENFolha and the CHAVE collection.

To know something about my past, see a short CV or an extended one, in Portuguese (not updated since 2004). See also my 251 publications.

Scientific interests

My main scientific interests are: I have worked in corpus processing and analysis, morphological analysis, parsing, tense and aspect modelling, contrastive studies, machine translation, alignment, and Web interfaces to corpora. I am currently also interested in question answering, information retrieval, unobstrusive usability studies and the future of the Web: Web 2.0 and Web services.

See also my interests page, written in 1997 and slightly updated in 2003.

(Science) political views

I do not believe in anonymous reviews. See why.

I believe (for NLP) in language communities (all those who speak Portuguese); not geographical communities (such as Iberian, European or Latin-American).

I believe that people should teach and learn in their own native language, and that scientific publishing in English only is fundamentally wrong (for non-native English speakers). Scientists have the duty to translate and mediate science in their own language, instead of betraying it.


Last modified 17 August 2007.