I work in natural language processing and language engineering at SINTEF Information and Communication Technology, in Oslo, Norway.
My main achievement was launching a distributed resource center for the the processing of the Portuguese language, Linguateca (2000-), as a follow-up of the Computational Processing of Portuguese project (1998-2000).
In this connection, I have been involved in the organization of several evaluation contests for Portuguese (Morfolimpíadas, HAREM) as well as in adding Portuguese to CLEF, the main international forum for crosslingual information retrieval.
Also in the scope of Linguateca, I have deployed or helped deploy several important corpus resources for Portuguese, such as AC/DC, COMPARA and the Floresta Sintá(c)tica treebank. I have also supervised the creation of CETEMPúblico, CETENFolha and the CHAVE collection.
To know something about my past, see a short CV or an extended one, in Portuguese (not updated since 2004). See also my 251 publications.
See also my interests page, written in 1997 and slightly updated in 2003.
I believe (for NLP) in language communities (all those who speak Portuguese); not geographical communities (such as Iberian, European or Latin-American).
I believe that people should teach and learn in their own native language, and that scientific publishing in English only is fundamentally wrong (for non-native English speakers). Scientists have the duty to translate and mediate science in their own language, instead of betraying it.