[Esta página em português]

COMPARA corpus contents
version 2.2

(Last updated: 29-Jan-2003)

Text-pairs available
Quantitative summary of currently available texts
Other text-pairs to be included in COMPARA
Previous versions

Text-pairs available

Publication dates in bold refer to first editions

EBDL1T1 Lodge, David

1995 Therapy London: Secker & Warburg, pp 3-97. Copyright © 1995 David Lodge. Used by permission of David Lodge.
SOURCE: English, UK; 43764 tokens, 40464 words, 6873 types. 2148 alignment units.

1995 Terapia translated by Maria do Carmo Figueira. Lisboa: Gradiva, pp 11-88. Copyright © 1995 Gradiva Publicações, Lda. Used by permission of Gradiva Publicações, Lda.
TRANSLATION: Portuguese, Portugal; 45152 tokens, 42383 words, 7932 types.

EBDL1T2 Lodge, David

1995 Therapy London: Secker & Warburg, pp 3-97. Copyright © 1995 David Lodge. Used by permission of David Lodge.
SOURCE: English, UK; 43765 tokens, 40463 words, 6870 types. 2148 alignment units.

1997 Terapia translated by Lídia Cavalcante-Luther. São Paulo: Scipione, pp 11-115. Copyright © 1997 Scipione. Used by permission of Editora Scipione.
TRANSLATION: Portuguese, Brazil; 45194 tokens, 42485 words, 7550 types.

EBDL2 Lodge, David

1989 Nice Work London: Penguin, pp 13-89. Copyright © 1988 by David Lodge. Used by permission of David Lodge.
SOURCE: English, UK; 28167 tokens, 26546 words, 5921 types. 1469 alignment units.

1996 Um almoço nunca é de graça translated by Maria Carlota Pracana. Lisboa: Gradiva, pp 13-67. Copyright © Gradiva Publicações, Lda. Used by permission of Gradiva Publicações, Lda.
TRANSLATION: Portuguese, Portugal; 28988 tokens, 27194 words, 6470 types.

EBDL3T1 Lodge, David

1975 Changing PlacesLondon: Secker & Warburg, pp 3-71. Copyright © David Lodge 1975. Used by permission of David Lodge.
SOURCE: English, UK; 30222 tokens, 27683 words, 6044 types. 1479 alignment units.

1995 A Troca translated by Helena Cardoso. Porto: Edições Asa, pp 9-67. Copyright © 1995 Edições Asa. Used by permission of Edições Asa.
TRANSLATION: Portuguese, Portugal; 29252 tokens, 26635 words, 6612 types.

EBDL3T2 Lodge, David

1975 Changing PlacesLondon: Secker & Warburg, pp 3-71. Copyright © David Lodge 1975. Used by permission of David Lodge.
SOURCE: English, UK; 30222 tokens, 27683 words, 6044 types. 1479 alignment units.

1998 Invertendo os Papéis translated by Lídia Luther-Cavalcante. São Paulo: Scipione, pp 15-93. Copyright © 1998 Scipione. Used by permission of Scipione.
TRANSLATION: Portuguese, Brazil; 30860 tokens, 29106 words, 6349 types.

EBJB1 Barnes, Julian

1984 Flaubert's parrot Corpus text based on London: Picador, 1985, pp 11-65. Copyright © 1984 by Julian Barnes. Used by permission of Julian Barnes.
SOURCE: English, UK; 21316 tokens, 19805 words, 4605 types. 1191 alignment units.

1988 O papagaio de Flaubert translated by Ana Maria Amador. Corpus text based on Lisboa: Queztal Editores, 1988, pp 11-74. Copyright © 1988 by Quetzal Editores. Used by permission of Quetzal Editores.
TRANSLATION: Portuguese, Portugal; 20932 tokens, 19005 words, 5044 types.

EBJT1 Trollope, Joanna

1996 Next of kin Corpus text based on London: Black Swan, 1997, pp 7-87. Copyright © 1996 Joanna Trollope. Used by permission of Joanna Trollope.
SOURCE: English, UK; 32470 tokens, 30335 words, 4598 types. 2078 alignment units.

1998 Parentes próximos translated by Ana Falcão Bastos. Corpus text based on Lisboa: Gradiva, 1998, pp 7-77. Copyright © 1998 Gradiva Publicações, Lda. Used by permission of Gradiva Publicações, Lda.
TRANSLATION: Portuguese, Portugal; 32552 tokens, 30300 words, 5716 types.

ESNG1 Gordimer, Nadine

1990 My son's story Corpus text extracted from digital text prepared by the ENPC based on London: Penguin, 1991, pp 3-46. Copyright © 1990 Nadine Gordimer. Used by permission of Nadine Gordimer.
SOURCE: English, South Africa; 15921 tokens, 15046 words, 3053 types. 674 alignment units.

1992 A história do meu filho translated by Geraldo Galvão Ferraz. Corpus text extracted from digital text prepared by the ENPC based on São Paulo: Editora Siciliano, 1992, pp 11-49. Copyright © 1992 Siciliano. Used by permission of Editora Siciliano.
TRANSLATION: Portuguese, Brazil; 15035 tokens, 14134 words, 3595 types.

EUHJ1 James, Henry

1880 Washington Square, http://www.ibiblio.org/gutenberg/etext01/wassq10.txt, Project Gutenberg, 2001, pp 6-43.
SOURCE: English, USA; 23433 tokens, 20922 words, 3018 types. 1083 alignment units.

1990 A Herdeira translated by M. F. Gonçalves de Azevedo. Lisboa: Editorial Estampa, pp 7-77. Copyright © 1990 Editorial Estampa. Used by permission of Editorial Estampa.
TRANSLATION: Portuguese, Portugal; 21127 tokens, 19459 words, 3881 types.

PBAA1 Azevedo, Aluísio

1890 O Cortiço, http://vbookstore.uol.com.br/nacional/aluisioazevedo/cortico.shtml [06/12/1999].
SOURCE: Portuguese, Brazil; 8324 tokens, 7900 words, 2376 types, 291 alignment units.

2000 The slum: a novel translated by David Rosenthal. New York: Oxford University Press, pp 1-17. Copyright © 2000 by Oxford University Press. Used by permission of Oxford University Press, Inc.
TRANSLATION: English, US; 7451 tokens, 6934 words, 2025 types.

PBJA1T1 Alencar, José de

1865 Iracema http://vbookstore.uol.com.br/nacional/josedealencar/iracema.shtml, pp 1-12 [06/12/1999]. Corpus text extracted from digital text prepared by the Biblioteca Virtual do Estudante Brasileiro, based on 24th ed., São Paulo: Ática, 1991.
SOURCE: Portuguese, Brazil; 3741 tokens, 3470 words, 1249 types, 219 alignment units.

2000 Iracema translated by Clifford Landers. Corpus text based on New York: Oxford University Press, 2000, pp 1-17. Copyright © 2000 by Oxford University Press. Used by permission of Oxford University Press.
TRANSLATION: English, US; 4199 tokens, 3862 words, 1177 types.

PBJA1T2 Alencar, José de

1865 Iracema http://vbookstore.uol.com.br/nacional/josedealencar/iracema.shtml, pp 1-12 [06/12/1999]. Corpus text extracted from digital text prepared by the Biblioteca Virtual do Estudante Brasileiro, based on 24th ed., São Paulo: Ática, 1991.
SOURCE: Portuguese, Brazil; 3741 tokens, 3470 words, 1249 types, 219 alignment units.

1886 Iracema, the honey lips: a legend of Brazil translated by Lady Isabel Burton. Corpus text based on London: Bickers, 1886, pp 1-17.
TRANSLATION: English, UK; 4292 tokens, 3939 words, 1303 types.

PBMA1 Machado de Assis, J.

1886 Quincas Borba http://www.vbookstore.com.br/nacional/machadodeassis/quincas.shtml pp 1-21 [06/12/1999] Corpus text extracted from digital text prepared by Núcleo de Pesquisas em Informática, Literatura e Lingüística da Universidade Federal de Santa Catarina, based on Obra Completa, by Machado de Assis, vol. I, Nova Aguilar, Rio de Janeiro, 1994.
SOURCE: Portuguese, Brazil; 13726 tokens, 12614 words, 3184 types, 841 alignment units.

1998 Quincas Borbaedited by Celso Favaretto and David T. Haberly; translated by Gregory Rabassa. Corpus text based on New York: Oxford University Press 1998, pp 5-45. Copyright © 1998 by Oxford University Press Inc. Used by permission of Oxford University Press, Inc.
TRANSLATION: English, US; 15788 tokens, 14307 words, 2659 types.

PBMA2 Machado de Assis, Joaquim

1881 Memórias póstumas de Brás Cubas http://vbookstore.uol.com.br/nacional/machadodeassis/cubas.shtml pp 1-24 [06/12/1999] Corpus text extracted from digital text prepared by Renato Lima for Virtual bookstore, livros e literatura.
SOURCE: Portuguese, Brazil; 12332 tokens, 11554 words, 3367 types, 491 alignment units.

1998 The posthumous Memoirs of Brás Cubas translated by Gregory Rabassa. Corpus text based on New York: Oxford University Press, 1998, pp 5-34. Copyright © 1998 by Oxford University Press Inc. Used by permission of Oxford University Press, Inc.
TRANSLATION: English, US; 13324 tokens, 12453 words, 2905 types.

PBMA3 Machado de Assis, Joaquim

1899 Dom Casmurro http://vbookstore.uol.com.br/nacional/machadodeassis/casmurro.shtml pp 1-18 [06/12/1999] Corpus text extracted from digital text prepared by Núcleo de Pesquisas em Informática, Literatura e Lingüística da Universidade Federal de Santa Catarina.
SOURCE: Portuguese, Brazil; 11919 tokens, 11021 words, 2958 types, 661 alignment units.

1997 Dom Casmurro translated by John Gledson. Corpus text based on New York: Oxford University Press, 1997, pp 3-39. Copyright © 1997 by Oxford University Press Inc. Used by permission of Oxford University Press, Inc.
TRANSLATION: English, US; 13978 tokens, 12796 words, 2543 types.

PBPC1 Coelho, Paulo

1988 O alquimista Corpus text based on Rio de Janeiro: Rocco, 2000, pp 21-79. Copyright © 1988 by Paulo Coelho. Used by permission of Paulo Coelho.
SOURCE: Portuguese, Brazil; 11742 tokens, 10775 words, 2242 types, 772 alignment units.

1993 The alquemist translated by Alan Clarke. Corpus text based on London: Thorsons, 1997, pp 3-50. Copyright © by Paulo Coelho & Alan Clarke. Used by permission of Paulo Coelho and Alan Clarke.
TRANSLATION: English, US; 13078 tokens, 11897 words, 1787 types.

PBPM1 Melo, Patrícia

1998 O elogio da mentira Corpus text based on São Paulo: Companhia das Letras, 1998, pp 11-61. Copyright © 1998 by Patrícia Melo. Used by permission of Patrícia Melo.
SOURCE: Portuguese, Brazil; 15152 tokens, 14036 words, 3456 types, 1025 alignment units.

1999 In praise of lies translated by Clifford Landers. Corpus extract based on London: Bloomsbury, 1999, pp 1-53. Copyright © 1999 by Clifford Landers. Used by permission of Clifford Landers.
TRANSLATION: English, US; 16670 tokens, 15377 words, 2957 types.

PBRF1 Fonseca, Rubem

1988 Vastas emoções e pensamentos imperfeitos São Paulo: Companhia das Letras,pp 7-102.Copyright © Rubem Fonseca. Used by permission of Rubem Fonseca.
SOURCE: Portuguese, Brazil; 34654 tokens, 29641 words, 5891 types, 2860 alignment units.

1997 The lost manuscript translated by Clifford Landers. London: Bloomsbury, pp 3-107. Copyright © 1997 by Clifford Landers. Used by permission of Clifford Landers.
TRANSLATION: English, US; 35469 tokens, 32320 words, 4766 types.

PMMC1 Couto, Mia

1987 Vozes Anoitecidas. Lisboa: Editorial Caminho, pp 23-64. Copyright © Editorial Caminho 1987. Used by permission of Editorial Caminho
SOURCE: Portuguese, Mozambique; 7599 tokens, 6774 words, 2190 types, 750 alignment units.

1990 Voices Made Night translated by David Brookshaw. Oxford: Heinneman, pp 1-28. Copyright © 1990 by David Brookshaw. Used by permission of David Brookshaw .
TRANSLATION: English, UK; 9201 tokens, 8403 words, 1977 types.

PPCC1 Castelo Branco, Camilo

1862 Amor de Perdição corpus extract taken from Obras integrais de Camilo Castelo Branco em CD-ROM. Projecto Vercial 2000, pp 60-92.
SOURCE: Portuguese, Portugal; 19018 tokens, 17499 words, 3960 types, 1210 alignment units.

2000 Doomed Love translated by Alice Clemente. Providence: Gávea Brown, pp 96-144. Copyright © 2000 Alice Clemente and Gávea-Brown Publications. Used by permission of Alice Clemente and Gávea-Brown Publications.
TRANSLATION: English, USA; 20420 tokens, 18099 words, 3007 types.

PPEQ1 Eça de Queirós, J. M.

1880 O mandarim http://www.ipn.pt/opsis/litera/queiros.htmpp 2-10 [10/12/1999]. Corpus text extracted from digital text prepared by Deolinda Cabrera for Projecto Vercial.
SOURCE: Portuguese, Portugal; 7274 tokens, 6754 words, 2499 types, 303 alignment units.

1993 The mandarinin The mandarin and other storiestranslated by Margaret Jull Costa. Corpus text based on Sawtry: Dedalus Ltd, 1993, pp 5-25. Used by permission of Margaret Jull Costa and Dedalus Press.
TRANSLATION: English, UK; 8195 tokens, 7788 words, 2413 types.

PPMC1 Carvalho, Mário de

1994 Um deus passeando pela brisa da tarde Corpus text based on Lisboa: Caminho, 1995, pp 13-87. Copyright © 1994 Editorial Caminho. Used by permission of Editorial Caminho.
SOURCE: Portuguese, Portugal; 25012 tokens, 23499 words, 6463 types, 1394 alignment units.

1997 A god strolling in the cool of the evening translated by Gregory Rabassa. Corpus text based on London: Phoenix, 1999, pp 1-79. Copyright © 1997 by Lousiana StateUniversity Press. Used by permission of Lousiana StateUniversity Press.
TRANSLATION: English, US; 27120 tokens, 25305 words, 4892 types.

PPSC1 Sá-Carneiro, Mário de

1914 A confissão de Lúcio http://www.ipn.pt/opsis/litera/carneiro.htm pp 2-19 [10/12/1999]. Corpus text extracted from digital text prepared by Deolinda Cabrera for Projecto Vercial, based on orthographically updated 1st edition, Lisbon, 1914.
SOURCE: Portuguese, Portugal; 10252 tokens, 9440 words, 2796 types, 615 alignment units.

1993 Lucio's confession translated by Margaret Jull Costa. Corpus text based on Sawtry: Dedalus Ltd, 1993, pp 19-50. Copyright © 1993 by Dedalus Ltd. Used by permission of Dedalus Ltd.
TRANSLATION: English, UK; 11406 tokens, 10752 words, 2419 types.


Quantitative summary of present corpus

Aligned texts Source texts Translations
Portuguese 13 9
English 7 14
Total 20 23

Translation notes
Portuguese 42
English 86
Total 128

Alignment units
Portuguese to English direction 11651
English to Portuguese direction 13749
Total 25400

Words Source texts Translations Source texts & translations
Portuguese 168447 250701 419152
English 248947 184232 433184
Portuguese & English 417394 434933 852336

Words/alignment unit Source texts Translations Source texts & translations
Portuguese 14.46 18.23 16.50
English 18.11 15.81 17.05
Portuguese & English 16.43 17.12 33.56

Types Source texts Translations Source texts & translations
Portuguese 24531 24499 37948
English 17458 15343 25743

Tokens Source texts Translations Source texts & translations
Portuguese 184486 269092 453582
English 269280 200591 469876
Portuguese & English 453766 469683 923458

Type/token ratio Source texts Translations Source texts & translations
Portuguese 0.1456 0.0977 0.0905
English 0.0701 0.0833 0.0594


Text-pairs in preparation

PPCP1 Cardoso Pires, José

1983 Balada da Praia dos Cães Lisboa: Edições "O Jornal", pp 13-60. First published by Publicações Dom Quixote, 1982. Copyright © 1982 by José Cardoso Pires. Used by permission of Maria Edite Cardoso Pires.
SOURCE: Portuguese, Portugal

1986 Ballad of Dog's Beach translated by Mary Fitton. London: John M. Dent, pp 5-38. Copyright © 1986 by John M. Dent. Used by permission of The Orion Publishing Group Ltd.
TRANSLATION: English, UK.


Other text-pairs to be included in COMPARA


Author Year
Source Text Place Source Publisher

Agualusa, José Eduardo 1990
D. Nicolau Água Rosada e outras histórias verdadeiras Lisboa Vega
Levitin, Alexis 1995
The incredible but true story of Prince Nicolau Água Rosada Madison The Literary Review

Agualusa, José Eduardo 1992
A feira dos assombrados Lisboa Vega
Zenith, Richard 1994
Shadow Town Prague Trafika

Azevedo, Aluísio 1881
O mulato WWW Virtual Bookstore, Livros e Literatura
MacNicoll, Graeme 1990
Mulatto Cranbury, NJ Associated University Presses

Barnes, Julian 1989
The history of the world in 10 1/2 chapters London Picador


Uma história do mundo em 10 1/2 Capítulos Lisboa Quetzal Editores

Buarque, Chico 1991
Estorvo São Paulo Companhia das Letras
Bush, Peter 1992
Turbulence London Bloomsbury

Buarque, Chico 1995
Benjamim São Paulo Companhia das Letras
Landers, Clifford 1997
Benjamin London Bloomsbury

Carrol, Lewis 1871
Through the looking glass WWW Project Gutenberg
Arriaga, Yolanda & N. Videira & L. Lobo
Alice do outro lado do espelho Lisboa Editorial Estampa

Coelho, Paulo 1987
Diário de um mago Rio de Janeiro Rocco
Clarke, Alan 1995
The pilgrimage: a contemporary quest for ancient wisdom New York HarperCollins

Coelho, Paulo 1996
O monte cinco Rio de Janeiro Objetiva
Landers, Clifford 1998
The fifth mountain London Harper

Conrad, Joseph 1902
Heart of darkness WWW Project Gutenberg
Fernandes, Aníbal 1983
O coração das trevas Lisboa Editorial Estampa

Couto, Mia 1990
Cada homem é uma raça Lisboa Editorial Caminho
Brookshaw, David 1993
Every man is a race Oxford Heinneman

Dourado, Autran 1975
Os sinos da agonia Rio de Janeiro Editora Expressão e Cultura
Parker, John 1988
The bells of agony London Peter Owen

Dourado, Autran 1973
O risco do bordado Rio de Janeiro Editora Expressão e Cultura
Parker, John 1984
Pattern for a tapestry London Peter Owen

Eça de Queirós, J. M.
Alves e Co. WWW Projecto Vercial
Vetch, John 1993
The yellow sofa Manchester Carcanet Press Limited

Eça de Queirós, J. M. 1887
A relíquia WWW Projecto Vercial
Costa, Margaret Jull 1994
The relic Sawtry Dedalus, Ltd.

Fonseca, Rubem 1983
A grande arte São Paulo Companhia das Letras
Watson, Ellen 1987
High art London Collins

Gordimer, Nadine 1974
The conservationist London Picador

1992
O conservador Porto Edições Asa

Gordimer, Nadine 1981
Burger's daughter London Penguin

1992
A filha de Burger Porto Edições Asa

Gordimer, Nadine 1992
July's people Bath Chivers Press

1986
A gente de July Lisboa Editorial Teorema

Lodge, David 1984
Small world London Secker & Warburg

1996
O mundo é pequeno Porto Edições Asa

Lodge, David 1992
Paradise news London Penguin
Babo, Carlos Grifo 2000
Notícias do paraíso Lisboa Gradiva Publicações, Lda

Lodge, David 1993
How far can you go? London Secker & Warburg
Cardoso, Helena 1997
Até onde se pode ir? Lisboa Gradiva Publicações, Lda

Machado de Assis, J. 1876
Helena
Caldwell, Helen 1987
Helena Berkeley University of California Press

Machado de Assis, J. 1908
Memorial de Aires WWW Virtual Bookstore, Livros e Literatura
Caldwell, Helen 1972
Counselor Ayres' Memorial Berkeley University of California Press

Melo, Patrícia 1995
O matador São Paulo Companhia das Letras
Landers, Clifford 1997
The killer London Bloomsbury

Rey, Marcos 1986
Memórias de um gigolô São Paulo Ática Editorial
Landers, Clifford 1987
Memoirs of a Gigolo New York Avon Books

Sá-Carneiro, Mário 1956
A grande sombra Lisboa Ática
Costa, Margaret Jull 1996
The great shadow (and other stories) Sawtry Dedalus, Ltd.

Saramago, José 1989
História do cerco de Lisboa Lisboa Editorial Caminho
Pontiero, Giovanni 1996
The History of the siege of Lisbon London Harvill Press

Saramago, José 1995
Ensaio sobre a cegueira Lisboa Editorial Caminho
Pontiero, Giovanni 1997
Blindness London Harvill Press

Sena, Jorge de 1978
Sinais de fogo Lisboa Edições 70, Lda.
Byrne, John 1999
Signs of fire Manchester Carcanet Press Limited

Shelley, Mary 1818
Frankenstein WWW Project Gutenberg


Frankenstein Lisboa Editorial Estampa

Soares, Jô 1995
O xangô de Baker Street São Paulo Companhia das Letras
Landers, Clifford
A samba for sherlock New York Pantheon books

Trollope, Joanna 1993
A Spanish lover London Bloomsbury
Bastos, Ana Falcão 1999
O amante espanhol Lisboa Gradiva Publicações, Lda

Trollope, Joanna 1996
The best of friends London Bloomsbury

1996
Os melhores amigos Lisboa Gradiva Publicações, Lda

Wilde, Oscar 1897
De profundis WWW Project Gutenberg

1991
De profundis Lisboa Editorial Estampa

Wilde, Oscar 1890
The picture of Dorian Gray WWW Project Gutenberg

1990
O retrato de Dorian Gray Lisboa Editorial Estampa

Zimler, Richard 1998
The last kabbalist of Lisbon London Arcadia Books, Ltd.
Lima, José 1999
O último cabalista de Lisboa Lisboa Quetzal Editores

Zimler, Richard 2000
The angelic darkness London Arcadia Books, Ltd.
Lima, José 1983
Trevas de luz Lisboa Quetzal Editores

Previous versions


START USING COMPARA MORE INFORMATION ABOUT COMPARA ACKNOWLEDGEMENTS   HOW TO CONTRIBUTE  
Simple Search Project team
Complex Search Contents
Questions from users Publications
Search Help The DISPARA interface
Building the corpus

Comments and feedback to compara@informatics.sintef.no