In order to make it usable, we had to solve some problems first, namely to remove duplicate texts and handle the multiplicity of author identifiers in the different corpora.
So, we produced a list of unique authors, or better, a way to provide a unique identifier to the full set of author descriptions in our corpora.
These identifiers are not authoritative. We believe that to define authors is a prerrogative of the national libraries of the lusophone countries. These identifiers are just descriptive so that our users know what the identifiers refer to in our corpora. So, in line with Linguateca's philosophy in all areas of activity, this is just documentation of the identifiers of the authors in Literateca. We are anyway grateful for reporting any flaws or inaccuracies that users may detect.
Basically, for each work, me measured some features that could be used in exploratory studies and for visualization of the material.
Chronological description of the material:
Colour-coded by corpus: Vercial (rose), OBras (green), NOBRE (red), Tycho Brahe (light blue), Colonia (black) and PANTERA (dark blue), which was the (initial) order we included works.
Until then, we are willing in producing all figures one may require from Literateca, also to understand the user requirements. Some examples:
Last update: 30 March 2019. Contact Linguateca's team of corpus-based grammar.