Computerized Mediaeval Corpus of the Galician Language

Team members: 
César Osorio Peláez

The Computerized Mediaeval Corpus of the Galician Language (Tesouro Medieval Informatizado da Lingua Galega or TMILG) forms part of the Tesouro Medieval Galego-Portugués which covers the whole mediaeval linguistic corpus of Galicia and Portugal, within which it constitutes the Xelmírez Section which is the latter’s Galician component. It contains mediaeval texts written in Galician Romance, covering practically every publication and published collection of documents in mediaeval Galician, whether it be verse, literary prose, historical writings, religious tracts, technical treatises or notary documents. The standardised codification of these texts and their placement on-line makes it possible to consult them freely and obtain a huge amount of information.

The targets for coming years are lemmatization and grammatical parsing. This will be done in coordination with the ILG’s corpus of modern and contemporary Galician (TILG) and the other inter-university on-line corpora of the RILG network, to which it has been affiliated since 2006.

Execution date: 
1992 to 2018
Funded by: 
Secretaría Xeral de Política Lingüistica da Consellería de Cultura, Educación e Ordenación Universitaria (Convenio coa USC)