Corpus Search
In CORTEGAL, it is possible to carry out text searches, document searches, or combine both options to filter the results.
To search for a word in the corpus, you can type it into the CQL query box or into the “Student’s final version” field. Additionally, searches can be performed across the different standardization layers, by the codes that identify the various deviations from the standard, the codes that indicate the source of these deviations, the POS of the word written by the student, the POS of the standard form, the lemma corresponding to the student’s word (“original lemma”), and the standard lemma. Likewise, various types of discourse connectors—classified according to their function—can be searched for, along with certain other deviation codes that apply to or may affect sequences of words (in the "Multiword annotations" layer). The values of the deviation codes, their sources, connector types with examples, as well as the EAGLES tags used for POS, can be found in the document EAGLES codes and labels used in the anotation of the texts (only in Galician).
On the other hand, in “Document search,” texts can be filtered according to different criteria: their file name, topic, exam date, or delegated commissions (see Teaching centers associated with the delegated commissions of the ABAU tests (course 2016-2017)) for a list of schools assigned to each commission), as well as quantitative criteria (word count, number of lemmas, lexical density, number of sentences, etc.). Lexical density is calculated by dividing the number of lemmas by the number of words and always ranges between 0 and 1. However, since the search engine cannot process decimals, the figures have been multiplied by 100, so the range becomes 0 to 100 (for example, a lexical density of 0.54 appears as 54).
Finally, more complex searches can be performed through the CQL query box, which also supports wildcards and logical operators. For more information on the search system, consult the Help section