Corpus Search
Special characters: ~u = ũ, ~e = ẽ, ~i = ĩ
Type in a search query in the CQP (Corpus Query Protocol) format in the text box above to search in the corpus.
Which fields can be searched on depends on the corpus, but they typically include:
- word: the written version of the word
- nform: the normalized/standardized version of the word
- lemma: the lemma
- pos: the Part-of-Speech tag
The advanced search page includes several of the corpus-specific search options
The CQP Query syntax uses an intuitive system of defining properties of words you are looking for, as in for instance:
[lemma="casa"] [pos="A.*"]
for form of the word casa followed by a adjective.
Within each query, it is possible to use full regular expression searches, to find words that have a diphtongue "au" in it, you can use:
[word="[^q]au.*"]