From the contact between Literature, Corpus Linguistics and Natural Language Processing

the case of the anagrammatics of Guimarães Rosa




Corpus Linguistics, Natural Language Processing, Guimarães Rosa


From the attempt to achieve the cooperation between Corpus Linguistics and the Natural Language Processing (NLP), important products have been created, as the possibility of processing lots of linguistic data and developing technologies that use language. The relationship between those areas and the Literary Studies, however, has been less studied, opening spaces for this study, which has the objective of carrying out an exploratory analysis of the poems assigned to the anagrammatics of João Guimarães Rosa, in Ave, Palavra, from 1970. In order to do so, approaches of Corpus Linguistics and NLP were used together, associated with the works of Rossi (2007), Brito (2012) and Vital (2021), about the rosian oeuvre. Using computational processing, we extracted the following data from the corpus: a) the number of words; b) type-token ratio; c) the number of stanzas; d) the most frequent words for each anagrammatics. The data were displayed in the form of graphics and word clouds. From the results, we observed that there are quantitative and qualitative differences for each poet, reinforcing, through observations of the epigraphs of each author, the complexity involved in the metapoeticity of anagrammatic masks.


Átila Augusto Soares Vital, Universidade Federal de Minas Gerais, Faculdade de Letras, Belo Horizonte, MG, Brasil

Estudante do Bacharelado em Linguística Teórica e Descritiva pela UFMG. Membro do Laboratório de Estudos Empíricos e Experimentais da Linguagem (LEEL), da Faculdade de Letras (FALE/UFMG). Trabalha com estudos da estrutura informacional da fala espontânea a partir da Linguística de Corpus e Computacional. 


