Semantic change and word embeddings

case studies on the diachrony of Portuguese

Authors

  • Lucas Fonseca Lage Universität des Saarlandes
  • Evandro Landulfo Teixeira Paradela Cunha Universidade Federal de Minas Gerais

DOI:

https://doi.org/10.17851/2237-2083.30.4.2043-2086

Keywords:

Computational Linguistics, Diachronic Studies, Natural Language Processing, Linguistic Change, Word Embeddings

Abstract

According to Givón (2001), the lexicon is a repository of concepts which are relatively stable in time, socially shared and well encoded. They are well organized in a network where similar concepts are grouped next to each other. On a similar note, the lexicographer Georges Matoré proposes associative relationships between words and defines the concepts of notional field and testimonial words, which are organizational elements of the lexicon. Using computational techniques such as Word Embeddings, which represent words as vectors in a vector space, it is possible to analyze groupings of words based on their semantic features. This paper aims to explore the viability of such methods in semantic change. The occurrences of the word forms “deus”, “homem”, “mulher”, “pai”, “mae” and “terra” were analyzed in the Tycho Brahe corpus for Portuguese. Word Embeddings were created using the Skip-gram algorithm, and visualizations for a semantic feature network were created for each word in three different time slices. Evidence of the semantic organization of the lexicon and its reorganization was observed through the generated visualizations.

Published

2024-10-06

How to Cite

Semantic change and word embeddings: case studies on the diachrony of Portuguese. Revista de Estudos da Linguagem, [S. l.], v. 30, n. 4, p. 2043–2086, 2024. DOI: 10.17851/2237-2083.30.4.2043-2086. Disponível em: https://periodicos.ufmg.br/index.php/relin/article/view/54705. Acesso em: 26 dec. 2024.