Sujeito oculto às claras: uma abordagem descritivo-computacional

Cláudia Freitas; Elvis de Souza

doi:10.17851/2237-2083.29.2.1033-1058

Omitted subjects revealed

a quantitative-descriptive approach

Authors

Cláudia Freitas Pontifícia Universidade Católica do Rio de Janeiro
Elvis de Souza Pontifícia Universidade Católica do Rio de Janeiro

DOI:

https://doi.org/10.17851/2237-2083.29.2.1033-1058

Keywords:

linguistic description, omitted subject, syntactic dependencies, computational linguistics, machine learning, corpus linguistics

Abstract

In this paper, we present descriptive and computational studies related to omitted subjects. Firstly, we develop a quantitative descriptive study based on three corpora, which consist of journalistic, literary and encyclopedic genres. Specifically, we quantify the omitted subjects in sentences for each of these corpora; omitted subjects were found in 24%, 41% and 46% of their sentences, respectively. Secondly, applying rule-based strategies, we reconstitute those subjects and place them back to the corpora, with the goal of evaluating how much the omission of subjects can impact the automatic learning of syntactic dependencies. The results indicate that the formal subject reconstitution can enhance the learning of syntactic dependencies in up to 2% according to the CLAS metric, highlighting the relevant role of linguistic modeling in the automatic learning process.

Downloads

PDF (Português (Brasil)) (Portuguese)

Published

2024-10-06

Issue

Vol. 29 No. 2 (2021): Linguística de Corpus: Conquistas e Desafios

Section

Corpus Linguistics: Achievements and Challenges

How to Cite

Omitted subjects revealed: a quantitative-descriptive approach. Revista de Estudos da Linguagem, [S. l.], v. 29, n. 2, p. 1033–1058, 2024. DOI: 10.17851/2237-2083.29.2.1033-1058. Disponível em: https://periodicos.ufmg.br/index.php/relin/article/view/54371. Acesso em: 25 jul. 2026.

Download Citation