University of Macau Portuguese learner corpus and teaching of Portuguese L2

Authors

DOI:

https://doi.org/10.1590/1983-3652.2024.47754

Keywords:

Learner corpus, Chinese learners of Portuguese L2, Quantitative and qualitative analysis, Pedagogical applications

Abstract

This article presents a corpus of Chinese learners of Portuguese L2 with PoS and lemma annotations, highlighting its potential for quantitative and qualitative analysis in identifying linguistic patterns among learners, thus contributing to the teaching of Portuguese L2. This corpus (University of Macau Portuguese Learners Corpus), named UMPLC, contains a total of 933 compositions produced by 122 Portuguese students fromUniversity of Macau over three consecutive years of study. PoS and lemma annotation was performed using Stanza, an automatic annotator developed by Qi et al. (2020). To ensure annotation consistency, the results were manually reviewed. In this research, the PoS and lemma information enables us to quantitatively and qualitatively investigate various phenomena in the corpus relating to lexical aspects and diachronic changes in this regard. Two studies were conducted based on a contrastive approach, comparing the Portuguese of learners in the corpus with native Portuguese. Non-native linguistic characteristics were discovered, allowing Portuguese L2 teachers to focus on areas requiring corrective work.

Downloads

Download data is not yet available.

References

COBB, Tom. Analyzing Late Interlanguage with Learner Corpora: Québec Replications of Three European Studies. The Canadian Modern Language Review, v. 59, n. 3, p. 393–424, 2003. DOI: 10.3138/cmlr.59.3.393. eprint: https://doi.org/10.3138/cmlr.59.3.393. Disponível em: https://doi.org/10.3138/cmlr.59.3.393.

DAVIES, Mark; PRETO-BAY, Ana Maria. A frequency dictionary of Portuguese. [S. l.]: Routledge, 2008.

GARSIDE, Roger; LEECH, Geoffrey; MCENERY, Tony. Corpus annotation: linguistic information from computer text corpora. [S. l.]: Routledge, 1997.

GRANGER, Sylviane. The computer learner corpus: a versatile new source of data for SLA research. In: GRANGER, Sylviane (ed.). Learner English on Computer. [S. l.]: Longman, 1998. p. 3–18.

GRANGER, Sylviane. A bird’s-eye view of learner corpus research. In: GRANGER, Sylviane; HUNG, Joseph; PETCH-TYSON, Stephanie (ed.). Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching. [S. l.]: Benjamins, 2002. p. 3–33.

GRANGER, Sylviane. Computer Learner Corpus Research: Current Status and Future Prospects. Applied Corpus Linguistics, Brill, p. 123–145, 2004.

GRANGER, Sylviane; GILQUIN, Gaëtanelle; MEUNIER, Fanny. Introduction: learner corpus research – past, present and future. In: The Cambridge Handbook of Learner Corpus Research. Edição: Sylviane Granger, Gaëtanelle Gilquin e Fanny Meunier. [S. l.]: Cambridge University Press, 2015. p. 1–6. (Cambridge Handbooks in Language and Linguistics). DOI: 10.1017/CBO9781139649414.001.

GRANGER, Sylviane; TRIBBLE, Christopher. Learner corpus data in the foreign language classroom: form-focused instruction and data-driven learning. In: GRANGER, Sylviane (ed.). [S. l.]: Addison Wesley Longman, 1998. p. 199–209.

GROSSO, Maria José; ZHANG, Jing; GASPAR, Catarina; TEIXEIRA, Madalena. Referencial Ensino de Português Lı́ngua Estrangeira na China. [S. l.]: Centro Cientı́fico e Cultural de Macau ff Universidade de Macau, 2021.

KREYER, Rolf. ‘Multilinguality’in learner corpora: The case of the MILE. In: NURMI, Arja; RÜTTEN, Tanja; PAHTA, Päivi (ed.). Challenging the Myth of Monolingual Corpora. [S. l.]: Brill, 2017. p. 200–219.

KÜBLER, Sandra; ZINSMEISTER, Heike. Corpus linguistics and linguistically annotated corpora. [S. l.]: Bloomsbury Publishing, 2015.

MARNEFFE, Marie-Catherine de; MANNING, Christopher D.; NIVRE, Joakim; ZEMAN, Daniel. Universal Dependencies. Computational Linguistics, MIT Press, Cambridge, MA, v. 47, n. 2, p. 255–308, jun. 2021. DOI: 10.1162/coli_a_00402. Disponível em: https://aclanthology.org/2021.cl-2.11.

MCENERY, Tony; HARDIE, Andrew. Corpus linguistics: Method, theory and practice. [S. l.]: Cambridge University Press, 2011.

NESSELHAUF, Nadja. Learner corpora and their potential for language teaching. How to use corpora in language teaching, v. 12, p. 125–156, 2004.

PAIVA, Valeria de; REAL, Livy. Universal POS tagging for Portuguese: Issues and Opportunities. Proceedings of LexSem+ Logics 2016, p. 25, 2016.

QI, Peng; ZHANG, Yuhao; ZHANG, Yuhui; BOLTON, Jason; MANNING, Christopher D. Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. In: PROCEEDINGS of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. [S. l.: s. n.], 2020. Disponível em: https://nlp.stanford.edu/pubs/qi2020stanza.pdf.

RADEMAKER, Alexandre; CHALUB, Fabricio; REAL, Livy; FREITAS, Cláudia; BICK, Eckhard; DE PAIVA, Valeria. Universal dependencies for Portuguese. In: PROCEEDINGS of the fourth international conference on dependency linguistics (Depling 2017). [S. l.: s. n.], 2017. p. 197–206.

RUNDELL, Michael. The corpus of the future, and the future of the corpus. In: TALK at a special conference on New Trends in Reference Science at Exeter, UK (a printed hand out). [S. l.: s. n.], 1996.

SANTOS, Isabel Almeida; PEREIRA, Isabel; MARTINS, Cristina; LOPES, Ana Cristina Macário; CARAPINHA, Conceição; SILVA, António. Corpus oral de PL2: um novo recurso para a investigação e ensino. Revista da Associação Portuguesa de Linguı́stica, n. 1, p. 740–760, 2016.

SELINKER, Larry. Rediscovering interlanguage. [S. l.]: Addison Wesley Longman, 1992.

TURTON, Nigel D; HEATON, John Brian. Longman dictionary of common errors. [S. l.]: Longman, 1996.

WOLFE-QUINTERO, Kathryn Elizabeth; INAGAKI, Shunji; KIM, Hae-Young. Second language development in writing: Measures of fluency, accuracy, & complexity. [S. l.]: Second Language Teaching an Curriculum Center of University of Hawai’i, 1998.

YAN, Qiaorong. O desenvolvimento do ensino de Português na China: história, situação atual e novas tendências. In: YAN, Qiaorong; FLEIDE, Daniel Albuquerque (ed.). O ensino do Português na China: parâmetros e perspectivas. [S. l.]: Edufrn, 2019. p. 24–52.

YANG, Huizhong. An Introduction to Corpus Linguistics. [S. l.]: Shanghai Foreign Language Education Press, 2001.

Published

2023-12-20

How to Cite

ZHANG, J.; YOU, M. University of Macau Portuguese learner corpus and teaching of Portuguese L2. Texto Livre, Belo Horizonte-MG, v. 17, p. e47754, 2023. DOI: 10.1590/1983-3652.2024.47754. Disponível em: https://periodicos.ufmg.br/index.php/textolivre/article/view/47754. Acesso em: 23 nov. 2024.

Issue

Section

Dossier 2024: Linguistic and cultural education mediated by digital technologies