Dialetologia holística baseada em corpus
Parole chiave:
dialetologia baseada em corpus, abordagem holística, dialetometria baseada em corpus, agregados de traços, análise multivariada, técnicas de visualizaçãoAbstract
Este artigo debruça-se sobre o esboço propositivo de futuras direções para a dialetologia baseada em corpus. Defendemos uma abordagem holística para o estudo da variabilidade linguística geograficamente condicionada, e apresentamos uma metodologia adequada para tal – a dialetometria baseada em corpus. Mais especificamente, defendemos que para que se obtenham todos os resultados esperados da metodologia de corpus, pesquisadores devem: (i) abandonar seu foco exclusivo em traços linguísticos individuais em favor do estudo dos agregados de traços, (ii) amparar-se em métodos computacionais avançados de técnicas de análise multivariada (tais como escalagem multidimensional, análise de clusters, e análise de componente principal), e (iii) auxiliar a interpretação de resultados empíricos através da utilização do estado da arte em técnicas de visualização. A fim de exemplificarmos essa linha de análise, apresentamos um estudo de caso que explora a variabilidade da frequência agregada de 57 traços morfossintáticos de 34 dialetos da Grã-Bretanha.
Downloads
Riferimenti bibliografici
ALDENDERFER, M. S.; BLASHFIELD, R. K. Cluster Analysis Newbury Park, London, New Delhi: Sage Publications, 1984.
ANDERWALD, L.; SZMRECSANYI, B. Corpus linguistics and dialectology. In: LÜDELING, A.; KYTÖ, M. (Ed.). Corpus Linguistics. An International Handbook. Handbücher zur Sprache und Kommunikationswissenschaft/ Handbooks of Linguistics and Communication Science. Berlin / New York: Mouton de Gruyter, 2009.
ARPPE, A.; GILQUIN, G.; GLYNN, D.; HILPERT, M.; ZESCHEL, A. Cognitive Corpus Linguistics: Five points of debate on current theory and methodology. Corpora, v. 5, n. 2, p. 1-27, 2010.
BIBER, D. Variation across Speech and Writing Cambridge: Cambridge University Press, 1988.
BLOOMFIELD, L. Language Chicago: University of Chicago Press, 1984 [1933]
BRYANT, D.; MOULTON, V. Neighbor-Net: An Agglomerative Method for the Construction of Phylogenetic Networks. Mol. Biol. Evol., v. 21, n. 2, p. 255-265, 2004.
CYSOUW, M. New approaches to cluster analysis of typological indices. In: KÖHLER, R.; GRZBEK, P. (Ed.). Exact Methods in the Study of Language and Text Berlin, New York: Mouton de Gruyter, 2007.
DRESS, A. W. M.; HUSON, D. H. Constructing Splits Graphs. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), v. 1, n. 3, p. 109-115, 2004.
DUNTEMAN, G. H. Principal components analysis Newbury Park: Sage Publications, 1989.
EMBLETON, S. Multidimensional scaling as a dialectometrical technique: Outline of a research project. In: KÖHLER, R.; RIEGER, B. (Ed.). Contributions to quantitative linguistics Dordrecht: Kluwer, 1993.
GOEBL, H. Dialektometrie: Prinzipien und Methoden des Einsatzes der Numerischen Taxonomie im Bereich der Dialektgeographie. Wien: Österreichische Akademie der Wissenschaften, 1982.
GOEBL, H. Dialektometrische Studien: Anhand italoromanischer, rätroromanischer und galloromanischer Sprachmaterialien aus AIS und ALF. Tübingen: Niemeyer, 1984. 3 v.
GOEBL, H. Arealtypologie und Dialektologie. In: HASPELMATH, M.; E. KÖNIG, E.; OESTERREICHER, W.; RAIBLE, W. (Ed.). Language Typology and Language Universals / La typologie des langues et les universaux linguistiques / Sprachtypologie und sprachliche Universalien: An International Handbook / Manuel international / Ein internationales Handbuch Berlin, New York: Walter de Gruyter, 2001. v. 2.
GOEBL, H. Recent Advances in Salzburg Dialectometry. Literary and Linguistic Computing, v. 21, n. 4, p. 411-435, 2006.
GOEBL, H. A bunch of dialectometric flowers: a brief introduction to dialectometry. In: SMIT, U.; DOLLINGER, S.; HÜTTNER, J.; KALTENBÖCK, G.; LUTZKY, U. (Ed.). Tracing English through time: Explorations in language variation. Wien: Braumüller, 2007.
GOEBL, H.; SCHILTZ, G. A dialectometrical compilation of CLAE 1 and CLAE 2: Isoglosses and dialect integration. In: VIERECK, W.; RAMISCH, H. (Ed.). Computer developed linguistic atlas of England (CLAE) Tübingen: Max Niemeyer Verlag, 1997. v. 2.
GOOSKENS, C. Traveling time as a predictor of linguistic distance. Dialectologia et Geolinguistica, v. 13, p. 38-62, 2005.
GOOSKENS, C.; HEERINGA, W. Perceptive evaluation of Levenshtein dialect distance measurements using Norwegian dialect data. Language Variation and Change, v. 16, n. 3, p. 189-207, 2004.
GRIEVE, J. A Corpus-Based Regional Dialect Survey of Grammatical Variation in Written Standard American English 340f. 2009. PhD (Dissertation) Northern Arizona University.
HAIMERL, E. Database Design and Technical Solutions for the Management, Calculation, and Visualization of Dialect Mass Data. Literary and Linguistic Computing, v. 21, n. 4, p. 437-444, 2006.
HEERINGA, W. Measuring dialect pronunciation differences using Levenshtein distance, 2004. 312f. PhD (Dissertation) University of Groningen.
HEERINGA, W.; NERBONNE, J. Dialect areas and dialect continua. Language Variation and Change, v. 13, n. 3, p. 375-400, 2001.
HERNÁNDEZ, N. User's Guide to FRED. URN: urn:nbn:de:bsz:25-opus24895, URL: http://www.freidok.uni-freiburg.de/volltexte/2489/ Freiburg: University of Freiburg, 2006.
HUSON, D. H.; BRYANT, D. Application of phylogenetic networks in evolutionary studies. Molecular Biology Evolution, v. 23, n. 2, p. 254-267, 2006.
JAIN, A. K.; MURTY, M. N.; FLYNN, P. J. Data clustering: a review. ACM Computing Surveys, v. 31, n. 3, p. 264-323, 1999.
KORTMANN, B.; SZMRECSANYI, B. Global synopsis: morphological and syntactic variation in English. In: KORTMANN, B.; SCHNEIDER, E.; BURRIDGE, K.; MESTHRIE, R.; UPTON, C. (Ed.). A Handbook of Varieties of English Berlin/New York: Mouton de Gruyter, 2004. v. 2.
KRUSKAL, J. B.; WISH, M. Multidimensional Scaling Newbury Park, London / New Delhi: Sage Publications, 1978.
LEINONEN, T. Factor Analysis of Vowel Pronunciation in Swedish Dialects. International Journal of Humanities and Arts Computing, v. 2, n. 1-2, p. 189-204, 2008.
MCMAHON, A.; HEGGARTY, P.; MCMAHON, R.; MAGUIRE, W. The sound patterns of Englishes: representing phonetic similarity. English Language and Linguistics, v. 11, n. 1, p. 113-142, 2007.
MCMAHON, A. M. S.; MCMAHON, R. Language classification by numbers Oxford New York: Oxford University Press, 2005.
NERBONNE, J. Computational Contributions to Humanities. Linguistic and Literary Computing, v. 20, n. 1, p. 25-40, 2005.
NERBONNE, J. Identifying Linguistic Structure in Aggregate Comparison. Literary and Linguistic Computing, v. 21, n. 4, p. 463-475, 2006.
NERBONNE, J. Variation in the aggregate: an alternative perspective for variationist linguistics. In: DEKKER, K.; MACDONALD, A.; NIEBAUM, H. (Eds.); Northern Voices: Essays on Old Germanic and Related Topics offered to Professor Tette Hofstra. Leuven: Peeters, 2008.
NERBONNE, J. Data-driven dialectology. Language and Linguistics Compass, v. 3, n. 1, p. 175-198, 2009.
NERBONNE, J.; HEERINGA, W.; KLEIWEG, P. Edit Distance and Dialect Proximity. In: SANKOFF, D.; KRUSKAL, J. (Ed.). Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparison. Stanford: CSLI Press, 1999.
NERBONNE, J.; KLEIWEG, P. Toward a Dialectological Yardstick. Journal of Quantitative Linguistics, v. 14, n. 2, p. 148-166, 2007.
NERBONNE, J.; KLEIWEG, P.; MANNI, F. Projecting dialect differences to geography: bootstrapping clustering vs. clustering with noise. In: PREISACH, C.; SCHMIDT-THIEME, L.; BURKHARDT, H.; DECKER, R. (Ed.). Data Analysis, Machine Learning, and Applications. Proceedings of the 31st Annual Meeting of the German Classification Society Berlin: Springer, 2008.
NUNNALLY, J. C. Psychometric Theory McGraw-Hill, 1978.
ORTON, H.; SANDERSON, S.; WIDDOWSON, J. D. A. The Linguistic Atlas of England London, Atlantic Highlands, N.J.: Croom Helm, 1978.
PENKE, M.; ROSENBACH, A. What counts as evidence in linguistics? An introduction. Studies in Language, v. 28, n. 3, p. 480-526, 2004.
SÉGUY, J. La relation entre la distance spatiale et la distance lexicale. Revue de Linguistique Romane, v. 35, p. 335-357, 1971.
SHACKLETON, R. G. J. English-American Speech Relationships: A Quantitative Approach. Journal of English Linguistics, v. 33, n. 2, p. 99-160, 2005.
SHACKLETON, R. G. J. Phonetic variation in the traditional English dialects: a computational analysis. Journal of English Linguistics, v. 35, n. 1, p. 30-102, 2007.
SZMRECSANYI, B. Corpus-based dialectometry: aggregate morphosyntactic variability in British English dialects. International Journal of Humanities and Arts Computing, v. 2, n. 1-2, p. 279-296, 2008.
SZMRECSANYI, B. The morphosyntax of BrE dialects in a corpus-based dialectometrical perspective: feature extraction, coding protocols, projections to geography, summary statistics. URN: urn:nbn:de:bsz:25-opus-73209, URL: http://www.freidok.uni-freiburg.de/volltexte/7320/ Freiburg: University of Freiburg, 2010.
SZMRECSANYI, B. Corpus-based dialectometry a methodological sketch. Corpora, v. 6, n. 1, 2011.
SZMRECSANYI, B. Geography is overrated. In: HANSEN, S.; SCHWARZ, C.; STOECKLE, P.; STRECK, T. (Ed.). Dialectological and folk dialectological concepts of space Berlin, New York: Walter de Gruyter, to appear.
SZMRECSANYI, B.; HERNÁNDEZ, N. Manual of Information to accompany the Freiburg Corpus of English Dialects Sampler ("FRED-S"). URN: urn:nbn:de:bsz:25-opus-28598, URL: http://www.freidok.uni-freiburg.de/ volltexte/2859/ Freiburg: University of Freiburg, 2007.
SZMRECSANYI, B.; KORTMANN, B. The morphosyntax of varieties of English worldwide: a quantitative perspective. Lingua, v. 119, n. 11, p. 1643-1663, 2009.
TRUDGILL, P. Linguistic change and diffusion: description and explanation in sociolinguistic dialect geography. Language in Society, v. 2, p. 215-246, 1974.
VIERECK, W. Linguistic atlases and dialectometry: The survey of English dialects. In: KIRK, J. M.; SANDERSON, S.; WIDDOWSON, J. D. A. (Ed.). Studies in linguistic geography: The dialects of English in Britain and Ireland. London: Croom Helm, 1985.
VORONOI, G. Nouvelles applications des paramètres continus à la théorie des formes quadratiques. Journal für die Reine und Angewandte Mathematik, v. 133, p. 97-178, 1907.
WARD, J. H. J. Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, v. 58, p. 236-244, 1963.
WIELING, M.; HEERINGA, W.; NERBONNE, J. An aggregate analysis of pronunciation in the Goeman-Taeldeman-van Reenen-Project data. Taal en Tongval, v. 59, n. 1, p. 84-116, 2007.
Dowloads
Pubblicato
Fascicolo
Sezione
Licenza
Copyright (c) 2012 Revista Brasileira de Linguística Aplicada

Questo volume è pubblicato con la licenza Creative Commons Attribuzione 4.0 Internazionale.
Autores de artigos publicados pela RBLA mantêm os direitos autorais de seus trabalhos, licenciando-os sob a licença Creative Commons BY Attribution 4.0, que permite que os artigos sejam reutilizados e distribuídos sem restrição, desde que o trabalho original seja corretamente citado.


