Corpora and Cognitive Linguistics
Keywords:
corpora, cognitive linguistics, metaphor, polysemy, synonymy, prototypes, constructional analysis, statistics, R statistical programAbstract
Corpora are a natural source of data for cognitive linguists, since corpora, more than any other source of data, reflect "usage" – a notion which is often claimed to be of critical importance to the field of cognitive linguistics. Corpora are relevant to all the main topics of interest in cognitive linguistics: metaphor, polysemy, synonymy, prototypes, and constructional analysis. I consider each of these topics in turn and offer suggestions about which methods of analysis can be profitably used with available corpora to explore these topics further. In addition, I consider how the design and content of currently used corpora need to be rethought if corpora are to provide all the types of usage data that cognitive linguists require.
Downloads
References
ATKINSON, M.; KILBY, D.; ROCA, I. Foundations of general linguistics London: George Allen and Unwin, 1982.
BAAYEN, R. H. Analyzing linguistic data: a practical introduction to statistics using R. Cambridge: Cambridge University Press, 2008.
BENDIXEN, M. A practical guide to the use of Correspondence Analysis in marketing research. Marketing Bulleting 14, Technical Note 2, 2003. Available at: <http://marketing-bulletin.massey.ac.nz/V14/MB_V14_T2_Bendixen.pdf>. Retrieved: April 7, 2011.
BOERS, F. When a bodily source domain becomes prominent: the joy of counting metaphors in the socio-economic domain. In: GIBBS, R. W. JR.; STEEN, G. J. (Ed.). Metaphor in cognitive linguistics Amsterdam / Philadelphia: John Benjamins, 1999.
BYBEE, J. Language, usage and cognition Cambridge: Cambridge University Press, 2010.
BYBEE, J.; EDDINGTON, D. A usage-based approach to Spanish verbs of becoming. Language, v. 82, n. 2, p. 323-355, 2006.
CAMERON, L. Identifying and describing metaphor in spoken discourse data. In: CAMERON, L.; LOW, G. (Ed.). Researching and applying metaphor Cambridge: Cambridge University Press, 1999.
CAMERON, L.; DEIGNAN, A. Combining large and small corpora to investigate tuning devices around metaphor in spoken discourse. Metaphor and Symbol, v. 18, n. 3, p. 149-160, 2003.
CHARTERIS-BLACK, J. Corpus approaches to critical metaphor analysis New York: Palgrave Macmillan, 2004.
DĄBROWSKA, E. Words as constructions. In: EVANS, V.; POURCEL, S. (Ed.). New directions in cognitive linguistics Amsterdam and Philadelphia: John Benjamins, 2009.
DAVIES, M. Semantically-based queries with a joint BNC/WordNet database. In: FACCHINETTI, R. (Ed.). Corpus linguistics 25 years on Amsterdam and New York: Rodopi, 2007.
DEIGNAN, A. Metaphor and corpus linguistics Amsterdam and Philadelphia: John Benjamins, 2005.
DOWBOR, D. (ms.). The polysemy of OVER: a BP and HCFA investigation. University of Alberta.
EVANS, V.; GREEN, M. Cognitive linguistics: an introduction. Edinburgh: Edinburgh University Press, 2006.
FASS, D. met*: A method for discriminating metonymy and metaphor by computer. Computational Linguistics, v. 17, n. 1, p. 49-90, 1991.
FELLBAUM, C. (Ed.). WordNet: an electronic lexical database. Cambridge, MA: MIT Press, 1998.
FILLMORE, C. J.; ATKINS, B.T.S. Describing polysemy: the case of crawl In: RAVIN, Y.; LEACOCK, C. (Ed.). Polysemy: linguistic and computational approaches. Oxford: Oxford University Press, 2000.
GEERAERTS, D. Methodology in cognitive linguistics. In: KRISTIANSEN, G.; ACHARD, M.; DIRVEN, R.; Ruiz de MENDOZA IBÁÑEZ; F. J. R. (Ed.). Cognitive linguistics: current applications and future perspectives. Berlin and New York: Mouton de Gruyter, 2006.
GEERAERTS, D.; GRONDELAERS, S.; BAKEMA P. The structure of lexical variation: meaning, naming, and context. Berlin and New York: Mouton de Gruyter, 1994.
GLYNN, D. Multiple Correspondence Analysis: exploring correlations in multifactorial data. In: GLYNN, D; ROBINSON, J. (Ed.). Polysemy and synonymy: corpus methods and applications in cognitive linguistics. Amsterdam and Philadelphia: John Benjamins. (In press).
GOODWIN, C. The interactive construction of a sentence in natural conversation. In: PSATHAS, G. (Ed.). Everyday language: studies in ethnomethodology. New York: Irvington, 1979.
GOODWIN, C. Restarts, pauses, and the achievement of mutual gaze at turn-beginning. Sociological Inquiry, v. 50, n. 3-4, p. 272-302, 1980.
GOODWIN, C. Conversational organization: interaction between speakers and hearers. New York: Academic Press, 1981.
GREENACRE, M. Correspondence Analysis in practice 2. ed. Boca Raton: Chapman and Hall/CRC, 2007.
GRIES, St. Th. Coll.analysis 3. A program for R for Windows 2.x, 2004a. url: <http://www.linguistics.ucsb.edu/faculty/stgries/>
GRIES, St. Th. HCFA 3.2. A program for R, 2004b. url: <http://www.linguistics.ucsb.edu/faculty/stgries/>
GRIES, St. Th. Corpus-based methods and cognitive semantics: the many meanings of to run In: GRIES, S. Th.; STEFANOWITSCH, A. (Ed.). Corpora in cognitive linguistics: corpus-based approaches to syntax and lexis. Berlin and New York: Mouton de Gruyter, 2006.
GRIES, St. Th. BehavioralProfiles 1.01 A program for R 2.7.1 and higher, 2009a. url: <http://www.linguistics.ucsb.edu/faculty/stgries/>
GRIES, St. Th. Statistics for linguistics with R: a practical introduction. Berlin and New York: Mouton de Gruyter, 2009b.
GRIES, St. Th.; DIVJAK, D. Behavioral profiles: a corpus-based approach to cognitive semantic analysis. In: EVANS, V.; POURCEL, S. (Ed.). New directions in cognitive linguistics Amsterdam and Philadelphia: John Benjamins, 2009.
GRIES, St. Th.; HAMPE, B.; SCHÖNEFELD D. Converging evidence: bringing together experimental and corpus data on the association of verbs and constructions. Cognitive Linguistics, v. 16, n. 4, p. 635-676, 2005.
GRIES, St. Th.; HAMPE, B.; SCHÖNEFELD D. Converging evidence II: more on the association of verbs and constructions. In: RICE, S.; NEWMAN, J. (Eds.), Empirical and experimental methods in cognitive/functional research Stanford, CA: CSLI, 2010.
GRIES, St. Th.; OTANI, N. Behavioral profiles: a corpus-based perspective on synonymy and antonymy. ICAME Journal, v. 34, p. 121-150, 2010.
GRIES, St. Th.; STEFANOWITSCH, A. (Eds.), Corpora in cognitive linguistics: corpus-based approaches to syntax and lexis. Berlin and New York: Mouton de Gruyter, 2006.
HARDIE, A; KOLLER, V.; RAYSON, P.; SEMINO, E. In: DAVIES, M.; RAYSON, P.; HUNSTON, S.; DANIELSSON, P. (Ed.). Corpus Linguistics Conference, CL2007, Proceedings.. University of Birmingham, UK, 27-30 July 2007. Available at: <http://ucrel.lancs.ac.uk/publications/CL2007/paper/49_Paper.pdf>. Retrieved: April 7, 2011.
HARRIS, R. Language and communication: integrational and segregational approaches. London: Routledge, 1996.
HARRIS, R. Introduction to integrational linguistics Oxford: Elsevier Science, 1998.
HILPERT, M. The German mit-predicative construction. Constructions and Frames, v. 1, n. 1, p. 29-55, 2009.
JANDA, L. A.; SOLOVYEV, V. D. What constructional profiles reveal about synonymy: a case study of Russian words for SADNESS and HAPPINESS. Cognitive Linguistics, v. 20, n. 2, p. 367-393, 2009.
JURAFSKY, D.; BELL, A.; GREGORY, M.; RAYMOND, W. D. Probabilistic relations between words: evidence from reduction in lexical production. In: BYBEE, J; HOPPER, P. (Ed.). Frequency and the emergence of linguistic structure Amsterdam: John Benjamins, 2001.
LANDES, S.; LEACOCK, C.; TENGI, R. Building semantic concordances. In: FELLBAUM , C. (Ed.). WordNet: an electronic lexical database. Cambridge, MA: MIT Press, 1998.
LAUTSCH, E.; von WEBER, S. Methoden und Anwendungen der Konfigurations frequenzanalyse (KFA) Weinheim: Psychologie-Verlags-Union, 1995.
LEWANDOWSKA-TOMASZCZYK, B.; DZIWIREK, K. (Ed.). Studies in cognitive corpus linguistics Frankfurt am Main: Peter Lang, 2009.
MANNING, C. D.; SCHÜTZE., H. Foundations of statistical natural language processing Cambridge, Mass. and London, England: The MIT Press, 1999.
OERTEL, C.; CUMMINS, F.; CAMPBELL, N.; EDLUND, J.; WAGNER, P. D64: A corpus of richly recorded conversational interaction. In: KIPP, M.; MARTIN, J-C.; PAGGIO, P.; HEYLEN, D. (Ed.). LREC 2010 Workshop on multimodal corpora: advances in capturing, coding and analyzing multimodality, Proceedings.. Valetta, Malta, 2010. p. 27-30. Available at: <http://www.speech.kth.se/prod/publications/files/3433.pdf>. Retrieved: April 7, 2011.
OSTER, U. Using corpus methodology for semantic and pragmatic analysis: what can corpora tell us about the linguistic expression of emotions? Cognitive Linguistics, v. 21, n. 4, p. 727-763, 2010.
PETERS, W,; WILKS., Y. Data-driven detection of figurative language use in electronic language resources. Metaphor and Symbol, v. 18, n. 3, p. 161-173, 2003.
PHILIP, G. Locating metaphor candidates in specialised corpora using raw frequency and key-word lists. In: MACARTHUR, F.; ONCINS-MARTÍNEZ, J. L.; SÁNCHEZ-GARCÍA, M.; PIQUER-PÍRIZ, A. M. (Ed.). Metaphor in use: context, culture, and communication. Amsterdam: John Benjamins. (In press).
PHILIP, G. Metaphorical keyness in specialised corpora. In: BONDI, M.; SCOTT, M. (Ed.). Keyness in text Amsterdam: John Benjamins, 2010.
PRAGGLEJAZ GROUP. MIP: A method for identifying metaphorically used words in discourse. Metaphor and Symbol, v. 22, n. 1, p. 1-39, 2007.
RAVIN, Y.; LEACOK, C. (Ed.). Polysemy: theoretical and computational approaches. Oxford: Oxford University Press.
RAYSON, P. Matrix: A statistical method and software tool for linguistic analysis through corpus comparison. Ph.D. thesis, Lancaster University, 2003.
RAYSON, P. Wmatrix: A web-based corpus processing environment. Computing Department, Lancaster University, 2007. Available at: <http://www.comp.lancs.ac.uk/ucrel/wmatrix/>. Retrieved: April 7, 2011.
RAYSON, P.; ARCHER, D.; PIAO, S. L.; MCENERY, T. The UCREL semantic analysis system. In: Workshop on Beyond Named Entity Recognition Semantic labelling for NLP tasks in association with 4th International Conference on Language Resources and Evaluation (LREC 2004), Proceedings.. Lisbon, Portugal, 2004.
ROMESBURG, H. C. Cluster analysis for researchers North Carolina: Lulu Press, 2004.
ROY, D. New horizons in the study of child language acquisition In: Interspeech 2009, Proceedings.. Brighton, England. 2009. Available at: <http://www.media.mit.edu/cogmac/publications/Roy_interspeech_keynote.pdf>. Retrieved: April 7, 2011.
SCHMID, H.-J. Does frequency in text instantiate entrenchment in the cognitive system? In: GLYNN D.; FISCHER, K. (Ed.). Quantitative methods in cognitive semantics Berlin and New York: Mouton de Gruyter, 2010.
SHIMODAIRA, H. Approximately unbiased tests of regions using multistep-multiscale bootstrap resampling. Annals of Statistics, v. 32, p. 2616-2641, 2004.
STAMOU, S.; ANDRIKOPOULOS, V.; CHRISTODOULAKIS, D. Towards developing a semantically annotated treebank corpus for Greek. In: NIVRE, J.; HINRICHS, E. (Ed.) Second Workshop on Treebanks and Linguistic Theories, Proceedings.. Växjö: Växjö University Press, 2003.
STEEN, G. J. Finding metaphor in grammar and usage Amsterdam and Philadelphia: John Benjamins, 2007.
STEEN, G. J.; DORST, A. G.; HERRMANN, J. B.; KAAL, A. A. A method for linguistic metaphor identification: from MIP to MIPVU. Amsterdam / Philadelphia: John Benjamins, 2010.
STEFANOWITSCH, A. Corpus-based approaches to metaphor and metonymy. In: STEFANOWITSCH, A.; GRIES, St. Th. (Ed.). Corpus-based approaches to metaphor and metonymy Berlin / New York: Mouton de Gruyter, 2006.
STEFANOWITSCH, A.; GRIES, St. Th. Collostructions: investigating the interaction between words and constructions. International Journal of Corpus Linguistics, v. 8, n. 2, p. 209-243, 2003.
STEFANOWITSCH, A.; GRIES, St. Th. (Ed.). Corpus-based approaches to metaphor and metonymy Berlin and New York: Mouton de Gruyter, 2006.
STUBBS, M. Words and phrases: corpus studies of lexical semantics. Oxford: Blackwell, 2001.
SUZUKI, R.; SHIMODAIRA, H. pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics, v. 22, n. 12, p. 1540-1542, 2006.
TAYLOR, J. Polysemy and the lexicon. In: KRISTIANSEN, G.; ACHARD, M.; DIRVEN, R.; de MENDOZA IBÁÑEZ, F. J. R. (Ed.). Cognitive linguistics: current applications and future perspectives. Berlin and New York: Mouton de Gruyter, 2006.
THE R PROJECT for Statistical Computing <http://www.r-project.org/>
THORNE, S. L.; LANTOLF, J. P. A linguistics of communicative activity. In: MAKON, I., S.; PENNYCOOK, A. (Ed.). Disinventing and reconstituting languages Clevedon: Multilingual Matters, 2006.
VALENZUELA, J. A psycholinguists view on cognitive linguistics: an interview with Ray W. Gibbs. Annual Review of Cognitive Linguistics, v. 7, p. 301-317, 2009.
von EYE, A. Introduction to configural frequency analysis: the search for types and anti-types in cross-classification. Cambridge: Cambridge University Press, 1990.
von EYE, A.; LAUTSCH, E. Charting the future of configural frequency analysis: the development of a statistical method. [Introduction to a special issue devoted to configural frequency analysis.] Psychology Science, v. 45, n. 2, p. 217-222, 2003.
VOSSEN, P. Introduction to EuroWordNet. In: IDE, N.; GREENSTEIN, D.; VOSSEN, P. (Ed.). Computers and the Humanities, v. 32, n. 2-3, p. 73-89, 1998. Special issue on EuroWordNet.
WICHMANN, A. Corpora and spoken discourse. In: FACCHINETTI, R. (Ed.), Corpus linguistics 25 years on. Amsterdam and New York: Rodopi, 2007.
Downloads
Published
Issue
Section
License
Copyright (c) 2012 Revista Brasileira de Linguística Aplicada

This work is licensed under a Creative Commons Attribution 4.0 International License.
Autores de artigos publicados pela RBLA mantêm os direitos autorais de seus trabalhos, licenciando-os sob a licença Creative Commons BY Attribution 4.0, que permite que os artigos sejam reutilizados e distribuídos sem restrição, desde que o trabalho original seja corretamente citado.


