Spoken Corpora and Pragmatics

Authors

  • Massimo Moneglia University of Florence ##default.groups.name.author##

Keywords:

oral corpora, pragmatics, annotation, sampling, speech act types, prosody, Language into Act Theory

Abstract

The goal of this paper is to present arguments in favour of two points related to the study of oral corpora and pragmatics: a) at the level of annotation, corpora must ensure the parsing of the speech flow into utterances on the basis of prosodic cues and provide an easy access to the acoustic source; b) at the level of sampling, corpora must ensure the maximum representation of context variation, rather than speaker variation. We will present the reasons which support the very basic prosodic annotation of speech (prosodic boundaries) as a means to obtain relevant data from the speech flow. Starting from our present knowledge about the distribution of speech acts types in spoken corpora, we will present the reasons why building corpora in accordance to a context variation strategy should expand our knowledge of pragmatics. Additionally, we will claim that prosody is the necessary interface between locutive and illocutive acts and we will show that a deeper prosodic analysis is necessary to grasp unknown speech act types from language usage. Finally, we will briefly sketch the main assumptions of the Language into Act Theory (CRESTI, 2000) which is dedicated to the link between prosody and pragmatics and helps make explicit core aspects of pragmatic knowledge.

Downloads

Download data is not yet available.

References

ABEILLE, A. Treebanks Building and Using Parsed Corpora Dordrecht: Kluwer Academic, 2003.

AMIR, N.; SILBERT-VARODZ, V.; IZRE'EL, S. (2004), Characteristics of intonation unit boundaries in spontaneous spoken Hebrew: Perception and acoustic correlates. SProSIG, p. 677-680, 2003.

ANDERSON, A.; BADER, M.; BARD, E.; BOYLE E.; DOHERTY, G.; GARROD, S.; ISARD, S.; KOWTKO, J.; MCALLISTER, J.; MILLER, J.; SOTILLO, C.; THOMPSON, H.; WEINERT, R. The HCRC map task corpus. Language and Speech, v. 34, p. 351-366, 1991.

AUSTIN, L. J. How to Do Things with Words Oxford: Oxford University Press, 1962.

BALLY, C. Linguistique Générale et Linguistique Française Berne: Francke Verlag, 1950.

BAZZANELLA, C. I segnali discorsivi. In: RENZI, L.; SALVI, G.; CARDINALETTI, A. (Ed.). Grande Grammatica di Consultazione Bologna: Il Mulino, 1995.

BAZZANELLA, C.; BOSCO, C.; GILI FIVELA, B.; MIECZNIKOWSKI, J.; TINI BRUNOZZI, F. Polifunzionalità dei segnali discorsivi, sviluppo conversazionale e ruolo dei tratti fonetici e fonologici. In: PETTORINO, M.; GIANNINI, A.; VALLONE, M.; SAVY, R. (Ed.). La comunicazione parlata Napoli: Liguori, 2008. v. II.

BERRUTO, G. Sociolinguistica dell'Italiano Contemporaneo Roma: La Nuova Italia Scientifica, 1987.

BIBER, D. Variation Across Speech and Writing Cambridge: Cambridge University Press, 1988.

BIBER, D.; JOHANSSON, S.; LEECH, G.; CONRAD, S.; FINEGAN, E. The Longman Grammar of Spoken and Written English London / New York: Longman, 1999.

BLANCHE-BENVENISTE, C. Approches de la Langue Parlée en Français. Paris: Ophrys, 1997.

BLANCHE-BENVENISTE, C.; BILGER, M.; ROUGET, Ch.; VAN DEN EYNDE, K.; MERTENS, P. Le Français Parlé: Études Grammaticales. Paris: Éditions du C.N.R.

BLANCHE-BENVENISTE, C. Le recouvreman de la syntaxe et de la macrosyntaxe. In: SCARANO, A. (Ed.). Macro-syntaxe et Pragmatique. L'analyse Linguistique de l' Oral. Roma: Bulzoni, 2003.

BNC http://www.natcorp.ox.ac.uk/

BRAZIL, D. A Grammar of Speech. Oxford: Oxford University Press, 1995.

BOLINGER, D. L. (Ed.). Intonation: Selected readings. Harmondsworth: Penguin, 1972.

BUHMANN, J.; CASPERS, J.; VAN HEUVEN, V.; HOEKSTRA, H.; MARTENS, J-P.; SWERTS, M. Annotation of prominent words, prosodic boundaries and segmental lengthening by no-expert transcribers in the spoken Dutch corpus. In Proceedings of LREC 2002 Paris: ELRA. p 779-785, 2002.

CARLETTA, J.; ISARD, A.; ISARD, S.; KOWTKO, J.; DOHERTY-SNEDDON, G.; ANDERSON, A. HCRC dialogue structure coding manual. HCRC/TR 82. Human Communication Research Centre, University of Edinburgh, 1996.

CARLETTA, J.; ISARD, A.; ISARD, S.; KOWTKO, J.; DOHERTY-SNEDDON, G.; ANDERSON, A. The reliability of a dialogue structure coding scheme. Computational Linguistics, v. 23, n. 1, p. 13-31, 1997.

CHAFE, W. Cognitive constraints on information flow. In: TOMLIN, R. (Ed.). Coherence and grounding in discourse Amsterdam: John Benjamins, 1987.

CHAFE, W. (1993). Prosodic and functional units of language. In: EDWARDS, Jane A.; LAMPERT, Martin D. (Ed.). Talking data: Transcription and coding methods for language research. Hillsdale, NJ: Lawrence Erlbaum Associates, 1992.

CHAFE, W. Discourse, consciousness, and time: The flow and displacement ofconscious experience in speaking and writing. Chicago / London: The University of Chicago Press, 1994.

CHOMSKY, N. Deep Structure, Surface Structure and Semantic Interpretation. STEIMBERG, D.; JACOBOVITS, L. (Ed.). Semantics: an Interdisciplinary Reader. Cambridge: Cambridge University Press, 1971.

CRESTI, E. L'articolazione dell'informazione nel parlato. In: AA.VV. Gli Italiani Parlati: Sondaggi sopra la Lingua d'oggi. Firenze: Accademia della Crusca, 1987.

CRESTI, E. Information and intonational patterning in Italian. In: FERGUSON, B.; GEZUNDHAJT, H.; MARTIN, P. (Ed.). Accent, Intonation, et Modéles Phonologiques. Toronto: Editions Mélodie, 1994.

CRESTI, E. Corpus di Italiano Parlato Firenze: Accademia della Crusca, 2000. v. I-II, CD-ROM.

CRESTI, E.; FIRENZUOLI, V. Illocution and intonational contours in Italian. Revue Française de Linguistique Appliquée, v. IV, n. 2, p. 77-98, 2001.

CRESTI, E.; MONEGLIA, M. C-ORAL-ROM. Integrated Reference Corpora for Spoken Romance Languages Amsterdam: Benjamins, 2005.

CRESTI, E.; MONEGLIA, M. Informational Patterning Theory and the Corpus based description of Spoken language. The composiotionality issue in the Topic Comment pattern. In: MONEGLIA, M.; PANUNZI, A. (Ed.). Bootsrapping Information from Corpora in a Cross Linguistic Perspective Firenze: FUP, 2010.

CRESTI, E.; MONEGLIA, M.; TUCCI, I. Annotation de l'entretien avec Anita Musso selon la Théorie de la langue en acte. In: LEFEUVRE, F.; MOLINE, E. (Ed.). Unités syntaxiques et Unités prosodiques, Langue Française, 2011.

CRYSTAL, D.; QUIRK, R. Systems of Prosodic and Paralinguistic Features in English. The Hague: Mouton, 1964.

CRYSTAL, D. The English Tone of Voice London: Edward Arnold, 1975.

DANEŠ, F. Sentence intonation from a functional point of view. Word, v. 16, p. 34-55, 1960.

DE MAURO, T.; MANCINI, F.; VEDOVELLI, M.; VOGHERA, M. Lessico di Frequenza dell'Italiano Parlato. Milano: ETAS, 1993.

DUBOIS, J. W.; CHAFE, W.; MEYER, C.; THOMPSON, S. A. Santa Barbara Corpus of Spoken American English Part 1 Linguistic Data Consortium, 2000.

FAVA, E. Tipi di atti e tipi di frase. In: RENZI, L.; SALVI, G.; CARDINALETTI, A. (Ed.). Grande Grammatica Italiana di Consultazione Bologna: Il Mulino, 1995.

FIRENZUOLI, V. Le Forme Intonative di Valore Illocutivo dell'Italiano Parlato: Analisi Sperimentale di un Corpus di Parlato Spontaneo (LABLITA). 2003. PhD (Thesis) Università di Firenze, Firenze.

FISCHER, K. (Ed.). Approaches to discourse particles Studies in Pragmatics 1. Bingley, UK: Emerald Group Publishing, 2006.

FRASER, B. Towards a Theory of Discourse Markers. In: FISCHER, K. (Ed.). Approaches to discourse particles Studies in Pragmatics 1. Bingley, UK: Emerald Group Publishing, 2006.

FROSALI, F. Il lessico degli ausili dialogici. In: CRESTI, E. (Ed.). Prospettive nello studio del lessico italiano (Atti del IX Congresso SILFI), Firenze: FUP, 2006.

GADET, F. Variabilité, variation, variété. Journal of French Language Studies, v. 1, p. 75-98, 1996.

GRICE, H. Logic and Conversation. In: COLE, P.; MORGAN, G. Speech Acts. Syntax and semantics. New-York: Academic Press, 1975. v. 3.

HALLIDAY, M.A.K. System and Function in Language: Selected Papers. London: Oxford University Press, 1976.

't HART, J.; COLLIER, R.; COHEN, A. A Perceptual Study on Intonation. An Experimental Approach to Speech Melody. Cambridge: Cambridge University Press, 1990.

HOCKETT, C. F. A Course in Modern Linguistics New York: The Macmillan Company, 1958.

IMDI http://www.mpi.nl/IMDI/documents/Proposals/IMDI_MetaData_ 3.0.4.pdf.

IZRE'EL, S. Intonation Units and the Structure of Spontaneous Spoken Language: A View from Hebrew. In: Proceedings of the IDP05 on Discourse-Prosody Interfaces, 2005.

IZRE'EL, S.; HARY, B.; RAHAV, G. Designing CoSIH: The corpus of spoken Israeli Hebrew. International Journal of Corpus Linguistics, v. 6, p. 171-197, 2001.

JACKENDOFF, R. Semantic Interpretation in Generative Grammar Cambridge Mass: MIT Press, 1972.

JURAFSKY, D.; SCHRIBERG, L.; BIASCA, D. Switchboard SWBD-DAMSL Shallow-Discourse-Function-Annotation Coder's Manual, Draft 13. Technical Report TR 97-02. Institute for Cognitive Science, University of Colorado at Boulder, 1997.

KARCEVSKY, S. Sur la phonologie de la phrase. In: Travaux du Cercle linguistique de Prague IV, p. 188-228, 1931.

KEMPSON, R. Semantic Theory Cambridge: Cambridge University Press, 1977.

LABOV, W. The Social Stratification of English in New York City Washington D.C.: Center for Applied Linguistics, 1966.

LADD, D. R. The structure of the Intonational Meaning. London: Bloomington, 1980.

LAMBRECHT, K. Information structure and sentence form Cambridge: Cambridge University Press, 1994.

LEHISTE, I. The phonetic structure of paragraphs. In: COHEN, A.; NOOTEBOOM, S. (Ed.). Structure and Process in Speech Perception Berlin: Springer-Verlag, 1975.

MATHESIUS, V. La linguistica funzionale. In: SORNICOLA, R.; SVOBODA, A. (Ed.). (1991). Il campo di tensione. La sintassi della scuola di Praga Napoli: Liguori, 1929.

MILLER, J.; WEINERT, R. Spontaneous Spoken Language Oxford: Clarendon Press, 1998.

MONEGLIA, M. The C-ORAL-ROM resource. In: CRESTI, E.; MONEGLIA, M. C-ORAL-ROM. Integrated Reference Corpora for Spoken Romance Languages Amsterdam: Benjamins, 2005.

MONEGLIA, M. Units of Analysis of Spontaneous Speech and Speech Variation in a Cross-linguistic Perspective. In: KAWAGUCHI, Y.; ZAIMA, S.; TAKAGAKI, T. (Ed.). Spoken Language Corpus and Linguistics Informatics. Amsterdam: John Benjamins, 2006.

MONEGLIA, M.; FABBRI, M.; QUAZZA, S.; PANIZZA, A.; DANIELI, M.; GARRIDO, J. M.; SWERTS, M. Evaluation of consensus on the annotation of terminal and non-terminal prosodic breaks in the C-ORAL-ROM corpus. In: E. CRESTI; MONEGLIA, M. (Ed.). C-ORAL-ROM. Integrated Reference Corpora for Spoken Romance Languages Amsterdam: John Benjamins, 2005.

MONEGLIA, M.; RASO, T.; MALVESSI-MITTMANN, M.; MELLO, H. Challenging the perceptual relevance of prosodic breaks in multilingual spontaneous speech corpora: C-ORAL-BRASIL / C-ORAL-ROM in Speech Prosody 2010, W1.09, Satellite workshop on Prosodic Prominence: Perceptual, Automatic Identification Chicago. Available at: <http://aune.lpl.univ-aix.fr/ ~sprosig/sp2010/papers/102010.pdf>

MORENO FERNANDEZ, F. Corpus of spoken Spanish language The representativeness Issue. KAWAGUCHI et al (Ed.). Usage-Based Linguistics Informatics Amsterdam: John Benjamins, 2005.

NAKATANI, C.; GROSZ, B.; HIRSCHBERG, J. Discourse structure in spoken language: studies on speech corpora. In: Proc. AAAI-95 Spring Symposium on Empirical Methods in Discourse Interpretation and Generation, 1995.

QUIRK, R.; GREENBAUM, S.; LEECH, G.; SVARTVIK, J. A Comprehensive Grammar of the English Language London / New York: Longman, 1985.

RAPSODIE Project http://rhapsodie.ilpga.fr/wiki/Chaine_de_traitement

RASO, T.; MELLO, H. Allocutives as discourse markers: a comparative corpus- based study for Italian, Spanish, European Portuguese and Brazilian Portuguese. Proceedings of the 2th International Pragmatics Conference. Manchester, 3-8 July 2011. Forthcoming.

ROSSI, M. L'intonation et l'organisation de l'énoncé. Phonetica, v. 42, p. 135-153, 1985.

ROSSI, M. L'intonation, le Système du Français: Description et Modélisation. Paris: Ophrys, 1999.

SCARANO, A. (Ed.). Macro-syntaxe et Pragmatique. L'analyse Linguistique de l'Oral. Roma: Bulzoni, 2003.

SCHIFFRIN, D. Discourse Markers. Cambridge: Cambridge University Press, 1987.

SCHOURUPS, L. Discourse markers. Lingua, v. 107, p. 227-265, 1999.

SEARLE, J. Speech Acts: An Essay in the Philosophy of Language. Cambridge: Cambridge University Press, 1969.

SEARLE, J. Intentionality. An essay in the Philosophy of the Mind. Cambridge: CUP, 1983.

SEARLE, J. Indirect speech acts. In: COLE, P.; MORGAN, J. L. (Ed.). Syntax and Semantics, 3: Speech Acts. New York: Academic Press, 1975.

SHRIBERG, E.; BATES, R.; STOLCKE, A.; TAYLOR, P.; JURAFSKY, D.; RIES, K.; COCCARO, N.; MARTIN, R.; METEER, M.; VAN ESS-DYKEMA, C. Can prosody aid the automatic classification of dialog acts in conversational speech? Language and Speech, v. 3-4, p. 443-492, 1998. Special issue on Prosody and Conversation, 41.

SINCLAIR, J. M.; COULTHARD, R. M. Towards of Analysis of Discourse: The English Used by Teachers and Pupils. London: Oxford UP, 1975.

SORNICOLA, R.; SVOBODA, A. Il campo di tensione Napoli: Liguori, 1989.

STIRLING, J.; FLETCHER, I.; MUSHIN, R.; WALES, L. Representational issues in annotation: Using the Australian map task corpus to relate prosody and discourse structure. Speech Communication, v. 33, p. 113-134, 2001.

SWERTS, M. Prosodic features at discourse boundaries of different strength. J. Acoust. Soc. Amer. v. 101, p. 514-521, 1997.

SWERTS, M.; GELUYKENS, R. The prosody of information units in spontaneous monologues. Phonetica, v. 50, p. 189-196, 1993.

WEIL, H. 1844. De l'ordre des mots dans les langues anciennes comparées aux langues modernes. In: The order of words in the ancient languages compared with that of the modern languages translation, by C.W. SUPER. Amsterdam: Benjamins, 1978.

WINPITCH-PRO http://www.winpitch.com/

YUKI, K.; ABE, K.; LIN, C. Development and assessment of TUFS Dialogue Module-Multilingual and Functional Syllabus. In: KAWAGUCHI et al. Usage-Based Linguistics Informatics. Amsterdam: John Benjamins, 2005.

Downloads

Published

Feb-Wed-2012

Issue

Section

Thematic Dossier – Corpus Studies: Future Directions