Cross-linguistic and language-internal variation in text and speech: focus on the joint analysis of multiple characteristics
Wann |
09.02.2011 um 09:00 bis 11.02.2011 um 18:00 |
---|---|
Wo | Rektorat, Fahnenbergplatz, Senatssaal / FRIAS Seminarraum, Albertstr. 19 |
Name | Dr. Gesa von Essen |
Teilnehmer |
nach Anmeldung |
Termin übernehmen |
vCal iCal |
Organized by Benedikt Szmrecsanyi & Bernhard Wälchli
This workshop seeks to bring together typologists, dialectologists and dialectometricians, register analysts, and quantitative linguists to discuss approaches to cross-linguistic and language-internal diversity that
- are based on the study of corpora of texts or speech of different languages, different dialects or different registers (conversation, narratives including retold stories, newspaper prose, parallel texts, etc.) -- not on reference grammar material, questionnaire data, dialect atlases, or elicitation;
- are concerned with the joint (or: aggregate) analysis of multiple characteristics or features. These multiple characteristics may be frequency counts and/or distributional data with low levels of data reduction, but not binary features or discrete features with few types;
- marshal some sort of quantitative analysis technique to see the wood for the trees. Such techniques may involve data mining in the broadest sense, dimension reduction techniques, taxonomy, index calculation, diagrammatic visualization methods (e.g. network diagrams), projections to geography, and so on.
The nature and number of characteristics is not limited in any way (the more the merrier). Functional and formal perspectives on phonetics, phonology, morphology, syntax, and the lexicon are all welcome, provided the features investigated can be extracted from texts or speech with minimal commitments to particular theories of grammar. For relevant case studies in this spirit, see Szmrecsanyi (to appear) or Wälchli (2009).
The workshop is intended as a platform to discuss appropriate analysis techniques and issues concerning the corpus-cum-aggregation endeavor, as well as its prospects. Owing to the interdisciplinary scope of the workshop, we welcome contributions (i) which have an interdisciplinary focus themselves, and (ii) which emphasize methodological aspects rather than the detailed discussions of results. The approaches presented should be applied to a particular set of corpora, and the abstract should spell out the methodology utilized.
WEDNESDAY FEB 9
(venue: Rektorat, Fahnenbergplatz, Senatssaal)
9:30-9:45 Benedikt Szmrecsanyi (FRIAS) & Bernhard Wälchli (University of Bern)
Welcome
9:45-10:45 Michael Cysouw (LMU Munich)
"Historical reconstruction through parallel corpora"
abstract
10:45-11:00 break
11:00-12:00 William Kretzschmar (University of Georgia)
"Complex Systems in Aggregated Variation Analyses"
abstract
12:00-13:00 sandwich lunch (FRIAS lounge)
13:00-13:45 Sascha Diwersy (University of Cologne), Stefan Evert (University of Osnabrück) & Stella Neumann (TU Darmstadt/RWTH Aachen)
"A corpus-driven approach to language variation"
abstract
13:45-14:30 Bernhard Wälchli (University of Bern)
"Typological features as indices of automatically extracted multiple lexical characteristics, or, an approximation to spectral analysis of morphological complexity in parallel texts"
abstract
14:30-14:45 break
14:45-15:45 Wilbert Heeringa & Frans Hinskens (Meertens Institute)
"Dutch dialect change in lexis, morphology and sound components"
abstract
15:45-16:15 coffee break (FRIAS lounge)
16:15-17:00 Maria Koptjevskaja-Tamm (Stockholm University) & Magnus Sahlgren (Stockholm University / Swedish Institute of Computer Science)
"Temperature in the Word Space: sense exploration of temperature expressions using word-space modelling"
abstract
17:00-17:45 Benedikt Szmrecsanyi (FRIAS)
"Holistic corpus-based dialectology"
abstract
THURSDAY FEB 10
(venue: FRIAS, Albertstr. 19, Hörsaal)
10:45-11:30 Ruprecht von Waldenfels (University of Bern)
"Tapping into intra-family variation using a Slavic parallel corpus"
abstract
11:30-12:15 Thomas Mayer (University of Konstanz)
"Automatically extracting place features from the distribution of consonants in corpora"
abstract
12:15-14:15 Lunch buffet (FRIAS lounge)
14:15-15:15 Karen Corrigan (Newcastle University)
"Data-Mining the DECTE Corpus: Phonological and Morphological Variability in Tyneside English"
abstract
15:15-16:00 Annemarie Verkerk (Max Planck Institute for Psycholinguistics)
"Where Alice fell into: Motion events in a parallel corpus"
abstract
16:00-16:30 coffee break (FRIAS lounge)
16:30-17:30 Balthasar Bickel (University of Leipzig)
"On the role of language and other genealogical units in explaining typological distributions: a case study on referential density"
abstract
FRIDAY FEB 11
(venue: Rektorat, Fahnenbergplatz, Senatssaal)
9:00-10:00 Dirk Geeraerts & Tom Ruette (University of Leuven)
"Lexical Sociolectometry"
abstract
10:00-10:45 Douglas Biber (Northern Arizona University)
"Using multi-dimensional analysis to investigate cross-linguistic patterns of register variation"
abstract
10:45-11:00 break
11:00-12:00 Peter Grzybek (University of Graz)
"Homogeneity and heterogeneity within language(s): Relevance for intra-lingual and cross-linguistic typologies"
abstract
12:00-14:00 lunch break
14:00-15:00 Jack Grieve (University of Leuven)
"A comparison of statistical methods for the aggregation of regional linguistic variation"
abstract
15:00-16:00 Östen Dahl (Stockholm University)
"The perfect map: investigating the cross-linguistic distribution of TAME categories in a parallel corpus"
abstract
16:00-16:30 coffee break (FRIAS lounge)
16:30-17:30 General discussion
References
Szmrecsanyi, Benedikt (to appear). "Aggregate data analysis in variationist linguistics". Available online at http://www.benszm.net/omnibuslit/Szmrecsanyi_Bamberg_webversion.pdf
Wälchli, Bernhard (2009). "Data reduction typology and the bimodal distribution bias". Linguistic Typology 13.1: 77-94.