Data and Methodology
We first describe the data and the methodology employed in our investigation, providing a description of the corpus, the dependent and independent variables, and the statistical models used to analyse the data.
The Panel Corpus
Twenty native speakers of Swabian were recorded in the communities of Stuttgart and Schwabisch Gmiind in 1982 and again in 2017. Stuttgart is a large urban metropolis with over one million inhabitants, and Schwabisch Gmiind is a typical mid-sized, semi-rural German town of around 60,000 inhabitants. The comparison of these two localities thus provides the opportunity to investigate sound change from both an urban and a semi-rural perspective. The data were collected following a Labovian-style, semi-structured sociolinguistic interview (Labov 1984) covering topics about the speakers’ childhood, hobbies, friends, and family, knowledge of Swabian customs and icons, and participation in local cultural activities. Interviews were conducted in speakers’ home, typically over coffee and cake, with the goal of creating a casual interview situation. Interviewers in 1982 and 2017 were matched for social characteristics (e.g., age, gender, education) to create an interview situation as similar as possible for the two recording periods.
The speakers comprise three women and four men from Stuttgart and six women and seven men from Schwabisch Gmiind. The majority are of similar socioeconomic status (middle class) and in the same age group (18-25 years old in 1982 and 53-60 years old in 2017). Four speakers were in their early 50s in 1982 and hence in their late 80s in 2017.
In 1982, the 20 interviews comprised 17.9 hours (1075 minutes), 18,430 words (tokens), and 3158 types (unique words). In 2017, the 20 transcripts total 24.2 hours (1451 minutes), 21,553 tokens, and 3877 types. The number of tokens per speaker or word was not capped; hence, the dataset is not balanced for phonological context or word type.
Transcriptions were completed in ELAN (Wittenburg et al. 2006) by native German speakers, students at the University of Tubingen, and extracted into PRAAT 4.0 (Boersma and Weenink 2015), with signals digitised at a sampling rate of 4.4 kHz and a low-pass filter at 2.2 kHz. The audio files were aligned with the orthographic transcription using the Hidden-Markov-Model-based Forced Aligner (Rapp 1995), and the segment boundaries of each item of interest were manually corrected. Word types with [ai] at the onset were excluded, as onset positions in German are frequently articulated with creaky voice, an allophone of glottal stops rendering the extraction of vowel formants impossible (Pompino-Marschall and Zygis 2010). Since our aim is to evaluate the loss of phonetic distinction between [ai] and [ai] in contemporary Swabian, tokens of [ai] were also excluded.