Modeling intonational tunes with Generalized Additive Mixed Models: A practical introduction

Introduction

This introduction is aimed at outlining the basics of GAM(M) modeling and applying them to modeling intonational tunes in American English. It draws heavily on previous tutorials and introductions, particularly Sóskuthy (2017). I assume some basic familiarity with linear models. Familiarity with R is a plus for reading this document, as some code snippets are included, but probably not needed.

How to use this tutorial

If you want to just read this document you will get an introduction to GAM(M)s and an example of their application in intonation research.

If you want to get more hands on, you can access the corresponding data and RMarkdown document to experiment with the code, and fit your own models with the data. You can find this here.

This is intended to be a mostly conceptual introduction -

For more in-depth descriptions of GAM(M)s see, e.g. Sóskuthy (2021) and Wood (2017).
The document also includes notes for further reading where relevant.

The data

The data we will use comes from an imitative speech paradigm in which speakers listened to model utterances such as “Her name is Marilyn”, with f0 resynthesized to create various intonational tunes. Speakers reproduced what they heard on a new sentence such as “They honored Melanie”, effectively transposing the tune.

We are interested on the intonational tune in the final stress-initial word, like “Marilyn” which was modeled on 8 different resynthesized f0 contours - all combinations of H and L mono-tonal pitch accents, phrase accents, and boundary tones in MAE ToBI (Beckman & Ayers, 1997). We call this the “Nuclear Tune”. The model tunes, with 5 schematic target heights are shown below, as well as the data we’ll use in this tutorial, which comes from 5 speakers.

We measured f0 at 30-time normalized samples across each trisyllabic stress-initial word, and we are interested in examining how f0 varies as a function of which tune participants are imitating. The contours above show average f0 contours for each of 5 speakers per tune, plus the overall mean in each panel.

Note the data includes multiple repetitions of each tune (up to 18 per speaker), and we excluded data which contained likely f0 tracking errors. We’ll model f0 measured as speaker-mean-centered ERB.

There are a total of 663 productions (with 30 f0 samples per production)
The data frame itself is in “long” format, as in:

## # A tibble: 20,070 × 8
##    speaker  trial_num tune    erb    erbc normtime prop_dur speaker_trial
##    <chr>        <int> <chr> <dbl>   <dbl>    <int>    <dbl> <chr>        
##  1 speaker1         1 HLH    4.16 -0.579         1   0.0333 speaker1_1   
##  2 speaker1         1 HLH    4.05 -0.689         2   0.0667 speaker1_1   
##  3 speaker1         1 HLH    3.92 -0.819         3   0.1    speaker1_1   
##  4 speaker1         1 HLH    3.86 -0.870         4   0.133  speaker1_1   
##  5 speaker1         1 HLH    4.03 -0.705         5   0.167  speaker1_1   
##  6 speaker1         1 HLH    4.31 -0.420         6   0.2    speaker1_1   
##  7 speaker1         1 HLH    4.55 -0.190         7   0.233  speaker1_1   
##  8 speaker1         1 HLH    4.66 -0.0760        8   0.267  speaker1_1   
##  9 speaker1         1 HLH    4.79  0.0515        9   0.3    speaker1_1   
## 10 speaker1         1 HLH    5.02  0.285        10   0.333  speaker1_1   
## # … with 20,060 more rows
## # ℹ Use `print(n = ...)` to see more rows

Issues with modeling intonation

Let’s say we want to understand how f0 changes over the course of (normalized) time. Let’s focus on two tunes, HLH and LHH, to illustrate. To keep things simple let’s just model the mean contour for each tune. This data has the same structure as the full data set, but is much simplified.

## # A tibble: 60 × 3
##    tune  prop_dur erbc_mean
##    <chr>    <dbl>     <dbl>
##  1 HLH     0.0333   -0.368 
##  2 HLH     0.0667   -0.322 
##  3 HLH     0.1      -0.283 
##  4 HLH     0.133    -0.219 
##  5 HLH     0.167    -0.107 
##  6 HLH     0.2       0.0611
##  7 HLH     0.233     0.269 
##  8 HLH     0.267     0.510 
##  9 HLH     0.3       0.748 
## 10 HLH     0.333     0.928 
## # … with 50 more rows
## # ℹ Use `print(n = ...)` to see more rows

We could capture the relationship between f0 and time with a linear equation as in: \[y_{f0} = \alpha + \beta_{1} x_{normtime}\] If we fit a linear regression to each mean trajectory, we get the following:

HLH_lm<-lm(erbc_mean~prop_dur,data=data_means[data_means$tune=="HLH",])
LHH_lm<-lm(erbc_mean~prop_dur,data=data_means[data_means$tune=="LHH",])

This is not a great way to model the data for several reasons:

Conceptually, the non-linearities of the f0 movement are simply not captured by a linear model
- Especially for the HLH tune, modeling it as simply declining in f0 is missing a lot of what is going on
More practically, they way a linear model is fit to trajectories such as these is also probelmatic, particularly in the way it impacts the model residuals, plotted below (also with auto-correlation functions):

As we can see, the residuals from each fit shows a systematic patterning (we can imagine with more trajetories this same issue would hold) - this is bad news for making statistical inferences, since residuals should not be correlated with one another in linear regressions.

GAM(M)s offer an attractive alternative for capturing non-linear, and particularly time-series data (though we also need to consider autocorrelation in GAM(M)s, discussed below).

There are of course alternatives to GAM(M)s that also work for non-linear data, for example fitting various polynomial functions, but these also have shortcomings.
Further reading: Clark (2019) and Winter & Wieling (2016) for motivating GAM(M)s and the shortcomings of other alternatives.

GAM(M)s have been applied to diferent sorts of linguistic research including eye movement data (Zahner et al., 2019), formants in a vowel (Stuart-Smith et al., 2015), and diachronic change (Fruehwald, 2016).

GAM basics

Let’s start by fitting a GAM to the simple HLH trajectory, using the package mgcv Note this is not a GAMM since we don’t have random effects.

library(mgcv)
gam_tp_10<-bam(erbc_mean~s(prop_dur,bs="tp",k=10),
          data=data_means[data_means$tune=="HLH",])

When we plot the GAM fit we get:

Looks pretty good! Much better than a linear approximation, but what is actually being fit?…

The GAM is predicting speaker-centered ERB (erbc_mean) as a function of a smooth term for normalized time (prop_dur) ~s(prop_dur,bs="tp",k=10). Two components in the model terms are: bs="tp" and k=10

1. Basis functions

Let’s first focus on the former: bs stands for basis function- which defines a set of functions that are fit with the GAM, and summed to create a potentially more complex curve. We’ll walk through this below.

Basis functions come in many different varieties. In this case bs="tp" specifies “thin plate regression splines”, a particular type of spline which this case is made up of functions of varying complexity and shape.

Let’s take a look: we can extract this information from a GAM(M) fit as follows (see the R markdown document for a full set of code plotting from this object)

model_matrix_tp_10 <- as.data.frame(predict(gam_tp_10, type = "lpmatrix"))

So, how do we get from these various functions to the model fit? The mathematical representation of GAMs is really flexible, it’s often just written as:

\[y_{f0} = \alpha + f(x_{normtime}) + \epsilon\]

Where \(f(x)\) is just “some function of x” (and \(\epsilon\) is a term for error).

What \(f(x)\) is depends on the number and type of basis functions, like those we see above. A GAM is fit by taking each of these basis functions of x, multiplying each function by a coefficient \(\beta\) which is computed by the model, and summing the result, as in:

\[ f(x) = \sum_{i=1}^{d} \gamma_{i}(x) \beta_{i} \] Where \(\gamma\) is a basis function, and \(d\) is total number of these functions.

To make this concrete, let’s walk through an example. With the model we just fit, we have 9 basis functions, so the model when fit produces 9 coefficients, in addition to the intercept.

gam_tp_10$coefficients

##   (Intercept) s(prop_dur).1 s(prop_dur).2 s(prop_dur).3 s(prop_dur).4 
##   -0.14063349   -0.98701085    1.39983695   -0.79643149   -0.94618014 
## s(prop_dur).5 s(prop_dur).6 s(prop_dur).7 s(prop_dur).8 s(prop_dur).9 
##    0.30137856   -0.08695606    0.19906742    0.09453152    0.09203939

By multiplying each coefficient by its respective basis function we arrive at the following:

These plots are coming from a data frame in which I extracted the basis functions and intercept from the model, and then multiplied by the the coefficients, for example for basis function “a”:

tibble(model_matrix_tp_10)

## # A tibble: 300 × 5
##    index   basis prop_dur  coeff coeff_basis
##    <chr>   <dbl>    <dbl>  <dbl>       <dbl>
##  1 a     -0.661    0.0333 -0.987      0.652 
##  2 a     -0.552    0.0667 -0.987      0.545 
##  3 a     -0.443    0.1    -0.987      0.438 
##  4 a     -0.336    0.133  -0.987      0.332 
##  5 a     -0.231    0.167  -0.987      0.228 
##  6 a     -0.130    0.2    -0.987      0.129 
##  7 a     -0.0340   0.233  -0.987      0.0336
##  8 a      0.0560   0.267  -0.987     -0.0553
##  9 a      0.139    0.3    -0.987     -0.137 
## 10 a      0.212    0.333  -0.987     -0.209 
## # … with 290 more rows
## # ℹ Use `print(n = ...)` to see more rows

So, let’s just sum up the basis functions and create a data frame with that summed fit

This will be the sum of all the lines (basis functions) in panel C above and the intercept

model_matrix_tp_10 %>% group_by(prop_dur)%>% # group by time
  mutate(fit= sum(coeff_basis)) %>% select(fit,prop_dur) %>% # sum coeffs 
  slice(1)->reconstructed_fit_tp # get just one observeration per time stamp

print(as_tibble(reconstructed_fit_tp,n=30))

## # A tibble: 30 × 2
##        fit prop_dur
##      <dbl>    <dbl>
##  1 -0.360    0.0333
##  2 -0.332    0.0667
##  3 -0.289    0.1   
##  4 -0.217    0.133 
##  5 -0.101    0.167 
##  6  0.0638   0.2   
##  7  0.272    0.233 
##  8  0.506    0.267 
##  9  0.738    0.3   
## 10  0.931    0.333 
## # … with 20 more rows
## # ℹ Use `print(n = ...)` to see more rows

ggplot(data_means[data_means$tune=="HLH",],aes(x=prop_dur,y=erbc_mean))+
   stat_summary(fun=mean,geom="point",color="black",pch=1,size=5)+
    geom_line(data=reconstructed_fit_tp,aes(y=fit,x=prop_dur),color="orchid2",size=2)+ # the fit we just made from the basis functions
      ylim(-1.5,1.5)+
  xlab("prop. word duration")+ylab("centered ERB")+theme(legend.position = "none")

We can see that by summing the lines in panel C and intercept, we have thus arrived at the model’s fit to the data.

A GAM can be fit with many types of basis functions, for comparison let’s fit a GAM with bs="cr" for “cubic regression” splines:

gam_cr_10<-bam(erbc_mean~s(prop_dur,bs="cr",k=10),
          data=data_means[data_means$tune=="HLH",])

Like before, let’s take a look at the basis functions before and after they are multiplied by the model coefficients:

The cubic regression splines look really different as compared to the thin plate regression splines, but at the end of the day they produce essentially the same fit as shown below:

In my experience, the basis functions one can employ in mgcv don’t drastically change the fit or inferences about the data. The default in the package is bs=tp.

2. Knots

Now that we’ve covered the basis functions, let’s turn to the other parameter we set when fitting the models, namely k=10. k stands for “knots”, which determines the number of basis functions we employ (9 in the examples above). Now let’s look at k=6.

For cubic regression splines, knots define piece-wise cubic functions that contribute to the overall spline, and are more intuitively visible as places where splines converge. For thin plate splines, things are more complicated, and for our purposes we can think of k as just determining the number of basis functions \(d\), where \(d = k-1\). So when specifying k=10 above, we used 9 basis functions.

Further reading: Wood (2003) for a math-heavy introduction to thin plate regression splines and discussion of knot construction.

Let’s look at two types of basis functions, with different numbers of knots, fit to the HLH trajectory. The labels for each row indicate both the basis funciton cr/tp and the knot number.

As is clear from above, the more knots (and basis functions), the “wigglier” a curve can be, allowing us to capture more complicated trajectory shapes.

Another important determinant of the GAM’s fit is the smoothing parameter, which is estimated to optimize for over- and under-fitting the data. In mgcv this parameter is estimated directly from the data, such that you as the modeler don’t have to set it.

Further reading: Sóskuthy (2017), Clark (2019) and Wood (2017) for more information on smoothing parameter estimation.

Because of the smoothing parameter, the number of knots imposes an upper limit on the wiggliness of a trajectory, but the actual fit will not necessarily increase in wiggliness once an adequate number of knots has been exceeded.

We can see this for tp fits below: note that going from 10 to 30 knots doesn’t really change anything.

In practice, setting a very high number of knots is not necessarily a good idea because it is computationally intensive, and can sometimes lead to over-fitting.

There is no hard and fast rule for choosing the right number of knots, and the number you select should be informed by how “wiggly” you expect your data to be.

In my experience 10 knots is a good amount for modeling trajectories of the sort we see in intonational tunes over a multi-syllable word. The package mgcv comes with useful ways to check that the number of knots you are setting is appropriate.

This can be done with the gam.check() and check.k() functions in mgcv.
Further reading: see Sóskuthy (2017) for tips on choosing an appropriate number of knots for your data.

3. Scaling things up

In the preceding toy examples, we are fitting a single trajectory, that is, one single time series measurement of f0. However, everything we just saw applies when we scale up to modeling, say, all of the LHH contours in the data set, as in:

gam_tp_10_LHH<-bam(erbc~s(prop_dur,bs="tp",k=10),
          data=data[data$tune=="LHH",])

Let’s look at the fit in the same way as with the single-trjacetory models: here you can see the same principles we examined for a single trajectory apply to the fit to all of the trajectories here.

Comparing tunes, random effects, and error models

So far we’ve just fit to a single tune’s trajectory, allowing us to characterize its shape. However, usually, we want to characterize how tunes differ from one another.

The remainder of this tutorial will walk through this, comparing just two tunes: HHH and HHL, the “high rising” tunes.

data %>% filter(tune=="HHH"|tune=="HHL") -> HR

Here you can see all of the data with means for each tune

1. Comparing tune shapes

To begin let’s fit a simple model that includes each tune. You’ll see that the model code is a bit different.

Note that in this model we are fitting separate smooths to each trajectory, and then inspecting how these smooths vary from one another.

An alternative, not employed here, is to explicitly fit a trajectory that captures the difference between smooths. The model fit is the same, however when the difference is fit, the model coefficients become more readily interpretable (we’ll see that they are pretty opaque below).
See Sóskuthy (2017) for discussion on this, and see the RMarkdown code for an implementation (not shown in the html document).

HR$tune<-as.factor(HR$tune) # need to make tune a factor first
gam_HR<- bam(erbc ~ tune +
             s(prop_dur, by=tune, bs="tp",k=10),
             data=HR)

You can see that “tune” is showing up in two places, once outside the smooth term s(normtime, by=tune, bs="tp",k=10) and once inside it. The first instance of tune is a parametric term, which is analogous to how we would fit a linear model. This captures overall difference in the height of each trajectory.

The second, s(normtime, by=tune, bs="tp",k=10) is a smooth term, fitting separate smooths for each tune.

Let’s look at how each of these comes out in the model summary:

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## erbc ~ tune + s(prop_dur, by = tune, bs = "tp", k = 10)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.73999    0.01020   72.55   <2e-16 ***
## tuneHHL      0.17458    0.01447   12.07   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                       edf Ref.df     F p-value    
## s(prop_dur):tuneHHH 6.127  7.284 925.8  <2e-16 ***
## s(prop_dur):tuneHHL 5.971  7.130 595.8  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =   0.69   Deviance explained = 69.1%
## fREML = 3782.7  Scale est. = 0.2622    n = 5010

The Parametric coefficients portion of the model summary will look familiar if you’ve seen linear model outputs in R. We have an estimated intercept (for the reference level HHH), and a parametric term for HHL compared to the reference level.

The model estimates for the smooth terms, are not particularly useful for interpreting the model, they tell us:

“estimated degrees of freedom” and reference degrees of freedom which vary based on the number of basis functions, and the smoothing parameter see Sóskuthy (2017) for more on this.
F and p values which are testing the null hypothesis that a smooth term = 0 (i.e. a straight line at 0). We can see that the model is quite sure that we can reject his null hypothesis.
- Note that if we had a fit a trajectory modeling the difference between smooths these coefficients would be much more informative.

One more intuitive way of inspecting GAM(M)s is to visualize smooths, this can be done easily using the package itsadug, the output of which is shown below.

Because each smooth fit also includes CI, we can make some principled comparisons between trajectories.

However before we do this, we are missing one crucial component: random effects.

2. Adding Random effects

There are various ways of specifying random effect structures in GAMMs. Here I will present just one, which is discussed in Sóskuthy (2021): so-called random smooths which are constructed to be analogous to random intercepts for speaker and slopes for tune in a linear model.

We’ll add two lines to our model code:

  s(prop_dur,speaker,bs="fs",m=1,k=10)+
  s(prop_dur,speaker, by = tune.ordered, bs="fs",m=1,k=10)

You can see that we’ve added a reference smooth by time for speaker, with the basis function “fs” which stands for “factor smooth interaction”, and is necessary for specifying random smooths. We also have a new setting in each line m=1, which adjusts the smoothing penalty applied in these smooths (from the default 2) which is recommended in a couple of places.

Note that tune must be recoded as an orderd variable
Further reading: see Sóskuthy (2017) and for in depth discussion of different ways to fit random effects in GAMMs.
Further reading: see Sóskuthy (2017) and Baayen et al. (2018) for more on smoothing penalties.

The full model now looks like:

HR$speaker<-as.factor(HR$speaker)
HR$tune.ordered<-as.ordered(HR$tune) # tune needs to become an ordered factor for this to work
contrasts(HR$tune.ordered)<- "contr.treatment" # contrast code - this is just how these types of smooths are coded in mgcv
gam_HR_random<- bam(erbc ~ tune +
             s(prop_dur, by=tune, bs="tp",k=10)+
             s(prop_dur,speaker,bs="fs",m=1,k=10)+
             s(prop_dur,speaker, by = tune.ordered, bs="fs",m=1,k=10),
             data=HR)

Here we can see that the addition of random smooths has added more uncertainty around the estimates.

Further reading: Sóskuthy (2017) and Sóskuthy (2021) for descriptions of random effects in GAMMs and random effects more generally.

3. Adding an AR1 error model

One potential issue that GAMMs can have is, like linear models, auto-correlated residuals.

To see if residuals are indeed correlated, we can inspect this using another function from itsadug:

acf_resid(gam_HR_random)

There is indeed considerable auto-correlation. One recommended solution for dealing with this is to use a so-called AR1 error model.

An AR1 model is a specific type of auto-regressive error model, which means a model that assumes correlation when estimating parameters, and is recommended for dealing with auto-correlation (the 1 in AR1 means that we are just assuming correlation between immediately adjacent points in the time-series, it is used a lot in GAMM modeling, partly because it is the type of error model that is integrated with mgcv)

Further reading: see Sóskuthy (2017) and Baayen et al. (2018) for more on autrogressive error models, and discussion of the usefullness of AR1 models for GAMMs.

Setting up an AR1 model entails a couple of things:

Ordering rows in the data frame to correspond to the time series
Telling the model at what point in time each trajectory begins, this can be set as a TRUE/FALSE variable that is TRUE at the start of each trajectory.
Providing an estimate of the correlation between errors- this can be extracted from the previous non-AR1 version of the model we ran.

The code for implmenting this AR-1 model is shown below.

HR$start.event <- HR$normtime==1  # set event starts
rho1<-start_value_rho(gam_HR_random) # estimate correlation from model

gam_HR_random_AR1<- bam(erbc ~ tune +
             s(prop_dur, by=tune, bs="tp",k=10)+
             s(prop_dur,speaker,bs="fs",m=1,k=10)+
             s(prop_dur,speaker, by = tune.ordered, bs="fs",m=1,k=10),
             rho=rho1, AR.start = HR$start.event,
             data=HR)

By plotting the residuals like before we can see that the AR1 model does a much better job of dealing with auto-correlated residuals.

acf_resid(gam_HR_random_AR1)

Here we can plot the model fit, and see that its quite similar to the non-AR1 variant, though we can feel better knowing we’ve removed correlation among the residuals.

Visual inspection for significance

There are various ways to do significance testing in GAMMs, and the appropriate one may vary depending on your question. These include:

Inspecting coefficients that model differences explicitly (not possible with the way we fit our model)
Model comparison
Visual inspection of fits, difference smooths and CI.

Given that here we are interested in characterizing how the shapes of the contours differ from one another, a straightforward test of significance can come from visual inspection of smooth differences: so-called difference smooths.

Note that these can be computed by the model even if we didn’t specifically hard-code a difference smooth term (the benefit of doing that would just be to have more redily interpretable coefficents).
Further reading: Sóskuthy (2017) and Sóskuthy (2021) for other significance tests in GAMM models.

This visualization shows the estimated difference between two fits, and the model’s confidence in that estimate. When the difference and CI exclude 0, we can say this represents a significant diffference at this location in (normalized) time between the two smooths. These plots can be generated easily with itsadug.

As a final note, its easy enough to extract model fits and difference smooths for plotting elsewhere, with for example ggplot2 (see the RMarkdown version of this document for code), so we could re-plot the fit and difference smooth like so:

Thinking about the results

In this particular example, the results tell us something interesting about these two tunes (for these 5 speakers). We can see that they differ in the scaling of their final f0, where HHH is higher than HHL (in ToBI this would be H*H-H% versus H*H-L%). However we can also see that they vary in the shape of the f0 trajectory in the first portion of the word, around the pitch accent, with HHH showing a “scooped” rise, and HHL with a “domed” rise. This might have something to do with how these contour shapes displace so-called “Tonal Center of Gravity”; see e.g. Barnes et al. (2012) and Barnes et al. (2021).

In this sense, the GAMM has let us say something interesting about how speakers produce a distinction between these two tunes in capturing (non-linear) differences in contour shape.

Further exercises

Here are some things you can do to extend what was covered above, using the RMarkdown code.

Fit a new model comparing two different tunes, say LHL and LLH, or two of your choosing, and characterize how they are different from one another.
Compare autocorrelation from this new model that you make to one which includes an AR1 error model.
Fit a new model with all of the tunes, and then examine pairwise comparisons of interest.
Do all of the above but using “difference smooths” hard-coded into the model - see the un-run code following the first high-rising tunes model above, and refer to Sóskuthy (2017) as needed.

References

Baayen, R. H., Rij, J. van, Cat, C. de, & Wood, S. (2018). Autocorrelated errors in experimental data in the language sciences: Some solutions offered by generalized additive mixed models. In Mixed-Effects Regression Models in Linguistics (pp. 49–69). Springer.

Barnes, J., Brugos, A., Veilleux, N., & Shattuck-Hufnagel, S. (2021). On (and Off) Ramps in Intonational Phonology: Rises, Falls, and the Tonal Center of Gravity. Journal of Phonetics, 85, 101020.

Barnes, J., Veilleux, N., Brugos, A., & Shattuck-Hufnagel, S. (2012). Tonal center of gravity: A global approach to tonal implementation in a level-based intonational phonology. Laboratory Phonology, 3(2), 337–383.

Beckman, M. E., & Ayers, G. (1997). Guidelines for ToBI labelling. The OSU Research Foundation, 3.

Clark, M. (2019). Generalized additive models. https://m-clark.github.io/generalized-additive-models/

Fruehwald, J. (2016). The early influence of phonology on a phonetic change. Language, 92(2), 376–410.

Sóskuthy, M. (2017). Generalised additive mixed models for dynamic analysis in linguistics: A practical introduction. arXiv Preprint arXiv:1703.05339.

Sóskuthy, M. (2021). Evaluating generalised additive mixed modelling strategies for dynamic speech analysis. Journal of Phonetics, 84, 101017.

Stuart-Smith, J., Lennon, R., Macdonald, R., Robertson, D., Sóskuthy, M., José, B., & Evers, L. (2015). A dynamic acoustic view of real-time change in word-final liquids in spontaneous Glaswegian. Proceedings of the 18th International Congress of Phonetic Sciences.

Winter, B., & Wieling, M. (2016). How to analyze linguistic change using mixed models, growth curve analysis and generalized additive modeling. Journal of Language Evolution, 1(1), 7–18.

Wood, S. N. (2003). Thin plate regression splines. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(1), 95–114.

Wood, S. N. (2017). Generalized Additive Models: an Introduction with R. Chapman; Hall/CRC.

Zahner, K., Kutscheid, S., & Braun, B. (2019). Alignment of f0 peak in different pitch accent types affects perception of metrical stress. Journal of Phonetics, 74, 75–95.