# Chapter 7 Mixed-effects modeling

Many language (acquisition) studies are based on samples of two random
factors: a sample of participants (subjects) and a sample of language
items (words, sentences, texts). The two random factors are *crossed*,
i.e., each item is presented to each participant — often only once, so
that a subject does not respond to the same item repeatedly in multiple
conditions. The analysis methods shown above
(`aov`

, `lm`

,
`glm`

) all fail to acknowledge this particular
structure in the random part of the design. They include a single random
factor (named `Residual`

) that aggregates all
random effects.

A better method is to use *mixed-effects modeling*,
which may be done in R by using the `lmer`

command. Key
advantages of this method are (a) it allows multiple random
factors, crossed and/or nested, (b) it does not require homogeneity of
variance, (c) it is robust against missing data. Hence mixed-effects
modeling is quickly gaining in popularity
(Quené and Van den Bergh 2004, 2008; Baayen 2008; Hox, Moerbeek, and Van de Schoot 2018).

For mixed-effects modeling, you need to install two add-on packages to
R , named `lme4`

and `languageR`

(Baayen, Davidson, and Bates 2008). For more
information on packages, see Chapter 8 below.
After activation of these packages, we
can simply perform a mixed-effects analysis. First, we read in an
example dataset (Quené and Van den Bergh 2008) in long data layout:

These fictitious responses were provided by 24 subjects, for 36 items, in 3 conditions, with rotation of items over conditions. This rotation may be inspected for a small subset of the data frame:

```
## , , cond = 1
##
## item
## subj 1 2 3 4 5 6
## 1 1 1 1 1 1 1
## 2 0 0 0 0 0 0
## 3 0 0 0 0 0 0
##
## , , cond = 2
##
## item
## subj 1 2 3 4 5 6
## 1 0 0 0 0 0 0
## 2 1 1 1 1 1 1
## 3 0 0 0 0 0 0
##
## , , cond = 3
##
## item
## subj 1 2 3 4 5 6
## 1 0 0 0 0 0 0
## 2 0 0 0 0 0 0
## 3 1 1 1 1 1 1
```

Next, we need to specify that `cond`

is a
categorical factor, and not a continuous predictor. In addition, we
specify the levels of the factor, we specify its contrasts, and indicate
that the second level is the baseline or reference level.

After these preliminary steps we can estimate an appropriate mixed-effects model in a single command. The estimated model is also stored as an object, and a summary is displayed.

```
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: resp ~ 1 + cond + (1 | subj) + (1 | item)
## Data: x24
##
## AIC BIC logLik deviance df.resid
## 2046.8 2075.4 -1017.4 2034.8 858
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.1641 -0.6394 -0.0077 0.6416 2.7157
##
## Random effects:
## Groups Name Variance Std.Dev.
## item (Intercept) 0.2579 0.5078
## subj (Intercept) 0.2891 0.5377
## Residual 0.5103 0.7144
## Number of obs: 864, groups: item, 36; subj, 24
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.04569 0.14485 0.315
## cond.A 0.17037 0.05953 2.862
## cond.C -0.23696 0.05953 -3.980
##
## Correlation of Fixed Effects:
## (Intr) cond.A
## cond.A -0.205
## cond.C -0.205 0.500
```

The output correctly shows that there are two unrelated random effects, plus unexplained residual variance. Each response is now modeled as a unique combination of the intercept (mean of baseline condition B), item effect, subject effect, condition effect, and residual. The average response in the baseline condition B is \(0.046\) units. Responses in condition A are 0.170 units higher than baseline, and in condition C they are \(-0.237\) units higher than baseline, i.e. \(0.237\) units lower.

For reasons not discussed here,^{10} the significance levels of the
fixed effects are not reported in the output of
`lmer`

. There are several solutions to obtain
these significance levels.

- The most conservative option (Hox, Moerbeek, and Van de Schoot 2018) is to use the critical \(t\) value
associated with the random effect that has the fewest levels (here
`subj`

), corrected for the number of fixed coefficients including the intercept (here \(3\)). If a fixed effect is significant by this very conservative criterion, then it will also be significant by any other criterion that is less conservative and more liberal.

`## [1] 2.079614`

Comparison of the fixed effects with this critical \(t^*\) = 2.08 shows that both conditions A and C differ significantly from the baseline condition B.

- A second option is to estimate 95% confidence intervals for all coefficients in the resulting mixed-effects model. This can be time-consuming, as the mixed-effects model will be re-fit many times on slightly varying datasets.

`## Computing bootstrap confidence intervals ...`

```
##
## 6 warning(s): Model failed to converge with max|grad| = 0.00267345 (tol = 0.002, component 1) (and others)
```

```
## 2.5 % 97.5 %
## .sig01 0.35950709 0.6747204
## .sig02 0.36333157 0.6844652
## .sigma 0.68054619 0.7474401
## (Intercept) -0.23533348 0.3634542
## cond.A 0.05695928 0.3035280
## cond.C -0.36262503 -0.1261797
```

As the interval for `cond.A`

is entirely
positive, we may conclude with 95% confidence that condition A yields
higher scores than the baseline condition B, and mutatis mutandis that
condition C yields lower scores than condition B.

### References

Baayen, R. H. 2008. *Analyzing Linguistic Data: A Practical Introduction to Statistics Using R*. Cambridge University Press.

Baayen, R. H., D. J. Davidson, and Douglas M. Bates. 2008. “Mixed-Effects Modeling with Crossed Random Effects for Subjects and Items.” *Journal of Memory and Language* 59 (4): 390–412.

Hox, Joop J., Mirjam Moerbeek, and Rens Van de Schoot. 2018. *Multilevel Analysis: Techniques and Applications*. 3rd ed. Routledge.

Quené, Hugo, and Huub Van den Bergh. 2004. “On Multi-Level Modeling of Data from Repeated Measures Designs: A Tutorial.” *Speech Communication* 43 (1–2): 103–21.

Quené, Hugo, and Huub Van den Bergh. 2008. “Examples of Mixed-Effects Modeling with Crossed Random Effects and with Binomial Data.” *Journal of Memory and Language* 59 (4): 413–25.