Chapter 5 Testing hypotheses
5.1 formula
When testing hypotheses, and building regression models, we need to
specify the relations between variables. This is done in
R by means of a formula, which is needed in many
statistical functions. In general, such a formula consists of a
response variable, followed by the tilde symbol
~
, followed by a list of independent
variables and/or factors (Wilkinson and Rogers 1973).
In this list, the colon
:
indicates an interaction effect (instead
of the sequence operator), and the asterisk
*
is shorthand for main effects plus
interactions (instead of the multiplication operator).
By default,
the intercept ~1
is included in the
formula, unless suppressed explicitly
(-1
). We have already encountered such a formula
in the boxplot example above.
y ~ x1 + x2 # only main effects
y ~ x1 * x2 # shorthand for x1 + x2 + (x1:x2)
Consult the help files for further information on how to specify complex models.
5.2 \(t\) test
There are three ways to use the t test.
In a one-sample \(t\) test, the sample mean is compared against an expected mean mu
,
with
t.test( x1, mu=0.80 )
##
## One Sample t-test
##
## data: x1
## t = 1.8691, df = 99, p-value = 0.06456
## alternative hypothesis: true mean is not equal to 0.8
## 95 percent confidence interval:
## 0.7878051 1.2082871
## sample estimates:
## mean of x
## 0.9980461
In a two-sample test with independent observations, we compare the same dependent variable, in two groups of sampling units; these groups are defined by an independent variable.
t.test( y[x1<median(x1)], y[x1>median(x1)] ) # groups by median split of x1
##
## Welch Two Sample t-test
##
## data: y[x1 < median(x1)] and y[x1 > median(x1)]
## t = -3.3044, df = 94.766, p-value = 0.001345
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.0476023 -0.2612324
## sample estimates:
## mean of x mean of y
## 2.749787 3.404204
This could also be achieved by specifying the dependent and independent variables in a formula:
t.test( y ~ (x1<median(x1)) ) # equivalent
##
## Welch Two Sample t-test
##
## data: y by x1 < median(x1)
## t = 3.3044, df = 94.766, p-value = 0.001345
## alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
## 95 percent confidence interval:
## 0.2612324 1.0476023
## sample estimates:
## mean in group FALSE mean in group TRUE
## 3.404204 2.749787
In a two-sample test with paired observations, we compare the same construct, but observed under two conditions, which were “paired” within the same sampling units. The two observations are typically stored in two different variables.
t.test( x1, x2, paired=TRUE )
##
## Paired t-test
##
## data: x1 and x2
## t = -14.581, df = 99, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.1074340 -0.8421383
## sample estimates:
## mean of the differences
## -0.9747861
The small \(p\)-value reported here is \(p < 2.2 * 10^{-16}\) in scientific notation8.
Note that the number of sampling units (e.g. participants) and of observations varies in these three \(t\) tests, yielding different degrees of freedom.
5.3 chisq.test
First, let us create two categorical variables, derived from a
speaker’s age
(in years) and average phraselength
(in
syllables), for 80 speakers in the Corpus of Spoken Dutch (talkers
data set; (Quené 2014)).
Categorical variables are created here with the
cut
function, to create
breaks=2
categories of age
(young and
old) and of phraselength
(short and long).
require(hqmisc)
data(talkers)
<- cut( talkers$age, breaks=2 )
age.cat <- cut( talkers$nsyl, breaks=2 ) phraselength.cat
The hypothesis under study is that older speakers tend to produce
shorter phrases. This hypothesis may be tested with a \(\chi^2\) (chi
square) test.
table( age.cat, phraselength.cat ) # show 2x2 table
## phraselength.cat
## age.cat (4.44,9] (9,13.6]
## (21,40] 28 12
## (40,59] 32 8
chisq.test( age.cat, phraselength.cat )
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: age.cat and phraselength.cat
## X-squared = 0.6, df = 1, p-value = 0.4386
The data in the table show that, as hypothesized, the odds of older talkers producing short phrases (\(32/8\) or \(4.0:1\)) are indeed higher than the odds of younger talkers producing short phrases (\(28/12\) or \(2.3:1\)). The effect is far from significant, however, and \(H_0\) is not rejected.
5.4 aov
This function performs a between-subjects analysis of variance, with
only fixed factors (Johnson 2008) (For more complex
analyses of variance having repeated measures, see Johnson 2008; for mixed effects models, see Chapter 7 and references cited there.)
In the example below we create a
response variable aa
which is not normally distributed
(check
with hist
, qqnorm
, etc).
<-rpois(20,lambda=2)
a1<-rpois(20,lambda=4)
a2<-rpois(20,lambda=6)
a3<- c(a1,a2,a3)
aa <- as.factor(rep(1:3,each=20))
x1 # x1 corresponds with the three different poisson distributions within aa
<- as.factor(rep( rep(1:2,each=10), 3)) # no effect expected x2
Thus the dependent variable aa
intentionally differs between the levels of x1
, but there should be no effect of the independent variable x2
nor of the interaction between the two independent variables (\(F<1\) expected for both effects). The model is estimated and summarized in a single composite command.
summary( model1.aov <- aov(aa~x1*x2) )
## Df Sum Sq Mean Sq F value Pr(>F)
## x1 2 109.73 54.87 15.512 4.75e-06 ***
## x2 1 0.27 0.27 0.075 0.785
## x1:x2 2 6.93 3.47 0.980 0.382
## Residuals 54 191.00 3.54
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
When reporting any of the hypothesis tests in this section, you should always report the effect size too (Quené 2010).