In the within-subject differences and factorial differences worksheets, we looked at Bayesian ANOVA methods. Throughout Research Methods in R we focus on Bayesian, rather than traditional (“p value”) methods, because Bayes Factors are more useful and easier to understand. However, psychologists have been using traditional ANOVA methods since the 1970s, and it was not until the 2010s that we started using Bayesian methods. So, there’s a lot of older work out there that uses a different technique. For this reason, it is probably still worth knowing how to do a traditional ANOVA. This is what we’ll cover in this worksheet.
The following assumes you have completed the within-subject differences and factorial differences worksheets. Make sure you
are in your R project that contains the data downloaded from the
git repository - see the preprocessing worksheet for details.
Create a new R script called afex.R
and
enter all comments and commands from this worksheet into it.
To do traditional ANVOA in R, you will need to load the
afex
package.
Enter this comment and command into your script, and run them:
# Load 'afex' package, for NHST ANOVA
library(afex)
You’ll also need to load the data, preprocess it, and set factors, as we did for Bayesian ANOVA. Here’s how (enter these comments and commands into your script, and run them):
# Load tidyverse
library(tidyverse)
# Load data
words <- read_csv("wordnaming2.csv")
# Produce subject-level summary
wordsum <-
words %>%
group_by(subj, medit) %>%
summarise(rt = mean(rt))
# Set factors
wordsum$subj <- factor(wordsum$subj)
wordsum$medit <- factor(wordsum$medit)
To do a traditional, one between-subject factor ANOVA, use the following command.
Enter this comment and command into your script, and run it:
# One b/subj factor ANOVA
aov_car(formula = rt ~ medit + Error(subj), data = wordsum)
Contrasts set to contr.sum for the following variables: medit
Anova Table (Type 3 tests)
Response: rt
Effect df MSE F ges p.value
1 medit 2, 417 3812.97 11.11 *** .051 <.001
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
The command aov_car
is similar to the command
anovaBF
, but for traditional ANOVA. In most ways, this
command works like anovaBF
. For example, the part
formula = rt ~ medit
says we want to look at the effect of
the medit
(meditation) variable on the rt
(reaction time) variable. The next bit, + Error(subj)
is
also very similar to anovaBF
- it tells
aov_car
which column contains the participant IDs (in this
case, subj
). The final part, data = wordsum
tells the command where to find the data (just as with
anovaBF
).
Although there’s quite a lot in this output, the main thing to focus
on is the number underneath p.value
. In this case, the
number is ‘<.0001’. So, the p-value in this case is less than .0001,
written as p < .0001
. Recall that p values are widely misinterpreted by
psychologists, but we have a convention that if the p value is less
than .05, then people will believe there is a difference. In this case,
the p value is less than .05, and so people will believe there is a
difference.
If you are reporting the results of a traditional ANOVA in a journal
article, you are generally expected to report the F
value,
along with the degrees of freedom (df
), as well as the p
value. So, in this case, you would write:
The two groups differed significantly, F(2, 417) = 11.11, p < .0001.
The F ratio and degrees of freedom are not particularly meaningful or useful information for the reader to have, but nonetheless journals normally require them when reporting traditional ANOVA. For further explanation, see the more on ANOVA worksheet.
One useful piece of information aov_car
provides is
ges
- which is .04
in this example.
ges
stands for “generalized eta-squared”. This is a measure
of effect size, somewhat
like Cohen’s d, but the scale is different. ges
is much
like a correlation co-efficient, and ranges
between 0 and 1. A large effect size for ges
is around .26,
a medium-sized effect is around .13, and a small effect is around .02
(further
details here). It can be useful to report generalized eta squared,
and this is reported as \(\eta_{g}^{2}\). So one could write:
The two groups did not differ significantly, F(1, 58) = 2.62, p = .11, \(\eta_{g}^{2}\) = .04.
Note that some authors will instead report a related measure called partial eta-squared (\(\eta_{p}^{2}\)). This is not the same thing and, in most circumstances, it is better to report generalized eta-squared. This point is discussed further here.
First, preprocess the data for this example.
Enter these comments and commands into your script, and run them:
# Load data
words <- read_csv("wordnaming2.csv")
# Select control condition, remove neutral condition
wordctrl <- words %>% filter(medit == "control")
wordctrlCI <- wordctrl %>% filter(congru != "neutral")
# Create subject-level summary
wordctrlCIsum <- wordctrlCI %>%
group_by(subj, congru) %>%
summarise(rt = mean(rt))
# Make factors
wordctrlCIsum$congru <- factor(wordctrlCIsum$congru)
wordctrlCIsum$subj <- factor(wordctrlCIsum$subj)
Now, this is the command to run a traditional within-subjects ANOVA (Enter this comment and command into your script, and run it):
# One-factor, w/subj ANOVA
aov_car(formula = rt ~ Error(subj/congru), data = wordctrlCIsum)
Anova Table (Type 3 tests)
Response: rt
Effect df MSE F ges p.value
1 congru 1, 139 4949.44 44.78 *** .109 <.001
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
As before, aov_car
works similarly to
anovaBF
- for example we specify a formula
and
the data set (data
). It differs mainly in how the
right-hand side of the formula (everything after the ~
) is
laid out. To do a within-subjects test in aov_car
we write
Error(x/y)
, where x
is the column of data
frame that contains the subject IDs (subj
in this case),
and y
is the column of the data frame that contains the
within-subjects condition (congru
in this case).
First, preprocess the data for this example.
Enter these commands into your script, and run them:
# Create subject-level summary
wordctrlsum <- wordctrl %>%
group_by(subj, congru) %>%
summarise(rt = mean(rt))
# Create factors
wordctrlsum$congru <- factor(wordctrlsum$congru)
wordctrlsum$subj <- factor(wordctrlsum$subj)
The command to run a traditional ANOVA with more than two levels is exactly the same as it is with two levels. For example, the within-subjects version is (Enter this command into your script, and run it):
# One-factor, w/subj, more than two levels
aov_car(formula = rt ~ Error(subj/congru), data = wordctrlsum)
Anova Table (Type 3 tests)
Response: rt
Effect df MSE F ges p.value
1 congru 1.05, 145.44 4842.14 43.75 *** .085 <.001
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
Sphericity correction method: GG
The output is read exactly the same way as before. In this example, you would write:
Congruency significantly affected reaction time, F(1.05, 145.55) = 43.75, p < .0001, \(\eta_{g}^{2}\) = .09.
There is one new aspect here, the statement at the end:
Sphericity correction method: GG
You’ll see something like this in traditional ANOVA outputs when a within-subjects factor has more than two levels. It turns out that traditional ANOVA methods generally get the p value wrong in these cases, because they make assumptions about the data which are normally untrue. The incorrect assumption they make is that the data is “spherical” – if you’re curious what that means, click here.
Fortunately, there are ways of correcting this error. These are
called sphericity corrections (“sphericity” is the property of
being spherical). The two main methods are Greenhouse-Geisser
(GG
) correction, and Huynh-Feldt (HF
)
correction. aov_car
picks the most appropriate one, makes
the correction, and tells you it’s done so by including that statement
Sphericity correction method: GG
at the end. Another way
you can tell a sphericity correction has been applied is that the
degrees of freedom (df
) are normally not whole numbers
(1.05
and 145.44
in this case).
You would normally report towards the beginning of your results that you used such a correction. For example:
“Greenhouse-Geisser corrections for non-sphericity were applied where appropriate.”
You can do pairwise comparisons in traditional ANOVA in the same way
you do them with anovaBF
. In other words, filter the data
to include just the two conditions you want to compare, and run the
appropriate test. There are also other ways to do this, but we don’t
cover them in these intermediate-level worksheets.
First, preprocess the data for this example.
Enter these commands into your script, and run them:
# Create subject-level summary
wordsum <-
words %>%
group_by(subj, medit, congru) %>%
summarise(rt = mean(rt))
# Create factors
wordsum$subj <- factor(wordsum$subj)
wordsum$medit <- factor(wordsum$medit)
wordsum$congru <- factor(wordsum$congru)
A traditional ANOVA with one within-subjects factor and one between-subjects factor is conducted like this (Enter this command into your script, and run it):
# One w/subj, one b/subj, ANOVA
aov_car(formula = rt ~ medit + Error(subj/congru), data = wordsum)
Contrasts set to contr.sum for the following variables: medit
Anova Table (Type 3 tests)
Response: rt
Effect df MSE F ges p.value
1 medit 2, 417 11759.86 11.15 *** .035 <.001
2 congru 1.05, 436.24 5093.68 48.44 *** .035 <.001
3 medit:congru 2.09, 436.24 5093.68 13.39 *** .020 <.001
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
Sphericity correction method: GG
The aov_car
command for this factorial ANOVA is just
combines the formula we used for a single between-subjects factor,
rt ~ medit + Error(subj)
with the formula we used for a
single within-subjects factor, rt ~ Error(subj/congru)
, to
give the full formula rt ~ medit + Error(subj/congru)
. Note
that the +
sign is used to combine a within-subjects
component with a between-subjects component, rather than the
*
sign we used in anovaBF
. This is just
because the two commands were written by different peoplel it doesn’t
have any deeper significance.
The first two lines of the output are the main effect for
medit
and the main effect for congru
,
respectively. The third line, medit:congru
is the
interaction between these two factors. Everything else has the
same meaning as in previous outputs. You’ll notice that the main effect
F values are not quite the same as they were in the earlier examples on
this worksheet. As we saw in our Bayesian ANOVA, different analyses can
give different answers.
First, preprocess the data for this example.
Enter these commands into your script, and run them:
# Create subject-level summary
wordsum <- words %>% group_by(subj, sex, medit) %>% summarise(rt = mean(rt))
# Create factors
wordsum$sex <- factor(wordsum$sex)
wordsum$subj <- factor(wordsum$subj)
wordsum$medit <- factor(wordsum$medit)
The command for a two between-subject factor traditional ANOVA is
given below. This is much like previous commands, the only thing to note
is that *
is used to combine between-subjects factors in
aov_car
. The output can be interpreted in the same way as
before.
Enter this command into your script, and run it:
# Two b/subj factors, ANOVA
aov_car(formula = rt ~ medit*sex + Error(subj), data = wordsum)
Contrasts set to contr.sum for the following variables: medit, sex
Anova Table (Type 3 tests)
Response: rt
Effect df MSE F ges p.value
1 medit 2, 414 3781.43 11.20 *** .051 <.001
2 sex 1, 414 3781.43 5.64 * .013 .018
3 medit:sex 2, 414 3781.43 0.42 .002 .657
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
First, preprocess the data for this example.
Enter these commands into your script, and run them:
# Create subject-level summary
wordsum <- words %>% group_by(subj, congru, block) %>% summarise(rt = mean(rt))
# Create factors
wordsum$congru <- factor(wordsum$congru)
wordsum$subj <- factor(wordsum$subj)
wordsum$block <- factor(wordsum$block)
The command for a two within-subject factor traditional ANOVA is
given below. This is much like previous commands, the only thing to note
is that *
is used to combine within-subjects factors in
aov_car
. The output can be interpreted in the same way as
before.
Enter this command into your script, and run it:
# Two w/subj factors, ANOVA
aov_car(formula = rt ~ Error(subj/congru*block), data = wordsum)
Anova Table (Type 3 tests)
Response: rt
Effect df MSE F ges p.value
1 congru 1.04, 437.22 16222.93 45.80 *** .031 <.001
2 block 1.68, 702.77 1608.56 2.43 + <.001 .099
3 congru:block 3.76, 1576.44 421.40 1.34 <.001 .255
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
Sphericity correction method: GG
Unlike Bayesian ANOVA, traditional ANOVA techniques can handle designs with more than two factors quite efficiently.
First, preprocess the data for this example; Enter these commands into your script, and run them:
# Create subject-level summary
wordsum <- words %>% group_by(subj, sex, medit, congru, block) %>% summarise(rt = mean(rt))
# Create factors
wordsum$sex <- factor(wordsum$sex)
wordsum$subj <- factor(wordsum$subj)
wordsum$medit <- factor(wordsum$medit)
wordsum$congru <- factor(wordsum$congru)
wordsum$block <- factor(wordsum$block)
Now, we can run an ANOVA with all four factors; Enter this command into your script, and run it:
# Four-factor ANOVA
aov_car(formula = rt ~ sex*medit + Error(subj/congru*block), data = wordsum)
Contrasts set to contr.sum for the following variables: sex, medit
Anova Table (Type 3 tests)
Response: rt
Effect df MSE F ges p.value
1 sex 1, 414 34897.39 5.79 * .009 .017
2 medit 2, 414 34897.39 11.43 *** .034 <.001
3 sex:medit 2, 414 34897.39 0.39 .001 .680
4 congru 1.05, 433.57 15058.29 49.17 *** .033 <.001
5 sex:congru 1.05, 433.57 15058.29 6.89 ** .005 .008
6 medit:congru 2.09, 433.57 15058.29 13.83 *** .019 <.001
7 sex:medit:congru 2.09, 433.57 15058.29 0.61 <.001 .549
8 block 1.67, 692.52 1625.05 2.41 <.001 .100
9 sex:block 1.67, 692.52 1625.05 0.23 <.001 .757
10 medit:block 3.35, 692.52 1625.05 0.52 <.001 .686
11 sex:medit:block 3.35, 692.52 1625.05 0.29 <.001 .852
12 congru:block 3.76, 1557.63 419.60 1.35 <.001 .253
13 sex:congru:block 3.76, 1557.63 419.60 1.45 <.001 .217
14 medit:congru:block 7.52, 1557.63 419.60 1.80 + <.001 .078
15 sex:medit:congru:block 7.52, 1557.63 419.60 0.87 <.001 .532
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
Sphericity correction method: GG
We get some very extensive output. This is because, when we have more
than two factors in an analysis, it’s possible that there will be
higher-order interactions in our data
(e.g. sex:medit:congru
), and there are a large number of
these higher-order interactions to consider. Generally speaking, people
find higher-order interactions very hard to understand or relate to
their hypotheses. Trying to make sense of them is beyond this
intermediate-level worksheet.
__
This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.