Before starting this exercise, you should have completed all the Absolute Beginners’ workshop exercises. If not, take a look at those exercises before continuing. Each section below also indicates which of the earlier worksheets are relevant.
Relevant worksheet: Intro to RStudio
In this exercise, you’ll be analysing some data that has already been collected. To get this data into R, follow these steps:
Set up an RStudio project for
this analysis, and create a script file within that project.
Plymouth University students: Create/open your project
named psyc415
; within that create a script file called
lineup.R
. Enter all commands into that script and run them
from there.
Upload this CSV file into your RStudio project folder. Here’s a reminder of how to upload CSV files.
Load the tidyverse package, and then load your data into R.
# Police lineup experiment
# Load tidyverse
library(tidyverse)
# Load data into 'lup'
lup <- read_csv("lineup.csv")
Look at the data by clicking on it in the Environment tab in RStudio.
Each row is one participant in this simulated police line-up experiment. Each participant views a video of a simulated crime, then has to pick the criminal from one of four photographs of different people. The criminal in the video does not appear in any of those four photos, but the participants have not yet been told that. After they make their decision, some participants are told they picked the correct person; the rest are not told anything. Each participant then goes on to answer a series of questions (Q2-Q9, below).
Will being told they made the right choice change peoples’ answers to these questions?
Column | Description | Values |
---|---|---|
Sub | Subject number | a number |
Cond | Did the subject receive feedback on their decision? | “Feedback”, “No Feedback” |
Q1 | The photograph chosen by the participant | A, B, C, or D |
Q2 | “Would you be willing to testify in court?” | “Testify”, “Not Tesitfy” |
Q3 | “How was your view of the scene?” | 0 - 100, higher numbers = better view |
Q4 | “How long did you see the thief’s face? (in seconds)” | a number |
Q5 | “When you chose the photograph, how confident were you?” | 0 - 100, higher numbers = more confident |
Q6 | “Did the thief shove the victim?” | Yes, No |
Q7 | “How confident were you in your answer?” (about the shove) | 0 - 100, higher numbers = more confident |
Q8 | “Do you think the thief may be violent?” | Yes, No |
Q9 | “How confident were you in your answer?” (about the thief’s violence) | 0 - 100, higher numbers = more confident |
Relevant worksheet: Relationships, Evidence
Will witnesses be more likely to testify in court if they are told
they are right? We can look at this question with the data set you just
loaded. Looking at the lup
data frame, the column
Cond
tells us whether each participant was given feedback
or not. The Q2
column tells us whether they said they would
be willing to testify in court or not. Both of these variables have
unordered (“nominal”) data, so the appropriate form of analysis here is
a contingency table. As we covered in the Relationships
worksheet, we produce a contingency table using the table
command:
# Create contingency table of 'Cond' by 'Q2'
cont <- table(lup$Cond, lup$Q2)
# Display contigency table
cont
Not Testify Testify
Feedback 43 45
No Feedback 66 17
Often, it’s easier to see what’s going on in a contingency table if we draw a mosaic plot:
# Display mosaic plot
mosaicplot(cont)
It looks like, with feedback, people are about 50:50 on whether they would testify. Without feedback, a large majority would not testify.
Is this a real effect, or could it just be down to chance? As we
covered in the Relationships worksheet, the best way to look at
this is with a Bayesian test. We use the cont
contingency
table we generated above:
# Load BayesFactor package
library(BayesFactor, quietly = TRUE)
# Calculate Bayes Factor for the contingency table 'cont'
contingencyTableBF(cont, sampleType = "indepMulti", fixedMargin = "rows" )
Bayes factor analysis
--------------
[1] Non-indep. (a=1) : 1201.46 ±0%
Against denominator:
Null, independence, a = 1
---
Bayes factor type: BFcontingencyTable, independent multinomial
We’ve set fixedMargin = "rows"
because the rows of the
contingency table represent the groups created by the experimenter
(Feedback vs. No Feedback).
The Bayes Factor here is about 1200, so it’s over a thousand times more likely there is a real difference, than there isn’t.
Now do the same analyses as above, but on question 6, “Did the thief
shove the victim?”. To do this you change the command
cont <- table(lup$Cond, lup$Q2)
so that you get a
contingency table for question 6. You can then re-run the commands above
to get the answers.
Enter the Bayes Factor for question 6 into PsycEL.
Using the convention that there is a difference if BF > 3, there isn’t a difference if BF < 0.33, and if it’s between 0.33 and 3, we’re unsure, select difference, no difference, or unsure, on PsycEL.
Relevant worksheet: Group Differences, Evidence
Did participants think their view was better if they were told they
made the correct decision? In this case, we have one ordered variable
(Q3
, their rating of their view on a 1-100 scale), and one
unordered variable (Cond
- whether they got feedback or
not).
We start by looking to see how the mean scores on Question 3 differ
for those who were and weren’t given feedback. As we saw in the
Group Differences worksheet, we use the group_by
,
summarise
, and mean
commands to do this:
# Group by 'Cond', take mean for Q3.
lup %>% group_by(Cond) %>% summarise(mean(Q3))
# A tibble: 2 × 2
Cond `mean(Q3)`
<chr> <dbl>
1 Feedback 45.6
2 No Feedback 41.2
As before, you can safely ignore the “ungrouping” message that you receive.
It looks like there’s a small difference, with the ratings of their view slightly higher in the feedback condition – but how does this between-group difference compare to the within-group variability? As we covered in the Group Differences worksheet, this most easily looked at with a scaled density plot:
# Display density plot of 'Q3', by 'Cond'
lup %>% ggplot(aes(Q3, colour = factor(Cond))) + geom_density(aes(y = ..scaled..))
This graph tells a somewhat different story to the means. The two groups almost completely overlap, with the main difference being that the No Feedback participants mostly give scores close to 50, while the Feedback participants give a broader range of scores.
At this point, the most pressing question is probably whether the difference observed in the mean scores is likely to be real, or whether it’s more likely down to chance. As we saw in the Evidence worksheet, the best way to look at this is with a Bayesian t-test:
# Calculate Bayesian t-test for effect of 'Cond' on 'Q3'
ttestBF(formula = Q3 ~ Cond, data = data.frame(lup))
Bayes factor analysis
--------------
[1] Alt., r=0.707 : 0.3230296 ±0.04%
Against denominator:
Null, mu1-mu2 = 0
---
Bayes factor type: BFindepSample, JZS
The Bayes Factor in this case is about 1/3, meaning it’s about three times as likely there isn’t a difference as there is.
Did participants who were told they were right think they saw the
thief’s face for longer? This was addressed by Question 4 (column
Q4
in data frame lup
). By changing
Q3
to Q4
in the commands above, you can answer
this question.
Enter the mean viewing time for each condition, and the Bayes Factor for the difference, into PsycEL.
Using the convention that there is a difference if BF > 3, there isn’t a difference if BF < 0.33, and if it’s between 0.33 and 3, we’re unsure, select difference, no difference, or unsure, into PsycEL.
This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.