Exercises – Reaction Time Data#
This section is based on a former assignment for this course. With GitHub Copilot, the assignment has become too easy to be used as an assignment. Instead, we will use it as an exercise to practice using GitHub Copilot. At the same time, we’re introducing some data that you might encounter in neuroscience research: behavioral reaction times (RTs), and errors. While these are not direct measurements of neural activity, they reflect neural processes, and are often important to analyze in order to properly interpret the results of a neuroscience experiment – one often cannot understand what the brain is doing, if one does not understand what the associated behavior is.
In the online textbook, this lesson is populated with code generated by Copilot, based on prompts that more or less match the assignment instructions. This is because in the textbook, we want to show you how Copilot works, and how it can be used to generate code. However, in the exercises you can download, you will get a notebook without code or prompts. This is because we want you to write the code yourself, ideally without peeking at the solutions in the textbook (so don’t read below this cell if you want to do the exercise!).
It is highly recommended that if you are working through the lesson yourself, you deactivate Copilot. If you really get stuck, you can always reactivate Copilot to get a hint, but then deactivate it again to write the next bit of code yourself. But you will learn a lot more about code by trying to write it yourself.
Reaction Time Data#
The cell below contains reaction times (RT; in seconds) from some trials in a behavioral experiment. The RTs reflect the amount of time between when a stimulus was presented, and when a human participant responded by making a button press. Execute the cell (shift-enter) and move on to the next cell.
rt = [0.394252808, 0.442094359, 0.534764366, 0.565906723, 0.570404592,
0.486154719, 0.518792127, 0.844916827, 0.495622859, 0.476159436,
0.612854746, 0.529661203, 0.389157455, 1.517088266, 0.573962432,
0.714152493, 0.409225638, 0.435308188, 0.509801957, 0.544626271,
0.437877745, 0.333356848, 0.401773569, 0.479840688
]
Question 1#
What type of data is rt
(in terms of Python data types)? Use a Python command to generate the answer.
# what type of data is rt
type(rt)
list
Question 2#
What type of data is the first value in rt
(in terms of Python data types)? Use a Python command to generate the answer.
# what type of data is the first element of rt
type(rt[0])
float
Question 3#
How many trials were in this experiment? (Hint: how many entries are there in rt
?). Use Python code to generate the answer.
# how many trials were in this experiment
# how many entries are in rt
len(rt)
24
Question 4#
Print the first 9 values in rt
# print the fist 9 elements of rt
rt[0:9]
[0.394252808,
0.442094359,
0.534764366,
0.565906723,
0.570404592,
0.486154719,
0.518792127,
0.844916827,
0.495622859]
Question 5#
Print the last 6 values in rt
# print the last 6 elements of rt
rt[-6:]
[0.509801957, 0.544626271, 0.437877745, 0.333356848, 0.401773569, 0.479840688]
Question 6#
Print the values of the fifteenth through twentieth data points in rt
(including the twentieth value)
# print the values of the 15th through 20th data points in rt
rt[14:20]
[0.573962432, 0.714152493, 0.409225638, 0.435308188, 0.509801957, 0.544626271]
Question 7#
What is the slowest reaction time in rt
?
# what is the slowest reaction time in the dataset
max(rt)
1.517088266
Check that last line of code that Copilot generated. Is it correct? Is there anything nonintuitive about it?
Question 8#
What is the fastest reaction time in rt
?
# what is the fasted reaction time in the dataset
min(rt)
0.333356848
Question 9#
You cand find the index of a specific value in a list using the .index()
method. Do this to find which data point (index) in rt
has the value of 0.409225638
# find which data point in rt has the value 0.409225638
rt.index(0.409225638)
16
Accuracy Data#
In behavioral experiments it’s common to analyze both reaction times and error rates. The list below contains a value for each trial indicating whther the subject made an error (True
) or not (False
).
Note that it might be more intuitive if this were coded as True
for correct responses, and False
for errors, but the variable is recording errors, not accuracy. It’s always important in data science to make sure you undersatnd what your data represent!
# Just run this cell; don't change anything in it
err = [False, False, True, False, False, False, False, False, True, False,
False, True, False, False, False, False, True, True, True, False,
]
Question 10#
What Python data type are the values of err
? (not the type of err
itself). Use code to generate your answer, showing the type of the first entry in the err
list.
# WHAT PYTHON DATA TYPE ARE THE VALUES OF ERR
type(err)
list
Question 11#
How many data points do we have in err
?
# HOW MANY DATA POINTS ARE IN ERR
len(err)
20
Question 12#
These data are from the same experiment/participant as the RT data, but you’ll note we have fewer data points in err
. Let’s say this is because of some sort of technical error during data recording, but we know (never mind how - this is just for the assignment!) what the missing data should be. Specifically, the first data point is missing, but we know the participant made an error on that trial; and the last three data points are missing, and we know the participant got all of those trials correct.
Write five lines of code, as follows:
Insert a value at the beginning of
err
(without changing any of the existing values) to reflect the participant’s error on the first trial.Insert three values (using one line of code) at the end of
err
, indicating correct answers on the last three trials.Print out
err
with these changes made.Print out the length of
err
Confirm that the length of
err
is now the same as the length ofrt
Note:: the err
list is re-defined at the start of the cell below. Don’t change this, and insert the additional lines of code that you need below it. This way, each time you run the cell you “reset” err
to its original values. This is useful because if your code doesn’t do what you want the first time, it may have modified err
in ways you didn’t want to.
err = [False, False, True, False, False, False, False, False, True, False,
False, True, False, False, False, False, True, True, True, False,
]
# insert a value at the beginning of err that indicates an error
err.insert(0, True)
# insert three values at the end of err that indicate correct responses
err.extend([False, False, False])
# print err
print(err)
# print the length of err
print(len(err))
# Confirm that the length of `err` is now the same as the length of `rt`
len(err) == len(rt)
[True, False, False, True, False, False, False, False, False, True, False, False, True, False, False, False, False, True, True, True, False, False, False, False]
24
True
print(err)
[True, False, False, True, False, False, False, False, False, True, False, False, True, False, False, False, False, True, True, True, False, False, False, False]
Question 13#
How many errors did the participant make? Use code — and specifically a list method — to generate the answer. This method was not covered in the lesson, so you may need to use the help
command to figure out which method is approrpiate.
# how many errors did the participant make
err.count(True)
7
Data Cleaning#
It is not uncommon in behavioral studies (or other research) to have outliers — one or a few values that are exceptionally different from the majority of values. These can be problematic for statistical analysis, and may also not reflect the behavior we’re trying to measure. For example, an RT may be exceptionally long because the participant sneezed prior to pressing the button.
Above, you should have identified the longest RT in this data, which is almost 1 s longer than any other RT. We would like to remove it from the data. When we do this, we should also remove the corresponding trial’s data from the error data.
Question 14#
Write code that does the following:
finds the position (index) of the slowest RT in the data
removes that slowest RT value from
rt
removes the data from
err
that corresponds to the trial you removed in RT (i.e., has the same index)prints the slowest RT remaining, rounded to two decimal places (after removing the outlier)
prints the lengths of
rt
anderr
using a singleprint
command, with accompanying text to make it clear which value is the length ofrt
and which is the length oferr
Note that this can be accomplished in 5 lines of code. However, if your’e trying to figure it out yourself, without the help of an AI assistant, you might start by figuring out how to do the task without worrying how many lines of code it takes. Once you have it working, then figure out how to shorten your code if you can (think about nesting Python commands).
# find the position of the slowest RT value in rt
rt.index(max(rt))
# remove the slowest RT value from rt
rt.remove(max(rt))
# remove the corresponding error value from err
err.pop(rt.index(max(rt)))
#print the slowest RT value in rt
print(max(rt))
# print the length of rt and err using a single print statement, with accompanying test that makes it clear which value is which
print("The length of rt is", len(rt), "and the length of err is", len(err))
0.844916827
The length of rt is 23 and the length of err is 23
Checking Copilot’s Work#
When I typed in prompts for Copilot based on the instructions above, I got the following code:
# find the position of the slowest RT value in rt
rt.index(max(rt))
# remove the slowest RT value from rt
rt.remove(max(rt))
# remove the corresponding error value from err
err.pop(rt.index(max(rt)))
#print the slowest RT value in rt
print(max(rt))
# print the length of rt and err using a single print statement, with accompanying test that makes it clear which value is which
print("The length of rt is", len(rt), "and the length of err is", len(err))
All of that code is correct, in the sense that it does what I asked it to do. However, there is a logical error in the code. Can you figure out what it is?
Click the button to reveal the solution
The problem is that the code removes the slowest RT value from rt
, and then finds the index of the slowest RT value in rt
. But the slowest RT value is no longer in rt
! So the index we get for err
is the index of the second-slowest RT in the original data. (Amusingly, even though Copilot generated the erroneous code, when I started typing this explanation of the error, it suggested almost the correct explanation! This actually presents some interesting possibilities for how AI assistants might be used to check their own code, which we will explore in a future lesson.)
This kind of error is pernicious, in the sense that it would be very easy to make the error, and not detect it. The code correctly ends up with equal-length lists for rt
and err
, and each line of code seems to be doing what it’s supposed to. It’s only by really stepping through the code, and thinking about what each line is doing, that we can detect the error.
To fix this, we need to find the index of the slowest RT value before we remove it from rt
. Think about how you might do this (conceptually first, then in terms of code). Then write the code to do it, and/or use Copilot prompts to help you.
Again, we define rt
and err
at the start of the cell below, so that you can run the cell multiple times to test your code.
rt = [0.394252808, 0.442094359, 0.534764366, 0.565906723, 0.570404592,
0.486154719, 0.518792127, 0.844916827, 0.495622859, 0.476159436,
0.612854746, 0.529661203, 0.389157455, 1.517088266, 0.573962432,
0.714152493, 0.409225638, 0.435308188, 0.509801957, 0.544626271,
0.437877745, 0.333356848, 0.401773569, 0.479840688
]
err = [True, False, False, True, False, False, False, False, False, True,
False, False, True, False, False, False, False, True, True, True,
False, False, False, False
]
# find the index of the slowest value in rt, and save it as a variable
slowest_rt_index = rt.index(max(rt))
# remove the slowest value from rt
rt.remove(max(rt))
# remove the corresponding error value from err
err.pop(slowest_rt_index)
# print the length of rt and err using a single print statement, with accompanying test that makes it clear which value is which
print("The length of rt is", len(rt), "and the length of err is", len(err))
The length of rt is 23 and the length of err is 23
Question 15#
Print out all the values of RT, sorted from fastest to slowest. Do not modify the original order of RT values in doing this.
# print all the values in rt, sorted from fastest to slowest
print(sorted(rt))
[0.333356848, 0.389157455, 0.394252808, 0.401773569, 0.409225638, 0.435308188, 0.437877745, 0.442094359, 0.476159436, 0.479840688, 0.486154719, 0.495622859, 0.509801957, 0.518792127, 0.529661203, 0.534764366, 0.544626271, 0.565906723, 0.570404592, 0.573962432, 0.612854746, 0.714152493, 0.844916827]
Summary#
You should now have a good sense of how to work with lists in Python, including how to access specific values, how to add values, how to remove values, and how to find the length of a list
You should be beginning to understand how to use Python to answer questions about your data, such as how many trials there were, or how many errors were made
You should also have some understanding of how to use Python to clean your data, such as by removing extreme values
You should also have a sense of how to use Python to check your work, such as by printing out the values of a list to make sure they are sorted correctly
You should be developing your ability to read code (such as that generated by Copilot) and understand what it does, and how it does it
You should be developing your ability to critically evaluate code, identify errors in it, and fix those errors