Cleaning RT Data
KEY:
ds = dataset you are currently using.
DV = dependent variable of interest
IV = independent variable of interest
Subject = name of subject number column
XYXY = dummy name for a variable, matrix, or data frame into which you are moving information.
Topics:
*Create a column that contains RTs for accurate trials only
*Remove trials on the basis of some criteria
*Replace trial scores that are more than 2.5 standard deviations from a subject's mean score
Create a column that contains RTs for accurate trials only
You are creating a column named "Trim" that will contain the contents of a preexisting column named "RT".
This function will use another preexisting column named "ACC" (1 = correct response, 0 = incorrect response) to decide which trials will get moved into "Trim". You should change ACC to match the name in your dataset.
"[ds$ACC==1]" tells R "only do this in cases where the ACC column contains a 1".
_____________________________
ds$Trim[ds$ACC==1] <- ds$RT[ds$ACC==1]
_____________________________
Remove trials on the basis of some criteria
You may want to get rid of trials on which impossible responses were made (in this case, RT < 100) and trials on which no response was made.
Step 1: Create a column that will eventually contain a "1" for any row that needs removal. "0" means the row is safe.
_____________________________
ds$drop <-0
_____________________________
Step 2: Change "0" to "1" for any row that meets the removal criteria.
This command assumes there are columns named RT (reaction time) and RESP (response made). Adjust these names in accordance with your dataset.
_____________________________
ds$drop[ds$RT <100 | is.null(ds$RESP)] <-1
_____________________________
Notes:
* "|" means "or". Use "&" if you want the commands to function as a compound-criterion.
* [ ] after a column name effectively says "only mark as 1 when one of these conditions is met.
Step 3: Count the number of cases that were tagged for removal.
You will need the function "count" from the "plyr" package.
_____________________________
library(plyr)
RemovalCount <- count(ds$drop)
_____________________________
If you want removed items to be expressed as a percentage, type...
_____________________________
RemovalCount[2,2] / (RemovalCount[1,2] + RemovalCount[2,2])
_____________________________
Step 4: Make a new dataset that only contains rows where drop = 0
_____________________________
XYXY <- ds[ds$drop != 1,]
_____________________________
Notes:
* If you get an error "Error in `[.data.frame`(ds$Drop != 1) : undefined columns selected", then you probably forgot the ","
* [ds$drop != 1,] says "If drop not equal to 1"
Replace trial scores that are more than 2.5 standard deviations from a subject's mean score
Step 1. Generate values
First you need to generate mean and standard deviation scores for each participant (here called "SubjMean" and "SubjSD").
This can be accomplished via the "ave" command. Note that "ds$RT" refers to your preexisting column of RT values. Adjust the name as needed.
Then you create columns that represent the upper and lower bounds for replacement (here called "Upper" and "Lower".
_____________________________
ds$SubjMean <- ave(ds$RT, ds$Subject, FUN=mean)
ds$SubjSD <- ave(ds$RT, ds$Subject, FUN=sd)
ds$Upper <- ds$SubjMean + (2.5 * ds$SubjSD)
ds$Lower <- ds$SubjMean - (2.5 * ds$SubjSD)
_____________________________
Step 2. Replace Scores
Make a new column called "Trim" (or whatever you want) in which your preexisting column of rt scores (ds$RT) can be copied.
Next, replace any cases where "Trim" is greater than the upper bound, or less than the lower bound.
_____________________________
ds$Trim <- ds$RT
ds$Trim[ds$Trim > ds$Upper] <- ds$Upper[ds$Trim > ds$Upper]
ds$Trim[ds$Trim < ds$Lower] <- ds$Lower[ds$Trim < ds$Lower]
_____________________________
Step 3. Get count of number of replaced trials.
Make a column named "RTcount" (or whatever) and set it to zero.
Set RTcount to 1 any time your new TRIM column does not match your preexisting RT column.
_____________________________
ds$RTcount <- 0
ds$RTcount[ds$RT != ds$Trim] <- 1
library(plyr)
count(ds$RTcount)
_____________________________