Cleaning RT Data
ds = dataset you are currently using.
DV = dependent variable of interest
IV = independent variable of interest
Subject = name of subject number column
XYXY = dummy name for a variable, matrix, or data frame into which you are moving information.
Create a column that contains RTs for accurate trials only
You are creating a column named "Trim" that will contain the contents of a preexisting column named "RT".
This function will use another preexisting column named "ACC" (1 = correct response, 0 = incorrect response) to decide which trials will get moved into "Trim". You should change ACC to match the name in your dataset.
"[ds$ACC==1]" tells R "only do this in cases where the ACC column contains a 1".
ds$Trim[ds$ACC==1] <- ds$RT[ds$ACC==1]
Remove trials on the basis of some criteria
You may want to get rid of trials on which impossible responses were made (in this case, RT < 100) and trials on which no response was made.
Step 1: Create a column that will eventually contain a "1" for any row that needs removal. "0" means the row is safe.
Step 2: Change "0" to "1" for any row that meets the removal criteria.
This command assumes there are columns named RT (reaction time) and RESP (response made). Adjust these names in accordance with your dataset.
ds$drop[ds$RT <100 | is.null(ds$RESP)] <-1
* "|" means "or". Use "&" if you want the commands to function as a compound-criterion.
* [ ] after a column name effectively says "only mark as 1 when one of these conditions is met.
Step 3: Count the number of cases that were tagged for removal.
You will need the function "count" from the "plyr" package.
RemovalCount <- count(ds$drop)
If you want removed items to be expressed as a percentage, type...
RemovalCount[2,2] / (RemovalCount[1,2] + RemovalCount[2,2])
Step 4: Make a new dataset that only contains rows where drop = 0
XYXY <- ds[ds$drop != 1,]
* If you get an error "Error in `[.data.frame`(ds$Drop != 1) : undefined columns selected", then you probably forgot the ","
* [ds$drop != 1,] says "If drop not equal to 1"
Replace trial scores that are more than 2.5 standard deviations from a subject's mean score
Step 1. Generate values
First you need to generate mean and standard deviation scores for each participant (here called "SubjMean" and "SubjSD").
This can be accomplished via the "ave" command. Note that "ds$RT" refers to your preexisting column of RT values. Adjust the name as needed.
Then you create columns that represent the upper and lower bounds for replacement (here called "Upper" and "Lower".
ds$SubjMean <- ave(ds$RT, ds$Subject, FUN=mean)
ds$SubjSD <- ave(ds$RT, ds$Subject, FUN=sd)
ds$Upper <- ds$SubjMean + (2.5 * ds$SubjSD)
ds$Lower <- ds$SubjMean - (2.5 * ds$SubjSD)
Step 2. Replace Scores
Make a new column called "Trim" (or whatever you want) in which your preexisting column of rt scores (ds$RT) can be copied.
Next, replace any cases where "Trim" is greater than the upper bound, or less than the lower bound.
ds$Trim <- ds$RT
ds$Trim[ds$Trim > ds$Upper] <- ds$Upper[ds$Trim > ds$Upper]
ds$Trim[ds$Trim < ds$Lower] <- ds$Lower[ds$Trim < ds$Lower]
Step 3. Get count of number of replaced trials.
Make a column named "RTcount" (or whatever) and set it to zero.
Set RTcount to 1 any time your new TRIM column does not match your preexisting RT column.
ds$RTcount <- 0
ds$RTcount[ds$RT != ds$Trim] <- 1