echo=F
and eval=F
in selected R code chunks to suppress displaying tex output in the Rmd output. To have these code chunks produce .tex table output files, set eval=T
.local_data=FALSE
and (2) set your Dataverse API token in the line Sys.setenv("DATAVERSE_KEY" = "ENTER-YOUR-API-TOKEN-HERE")
. # Set seed
set.seed(20150229)
# Load data locally or from dataverse
# -- if TRUE: Data must be saved in the local root directory (specified in above code chunk)
# -- if FALSE: Data will load from Dataverse; you must have a Dataverse API Token
local_data <- TRUE
if(!local_data){
Sys.setenv("DATAVERSE_KEY" = "ENTER-YOUR-API-TOKEN-HERE") ##### Put your Dataverse API Token here.
Sys.setenv("DATAVERSE_SERVER" = "dataverse.harvard.edu")
}
source("code_0_read_data.R")
# Load helper functions and prepare data
source("code_1a_prep_data.R")
source("code_1b_helper_functions.R")
We start by showing how to do a simple manual replication of core results. The remainder of this code does this same thing in a more automated way across various combinations.
There are three treatment levels:
table(dat$TA)
0 1 2
279 174 200
These correspond to
The key comparisons are thus:
For simplicity the code uses TA1 as a dummy for m, and TA2 as a dummy for p
There are also three groups: Blacks (b), Hispanics (h), Whites (w). These yield three intergroup comparisons:
There are four outcomes that have been defined as between group differences (i.e., net discrimination measures). The stubs for these are:
outvars <- c("nmeet_", "index.", "ncb_", "noff_")
Let’s examine one set of outcome variables, the “ncb” (net discrimination in callbacks) variables:
head(dplyr::select(dat, starts_with("ncb")))
ncb_bh ncb_wb ncb_wh
1 0 -1 -1
2 0 0 0
3 0 0 0
4 0 0 0
5 0 1 1
6 1 -1 0
table(dat$ncb_wh)
-1 0 1
55 520 78
We see most of the time there is no difference betwen white and Hispanic testers on callbacks, but it is 50% more likely that white testers gets callbacks and Hispanic testers do not than vice versa.
The punitive-control comparison is done:
manual_model <- lm(ncb_wh ~ TA2 + as.factor(block), data=filter(dat, TA %in% c(0,2)), weights = ipw20)
stargazer(manual_model, omit = "block", type = "html")
Dependent variable: | |
ncb_wh | |
TA2 | -0.066 |
(0.041) | |
Constant | 0.068 |
(0.083) | |
Observations | 479 |
R2 | 0.032 |
Adjusted R2 | -0.004 |
Residual Std. Error | 0.633 (df = 461) |
F Statistic | 0.888 (df = 17; 461) |
Note: | p<0.1; p<0.05; p<0.01 |
This might be compared with:
difference_in_means(ncb_wh ~ TA2, blocks = block, data=filter(dat, (block !=5) & (TA %in% c(0,2))))
Design: Blocked
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
TA2 -0.06417621 0.04483847 -1.431276 0.1530515 -0.1522971 0.0239447 446
Note that we exclude the block 5 for lack of units to calculate withn block variance:
table(dat$TA2, dat$block)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
0 25 7 25 5 2 16 87 31 82 38 10 15 36 9 43 15 7
1 17 2 18 6 0 9 29 11 23 14 3 4 14 8 25 14 3
lm_robust(ncb_wh ~ TA2, fixed_effects = ~ block, data=filter(dat, TA %in% c(0,2)), weights = ipw20)
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
TA2 -0.06603413 0.04450348 -1.483797 0.1385462 -0.153489 0.02142069 461
with(data=filter(dat, (block !=5) & (TA %in% c(0,2))), table(ncb_wh, TA2))
TA2
ncb_wh 0 1
-1 18 20
0 225 159
1 35 21
Note that the main differences between the parametric and non parametric approach is that the non parametric approach allows between block heterogeneity in average effects; in addition rather than using the assignment propensities it implictly uses information on the empirical shares in each condition in each block.
Note when examining different models that the following generally need to be adjusted:
TA %in% c(0,2)
)The core replication code automates these multiple adjustments, but any individual analysis can be checked manually, as above.
Posterior simulation using approach from Gelman and Hill (arm
package):
out_2 <- sim(manual_model, 3000)
hist(out_2@coef[,2], main = paste(
"Effects of Punitive v Control on\nWhite v Hispanic Discrimination,
mean = ", round(mean(out_2@coef[,2]), 3),
", share neg =", round(mean(out_2@coef[,2] < 0), 3)
),
xlim = c(-.2, .1))
# ---------------------------------------------------------------------- #
# DEFINE COLNAMES FOR OUTPUT TABLES
col.labels <- c("Outcome", "Estimate","SE","t","P")
# mc: monitoring - control
# pc: punitive - control
# pm: punitive - monitoring
# TA1: monitoring
# TA2: punitive
outvars <- c("nmeet_", "index.", "ncb_", "noff_")
reg_and_summ <- function(outvar, compare){
depvar <- paste0(outvar, compare)
model.mc <- paste(depvar," ~ TA1 + as.factor(block)",sep="")
model.pc <- paste(depvar," ~ TA2 + as.factor(block)",sep="")
model.pm <- paste(depvar," ~ TA2 + as.factor(block)",sep="")
fit.mc <- lm(formula=model.mc, data=dat[dat$TA %in% c(0,1),], weights = ipw10)
fit.pc <- lm(formula=model.pc, data=dat[dat$TA %in% c(0,2),], weights = ipw20)
fit.pm <- lm(formula=model.pm, data=dat[dat$TA %in% c(1,2),], weights = ipw21)
itt.mc <- summary(fit.mc)$coefficients[2,]
itt.pc <- summary(fit.pc)$coefficients[2,]
itt.pm <- summary(fit.pm)$coefficients[2,]
if(compare == "wb" | compare == "wh" ){
itt.mc[4] <- pt(coef(summary(fit.mc))[,3], summary(fit.mc)$df[2], lower=TRUE)[2] #one sided p
itt.pc[4] <- pt(coef(summary(fit.pc))[,3], summary(fit.pc)$df[2], lower=TRUE)[2] #one sided p
}
list(mc = itt.mc, pc = itt.pc, pm = itt.pm,
fit.mc = fit.mc, fit.pc = fit.pc, fit.pm = fit.pm )
}
# Parametric Differences in Differences Results - White / Black Comparisons
itt.wb <- lapply(outvars, reg_and_summ, compare = "wb" )
fit.wb.mc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$fit.mc)
fit.wb.pc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$fit.pc)
fit.wb.pm <- sapply(1:length(outvars), function(i) itt.wb[[i]]$fit.pm)
# Parametric Differences in Differences Results - White / Hispanic Comparisons
itt.wh <- lapply(outvars, reg_and_summ, compare = "wh" )
fit.wh.mc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$fit.mc)
fit.wh.pc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$fit.pc)
fit.wh.pm <- sapply(1:length(outvars), function(i) itt.wh[[i]]$fit.pm)
# Parametric Differences in Differences Results - Black / Hispanic Comparisons
itt.bh <- lapply(outvars, reg_and_summ, compare = "bh" )
fit.bh.mc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$fit.mc)
fit.bh.pc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$fit.pc)
fit.bh.pm <- sapply(1:length(outvars), function(i) itt.bh[[i]]$fit.pm)
# WB MC
stargazer(fit.wb.mc[[1]], fit.wb.mc[[2]], fit.wb.mc[[3]], fit.wb.mc[[4]],
omit = "block", type = "html",
p = lapply(fit.wb.mc, function(fit)
pt(coef(summary(fit))[,3], summary(fit)$df[2], lower=TRUE) ))
Dependent variable: | ||||
nmeet_wb | index.wb | ncb_wb | noff_wb | |
(1) | (2) | (3) | (4) | |
TA1 | 0.051 | -0.015 | -0.002 | -0.003 |
(0.038) | (0.052) | (0.045) | (0.036) | |
Constant | -0.103 | 0.020 | 0.121 | -0.039 |
(0.081) | (0.104) | (0.097) | (0.078) | |
Observations | 453 | 331 | 453 | 453 |
R2 | 0.065 | 0.058 | 0.035 | 0.041 |
Adjusted R2 | 0.029 | 0.007 | -0.003 | 0.004 |
Residual Std. Error | 0.560 (df = 435) | 0.641 (df = 313) | 0.673 (df = 435) | 0.538 (df = 435) |
F Statistic | 1.791** (df = 17; 435) | 1.129 (df = 17; 313) | 0.929 (df = 17; 435) | 1.099 (df = 17; 435) |
Note: | p<0.1; p<0.05; p<0.01 |
# WB PC
stargazer(fit.wb.pc[[1]], fit.wb.pc[[2]], fit.wb.pc[[3]], fit.wb.pc[[4]],
omit = "block", type = "html" )
Dependent variable: | ||||
nmeet_wb | index.wb | ncb_wb | noff_wb | |
(1) | (2) | (3) | (4) | |
TA2 | 0.039 | 0.007 | 0.019 | 0.018 |
(0.038) | (0.053) | (0.042) | (0.035) | |
Constant | -0.054 | 0.043 | 0.054 | -0.010 |
(0.077) | (0.103) | (0.086) | (0.071) | |
Observations | 479 | 354 | 479 | 479 |
R2 | 0.042 | 0.012 | 0.043 | 0.055 |
Adjusted R2 | 0.007 | -0.038 | 0.007 | 0.020 |
Residual Std. Error | 0.585 (df = 461) | 0.686 (df = 336) | 0.649 (df = 461) | 0.535 (df = 461) |
F Statistic | 1.196 (df = 17; 461) | 0.234 (df = 17; 336) | 1.209 (df = 17; 461) | 1.578* (df = 17; 461) |
Note: | p<0.1; p<0.05; p<0.01 |
# WB PM
stargazer(fit.wb.pm[[1]], fit.wb.pm[[2]], fit.wb.pm[[3]], fit.wb.pm[[4]],
omit = "block", type = "html",
p = lapply(fit.wb.pm, function(fit)
pt(coef(summary(fit))[,3], summary(fit)$df[2], lower=TRUE) ))
Dependent variable: | ||||
nmeet_wb | index.wb | ncb_wb | noff_wb | |
(1) | (2) | (3) | (4) | |
TA2 | -0.005 | 0.018 | 0.033 | 0.026 |
(0.045) | (0.058) | (0.049) | (0.037) | |
Constant | 0.039 | 0.189 | 0.087 | 0.020 |
(0.085) | (0.102) | (0.093) | (0.070) | |
Observations | 374 | 263 | 374 | 374 |
R2 | 0.062 | 0.052 | 0.015 | 0.054 |
Adjusted R2 | 0.017 | -0.009 | -0.032 | 0.009 |
Residual Std. Error | 0.605 (df = 356) | 0.638 (df = 246) | 0.656 (df = 356) | 0.499 (df = 356) |
F Statistic | 1.385 (df = 17; 356) | 0.850 (df = 16; 246) | 0.324 (df = 17; 356) | 1.205 (df = 17; 356) |
Note: | p<0.1; p<0.05; p<0.01 |
# WH MC
stargazer(fit.wh.mc[[1]], fit.wh.mc[[2]], fit.wh.mc[[3]], fit.wh.mc[[4]],
omit = "block", type = "html",
p = lapply(fit.wh.mc, function(fit)
pt(coef(summary(fit))[,3], summary(fit)$df[2], lower=TRUE) ) )
Dependent variable: | ||||
nmeet_wh | index.wh | ncb_wh | noff_wh | |
(1) | (2) | (3) | (4) | |
TA1 | -0.024 | -0.061 | -0.036 | -0.017 |
(0.041) | (0.057) | (0.043) | (0.034) | |
Constant | -0.110 | 0.072 | 0.176 | 0.047 |
(0.087) | (0.113) | (0.093) | (0.073) | |
Observations | 453 | 318 | 453 | 453 |
R2 | 0.093 | 0.031 | 0.039 | 0.054 |
Adjusted R2 | 0.058 | -0.024 | 0.001 | 0.017 |
Residual Std. Error | 0.600 (df = 435) | 0.693 (df = 300) | 0.641 (df = 435) | 0.506 (df = 435) |
F Statistic | 2.638*** (df = 17; 435) | 0.571 (df = 17; 300) | 1.040 (df = 17; 435) | 1.464 (df = 17; 435) |
Note: | p<0.1; p<0.05; p<0.01 |
# WH PC
stargazer(fit.wh.pc[[1]], fit.wh.pc[[2]], fit.wh.pc[[3]], fit.wh.pc[[4]],
omit = "block", type = "html",
p = lapply(fit.wh.pc, function(fit)
pt(coef(summary(fit))[,3], summary(fit)$df[2], lower=TRUE) ))
Dependent variable: | ||||
nmeet_wh | index.wh | ncb_wh | noff_wh | |
(1) | (2) | (3) | (4) | |
TA2 | -0.003 | 0.048 | -0.066* | -0.021 |
(0.039) | (0.055) | (0.041) | (0.034) | |
Constant | 0.002 | 0.085 | 0.068 | 0.012 |
(0.078) | (0.107) | (0.083) | (0.069) | |
Observations | 479 | 346 | 479 | 479 |
R2 | 0.048 | 0.068 | 0.032 | 0.059 |
Adjusted R2 | 0.013 | 0.020 | -0.004 | 0.024 |
Residual Std. Error | 0.594 (df = 461) | 0.714 (df = 328) | 0.633 (df = 461) | 0.524 (df = 461) |
F Statistic | 1.367 (df = 17; 461) | 1.405 (df = 17; 328) | 0.888 (df = 17; 461) | 1.702** (df = 17; 461) |
Note: | p<0.1; p<0.05; p<0.01 |
# WH PM
stargazer(fit.wh.pm[[1]], fit.wh.pm[[2]], fit.wh.pm[[3]], fit.wh.pm[[4]],
omit = "block", type = "html")
Dependent variable: | ||||
nmeet_wh | index.wh | ncb_wh | noff_wh | |
(1) | (2) | (3) | (4) | |
TA2 | 0.017 | 0.107* | -0.019 | 0.002 |
(0.045) | (0.063) | (0.049) | (0.038) | |
Constant | 0.025 | 0.127 | -0.024 | -0.037 |
(0.084) | (0.112) | (0.093) | (0.072) | |
Observations | 374 | 268 | 374 | 374 |
R2 | 0.047 | 0.089 | 0.027 | 0.049 |
Adjusted R2 | 0.001 | 0.031 | -0.020 | 0.004 |
Residual Std. Error | 0.598 (df = 356) | 0.703 (df = 251) | 0.661 (df = 356) | 0.511 (df = 356) |
F Statistic | 1.029 (df = 17; 356) | 1.539* (df = 16; 251) | 0.578 (df = 17; 356) | 1.077 (df = 17; 356) |
Note: | p<0.1; p<0.05; p<0.01 |
# BH MC
stargazer(fit.bh.mc[[1]], fit.bh.mc[[2]], fit.bh.mc[[3]], fit.bh.mc[[4]],
omit = "block", type = "html")
Dependent variable: | ||||
nmeet_bh | index.bh | ncb_bh | noff_bh | |
(1) | (2) | (3) | (4) | |
TA1 | -0.075* | -0.089* | -0.035 | -0.014 |
(0.042) | (0.052) | (0.042) | (0.031) | |
Constant | -0.007 | 0.057 | 0.055 | 0.086 |
(0.089) | (0.098) | (0.090) | (0.066) | |
Observations | 453 | 316 | 453 | 453 |
R2 | 0.088 | 0.058 | 0.037 | 0.012 |
Adjusted R2 | 0.053 | 0.004 | -0.0002 | -0.027 |
Residual Std. Error | 0.614 (df = 435) | 0.636 (df = 298) | 0.624 (df = 435) | 0.455 (df = 435) |
F Statistic | 2.477*** (df = 17; 435) | 1.075 (df = 17; 298) | 0.995 (df = 17; 435) | 0.309 (df = 17; 435) |
Note: | p<0.1; p<0.05; p<0.01 |
# BH PC
stargazer(fit.bh.pc[[1]], fit.bh.pc[[2]], fit.bh.pc[[3]], fit.bh.pc[[4]],
omit = "block", type = "html")
Dependent variable: | ||||
nmeet_bh | index.bh | ncb_bh | noff_bh | |
(1) | (2) | (3) | (4) | |
TA2 | -0.042 | 0.039 | -0.085** | -0.039 |
(0.041) | (0.049) | (0.041) | (0.033) | |
Constant | 0.056 | 0.052 | 0.015 | 0.021 |
(0.082) | (0.090) | (0.082) | (0.068) | |
Observations | 479 | 340 | 479 | 479 |
R2 | 0.051 | 0.069 | 0.039 | 0.023 |
Adjusted R2 | 0.016 | 0.020 | 0.004 | -0.013 |
Residual Std. Error | 0.624 (df = 461) | 0.625 (df = 322) | 0.623 (df = 461) | 0.512 (df = 461) |
F Statistic | 1.452 (df = 17; 461) | 1.400 (df = 17; 322) | 1.109 (df = 17; 461) | 0.634 (df = 17; 461) |
Note: | p<0.1; p<0.05; p<0.01 |
# BH PM
stargazer(fit.bh.pm[[1]], fit.bh.pm[[2]], fit.bh.pm[[3]], fit.bh.pm[[4]],
omit = "block", type = "html")
Dependent variable: | ||||
nmeet_bh | index.bh | ncb_bh | noff_bh | |
(1) | (2) | (3) | (4) | |
TA2 | 0.023 | 0.139** | -0.052 | -0.024 |
(0.051) | (0.057) | (0.047) | (0.036) | |
Constant | -0.014 | -0.077 | -0.111 | -0.057 |
(0.096) | (0.100) | (0.088) | (0.067) | |
Observations | 374 | 250 | 374 | 374 |
R2 | 0.067 | 0.118 | 0.036 | 0.036 |
Adjusted R2 | 0.023 | 0.057 | -0.010 | -0.010 |
Residual Std. Error | 0.683 (df = 356) | 0.622 (df = 233) | 0.626 (df = 356) | 0.477 (df = 356) |
F Statistic | 1.511* (df = 17; 356) | 1.943** (df = 16; 233) | 0.793 (df = 17; 356) | 0.785 (df = 17; 356) |
Note: | p<0.1; p<0.05; p<0.01 |
# Parametric Differences in Differences Results -White / Black Comparisons
itt.wb.mc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$mc)
itt.wb.pc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$pc)
itt.wb.pm <- sapply(1:length(outvars), function(i) itt.wb[[i]]$pm)
# Parametric Differences in Differences Results - White / Hispanic Comparisons
itt.wh.mc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$mc)
itt.wh.pc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$pc)
itt.wh.pm <- sapply(1:length(outvars), function(i) itt.wh[[i]]$pm)
# Parametric Differences in Differences Results - Black / Hispanic Comparisons
itt.bh.mc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$mc)
itt.bh.pc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$pc)
itt.bh.pm <- sapply(1:length(outvars), function(i) itt.bh[[i]]$pm)
#----- CIs -----#
df.w_ <- sapply(list(df.wb.mc = fit.wb.mc,
df.wb.pc = fit.wb.pc,
df.wb.pm = fit.wb.pm,
df.wh.mc = fit.wh.mc,
df.wh.pc = fit.wh.pc,
df.wh.pm = fit.wh.pm,
df.bh.mc = fit.bh.mc,
df.bh.pc = fit.bh.pc,
df.bh.pm = fit.bh.pm),
function(fit){
unlist(lapply(fit, function(x) summary(x)$df[2]))[2:4]})
df.2g.10 <- c(df.w_[,c("df.wb.mc", "df.wh.mc", "df.bh.mc")])
df.2g.20 <- c(df.w_[,c("df.wb.pc", "df.wh.pc", "df.bh.pc")])
df.2g.21 <- c(df.w_[,c("df.wb.pm", "df.wh.pm", "df.bh.pm")])
cv.2g.10 <- qt(p=0.025, df=df.2g.10, lower.tail=TRUE)
cv.2g.20 <- qt(p=0.025, df=df.2g.20, lower.tail=TRUE)
cv.2g.21 <- qt(p=0.025, df=df.2g.21, lower.tail=TRUE)
cv.2g <- c(cv.2g.10, cv.2g.20, cv.2g.21)
# Stack output
out2 <- rbind(itt.wb.mc, itt.wh.mc, itt.bh.mc,
itt.wb.pc, itt.wh.pc, itt.bh.pc,
itt.wb.pm, itt.wh.pm, itt.bh.pm)
out2 <- out2[-seq(1,33,4),]
# Round and format results
for(i in 2:ncol(out2)) out2[,i] <- round(as.numeric(out2[,i]),3)
out2[,"P"] <- paste("(", out2[,"P"],")",sep="")
itt.2g.ci <- as.numeric(out2[,2]) +
abs(cv.2g)*cbind(lower.ci = -as.numeric(out2[,3]), upper.ci = as.numeric(out2[,3]))
itt.est <- cbind(out2, itt.2g.ci)
outlist <-
lapply(c(cb.reg = "callback" ,
off.reg = "offer",
meindex.reg = "Index"),
function(keyword){
# Grab relevant outcome row
itt <- itt.est[grepl(keyword, itt.est[,1], fixed=TRUE),]
with(as.data.frame(itt),
list(coef=matrix(Estimate, nrow=3, ncol=3, byrow=FALSE),
se=matrix(SE, nrow=3, ncol=3, byrow=FALSE),
lower=matrix(lower.ci, nrow=3, ncol=3, byrow=FALSE),
upper=matrix(upper.ci, nrow=3, ncol=3, byrow=FALSE))) })
cb.reg <- outlist$cb.reg
off.reg <- outlist$off.reg
meindex.reg <- outlist$meindex.reg
# Create and rename variables
dat$cb_w <- dat$cb_C
dat$cb_h <- dat$cb_B
dat$cb_b <- dat$cb_A
dat$off_w <- dat$off_C
dat$off_h <- dat$off_B
dat$off_b <- dat$off_A
dat$meindex_w <- dat$meindex.impute_C
dat$meindex_h <- dat$meindex.impute_B
dat$meindex_b <- dat$meindex.impute_A
dat$meindex_wb <- dat$meindex.impute_C - dat$meindex.impute_A
dat$meindex_wh <- dat$meindex.impute_C - dat$meindex.impute_A
dat$meindex_bh <- dat$meindex.impute_A - dat$meindex.impute_B
# quad[1,1] --------------------------------------------
# Function to create each single row in data.frame
f_11 <- function(Y, TA = dat$TA, W = dat$ipw, a) {
N <- sum(TA==a)
out <- data.frame(estimate = weighted.mean(Y[TA==a], W[TA==a]))
out$std.error <- sqrt(wtd.var(Y[TA==a], W[TA==a])/N)
out$df <- N - 1
error <- abs(qt(p=.025, df= out$df, lower.tail=TRUE))
out$conf.low <- out$estimate - error*out$std.error
out$conf.high <- out$estimate + error*out$std.error
out
}
# function to call f_a11() for each level
quad_11 <- function(Y){
outcomes <- paste0(Y, c("_w", "_b", "_h"))
Ys <- dat[, outcomes]
rbind(
w_c = f_11(Ys[,1], a = 0),
b_c = f_11(Ys[,2], a = 0),
h_c = f_11(Ys[,3], a = 0),
w_m = f_11(Ys[,1], a = 1),
b_m = f_11(Ys[,2], a = 1),
h_m = f_11(Ys[,3], a = 1),
w_p = f_11(Ys[,1], a = 2),
b_p = f_11(Ys[,2], a = 2),
h_p = f_11(Ys[,3], a = 2))
}
# quad[1,2] --------------------------------------------
f_12 <- function(Y, a, b, TA = dat$TA, W = dat$ipw) {
X <- (TA==a)[TA==a | TA==b]
lm12 <- lm(Y[TA==a | TA==b] ~ X, weight = W[TA==a | TA==b])
summ <- summary(lm12)
out <- tidy(summ)
out$df <- summ$df[2]
error <- abs(qt(p=.025, df= out$df, lower.tail=TRUE))
out$conf.low <- out$estimate - error*out$std.error
out$conf.high <- out$estimate + error*out$std.error
as.data.frame(out[out$term == "XTRUE", ] )
}
# function to call f_a11() for each level
quad_12 <- function(Y){
outcomes <- paste0(Y, c("_w", "_b", "_h"))
Ys <- dat[, outcomes]
rbind(
w_mc = f_12(Ys[,1], a = 1, b = 0),
b_mc = f_12(Ys[,2], a = 1, b = 0),
h_mc = f_12(Ys[,3], a = 1, b = 0),
w_pc = f_12(Ys[,1], a = 2, b = 0),
b_pc = f_12(Ys[,2], a = 2, b = 0),
h_pc = f_12(Ys[,3], a = 2, b = 0),
w_pm = f_12(Ys[,1], a = 2, b = 1),
b_pm = f_12(Ys[,2], a = 2, b = 1),
h_pm = f_12(Ys[,3], a = 2, b = 1))
}
# quad[2,1] --------------------------------------------
f_21 <- function(Y1, Y2, a, TA = dat$TA, W = dat$ipw) {
out <- data.frame(estimate = weighted.mean((Y1-Y2)[TA==a], W[TA==a]))
ttest <- wtd.t.test((Y1-Y2)[TA==a], weight=W[TA==a])
out$std.error <- ttest$additional[4]
out$p <- ttest$coefficient[3]
out$df <- ttest$coefficient[2]
error <- abs(qt(p=.025, df= out$df, lower.tail=TRUE))
out$conf.low <- out$estimate - error*out$std.error
out$conf.high <- out$estimate + error*out$std.error
as.data.frame(out)
}
quad_21 <- function(Y){
outcomes <- paste0(Y, c("_w", "_b", "_h"))
Ys <- dat[, outcomes]
rbind(
wb_c = f_21(Ys[,1], Ys[,2], a = 0),
wh_c = f_21(Ys[,1], Ys[,3], a = 0),
bh_c = f_21(Ys[,2], Ys[,3], a = 0),
wb_m = f_21(Ys[,1], Ys[,2], a = 1),
wh_m = f_21(Ys[,1], Ys[,3], a = 1),
bh_m = f_21(Ys[,2], Ys[,3], a = 1),
wb_p = f_21(Ys[,1], Ys[,2], a = 2),
wh_p = f_21(Ys[,1], Ys[,3], a = 2),
bh_p = f_21(Ys[,2], Ys[,3], a = 2))
}
# quad[2,2] --------------------------------------------
# Non parametric Differences in Differences Results
# It excludes block 5 which has too little variance to estimate a variance term.
f_22 <- function(Y, TA = "TA", C1, C2, exclude, data = dat){
dim22 <- difference_in_means(formula(paste(Y, "~" ,TA)),
blocks = block,
data=filter(data, block !=5),
condition1 = C1,
condition2 = C2,
subset = TA!= exclude)
tidy(dim22)
}
quad_22 <- function(Y){
Ys <- paste0(Y, c("_wb", "_wh", "_bh"))
rbind(
wb_mc = f_22(Ys[1], C1 = 0, C2= 1, exclude = 2),
wh_mc = f_22(Ys[2], C1 = 0, C2= 1, exclude = 2),
bh_mc = f_22(Ys[3], C1 = 0, C2= 1, exclude = 2),
wb_pc = f_22(Ys[1], C1 = 0, C2= 2, exclude = 1),
wh_pc = f_22(Ys[2], C1 = 0, C2= 2, exclude = 1),
bh_pc = f_22(Ys[3], C1 = 0, C2= 2, exclude = 1),
wb_pm = f_22(Ys[1], C1 = 1, C2= 2, exclude = 0),
wh_pm = f_22(Ys[2], C1 = 1, C2= 2, exclude = 0),
bh_pm = f_22(Ys[3], C1 = 1, C2= 2, exclude = 0))
}
# Produce tables
cb_11 <- quad_11("cb")
cb_12 <- quad_12("cb")
cb_21 <- quad_21("cb")
cb_22 <- quad_22("ncb")
off_11 <- quad_11("off")
off_12 <- quad_12("off")
off_21 <- quad_21("off")
off_22 <- quad_22("noff")
off_11 <- quad_11("off")
off_12 <- quad_12("off")
off_21 <- quad_21("off")
off_22 <- quad_22("noff")
index_11 <- quad_11("meindex")
index_12 <- quad_12("meindex")
index_21 <- quad_21("meindex" )
index_22 <- quad_22("meindex")
# HELPER FUNCTIONS
#================#
# Plot function used in Figure 1
#================#
# Take matrices CB, OFF, MEINDEX
# Take matrices CB.REG, OFF.REG, MEINDEX.REG
# plot with a single set of three or two threes
plot.helper <- function(coefs1=1:3, sds1=3:5, lower1, upper1, coefs2=NA, sds2=NA, lower2, upper2, double=FALSE, xlim=NULL ){
coefplot(coefs1, sds1, CI=2, lower.conf.bounds=lower1, upper.conf.bounds=upper1, pch.pts=21, varnames="", main = "", h.axis = FALSE, xlim = xlim, cex.pts=3, lwd=3)
if(double) coefplot(coefs2, sds2, CI=2, lower.conf.bounds=lower2, upper.conf.bounds=upper2, add=TRUE, pch.pts=19, cex.pts=3, lwd=3)
axis(side=1, line=2, cex.axis=1.5, padj=1)
}
itt.plot <- function(y11,y12, y21, y22_np, y_22_p, xlim){
# Raw means ci
# upper left quadrant
for(i in 1:3) {
trt = list(control = c("w_c", "b_c","h_c"), # i = 1
monitoring = c("w_m", "b_m", "h_m"), # i = 2
punitive = c("w_p", "b_p", "h_p")) # i = 3
plot.helper(coefs1 = rev(y11[trt[[i]], "estimate" ]),
sds1 = rev(y11[trt[[i]], "std.error"]),
lower1 = rev(y11[trt[[i]], "conf.low" ]),
upper1 = rev(y11[trt[[i]], "conf.high"]),
xlim = xlim[[1]])}
# D-I-M between treatments conditional on each group - upper right quadrant
for(i in 1:3) {
trt = list(mc = c("w_mc", "b_mc","h_mc"), # i = 1
pc = c("w_pc", "b_pc", "h_pc"), # i = 2
pm = c("w_pm", "b_pm", "h_pm")) # i = 3
plot.helper(coefs1 = rev(y12[trt[[i]], "estimate" ]),
sds1 = rev(y12[trt[[i]], "std.error"]),
lower1 = rev(y12[trt[[i]], "conf.low" ]),
upper1 = rev(y12[trt[[i]], "conf.high"]),
xlim = xlim[[2]])}
for(i in 1:3) {
trt = list(c = c("wb_c", "wh_c", "bh_c"), # i = 1
m = c("wb_m", "wh_m", "bh_m"), # i = 2
p = c("wb_p", "wh_p", "bh_p")) # i = 3
plot.helper(coefs1 = rev(y21[trt[[i]], "estimate" ]),
sds1 = rev(y21[trt[[i]], "std.error"]),
lower1 = rev(y21[trt[[i]], "conf.low" ]),
upper1 = rev(y21[trt[[i]], "conf.high"]),
xlim = xlim[[3]])
if ( i == 1 ) { # highlight control group
abline(h=3, col=rgb(0,0,0,.1), lwd=40)
abline(h=2, col=rgb(0,0,0,.1), lwd=40)}}
# D-I-M between groups conditional on treatment assigment - lower left quadrant
# lower right quadrant
for(i in 1:3) {
trt = list(c = c("wb_mc", "wh_mc","bh_mc"), # i = 1
m = c("wb_pc", "wh_pc", "bh_pc"), # i = 2
p = c("wb_pm", "wh_pm", "bh_pm")) # i = 3
plot.helper(coefs1 = rev(y22_np[trt[[i]], "estimate" ]),
sds1 = rev(y22_np[trt[[i]], "std.error"]),
lower1 = rev(y22_np[trt[[i]], "conf.low" ]),
upper1 = rev(y22_np[trt[[i]], "conf.high"]),
coefs2=rev(y_22_p$coef[1:3,(i )]),
sds2=rev(y_22_p$se[1:3,(i)]),
lower2=rev(y_22_p$lower[1:3,(i)]),
upper2=rev(y_22_p$upper[1:3,(i)]),
double=TRUE, xlim=xlim[[4]])
if ( i == 5 ){ # highlight P-C
abline(h=3, col=rgb(0,0,0,.1), lwd=40)
abline(h=2, col=rgb(0,0,0,.1), lwd=40) }}
}
## MAKE PDF FIGURES FOR PAPER AND PNG FIGURES FOR RMD FILE
## CALLBACKS
# Make PNG version for Rmd output
png("out_figure1a_6x6_cb.png", width=png_width, height=png_height)
par(oma=c(5,0,0,0), mar=c(5,0,0,0))
layout(matrix(layout_vec, nrow=5, ncol=9, byrow=TRUE), widths=width_spec, heights=height_spec)
# plots
xlims <- list(c(0, .25), c(-.15, .15), c(-.15, .15), c(-.2, .2))
itt.plot(y11 = cb_11, y12 = cb_12, y21 = cb_21, y22_np = cb_22, y_22_p = cb.reg, xlims)
# dividers
frame(); abline(h=.5, lwd=2)
frame(); abline(v=.5, lwd=2)
# row labels
frame(); text(.6, .85, "White", cex=rowsub.cex); text(.6, .5, "Black", cex=rowsub.cex); text(.6, .15, "Hisp.", cex=rowsub.cex)
frame(); text(.6, .85, "W-B", cex=rowsub.cex); text(.6, .5, "W-H", cex=rowsub.cex); text(.6, .15, "B-H", cex=rowsub.cex)
# col labels
for(i in 1:6) {
frame(); text(.5, .5, col_headings[i], cex=colhead.cex)
}
# meta col labels
frame(); text(.5, .5, "Treatment Group Means", cex=colhead.cex)
frame(); text(.5, .5, "Differences between Treatment Groups", cex=colhead.cex)
# meta row labels
frame(); text(.5, .5, "Tester Race\nGroup Means", cex=colhead.cex, srt=90)
frame(); text(.5, .5, "Intergroup\nDifferences", cex=colhead.cex, srt=90)
dev.off()
quartz_off_screen
2
## OFFERS
# Make PNG version for Rmd output
png("out_figure1b_6x6_off.png", width=png_width, height=png_height)
par(oma=c(5,0,0,0), mar=c(5,0,0,0))
layout(matrix(layout_vec, nrow=5, ncol=9, byrow=TRUE), widths=width_spec, heights=height_spec)
# plots
xlims <- list(c(0, .15), c(-.1, .1), c(-.1, .1), c(-.2, .2))
itt.plot(y11 = off_11, y12 = off_12, y21 = off_21, y22_np = off_22, y_22_p = off.reg, xlims)
# dividers
frame(); abline(h=.5, lwd=2)
frame(); abline(v=.5, lwd=2)
# row labels
frame(); text(.6, .85, "White", cex=rowsub.cex); text(.6, .5, "Black", cex=rowsub.cex); text(.6, .15, "Hisp.", cex=rowsub.cex)
frame(); text(.6, .85, "W-B", cex=rowsub.cex); text(.6, .5, "W-H", cex=rowsub.cex); text(.6, .15, "B-H", cex=rowsub.cex)
# col labels
for(i in 1:6) {
frame(); text(.5, .5, col_headings[i], cex=colhead.cex)
}
# meta col labels
frame(); text(.5, .5, "Treatment Group Means", cex=colhead.cex)
frame(); text(.5, .5, "Differences between Treatment Groups", cex=colhead.cex)
# meta row labels
frame(); text(.5, .5, "Tester Race\nGroup Means", cex=colhead.cex, srt=90)
frame(); text(.5, .5, "Intergroup\nDifferences", cex=colhead.cex, srt=90)
dev.off()
quartz_off_screen
2
## INDEX
# Make PNG version for Rmd output
png("out_figurea3_6x6_index.png", width=2400, height=1100)
par(oma=c(5,0,0,0), mar=c(5,0,0,0))
layout(matrix(layout_vec, nrow=5, ncol=9, byrow=TRUE), widths=width_spec, heights=height_spec)
# plots
xlims <- list(c(-.3, .3), c(-.35, .35), c(-.35, .35), c(-.35, .35))
itt.plot(y11 = index_11, y12 = index_12, y21 = index_21, y22_np = index_22,
y_22_p =meindex.reg, xlims)
# dividers
frame(); abline(h=.5, lwd=2)
frame(); abline(v=.5, lwd=2)
# row labels
frame(); text(.6, .85, "White", cex=rowsub.cex); text(.6, .5, "Black", cex=rowsub.cex); text(.6, .15, "Hisp.", cex=rowsub.cex)
frame(); text(.6, .85, "W-B", cex=rowsub.cex); text(.6, .5, "W-H", cex=rowsub.cex); text(.6, .15, "B-H", cex=rowsub.cex)
# col labels
for(i in 1:6) {
frame(); text(.5, .5, col_headings[i], cex=colhead.cex)
}
# meta col labels
frame(); text(.5, .5, "Treatment Group Means", cex=colhead.cex)
frame(); text(.5, .5, "Differences between Treatment Groups", cex=colhead.cex)
# meta row labels
frame(); text(.5, .5, "Tester Race\nGroup Means", cex=colhead.cex, srt=90)
frame(); text(.5, .5, "Intergroup\nDifferences", cex=colhead.cex, srt=90)
dev.off()
quartz_off_screen
2