Table of Contents

1 Housekeeping

  # Set seed
  set.seed(20150229)

  # Load data locally or from dataverse
  # -- if TRUE: Data must be saved in the local root directory (specified in above code chunk)
  # -- if FALSE: Data will load from Dataverse; you must have a Dataverse API Token
  local_data <- TRUE
  if(!local_data){
    Sys.setenv("DATAVERSE_KEY" = "ENTER-YOUR-API-TOKEN-HERE")  ##### Put your Dataverse API Token here.
    Sys.setenv("DATAVERSE_SERVER" = "dataverse.harvard.edu")
  }
  source("code_0_read_data.R")

  # Load helper functions and prepare data
  source("code_1a_prep_data.R") 
  source("code_1b_helper_functions.R") 

2 Manual walkthrough

We start by showing how to do a simple manual replication of core results. The remainder of this code does this same thing in a more automated way across various combinations.

2.1 Treatment Levels

There are three treatment levels:

table(dat$TA)

  0   1   2 
279 174 200 

These correspond to

  1. control (c)
  2. monitoring (m) and
  3. punitive (p).

The key comparisons are thus:

  • mc (1 v 0)
  • pc (2 v 0)
  • pm (2 v 1)

For simplicity the code uses TA1 as a dummy for m, and TA2 as a dummy for p

2.2 Racial groups

There are also three groups: Blacks (b), Hispanics (h), Whites (w). These yield three intergroup comparisons:

  • wb (white - Black)
  • wh (white - Hispanic)
  • bh (Black - Hispanic)

2.3 Outcomes

There are four outcomes that have been defined as between group differences (i.e., net discrimination measures). The stubs for these are:

outvars <- c("nmeet_", "index.", "ncb_", "noff_")

Let’s examine one set of outcome variables, the “ncb” (net discrimination in callbacks) variables:

head(dplyr::select(dat, starts_with("ncb")))
  ncb_bh ncb_wb ncb_wh
1      0     -1     -1
2      0      0      0
3      0      0      0
4      0      0      0
5      0      1      1
6      1     -1      0
table(dat$ncb_wh)

 -1   0   1 
 55 520  78 

We see most of the time there is no difference betwen white and Hispanic testers on callbacks, but it is 50% more likely that white testers gets callbacks and Hispanic testers do not than vice versa.

2.4 Parametric Analysis

The punitive-control comparison is done:

manual_model <- lm(ncb_wh ~ TA2 + as.factor(block), data=filter(dat, TA %in% c(0,2)), weights = ipw20)
stargazer(manual_model,  omit = "block", type = "html")
Dependent variable:
ncb_wh
TA2 -0.066
(0.041)
Constant 0.068
(0.083)
Observations 479
R2 0.032
Adjusted R2 -0.004
Residual Std. Error 0.633 (df = 461)
F Statistic 0.888 (df = 17; 461)
Note: p<0.1; p<0.05; p<0.01

This might be compared with:

  1. The “non-parametric” analysis which uses blocked differences in means
difference_in_means(ncb_wh ~ TA2, blocks = block, data=filter(dat, (block !=5) & (TA %in% c(0,2))))
Design:  Blocked 
       Estimate Std. Error   t value  Pr(>|t|)   CI Lower  CI Upper  DF
TA2 -0.06417621 0.04483847 -1.431276 0.1530515 -0.1522971 0.0239447 446

Note that we exclude the block 5 for lack of units to calculate withn block variance:

table(dat$TA2, dat$block)
   
     1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
  0 25  7 25  5  2 16 87 31 82 38 10 15 36  9 43 15  7
  1 17  2 18  6  0  9 29 11 23 14  3  4 14  8 25 14  3
  1. A version of lm that uses robust standard errors
lm_robust(ncb_wh ~ TA2, fixed_effects =  ~ block, data=filter(dat, TA %in% c(0,2)), weights = ipw20)
       Estimate Std. Error   t value  Pr(>|t|)  CI Lower   CI Upper  DF
TA2 -0.06603413 0.04450348 -1.483797 0.1385462 -0.153489 0.02142069 461
  1. A raw cross tab (no weights)
with(data=filter(dat, (block !=5) & (TA %in% c(0,2))), table(ncb_wh, TA2))
      TA2
ncb_wh   0   1
    -1  18  20
    0  225 159
    1   35  21

Note that the main differences between the parametric and non parametric approach is that the non parametric approach allows between block heterogeneity in average effects; in addition rather than using the assignment propensities it implictly uses information on the empirical shares in each condition in each block.

Note when examining different models that the following generally need to be adjusted:

  • The outcome
  • The treatment (TA2 here since we are interested in punitive)
  • The data (to condition on the right comparisons, here TA %in% c(0,2))
  • The weights (here ipw20, since we are interested in comparing TA=2 and TA=0)

The core replication code automates these multiple adjustments, but any individual analysis can be checked manually, as above.

2.5 Posteriors

Posterior simulation using approach from Gelman and Hill (arm package):

out_2 <- sim(manual_model, 3000)

hist(out_2@coef[,2], main = paste(
      "Effects of Punitive v Control on\nWhite v Hispanic Discrimination, 
       mean = ", round(mean(out_2@coef[,2]), 3),
      ", share neg =", round(mean(out_2@coef[,2] < 0), 3)
      ),
     xlim = c(-.2, .1))

3 Tables and Figures in Main Text

3.1 Parametric Results on Discrimination Levels and Treatment Effects:

3.1.1 Code for Analyses

# ---------------------------------------------------------------------- #
# DEFINE COLNAMES FOR OUTPUT TABLES

col.labels <- c("Outcome", "Estimate","SE","t","P")
# mc: monitoring - control
# pc: punitive - control
# pm: punitive - monitoring
# TA1: monitoring
# TA2: punitive

outvars <- c("nmeet_", "index.", "ncb_", "noff_")

reg_and_summ <- function(outvar, compare){
  
  depvar <- paste0(outvar, compare)
  model.mc <- paste(depvar," ~ TA1 + as.factor(block)",sep="")
  model.pc <- paste(depvar," ~ TA2 + as.factor(block)",sep="")
  model.pm <- paste(depvar," ~ TA2 + as.factor(block)",sep="")
 
  
  fit.mc <- lm(formula=model.mc, data=dat[dat$TA %in% c(0,1),], weights = ipw10)
  fit.pc <- lm(formula=model.pc, data=dat[dat$TA %in% c(0,2),], weights = ipw20)
  fit.pm <- lm(formula=model.pm, data=dat[dat$TA %in% c(1,2),], weights = ipw21)

  
  itt.mc <- summary(fit.mc)$coefficients[2,]
  itt.pc <- summary(fit.pc)$coefficients[2,]
  itt.pm <- summary(fit.pm)$coefficients[2,]
    
  if(compare == "wb" | compare == "wh"  ){
  itt.mc[4] <- pt(coef(summary(fit.mc))[,3], summary(fit.mc)$df[2], lower=TRUE)[2] #one sided p
  itt.pc[4] <- pt(coef(summary(fit.pc))[,3], summary(fit.pc)$df[2], lower=TRUE)[2] #one sided p
  } 
  
    list(mc     = itt.mc, pc     = itt.pc, pm     = itt.pm,
         fit.mc = fit.mc, fit.pc = fit.pc, fit.pm = fit.pm )
  
}

# Parametric Differences in Differences Results - White / Black Comparisons
itt.wb <- lapply(outvars, reg_and_summ,  compare = "wb" )
fit.wb.mc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$fit.mc)
fit.wb.pc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$fit.pc)
fit.wb.pm <- sapply(1:length(outvars), function(i) itt.wb[[i]]$fit.pm)

# Parametric Differences in Differences Results - White / Hispanic Comparisons
itt.wh <- lapply(outvars, reg_and_summ,  compare = "wh" )
fit.wh.mc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$fit.mc)
fit.wh.pc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$fit.pc)
fit.wh.pm <- sapply(1:length(outvars), function(i) itt.wh[[i]]$fit.pm)

# Parametric Differences in Differences Results - Black / Hispanic Comparisons
itt.bh <- lapply(outvars, reg_and_summ,  compare = "bh" )
fit.bh.mc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$fit.mc)
fit.bh.pc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$fit.pc)
fit.bh.pm <- sapply(1:length(outvars), function(i) itt.bh[[i]]$fit.pm)

3.1.2 Simple print out of parametric results

# WB MC
stargazer(fit.wb.mc[[1]], fit.wb.mc[[2]], fit.wb.mc[[3]], fit.wb.mc[[4]], 
          omit = "block", type = "html", 
          p = lapply(fit.wb.mc, function(fit)   
            pt(coef(summary(fit))[,3], summary(fit)$df[2], lower=TRUE) ))
Dependent variable:
nmeet_wb index.wb ncb_wb noff_wb
(1) (2) (3) (4)
TA1 0.051 -0.015 -0.002 -0.003
(0.038) (0.052) (0.045) (0.036)
Constant -0.103 0.020 0.121 -0.039
(0.081) (0.104) (0.097) (0.078)
Observations 453 331 453 453
R2 0.065 0.058 0.035 0.041
Adjusted R2 0.029 0.007 -0.003 0.004
Residual Std. Error 0.560 (df = 435) 0.641 (df = 313) 0.673 (df = 435) 0.538 (df = 435)
F Statistic 1.791** (df = 17; 435) 1.129 (df = 17; 313) 0.929 (df = 17; 435) 1.099 (df = 17; 435)
Note: p<0.1; p<0.05; p<0.01
# WB PC
stargazer(fit.wb.pc[[1]], fit.wb.pc[[2]], fit.wb.pc[[3]], fit.wb.pc[[4]], 
          omit = "block", type = "html" )
Dependent variable:
nmeet_wb index.wb ncb_wb noff_wb
(1) (2) (3) (4)
TA2 0.039 0.007 0.019 0.018
(0.038) (0.053) (0.042) (0.035)
Constant -0.054 0.043 0.054 -0.010
(0.077) (0.103) (0.086) (0.071)
Observations 479 354 479 479
R2 0.042 0.012 0.043 0.055
Adjusted R2 0.007 -0.038 0.007 0.020
Residual Std. Error 0.585 (df = 461) 0.686 (df = 336) 0.649 (df = 461) 0.535 (df = 461)
F Statistic 1.196 (df = 17; 461) 0.234 (df = 17; 336) 1.209 (df = 17; 461) 1.578* (df = 17; 461)
Note: p<0.1; p<0.05; p<0.01
# WB PM
stargazer(fit.wb.pm[[1]], fit.wb.pm[[2]], fit.wb.pm[[3]], fit.wb.pm[[4]], 
          omit = "block", type = "html",
          p = lapply(fit.wb.pm, function(fit) 
            pt(coef(summary(fit))[,3], summary(fit)$df[2], lower=TRUE) ))
Dependent variable:
nmeet_wb index.wb ncb_wb noff_wb
(1) (2) (3) (4)
TA2 -0.005 0.018 0.033 0.026
(0.045) (0.058) (0.049) (0.037)
Constant 0.039 0.189 0.087 0.020
(0.085) (0.102) (0.093) (0.070)
Observations 374 263 374 374
R2 0.062 0.052 0.015 0.054
Adjusted R2 0.017 -0.009 -0.032 0.009
Residual Std. Error 0.605 (df = 356) 0.638 (df = 246) 0.656 (df = 356) 0.499 (df = 356)
F Statistic 1.385 (df = 17; 356) 0.850 (df = 16; 246) 0.324 (df = 17; 356) 1.205 (df = 17; 356)
Note: p<0.1; p<0.05; p<0.01
# WH MC
stargazer(fit.wh.mc[[1]], fit.wh.mc[[2]], fit.wh.mc[[3]], fit.wh.mc[[4]],
          omit = "block", type = "html",
          p = lapply(fit.wh.mc, function(fit) 
            pt(coef(summary(fit))[,3], summary(fit)$df[2], lower=TRUE) ) )
Dependent variable:
nmeet_wh index.wh ncb_wh noff_wh
(1) (2) (3) (4)
TA1 -0.024 -0.061 -0.036 -0.017
(0.041) (0.057) (0.043) (0.034)
Constant -0.110 0.072 0.176 0.047
(0.087) (0.113) (0.093) (0.073)
Observations 453 318 453 453
R2 0.093 0.031 0.039 0.054
Adjusted R2 0.058 -0.024 0.001 0.017
Residual Std. Error 0.600 (df = 435) 0.693 (df = 300) 0.641 (df = 435) 0.506 (df = 435)
F Statistic 2.638*** (df = 17; 435) 0.571 (df = 17; 300) 1.040 (df = 17; 435) 1.464 (df = 17; 435)
Note: p<0.1; p<0.05; p<0.01
# WH PC
stargazer(fit.wh.pc[[1]], fit.wh.pc[[2]], fit.wh.pc[[3]], fit.wh.pc[[4]], 
          omit = "block", type = "html",
          p = lapply(fit.wh.pc, function(fit) 
            pt(coef(summary(fit))[,3], summary(fit)$df[2], lower=TRUE) ))
Dependent variable:
nmeet_wh index.wh ncb_wh noff_wh
(1) (2) (3) (4)
TA2 -0.003 0.048 -0.066* -0.021
(0.039) (0.055) (0.041) (0.034)
Constant 0.002 0.085 0.068 0.012
(0.078) (0.107) (0.083) (0.069)
Observations 479 346 479 479
R2 0.048 0.068 0.032 0.059
Adjusted R2 0.013 0.020 -0.004 0.024
Residual Std. Error 0.594 (df = 461) 0.714 (df = 328) 0.633 (df = 461) 0.524 (df = 461)
F Statistic 1.367 (df = 17; 461) 1.405 (df = 17; 328) 0.888 (df = 17; 461) 1.702** (df = 17; 461)
Note: p<0.1; p<0.05; p<0.01
# WH PM
stargazer(fit.wh.pm[[1]], fit.wh.pm[[2]], fit.wh.pm[[3]], fit.wh.pm[[4]], 
          omit = "block", type = "html")
Dependent variable:
nmeet_wh index.wh ncb_wh noff_wh
(1) (2) (3) (4)
TA2 0.017 0.107* -0.019 0.002
(0.045) (0.063) (0.049) (0.038)
Constant 0.025 0.127 -0.024 -0.037
(0.084) (0.112) (0.093) (0.072)
Observations 374 268 374 374
R2 0.047 0.089 0.027 0.049
Adjusted R2 0.001 0.031 -0.020 0.004
Residual Std. Error 0.598 (df = 356) 0.703 (df = 251) 0.661 (df = 356) 0.511 (df = 356)
F Statistic 1.029 (df = 17; 356) 1.539* (df = 16; 251) 0.578 (df = 17; 356) 1.077 (df = 17; 356)
Note: p<0.1; p<0.05; p<0.01
# BH MC
stargazer(fit.bh.mc[[1]], fit.bh.mc[[2]], fit.bh.mc[[3]], fit.bh.mc[[4]], 
          omit = "block", type = "html")
Dependent variable:
nmeet_bh index.bh ncb_bh noff_bh
(1) (2) (3) (4)
TA1 -0.075* -0.089* -0.035 -0.014
(0.042) (0.052) (0.042) (0.031)
Constant -0.007 0.057 0.055 0.086
(0.089) (0.098) (0.090) (0.066)
Observations 453 316 453 453
R2 0.088 0.058 0.037 0.012
Adjusted R2 0.053 0.004 -0.0002 -0.027
Residual Std. Error 0.614 (df = 435) 0.636 (df = 298) 0.624 (df = 435) 0.455 (df = 435)
F Statistic 2.477*** (df = 17; 435) 1.075 (df = 17; 298) 0.995 (df = 17; 435) 0.309 (df = 17; 435)
Note: p<0.1; p<0.05; p<0.01
# BH PC
stargazer(fit.bh.pc[[1]], fit.bh.pc[[2]], fit.bh.pc[[3]], fit.bh.pc[[4]],
          omit = "block", type = "html")
Dependent variable:
nmeet_bh index.bh ncb_bh noff_bh
(1) (2) (3) (4)
TA2 -0.042 0.039 -0.085** -0.039
(0.041) (0.049) (0.041) (0.033)
Constant 0.056 0.052 0.015 0.021
(0.082) (0.090) (0.082) (0.068)
Observations 479 340 479 479
R2 0.051 0.069 0.039 0.023
Adjusted R2 0.016 0.020 0.004 -0.013
Residual Std. Error 0.624 (df = 461) 0.625 (df = 322) 0.623 (df = 461) 0.512 (df = 461)
F Statistic 1.452 (df = 17; 461) 1.400 (df = 17; 322) 1.109 (df = 17; 461) 0.634 (df = 17; 461)
Note: p<0.1; p<0.05; p<0.01
# BH PM
stargazer(fit.bh.pm[[1]], fit.bh.pm[[2]], fit.bh.pm[[3]], fit.bh.pm[[4]],
          omit = "block", type = "html")
Dependent variable:
nmeet_bh index.bh ncb_bh noff_bh
(1) (2) (3) (4)
TA2 0.023 0.139** -0.052 -0.024
(0.051) (0.057) (0.047) (0.036)
Constant -0.014 -0.077 -0.111 -0.057
(0.096) (0.100) (0.088) (0.067)
Observations 374 250 374 374
R2 0.067 0.118 0.036 0.036
Adjusted R2 0.023 0.057 -0.010 -0.010
Residual Std. Error 0.683 (df = 356) 0.622 (df = 233) 0.626 (df = 356) 0.477 (df = 356)
F Statistic 1.511* (df = 17; 356) 1.943** (df = 16; 233) 0.793 (df = 17; 356) 0.785 (df = 17; 356)
Note: p<0.1; p<0.05; p<0.01

3.1.3 Storing parametric results

# Parametric Differences in Differences Results -White / Black Comparisons
itt.wb.mc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$mc)
itt.wb.pc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$pc)
itt.wb.pm <- sapply(1:length(outvars), function(i) itt.wb[[i]]$pm)

# Parametric Differences in Differences Results - White / Hispanic Comparisons
itt.wh.mc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$mc)
itt.wh.pc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$pc)
itt.wh.pm <- sapply(1:length(outvars), function(i) itt.wh[[i]]$pm)

# Parametric Differences in Differences Results - Black / Hispanic Comparisons
itt.bh.mc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$mc)
itt.bh.pc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$pc)
itt.bh.pm <- sapply(1:length(outvars), function(i) itt.bh[[i]]$pm)


#----- CIs -----#
df.w_ <- sapply(list(df.wb.mc = fit.wb.mc, 
                     df.wb.pc = fit.wb.pc, 
                     df.wb.pm = fit.wb.pm,
                     df.wh.mc = fit.wh.mc,
                     df.wh.pc = fit.wh.pc, 
                     df.wh.pm = fit.wh.pm,
                     df.bh.mc = fit.bh.mc,
                     df.bh.pc = fit.bh.pc,
                     df.bh.pm = fit.bh.pm),
                function(fit){
                        unlist(lapply(fit, function(x) summary(x)$df[2]))[2:4]})

df.2g.10 <-  c(df.w_[,c("df.wb.mc", "df.wh.mc", "df.bh.mc")])
df.2g.20 <-  c(df.w_[,c("df.wb.pc", "df.wh.pc", "df.bh.pc")])
df.2g.21 <-  c(df.w_[,c("df.wb.pm", "df.wh.pm", "df.bh.pm")])

cv.2g.10 <- qt(p=0.025, df=df.2g.10, lower.tail=TRUE)
cv.2g.20 <- qt(p=0.025, df=df.2g.20, lower.tail=TRUE)
cv.2g.21 <- qt(p=0.025, df=df.2g.21, lower.tail=TRUE)

cv.2g <- c(cv.2g.10, cv.2g.20, cv.2g.21)
# Stack output 
out2 <- rbind(itt.wb.mc, itt.wh.mc, itt.bh.mc,
              itt.wb.pc, itt.wh.pc, itt.bh.pc,
              itt.wb.pm, itt.wh.pm, itt.bh.pm)

out2 <- out2[-seq(1,33,4),]


# Round and format results
for(i in 2:ncol(out2)) out2[,i] <- round(as.numeric(out2[,i]),3)
 out2[,"P"] <- paste("(", out2[,"P"],")",sep="")

itt.2g.ci <- as.numeric(out2[,2]) +
  abs(cv.2g)*cbind(lower.ci = -as.numeric(out2[,3]), upper.ci = as.numeric(out2[,3])) 

itt.est <- cbind(out2, itt.2g.ci)
outlist <-
  lapply(c(cb.reg = "callback" ,
          off.reg = "offer",
          meindex.reg = "Index"),  
         function(keyword){
             # Grab relevant outcome row 
            itt <-  itt.est[grepl(keyword, itt.est[,1], fixed=TRUE),]
            with(as.data.frame(itt),
              list(coef=matrix(Estimate, nrow=3, ncol=3, byrow=FALSE),
              se=matrix(SE, nrow=3, ncol=3, byrow=FALSE),
              lower=matrix(lower.ci, nrow=3, ncol=3, byrow=FALSE),
              upper=matrix(upper.ci, nrow=3, ncol=3, byrow=FALSE))) })
       
cb.reg      <-   outlist$cb.reg 
off.reg     <-   outlist$off.reg 
meindex.reg <-   outlist$meindex.reg 

3.2 Nonparametric and other Figure 1 Estimates

3.2.1 Prep

# Create and rename variables
dat$cb_w <- dat$cb_C
dat$cb_h <- dat$cb_B
dat$cb_b <- dat$cb_A

dat$off_w <- dat$off_C
dat$off_h <- dat$off_B
dat$off_b <- dat$off_A

dat$meindex_w <- dat$meindex.impute_C 
dat$meindex_h <- dat$meindex.impute_B
dat$meindex_b <- dat$meindex.impute_A 

dat$meindex_wb <- dat$meindex.impute_C - dat$meindex.impute_A
dat$meindex_wh <- dat$meindex.impute_C - dat$meindex.impute_A
dat$meindex_bh <- dat$meindex.impute_A - dat$meindex.impute_B

3.2.2 Quadrant 1,1

# quad[1,1] --------------------------------------------
 # Function to create each single row in data.frame
f_11 <- function(Y, TA = dat$TA, W = dat$ipw, a) {
  N <- sum(TA==a)
  out <- data.frame(estimate =  weighted.mean(Y[TA==a], W[TA==a]))
  out$std.error <- sqrt(wtd.var(Y[TA==a], W[TA==a])/N)
  out$df <- N - 1
  error <-  abs(qt(p=.025, df= out$df, lower.tail=TRUE))
  out$conf.low  <- out$estimate - error*out$std.error
  out$conf.high <- out$estimate + error*out$std.error
  out
}
 # function to call f_a11() for each level
quad_11 <- function(Y){
     outcomes <- paste0(Y, c("_w",  "_b", "_h"))
     Ys <- dat[, outcomes]
    rbind(
      w_c = f_11(Ys[,1], a = 0),
      b_c = f_11(Ys[,2], a = 0),
      h_c = f_11(Ys[,3], a = 0),
      w_m = f_11(Ys[,1], a = 1),
      b_m = f_11(Ys[,2], a = 1),
      h_m = f_11(Ys[,3], a = 1),
      w_p = f_11(Ys[,1], a = 2),
      b_p = f_11(Ys[,2], a = 2),
      h_p = f_11(Ys[,3], a = 2))
    }

3.2.3 Quadrant 1,2

# quad[1,2] --------------------------------------------
f_12 <- function(Y, a, b, TA = dat$TA, W = dat$ipw) {
  X <- (TA==a)[TA==a | TA==b]
  lm12 <- lm(Y[TA==a | TA==b]  ~ X, weight = W[TA==a | TA==b])
  summ <- summary(lm12)
  out <- tidy(summ) 
  out$df <- summ$df[2]
  error <-  abs(qt(p=.025, df= out$df, lower.tail=TRUE)) 
  out$conf.low  <- out$estimate - error*out$std.error
  out$conf.high <- out$estimate + error*out$std.error
  as.data.frame(out[out$term == "XTRUE", ] )
}
  
 # function to call f_a11() for each level
quad_12 <- function(Y){
     outcomes <- paste0(Y, c("_w",  "_b", "_h"))
     Ys <- dat[, outcomes]
    rbind(
      w_mc = f_12(Ys[,1], a = 1, b = 0),
      b_mc = f_12(Ys[,2], a = 1, b = 0),
      h_mc = f_12(Ys[,3], a = 1, b = 0),
      w_pc = f_12(Ys[,1], a = 2, b = 0),
      b_pc = f_12(Ys[,2], a = 2, b = 0),
      h_pc = f_12(Ys[,3], a = 2, b = 0),
      w_pm = f_12(Ys[,1], a = 2, b = 1),
      b_pm = f_12(Ys[,2], a = 2, b = 1),
      h_pm = f_12(Ys[,3], a = 2, b = 1))
    }

3.2.4 Quadrant 2,1

# quad[2,1] --------------------------------------------
f_21 <- function(Y1, Y2, a,  TA = dat$TA, W = dat$ipw) {
  out <- data.frame(estimate =  weighted.mean((Y1-Y2)[TA==a], W[TA==a]))
  ttest <- wtd.t.test((Y1-Y2)[TA==a], weight=W[TA==a])
  out$std.error <- ttest$additional[4]
  out$p  <- ttest$coefficient[3]
  out$df <- ttest$coefficient[2]
  error <-  abs(qt(p=.025, df= out$df, lower.tail=TRUE))
  out$conf.low  <- out$estimate - error*out$std.error
  out$conf.high <- out$estimate + error*out$std.error
  as.data.frame(out)
}

quad_21 <- function(Y){
     outcomes <- paste0(Y, c("_w",  "_b", "_h"))
     Ys <- dat[, outcomes]
    rbind(
      wb_c = f_21(Ys[,1], Ys[,2],  a = 0),
      wh_c = f_21(Ys[,1], Ys[,3],  a = 0),
      bh_c = f_21(Ys[,2], Ys[,3],  a = 0),
      wb_m = f_21(Ys[,1], Ys[,2],  a = 1),
      wh_m = f_21(Ys[,1], Ys[,3],  a = 1),
      bh_m = f_21(Ys[,2], Ys[,3],  a = 1),
      wb_p = f_21(Ys[,1], Ys[,2],  a = 2),
      wh_p = f_21(Ys[,1], Ys[,3],  a = 2),
      bh_p = f_21(Ys[,2], Ys[,3],  a = 2))
    }

3.2.5 Quadrant 2,2

# quad[2,2] --------------------------------------------
 # Non parametric Differences in Differences Results
# It excludes block 5 which has too little variance to estimate a variance term.
f_22 <- function(Y, TA = "TA",  C1, C2, exclude, data = dat){
  dim22 <- difference_in_means(formula(paste(Y,  "~" ,TA)), 
                           blocks = block, 
                           data=filter(data, block !=5), 
                           condition1 = C1,
                           condition2 = C2,
                           subset = TA!= exclude)
  tidy(dim22)
}

quad_22 <- function(Y){
    Ys <- paste0(Y, c("_wb",  "_wh", "_bh"))
    rbind(
      wb_mc = f_22(Ys[1], C1 = 0, C2= 1, exclude = 2),
      wh_mc = f_22(Ys[2], C1 = 0, C2= 1, exclude = 2),
      bh_mc = f_22(Ys[3], C1 = 0, C2= 1, exclude = 2),
      wb_pc = f_22(Ys[1], C1 = 0, C2= 2, exclude = 1), 
      wh_pc = f_22(Ys[2], C1 = 0, C2= 2, exclude = 1),
      bh_pc = f_22(Ys[3], C1 = 0, C2= 2, exclude = 1),
      wb_pm = f_22(Ys[1], C1 = 1, C2= 2, exclude = 0),
      wh_pm = f_22(Ys[2], C1 = 1, C2= 2, exclude = 0),
      bh_pm = f_22(Ys[3], C1 = 1, C2= 2, exclude = 0))
    }

# Produce tables
cb_11  <- quad_11("cb")
cb_12  <- quad_12("cb")
cb_21  <- quad_21("cb")
cb_22  <- quad_22("ncb")
off_11 <- quad_11("off")  
off_12 <- quad_12("off")   
off_21 <- quad_21("off") 
off_22  <- quad_22("noff")
off_11 <- quad_11("off")  
off_12 <- quad_12("off")   
off_21 <- quad_21("off") 
off_22  <- quad_22("noff")
index_11 <- quad_11("meindex")  
index_12 <- quad_12("meindex")   
index_21 <- quad_21("meindex" ) 
index_22  <- quad_22("meindex")

3.3 Produce Figure 1

# HELPER FUNCTIONS

#================#
# Plot function used in Figure 1
#================#

# Take matrices CB, OFF, MEINDEX
# Take matrices CB.REG, OFF.REG, MEINDEX.REG

# plot with a single set of three or two threes
plot.helper <- function(coefs1=1:3, sds1=3:5, lower1, upper1, coefs2=NA, sds2=NA, lower2, upper2, double=FALSE, xlim=NULL ){
  coefplot(coefs1, sds1, CI=2, lower.conf.bounds=lower1, upper.conf.bounds=upper1, pch.pts=21, varnames="", main = "", h.axis = FALSE, xlim = xlim, cex.pts=3, lwd=3)
  if(double) coefplot(coefs2, sds2, CI=2, lower.conf.bounds=lower2, upper.conf.bounds=upper2, add=TRUE, pch.pts=19, cex.pts=3, lwd=3)
  axis(side=1, line=2, cex.axis=1.5, padj=1)
}


itt.plot <- function(y11,y12, y21, y22_np, y_22_p, xlim){
  # Raw means ci
  # upper left quadrant
  for(i in 1:3) {
    trt = list(control    = c("w_c",  "b_c","h_c"),    #  i = 1 
               monitoring = c("w_m", "b_m", "h_m"), # i = 2
               punitive   = c("w_p", "b_p", "h_p"))   # i = 3
  
    plot.helper(coefs1 = rev(y11[trt[[i]], "estimate" ]),
                sds1   = rev(y11[trt[[i]], "std.error"]),
                lower1 = rev(y11[trt[[i]], "conf.low" ]), 
                upper1 = rev(y11[trt[[i]], "conf.high"]),
                xlim   = xlim[[1]])}

  # D-I-M between treatments conditional on each group - upper right quadrant
    for(i in 1:3) {
    trt = list(mc    = c("w_mc", "b_mc","h_mc"),  #  i = 1 
               pc = c("w_pc", "b_pc", "h_pc"), # i = 2
               pm   = c("w_pm", "b_pm", "h_pm")) # i = 3
  
    plot.helper(coefs1 = rev(y12[trt[[i]], "estimate" ]),
                sds1   = rev(y12[trt[[i]], "std.error"]),
                lower1 = rev(y12[trt[[i]], "conf.low" ]), 
                upper1 = rev(y12[trt[[i]], "conf.high"]),
                xlim   = xlim[[2]])}
    

     for(i in 1:3) {
    trt = list(c = c("wb_c", "wh_c", "bh_c"),  #  i = 1 
               m = c("wb_m", "wh_m", "bh_m"), # i = 2
               p = c("wb_p", "wh_p", "bh_p")) # i = 3
  
    plot.helper(coefs1 = rev(y21[trt[[i]], "estimate" ]),
                sds1   = rev(y21[trt[[i]], "std.error"]),
                lower1 = rev(y21[trt[[i]], "conf.low" ]), 
                upper1 = rev(y21[trt[[i]], "conf.high"]),
                xlim   = xlim[[3]])
    
        if ( i == 1 ) {  # highlight control group
      abline(h=3, col=rgb(0,0,0,.1), lwd=40)
      abline(h=2, col=rgb(0,0,0,.1), lwd=40)}}
   # D-I-M between groups conditional on treatment assigment - lower left quadrant

    
  # lower right quadrant
    for(i in 1:3) {
     trt = list(c = c("wb_mc", "wh_mc","bh_mc"),  #  i = 1 
                m = c("wb_pc", "wh_pc", "bh_pc"), # i = 2
                p = c("wb_pm", "wh_pm", "bh_pm")) # i = 3
    plot.helper(coefs1 = rev(y22_np[trt[[i]], "estimate" ]),
                sds1   = rev(y22_np[trt[[i]], "std.error"]),
                lower1 = rev(y22_np[trt[[i]], "conf.low" ]), 
                upper1 = rev(y22_np[trt[[i]], "conf.high"]),
                coefs2=rev(y_22_p$coef[1:3,(i )]), 
                sds2=rev(y_22_p$se[1:3,(i)]), 
                lower2=rev(y_22_p$lower[1:3,(i)]), 
                upper2=rev(y_22_p$upper[1:3,(i)]),
                double=TRUE, xlim=xlim[[4]])
    if ( i == 5 ){  # highlight P-C
      abline(h=3, col=rgb(0,0,0,.1), lwd=40)
      abline(h=2, col=rgb(0,0,0,.1), lwd=40)  }}
    
}

## MAKE PDF FIGURES FOR PAPER AND PNG FIGURES FOR RMD FILE

## CALLBACKS
# Make PNG version for Rmd output
png("out_figure1a_6x6_cb.png", width=png_width, height=png_height)
par(oma=c(5,0,0,0), mar=c(5,0,0,0))
layout(matrix(layout_vec, nrow=5, ncol=9, byrow=TRUE), widths=width_spec, heights=height_spec)
# plots
xlims <- list(c(0, .25), c(-.15, .15), c(-.15, .15), c(-.2, .2))
itt.plot(y11 = cb_11, y12 = cb_12, y21 = cb_21, y22_np  = cb_22, y_22_p = cb.reg, xlims)
# dividers
frame(); abline(h=.5, lwd=2)
frame(); abline(v=.5, lwd=2)
# row labels
frame(); text(.6, .85, "White", cex=rowsub.cex); text(.6, .5, "Black", cex=rowsub.cex); text(.6, .15, "Hisp.", cex=rowsub.cex)
frame(); text(.6, .85, "W-B", cex=rowsub.cex); text(.6, .5, "W-H", cex=rowsub.cex); text(.6, .15, "B-H", cex=rowsub.cex)
# col labels
for(i in 1:6) {
  frame(); text(.5, .5, col_headings[i], cex=colhead.cex)
}
# meta col labels
frame(); text(.5, .5, "Treatment Group Means", cex=colhead.cex)
frame(); text(.5, .5, "Differences between Treatment Groups", cex=colhead.cex)
# meta row labels
frame(); text(.5, .5, "Tester Race\nGroup Means", cex=colhead.cex, srt=90)
frame(); text(.5, .5, "Intergroup\nDifferences", cex=colhead.cex, srt=90)
dev.off()
quartz_off_screen 
                2 
## OFFERS
# Make PNG version for Rmd output
png("out_figure1b_6x6_off.png", width=png_width, height=png_height)
par(oma=c(5,0,0,0), mar=c(5,0,0,0))
layout(matrix(layout_vec, nrow=5, ncol=9, byrow=TRUE), widths=width_spec, heights=height_spec)
# plots
xlims <- list(c(0, .15), c(-.1, .1), c(-.1, .1), c(-.2, .2))
itt.plot(y11 = off_11, y12 = off_12, y21 = off_21, y22_np  = off_22, y_22_p = off.reg, xlims)
# dividers
frame(); abline(h=.5, lwd=2)
frame(); abline(v=.5, lwd=2)
# row labels
frame(); text(.6, .85, "White", cex=rowsub.cex); text(.6, .5, "Black", cex=rowsub.cex); text(.6, .15, "Hisp.", cex=rowsub.cex)
frame(); text(.6, .85, "W-B", cex=rowsub.cex); text(.6, .5, "W-H", cex=rowsub.cex); text(.6, .15, "B-H", cex=rowsub.cex)
# col labels
for(i in 1:6) {
  frame(); text(.5, .5, col_headings[i], cex=colhead.cex)
}
# meta col labels
frame(); text(.5, .5, "Treatment Group Means", cex=colhead.cex)
frame(); text(.5, .5, "Differences between Treatment Groups", cex=colhead.cex)
# meta row labels
frame(); text(.5, .5, "Tester Race\nGroup Means", cex=colhead.cex, srt=90)
frame(); text(.5, .5, "Intergroup\nDifferences", cex=colhead.cex, srt=90)
dev.off()
quartz_off_screen 
                2 
## INDEX

# Make PNG version for Rmd output
png("out_figurea3_6x6_index.png", width=2400, height=1100)
par(oma=c(5,0,0,0), mar=c(5,0,0,0))
layout(matrix(layout_vec, nrow=5, ncol=9, byrow=TRUE), widths=width_spec, heights=height_spec)
# plots
xlims <- list(c(-.3, .3), c(-.35, .35), c(-.35, .35), c(-.35, .35))
itt.plot(y11 = index_11, y12 = index_12, y21 = index_21, y22_np  = index_22, 
         y_22_p =meindex.reg, xlims)
# dividers
frame(); abline(h=.5, lwd=2)
frame(); abline(v=.5, lwd=2)
# row labels
frame(); text(.6, .85, "White", cex=rowsub.cex); text(.6, .5, "Black", cex=rowsub.cex); text(.6, .15, "Hisp.", cex=rowsub.cex)
frame(); text(.6, .85, "W-B", cex=rowsub.cex); text(.6, .5, "W-H", cex=rowsub.cex); text(.6, .15, "B-H", cex=rowsub.cex)
# col labels
for(i in 1:6) {
  frame(); text(.5, .5, col_headings[i], cex=colhead.cex)
}
# meta col labels
frame(); text(.5, .5, "Treatment Group Means", cex=colhead.cex)
frame(); text(.5, .5, "Differences between Treatment Groups", cex=colhead.cex)
# meta row labels
frame(); text(.5, .5, "Tester Race\nGroup Means", cex=colhead.cex, srt=90)
frame(); text(.5, .5, "Intergroup\nDifferences", cex=colhead.cex, srt=90)
dev.off()
quartz_off_screen 
                2 
Figure 1, Panel A: Outcome 1: Net Discrimination in Receiving Callbacks

Figure 1, Panel A: Outcome 1: Net Discrimination in Receiving Callbacks

Figure 1, Panel B: Outcome 2: Net Discrimination in Receiving Offers

Figure 1, Panel B: Outcome 2: Net Discrimination in Receiving Offers

3.4 Figure 2: Posterior Densities of Treatment Effects on Discrimination

##=============================================================================##
## Figure 2: Posterior Probabilities
##=============================================================================##

#### ======== FUNCTIONS =========== ####

## function to compute the bayesian analog of the lmfit
## using non-informative priors and Monte Carlo scheme
## based on N samples;
#  see http://www.r-bloggers.com/bayesian-linear-regression-analysis-without-tears-r/

bayes.graph <- function(depvar, treat, weight, block = NULL, main = "Callbacks", #comparison = "Hispanic", 
                       trt_text = "Punitive", draws = 10000, xloc = 0.15, yloc = 13, textadj = 1, ylab = "", print_title = FALSE,
                       vertical = TRUE, maxline = FALSE, addlegend = FALSE, addbox = TRUE, ylim = c(0,18)) {

  lmfit <- lm(depvar ~ treat + factor(block), weights = weight)
  
  bf <- data.frame(arm::sim(lmfit, draws)@coef)
  post_punitive <- bf$treat

  # graph
  dens <- density(post_punitive, bw = 0.01, from = -.17, to = .10) # to = .04 ; bw = 0.005

  left <- quantile(post_punitive, 0)
  x1 <- min(which(dens$x >= left))
  x2 <- max(which(dens$x <  0))

  pct_left <- round((diff(dens$x[1:2])*sum(dens$y[1:x2])) / (diff(dens$x[1:2])*sum(dens$y)), digits = 3) * 100 # % MOC < 0

  main_txt <- ""
  if(print_title) { main_txt <- paste0("Outcome: ", main, "\n(", trt_text, " vs. Control)") }
  plot(dens, xlab = "", ylab = ylab, main = main_txt, bty = "n", lwd = 2, xlim = c(-.15, .15), ylim = ylim, cex.main = 1.5, cex.lab = 1.2, cex.axis = 1, font = 1) #xlim = c(-.17, .10), ylim = c(0,17)
  with(dens, polygon(x=c(x[c(x1,x1:x2,x2)]), y= c(0, y[x1:x2], 0), col="gray"))
  if(maxline) abline(v = mean(post_punitive), col = "red", lty = 2, lwd = 3)
  text(x = xloc, y = yloc, paste0(pct_left, "%"), adj = c(textadj, NA), cex = 1.2)
  text(x = xloc, y = yloc-.8, "less than zero", adj = c(textadj, NA), cex = 1.2)
  if(vertical) abline(v=0)
  if(addbox) box()
}


#### ======== ANALYSES AND GRAPH RESULTS =========== ####

# CREATE PNG OUTPUT FOR RMD

op <- par()
par(op)
png("out_figure2_posteriors.png", width = 1350, height = 750)
par(mfrow = c(2,4), mar=c(2.5,5,5,0.5), family="Helvetica")

# Hispanic
# callback
  bayes.graph(depvar = dat$ncb_wh[dat$TA!=2], 
              treat  = (dat$TA1)[dat$TA!=2], 
              weight = dat$ipw10[dat$TA!=2], 
              block  = dat$block[dat$TA!=2],
              trt_text = "Monitoring",
              ylab = "Discrimination against Hispanics (vs. Whites)",
              print_title = TRUE)
  
  bayes.graph(depvar = dat$ncb_wh[dat$TA!=1],
              treat  = (dat$TA2)[dat$TA!=1], 
              weight = dat$ipw20[dat$TA!=1], 
              block  = dat$block[dat$TA!=1], 
              print_title = TRUE)

    # offer
  bayes.graph(depvar = dat$noff_wh[dat$TA!=2],
              treat  = (dat$TA==1)[dat$TA!=2], 
              weight = dat$ipw[dat$TA!=2], 
              block  = dat$block[dat$TA!=2], 
              main = "Offers", 
              trt_text = "Monitoring",
              print_title = TRUE)
  
  bayes.graph(depvar = dat$noff_wh[dat$TA!=1],
              treat  = (dat$TA==2)[dat$TA!=1], 
              weight = dat$ipw[dat$TA!=1],
              block  = dat$block[dat$TA!=1], 
              main = "Offers", 
              print_title = TRUE)

  # Black
  # callback
  bayes.graph(depvar = dat$ncb_wb[dat$TA!=2], 
              treat  = (dat$TA==1)[dat$TA!=2],
              weight = dat$ipw[dat$TA!=2], 
              block  = dat$block[dat$TA!=2], 
              trt_text = "Monitoring", 
              xloc = -0.15, 
              textadj = 0, 
              ylab = "Discrimination against Blacks (vs. Whites)")
  
  bayes.graph(depvar = dat$ncb_wb[dat$TA!=1], 
              treat  = (dat$TA==2)[dat$TA!=1], 
              weight = dat$ipw[dat$TA!=1],
              block = dat$block[dat$TA!=1],
              xloc = -0.15,
              textadj = 0)
  
  # offer
  bayes.graph(depvar = dat$noff_wb[dat$TA!=2], 
              treat  = (dat$TA==1)[dat$TA!=2], 
              weight = dat$ipw[dat$TA!=2],
              block  = dat$block[dat$TA!=2], 
              main = "Offers", 
              trt_text = "Monitoring",
              xloc = -0.15, 
              textadj = 0)
  
  bayes.graph(depvar = dat$noff_wb[dat$TA!=1], 
              treat  = (dat$TA==2)[dat$TA!=1],
              weight = dat$ipw[dat$TA!=1],
              block  = dat$block[dat$TA!=1], 
              main = "Offers",
              xloc = -0.15,
              textadj = 0)
  
dev.off()
quartz_off_screen 
                2 
##===========================================
# CREATE PDF OUTPUT FOR MANUSCRIPT OUTPUT

pdf("out_figure2_posteriors.pdf", width = 13.5, height = 7.5)
par(mfrow = c(2,4), mar=c(2.5,5,5,0.5), family="Helvetica") 
#TA: Treatment assignment -- {Control = 0, Monitoring = 1, Punitive = 2}

# Hispanic
# callback
bayes.graph(dat$ncb_wh[dat$TA!=2], (dat$TA==1)[dat$TA!=2], dat$ipw[dat$TA!=2], dat$block[dat$TA!=2], trt_text = "Monitoring", ylab = "Discrimination against Hispanics (vs. whites)", print_title = TRUE)
bayes.graph(dat$ncb_wh[dat$TA!=1], (dat$TA==2)[dat$TA!=1], dat$ipw[dat$TA!=1], dat$block[dat$TA!=1], print_title = TRUE)


# offer
bayes.graph(dat$noff_wh[dat$TA!=2], (dat$TA==1)[dat$TA!=2], dat$ipw[dat$TA!=2], dat$block[dat$TA!=2], main = "Offers", trt_text = "Monitoring", print_title = TRUE)
bayes.graph(dat$noff_wh[dat$TA!=1], (dat$TA==2)[dat$TA!=1], dat$ipw[dat$TA!=1], dat$block[dat$TA!=1], main = "Offers", print_title = TRUE)

# Black
# callback
bayes.graph(dat$ncb_wb[dat$TA!=2], (dat$TA==1)[dat$TA!=2], dat$ipw[dat$TA!=2], dat$block[dat$TA!=2], trt_text = "Monitoring", xloc = -0.15, textadj = 0, ylab = "Discrimination against Blacks (vs. whites)")
bayes.graph(dat$ncb_wb[dat$TA!=1], (dat$TA==2)[dat$TA!=1], dat$ipw[dat$TA!=1], dat$block[dat$TA!=1], xloc = -0.15, textadj = 0)

# offer
bayes.graph(dat$noff_wb[dat$TA!=2], (dat$TA==1)[dat$TA!=2], dat$ipw[dat$TA!=2], dat$block[dat$TA!=2], main = "Offers", trt_text = "Monitoring", xloc = -0.15, textadj = 0)
bayes.graph(dat$noff_wb[dat$TA!=1], (dat$TA==2)[dat$TA!=1], dat$ipw[dat$TA!=1], dat$block[dat$TA!=1], main = "Offers", xloc = -0.15, textadj = 0)

dev.off()
quartz_off_screen 
                2 
Figure 2: Posterior Densities of Treatment Effects on Discrimination against Hispanics (top) and African Americans (bottom) Relative to Whites

Figure 2: Posterior Densities of Treatment Effects on Discrimination against Hispanics (top) and African Americans (bottom) Relative to Whites

3.5 Randomization Check (reported in main text, footnote 14)

set.seed(1234567)
niter = 100

fmla <- " frame + partnered + rent + m.rent + numbr + sqft + m.sqft + 
          regime1 + regime2 + regime3 + nnumattr_bh + nnumattr_wh + 
          nnumattr_wb + nnumskep_bh + nnumskep_wh + nnumskep_wb + npctskep_bh + 
          npctskep_wh + npctskep_wb + nnumpos_bh + nnumpos_wh + nnumpos_wb + 
          nnumneu_bh + nnumneu_wh + nnumneu_wb + nnumneg_bh + nnumneg_wh + 
          nnumneg_wb + npctpos_bh + npctpos_wh + npctpos_wb + npctneu_bh + 
          npctneu_wh + npctneu_wb + npctneg_bh + npctneg_wh + npctneg_wb + 
          nanyskep_bh + nanyskep_wh + nanyskep_wb + nanyneg_bh + nanyneg_wh + 
          nanyneg_wb + team_gender + broker + callorder.wb + callorder.wh + 
          callorder.bh + incrank.wb.gt + incrank.wh.gt + incrank.bh.gt + 
          incrank.wb.eq + incrank.wh.eq + incrank.bh.eq + incrank.wb.lt + 
          incrank.wh.lt + incrank.bh.lt + boro.brx + boro.brk + boro.mnh + 
          boro.que + boro.stn + inc.w.hi + inc.b.hi + inc.h.hi + ll.female + 
          m.ll.female + primary_api + primary_blk + primary_hsp + primary_wht + 
          primary_age_18to34 + primary_age_35to44 + primary_age_45to64 + 
          primary_age_65over + primary_age_unknown + tid.A01 + tid.A10 + 
          tid.A11 + tid.A13 + tid.A02 + tid.A21 + tid.A22 + tid.A03 + 
          tid.A04 + tid.A05 + tid.A06 + tid.A07 + tid.A08 + tid.A09 + 
          tid.B01 + tid.B11 + tid.B12 + tid.B14 + tid.B16 + tid.B02 + 
          tid.B20 + tid.B23 + tid.B24 + tid.B25 + tid.B27 + tid.B03 + 
          tid.B04 + tid.B06 + tid.B07 + tid.B08 + tid.B09 + tid.C01 + 
          tid.C10 + tid.C12 + tid.C13 + tid.C14 + tid.C15 + tid.C02 + 
          tid.C27 + tid.C29 + tid.C03 + tid.C31 + tid.C33 + tid.C04 + 
          tid.C05 + tid.C06 + tid.C07 + tid.C08 + tid.C09 + as.factor(block)"

# trace=FALSE to suppress output about convergence
mnl_fit <-  multinom(formula = paste("TA ~ ", fmla), data = dat, weight = ipw, trace = FALSE) 
kparams = length(mnl_fit$coefnames)
mnl_loglik = -(mnl_fit$AIC - 2*kparams)/2   # log likelihood from actual model

# randomization inference
ri_mat = cbind(1:nrow(dat), dat$block, dat$TA)
blocks = names(table(dat$block))

# recalculate loglik stat for each shuffled vector of assignments
mnl_loglik_ri = rep(NA, niter)
fmla_ri = as.formula(paste0("z_ri ~ ", fmla))
for(j in 1:niter){
  shuffle_mat = list()
  for(i in 1:length(blocks)){
    block_z = ri_mat[ri_mat[,2] == blocks[i],3]
    shuffle_z = sample(block_z, length(block_z), replace=FALSE)
    shuffle_mat[[i]] = cbind(ri_mat[ri_mat[,2] == blocks[i],1:2], shuffle_z)
  }
  shuffle_mat = do.call(rbind, shuffle_mat)
  shuffle_mat = shuffle_mat[order(shuffle_mat[,1]),]
  dat$z_ri = shuffle_mat[,3]
  mnl_fit_ri = multinom(formula=fmla_ri, data=dat, weight=ipw, trace=FALSE)
  kparams = length(mnl_fit_ri$coefnames)
  mnl_loglik_ri[j] = -(mnl_fit_ri$AIC - 2*kparams)/2
}

# calculate p-value
mean(abs(mnl_loglik_ri) >= abs(mnl_loglik))
[1] 0.97

4 Appendix Tables and Figures

4.1 Table A1 (p. A-2): Distribution of Experimental Subjects by Randomization Block

blockLabs <- rep(c("Brooklyn","Bronx","Manhattan","Queens","Staten Island","Likely Discrimination Frame"), 3)
blockLabs <- blockLabs[-length(blockLabs)]
pcts <- round(prop.table(table(dat$block, dat$TA, useNA="ifany")), 3)
block_dist <- cbind(blockLabs, 
                    table(dat$block, dat$TA, useNA="ifany")[,1], pcts[,1],
                    table(dat$block, dat$TA, useNA="ifany")[,2], pcts[,2],
                    table(dat$block, dat$TA, useNA="ifany")[,3], pcts[,3])

block_dist <- rbind(c("Regime 1: 13 Apr 2012 - 9 Sep 2012", rep(NA,6)),
                    block_dist[1:6,],
                    c("Regime 2: 10 Sep 2012 - 7 May 7, 2013", rep(NA,6)),
                    block_dist[7:12,],
                    c("Regime 3: 8 May 2013 to 20 Dec 2013", rep(NA,6)),
                    block_dist[13:17,],
                    c("Total", table(dat$TA)[1], round(table(dat$TA)[1]/sum(table(dat$TA)),3), 
                      table(dat$TA)[2], round(table(dat$TA)[2]/sum(table(dat$TA)),3), 
                      table(dat$TA)[3], round(table(dat$TA)[3]/sum(table(dat$TA)),3)
                    ))

colnames(block_dist) <- c("Block","Control (N)","Control (Proportion)",
                  "Monitoring (N)","Monitoring (Proportion)",
                  "Punitive (N)","Punitive (Proportion)")

block_distcaption = "Distribution of Experimental Subjects by Randomization Block. Cells contain counts of the number and proportion of experimental subjects randomly assigned to each arm by randomization block. The 17 randomization blocks are defined by the ad's sampling stratum (New York City borough or the likely discrimination, or LD, oversample) and by treatment regime (defined as a distinct design and randomization procedure). There are three treatment regimes. Regime 1 was a 5-arm design (2 of which are not analyzed in this paper) with equal treatment assignment probabilities. Regime 2 was a 3-arm design where the probability of assignment to control was 0.5 and the probability of assignment to the monitoring or punitive conditions was 0.25. Regime 3 was a 3-arm design with equal treatment assignment probabilities. Proportions may not sum to 1 due to rounding."

# show table in Rmd output file
rownames(block_dist) <- NULL
kable(block_dist, caption="**Table A1**")
Table A1
Block Control (N) Control (Proportion) Monitoring (N) Monitoring (Proportion) Punitive (N) Punitive (Proportion)
Regime 1: 13 Apr 2012 - 9 Sep 2012
Brooklyn 14 0.021 11 0.017 17 0.026
Bronx 4 0.006 3 0.005 2 0.003
Manhattan 13 0.02 12 0.018 18 0.028
Queens 2 0.003 3 0.005 6 0.009
Staten Island 1 0.002 1 0.002 0 0
Likely Discrimination Frame 8 0.012 8 0.012 9 0.014
Regime 2: 10 Sep 2012 - 7 May 7, 2013
Brooklyn 63 0.096 24 0.037 29 0.044
Bronx 21 0.032 10 0.015 11 0.017
Manhattan 50 0.077 32 0.049 23 0.035
Queens 28 0.043 10 0.015 14 0.021
Staten Island 8 0.012 2 0.003 3 0.005
Likely Discrimination Frame 11 0.017 4 0.006 4 0.006
Regime 3: 8 May 2013 to 20 Dec 2013
Brooklyn 13 0.02 23 0.035 14 0.021
Bronx 6 0.009 3 0.005 8 0.012
Manhattan 25 0.038 18 0.028 25 0.038
Queens 8 0.012 7 0.011 14 0.021
Staten Island 4 0.006 3 0.005 3 0.005
Total 279 0.427 174 0.266 200 0.306

4.2 Table A2 (p. A-3): Incidence of Early Stage Discrimination

# A: BLACK , B: HISPANIC , C: WHITE

es <- c("contact", "sched",   "numattr", "numskep", "numpos", "numneu", 
        "numneg",  "pctskep", "pctpos",  "pctneu",  "pctneg",  "anyskep", "anyneg")

summm <- function(x, group1 = "C", group2 = "A") { 
  es1 <- paste0(x, "_", group1); es2 <- paste0(x, "_", group2) 
  
  # audit sample
  aud_maj_mean <- mean(aud[[es1]]) 
  aud_min_mean <- mean(aud[[es2]])
  aud_diff     <- as.numeric(aud_maj_mean) - as.numeric(aud_min_mean)
  aud_P        <- t.test(x=aud[[es1]], y=aud[[es2]], alternative="two.sided",
                         paired=TRUE, conf.level=0.95)$p.value
  aud_N        <- length(aud[[es1]])
  
  # experimental sample
  exp_maj_mean <- mean(aud[[es1]][aud$insamp=="1" & !is.na(aud$insamp) ])
  exp_min_mean <- mean(aud[[es2]][aud$insamp=="1" & !is.na(aud$insamp) ])
  exp_diff <-  exp_maj_mean - exp_min_mean
  exp_P <-  t.test(x=aud[[es1 ]][aud$insamp=="1" & !is.na(aud$insamp) ],
                   y=aud[[es2 ]][aud$insamp=="1" & !is.na(aud$insamp) ], alternative="two.sided", paired=TRUE, conf.level=0.95)$p.value
  exp_N <- length(aud[[es1]][aud$insamp=="1" & !is.na(aud$insamp) ]) 
  
  # control group
  ctrl_maj_mean <- mean(aud[[es1]][aud$insamp=="1"  & !is.na(aud$insamp) & aud$TA==0])
  ctrl_min_mean <- mean(aud[[es2]][aud$insamp=="1"  & !is.na(aud$insamp) & aud$TA==0])
  ctrl_diff <- ctrl_maj_mean -  ctrl_min_mean
  ctrl_P <- t.test(x=aud[[es1 ]][aud$insamp=="1" & !is.na(aud$insamp) & aud$TA==0], 
                   y=aud[[es2 ]][aud$insamp=="1"  & !is.na(aud$insamp) & aud$TA==0], alternative="two.sided", paired=TRUE, conf.level=0.95)$p.value
  ctrl_N <- length(aud[[es1]][aud$insamp=="1" & !is.na(aud$insamp)  & aud$TA==0]) 
  
  # return
  c(aud_maj_mean = aud_maj_mean, aud_min_mean = aud_min_mean,
    aud_diff = aud_diff, aud_P = aud_P, aud_N = aud_N,
    exp_maj_mean = exp_maj_mean, exp_min_mean = exp_min_mean, 
    exp_diff = exp_diff, exp_P = exp_P, exp_N = exp_N,
    ctrl_maj_mean = ctrl_maj_mean, ctrl_min_mean = ctrl_min_mean, 
    ctrl_diff = ctrl_diff, ctrl_P = ctrl_P, ctrl_N = ctrl_N)
}

wb <- cbind(es.wb.labs, t(sapply(es, summm, group1 = "C", group2 = "A")))
wh <- cbind(es.wh.labs, t(sapply(es, summm, group1 = "C", group2 = "B")))
bh <- cbind(es.bh.labs, t(sapply(es, summm, group1 = "A", group2 = "B")))
es_discrim_full_table <- rbind(wb, wh, bh)
table_a2 <- es_discrim_full_table[c(1,2,14,15,27,28),1:6]
kable(table_a2, caption="**Table A2**",
      col.names = c("Measure", "Majority Group Mean", "Minority Group Mean",
                    "Difference (Maj-Min)", "p-value", "[N]"),
      row.names = FALSE)
Table A2
Measure Majority Group Mean Minority Group Mean Difference (Maj-Min) p-value [N]
Any contact (White vs. Black) 0.512 0.524 -0.012 (0.184) [2711]
Scheduling appointment (White vs. Black) 0.348 0.361 -0.013 (0.035) [2711]
Any contact (White vs. Hispanic) 0.512 0.512 0 (0.968) [2711]
Scheduling appointment (White vs. Hispanic) 0.348 0.354 -0.006 (0.283) [2711]
Any contact (Black vs. Hispanic) 0.524 0.512 0.012 (0.189) [2711]
Scheduling appointment (Black vs. Hispanic) 0.361 0.354 0.007 (0.284) [2711]
#### F-test for the null hypothesis that race does not matter for making any contact or for scheduling appointment

# A: BLACK , B: HISPANIC , C: WHITE

aud.rs <- melt(aud[,c("cid", "contact_A", "contact_B", "contact_C", "sched_A", "sched_B", "sched_C")], id.vars="cid")

aud.rs$ttype <- sapply(strsplit(as.character(aud.rs$variable), "_", fixed=TRUE), function(x) x[2])
aud.rs$variable <- sapply(strsplit(as.character(aud.rs$variable), "_", fixed=TRUE), function(x) x[1])
aud.rs <- dcast(aud.rs, cid + ttype ~ variable, value.var="value")

fit.contact <- lm(contact ~ ttype, data = aud.rs)
fit.sched   <- lm(sched   ~ ttype, data = aud.rs)

print("F-test of the null that tester race predicts whether the tester makes any contact with the landlord")
[1] "F-test of the null that tester race predicts whether the tester makes any contact with the landlord"
print(summary(fit.contact))

Call:
lm(formula = contact ~ ttype, data = aud.rs)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.5242 -0.5124  0.4758  0.4876  0.4880 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.524161   0.009599  54.605   <2e-16 ***
ttypeB      -0.012173   0.013575  -0.897    0.370    
ttypeC      -0.011804   0.013575  -0.870    0.385    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4998 on 8130 degrees of freedom
Multiple R-squared:  0.000128,  Adjusted R-squared:  -0.000118 
F-statistic: 0.5203 on 2 and 8130 DF,  p-value: 0.5944
print("F-test of the null that tester race predicts whether the tester successfully schedules an appointment")
[1] "F-test of the null that tester race predicts whether the tester successfully schedules an appointment"
print(summary(fit.sched))

Call:
lm(formula = sched ~ ttype, data = aud.rs)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.3611 -0.3545 -0.3482  0.6389  0.6518 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.361121   0.009189  39.299   <2e-16 ***
ttypeB      -0.006640   0.012995  -0.511    0.609    
ttypeC      -0.012910   0.012995  -0.993    0.321    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4785 on 8130 degrees of freedom
Multiple R-squared:  0.0001214, Adjusted R-squared:  -0.0001246 
F-statistic: 0.4936 on 2 and 8130 DF,  p-value: 0.6104

4.3 Figure A2 (p. A-5): Cumulative Number of Cases Over Implementation Period

dat$casedate <- as.Date(dat$casedate, format="%Y-%m-%d")
casedate <- dat$casedate[order(dat$casedate)]

# generate .png
png("out_figurea2_cases_over_time.png", height=500, width=1200)
par(oma=c(2,0,3,0), xpd=TRUE)

plot(casedate, 1:length(casedate), xaxt="n", yaxt="n", type="l", ylim=c(0,700), xlab="", ylab="Cumulative Number of Cases", xaxs="i", yaxs="i")
axis.Date(side=1, at=seq(as.Date("2012-04-01"), as.Date("2014-01-01"), "month"), format="%m/%Y", las=2, cex.axis=0.9)
axis(side=2, at=seq(0,700,100), labels=seq(0,700,100), cex.axis=0.9, las=1)
polygon(x=c(rep(as.Date("2012-10-29"),2), rep(as.Date("2012-11-11"),2)),
        y=c(700,0,0,700),
        col=rgb(.8,.8,.8,.25),
        border=rgb(.8,.8,.8,.25))
text(x=as.Date("2012-11-11"), y=600, "Implementation Halted \nDue to Hurricane Sandy", cex=.9, pos=1)
mtext(side=1, "Date", outer=TRUE)

segments(x0=as.Date("2012-09-09"), y0=700, x1=as.Date("2012-09-09"), y1=0, lty=5, col="grey80")
segments(x0=as.Date("2013-05-07"), y0=700, x1=as.Date("2013-05-07"), y1=0, lty=5, col="grey80")

text(x=as.Date("2012-07-01"), y=840, "Regime 1: 5-Arm Design \nApril 13, 2012 to September 9, 2012 \nEqual Assignment Probabilities", cex=.9, pos=1)
text(x=as.Date("2013-01-15"), y=840, "Regime 2: 3-Arm Design \nSeptember 10, 2012 to May 7, 2013 \nPr(Control)=.5, Pr(Monitoring)=Pr(Punitive)=.25", cex=.9, pos=1)
text(x=as.Date("2013-09-01"), y=840, "Regime 3: 3-Arm Design \nMay 8, 2013 to December 20, 2013 \nEqual Assignment Probabilities", cex=.9, pos=1)

invisible(dev.off())
Figure A2

Figure A2

4.4 Figure A3 (p. A-6): Discrimination Levels and Treatment Effects using the Subjective Discrimination Index

Note: Analysis and figure construction code for Figure A3 is shown above in “1.1 Figure 1: Main Results on Discrimination Levels and Treatment Effects”

Figure A3: Net Discrimination in Subjective Discrimination Index

Figure A3: Net Discrimination in Subjective Discrimination Index

4.5 Figure A4 (p. A-17): Estimates of Tester Random Effects on Outcome Measures

ctx$weekday <- as.factor(weekdays(as.Date(ctx$sfa_B4_apptdate, format="%m/%d/%y")))

anova_fig <- data.frame(outcome = c("cb","off","qualpraise","sales","posedit","posbg","prof"),
                        ylab = c("Callbacks", "Offers", "Praise", "Perceived Sales Efforts", 
                                 "Positive Editorializing", "Positive Reactions", "Professionalism"),
                        title = c("Tester Random Effects on Callback Incidence", 
                                  "Tester Random Effects on Offer Incidence",
                                  "Tester Random Effects on Praise re: Qualifications to Rent", 
                                  "Tester Random Effects on Sales Efforts",
                                  "Tester Random Effects on Positive Editorializing", 
                                  "Tester Random Effects on Positive Reactions to Background",
                                  "Tester Random Effects on Professionalism"))

# MAKE GRAPHS

for(ii in 1:nrow(anova_fig)){
  model     <- lmer(paste0(anova_fig$outcome[ii], 
                       " ~ (1|tid) + (1|ttype) + callorder + weekday + partnered
                                   + numbr + anyskep + anyneg + (1|team_gender)"), data=ctx)
  model_off <- lmer(paste0(anova_fig$outcome[2], 
                           " ~ (1|tid) + (1|ttype) + callorder + weekday + partnered
                                        + numbr + anyskep + anyneg + (1|team_gender)"), data=ctx)
  
  ifelse(ii>=2, se <- se.ranef(model_off)[[1]][i,1], se <- se.ranef(model)[[1]][i,1])
  
  i <- 1:(length(ranef(model)[[1]][,1]))
  y <- ranef(model)[[1]][,1] # individual tester RE
  types <- c(rep(0,14), rep(1,19), rep(2,15))
  col   <- rainbow(3)[types+1]

  # output PNG file for Rmd output
  png(paste0("out_figurea4",letters[ii],"_anova_",anova_fig$outcome[ii],".png"), width=800)
  par(0,0,0,0)
  plot(i, y, col = col, ylim=c(-1.0,1),xlim=c(1,50),xlab="Testers", ylab=anova_fig$ylab[ii],main=anova_fig$title[ii],pch=20,bty="n")
  legend("bottom", c("Black","Hispanic","White"), pch=20, col=rainbow(3),horiz=T,bty="n")
  segments(i, y + 2*se, i, y - 2*se,  col = col)
  abline(a=0,b=0,lty=2)
  invisible(dev.off())
}
Figure A4, panel a: Post-Visit Callback

Figure A4, panel a: Post-Visit Callback

Figure A4, panel b: Post-Visit Offer

Figure A4, panel b: Post-Visit Offer

Figure A4, panel c: Positive Biographical Feedback

Figure A4, panel c: Positive Biographical Feedback

Figure A4, panel d: Positive Editorializing

Figure A4, panel d: Positive Editorializing

Figure A4, panel e: Praised Rental Qualifications

Figure A4, panel e: Praised Rental Qualifications

Figure A4, panel f: Sales Efforts

Figure A4, panel f: Sales Efforts

Figure A4, panel g: Professionalism

Figure A4, panel g: Professionalism

### The following analysis corresponds to the analyses described in footnote 22 in the article.
ctx$weekday <- as.factor(weekdays(as.Date(ctx$sfa_B4_apptdate, format="%m/%d/%y")))
dat_tid <- dat[,c("cid","TA",names(dat)[grepl("tid",names(dat))])]
dat_tid <- cbind(dat_tid[,c("cid","TA")], t(apply(dat_tid[,3:ncol(dat_tid)], 1, function(J) names(dat_tid[,3:ncol(dat_tid)])[which(J==1)])))

dat_tid_1 <- as.data.frame(dat_tid[,c(1,2,3)], stringsAsFactors=FALSE)
dat_tid_2 <- as.data.frame(dat_tid[,c(1,2,4)], stringsAsFactors=FALSE)
dat_tid_3 <- as.data.frame(dat_tid[,c(1,2,5)], stringsAsFactors=FALSE)

dat_tid_1[,3] <- as.character(dat_tid_1[,3])
dat_tid_2[,3] <- as.character(dat_tid_2[,3])
dat_tid_3[,3] <- as.character(dat_tid_3[,3])
names(dat_tid_1)[3] <- names(dat_tid_2)[3] <- names(dat_tid_3)[3] <- "tid"
dat_tid <- rbind(dat_tid_1, dat_tid_2, dat_tid_3)
dat_tid$tid <- gsub("tid.","",dat_tid$tid,fixed=TRUE)
dat_tid$ttype <- substr(dat_tid$tid, 1, 1)

dat_sub <- dat[,c("cid","TA","team_gender","callorder","partnered","numbr",
       paste0(rep(c("anyskep","anyneg","cb","off","prof","posedit","posbg","qualpraise","sales"), 3),
              rep(c("_A","_B","_C"), each=9))
       )]

dmelt <- melt(dat_sub, id.vars=c("cid", "TA", "team_gender", "callorder", "partnered", "numbr"))
dmelt$ttype <- ifelse(grepl("_A", dmelt$variable, fixed=TRUE), "A",
                      ifelse(grepl("_B", dmelt$variable, fixed=TRUE), "B", "C"))
dmelt$variable <- gsub("_A", "", dmelt$variable, fixed=TRUE)
dmelt$variable <- gsub("_B", "", dmelt$variable, fixed=TRUE)
dmelt$variable <- gsub("_C", "", dmelt$variable, fixed=TRUE)
dat_sub2 <- dcast(dmelt, cid + TA + team_gender + callorder + partnered + numbr + ttype ~ variable, value.var="value")
dat2 <- join(dat_tid, dat_sub2, by=c("cid","TA","ttype"), type="left", match="all")

ctx_sub <- ctx[,c("cid","weekday")]
ctx_sub <- ctx_sub[!duplicated(ctx_sub$cid),]
#ctx[ctx$cid %in% ctx_sub$cid[is.na(ctx_sub$weekday)] & !is.na(ctx$weekday),c("cid","weekday")]
ctx_sub$weekday <- ifelse(ctx_sub$cid == 11108, "Wednesday", ctx_sub$weekday)
ctx_sub$weekday <- ifelse(ctx_sub$cid == 12116,    "Friday", ctx_sub$weekday)
ctx_sub$weekday <- ifelse(ctx_sub$cid == 12608,    "Monday", ctx_sub$weekday)
ctx_sub$weekday <- ifelse(ctx_sub$cid == 13361,    "Monday", ctx_sub$weekday)
ctx_sub$weekday <- ifelse(ctx_sub$cid == 15734,    "Friday", ctx_sub$weekday)
ctx_sub$weekday <- ifelse(ctx_sub$cid == 19446,  "Thursday", ctx_sub$weekday)
ctx_sub$weekday <- ifelse(ctx_sub$cid == 26324,    "Friday", ctx_sub$weekday)
ctx_sub$weekday <- ifelse(ctx_sub$cid == 30165,    "Friday", ctx_sub$weekday)
ctx_sub$weekday <- ifelse(is.na(ctx_sub$weekday), "None", ctx_sub$weekday)

dat3 <- join(dat2, ctx_sub, by="cid", type="left", match="all")

model.callback.0 <- lmer(cb ~ (1|tid) + (1|ttype) + callorder + weekday + 
                         partnered + numbr + anyskep + anyneg + (1|team_gender), 
                       data=dat3, subset=(TA==0))

model.offer.0 <- lmer(off ~ (1|tid) + (1|ttype) + callorder + weekday + 
                      partnered + numbr + anyskep + anyneg + (1|team_gender), 
                    data=dat3, subset=(TA==0))

model.callback.all <- lmer(cb ~ (1|tid) + (1|ttype) + callorder + weekday + 
                           partnered + numbr + anyskep + anyneg + (1|team_gender), 
                         data=dat3)

model.offer.all <- lmer(off ~ (1|tid) + (1|ttype) + callorder + weekday + 
                        partnered + numbr + anyskep + anyneg + (1|team_gender), 
                      data=dat3)

re_out <- list(model.callback.0   , 
               model.offer.0      ,
               model.callback.all ,
               model.offer.all    )

re_table <- lapply(re_out, function(J){
  x <- as.data.frame(VarCorr(J))
  x[,4] <- round(x[,4], 4)
  x <- x[,c(1,4)]
  return(x)
})

re_table <- do.call(cbind, re_table)
re_table <- re_table[,-c(3,5,7)]
re_table[,1] <- c("Estimated Variance of Varying Tester Intercepts",
                  "Estimated Variance of Varying Tester Race Intercepts",
                  "Estimated Variance of Varying Tester Team Gender Intercepts",
                  "Estimated Residual Variance")
colnames(re_table) <- c("", "Control Group, Outcome: Callback", "Control Group, Outcome: Offer",
                        "Experimental Sample, Outcome: Callback", "Experimental Sample, Outcome: Offer")

kable(re_table)
Control Group, Outcome: Callback Control Group, Outcome: Offer Experimental Sample, Outcome: Callback Experimental Sample, Outcome: Offer
Estimated Variance of Varying Tester Intercepts 0.0034 0.0046 0.0035 0.0048
Estimated Variance of Varying Tester Race Intercepts 0.0002 0.0000 0.0003 0.0000
Estimated Variance of Varying Tester Team Gender Intercepts 0.0010 0.0000 0.0010 0.0000
Estimated Residual Variance 0.1408 0.0759 0.1352 0.0740

4.6 Table A4 (p. A-20): Selected Characteristics of Housing Units in the Audit and Experimental Samples

We summarize selected pre-treatment characteristics of housing units in the audit and experimental samples. To ensure the measures are pre-treatment quantities, we summarize characteristics of housing units that are posted on and scraped from the original Craigslist ads.

# Helper functions and code to calculate descriptive statistics
stratum_labels <- c("Total", "Brooklyn", "Bronx", "Manhattan", "Queens", "Staten Island", "Likely Discrimination Frame")


# descriptive_f: A function to compute descriptive statistics e.g. mean, sd, min, max..
calc_descriptives <- function(descriptive_f, df, x, label, digs = 2) {
  x <- c(descriptive_f(df[[x]], na.rm=TRUE),
              descriptive_f(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "brk" ], na.rm=TRUE),
              descriptive_f(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "brx" ], na.rm=TRUE),
              descriptive_f(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "mnh" ], na.rm=TRUE),
              descriptive_f(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "que" ], na.rm=TRUE),
              descriptive_f(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "stn" ], na.rm=TRUE),
              descriptive_f(df[[x]][ df[["frame"]] != "representative" ], na.rm=TRUE))
  
  out <- round(x, digs)
  
 if(identical(descriptive_f, sd))  out <- paste0("(", out, ")")
  out
}


calc_pct_ses <- function(df, x, label, digs=NULL) {
  
  x_prop <- c(mean(df[[x]], na.rm=TRUE),
              mean(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "brk" ], na.rm=TRUE),
              mean(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "brx" ], na.rm=TRUE),
              mean(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "mnh" ], na.rm=TRUE),
              mean(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "que" ], na.rm=TRUE),
              mean(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "stn" ], na.rm=TRUE),
              mean(df[[x]][ df[["frame"]] != "representative" ], na.rm=TRUE))
  
  x_sd <- c(sd(df[[x]], na.rm=TRUE),
            sd(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "brk" ], na.rm=TRUE),
            sd(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "brx" ], na.rm=TRUE),
            sd(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "mnh" ], na.rm=TRUE),
            sd(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "que" ], na.rm=TRUE),
            sd(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "stn" ], na.rm=TRUE),
            sd(df[[x]][ df[["frame"]] != "representative" ], na.rm=TRUE))
  
  x_n <- c(sum(!is.na(df[[x]])),
           sum(!is.na(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "brk" ])),
           sum(!is.na(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "brx" ])),
           sum(!is.na(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "mnh" ])),
           sum(!is.na(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "que" ])),
           sum(!is.na(df[[x]][ df[["frame"]] == "representative" & df[["boro"]] == "stn" ])),
           sum(!is.na(df[[x]][ df[["frame"]] != "representative" ]))
           )
  
  x_se <- x_sd / sqrt(x_n)
  
  if(is.null(digs)) digs <- 2
  x_pct <- sprintf( paste0("%.",digs,"f") , round(x_prop * 100, digs) )
  x_se <- paste0("(", sprintf( paste0("%.",digs,"f") , round(x_se * 100, digs) ) ,")")
  
  x <- cbind(x_pct, x_se)
  colnames(x) <- paste0(label, c("_p", "_se"))
  rownames(x) <- NULL
  return(x)
}

#----------------------------------------#
# Panel A: Number of units

panel_a_aud_n <- c(nrow(aud),
                   table(aud$boro[aud$frame == "representative"]),
                   nrow(aud[aud$frame != "representative",]))

panel_a_aud_p <- paste0("(", sprintf("%.2f", round(panel_a_aud_n / nrow(aud) * 100, 2))  ,")")

panel_a_exp_n <- c(nrow(dat),
                   table(dat$boro[dat$frame == "representative"]),
                   nrow(dat[dat$frame != "representative",]))
panel_a_exp_p <- paste0("(", sprintf("%.2f", round(panel_a_exp_n / nrow(dat) * 100, 2))  ,")")

panel_a_trt_n <- c(nrow(dat[dat$TA != 0,]),
                   table(dat$boro[dat$frame == "representative" & dat$TA != 0]),
                   nrow(dat[dat$frame != "representative" & dat$TA != 0,]))
panel_a_trt_p <- paste0("(", sprintf("%.2f", round(panel_a_trt_n / nrow(dat[dat$TA != 0,]) * 100, 2)) ,")")

panel_a_con_n <- c(nrow(dat[dat$TA == 0,]),
                   table(dat$boro[dat$frame == "representative" & dat$TA == 0]),
                   nrow(dat[dat$frame != "representative" & dat$TA == 0,]))
panel_a_con_p <- paste0("(", sprintf("%.2f", round(panel_a_con_n / nrow(dat[dat$TA == 0,]) * 100, 2)) ,")")

tab_a4_panel_a <- cbind(stratum_labels,
                        panel_a_aud_n,
                        panel_a_aud_p,
                        panel_a_exp_n,
                        panel_a_exp_p,
                        panel_a_trt_n,
                        panel_a_trt_p,
                        panel_a_con_n,
                        panel_a_con_p)

rownames(tab_a4_panel_a) <- NULL
colnames(tab_a4_panel_a) <- gsub("panel_a_", "", colnames(tab_a4_panel_a), fixed=TRUE)

kable(tab_a4_panel_a, col.names=c("Stratum", "Audit Sample (N)", "Audit Sample (%)",
                                  "Exp Sample (N)", "Exp Sample (%)",
                                  "Any Treatment (N)", "Any Treatment (%)",
                                  "Control (N)", "Control (%)"),
      caption="**Table A4, Panel A. Number of Units (Scraped from Craigslist).**")
Table A4, Panel A. Number of Units (Scraped from Craigslist).
Stratum Audit Sample (N) Audit Sample (%) Exp Sample (N) Exp Sample (%) Any Treatment (N) Any Treatment (%) Control (N) Control (%)
Total 2711 (100.00) 653 (100.00) 374 (100.00) 279 (100.00)
Brooklyn 801 (29.55) 208 (31.85) 118 (31.55) 90 (32.26)
Bronx 337 (12.43) 68 (10.41) 37 (9.89) 31 (11.11)
Manhattan 668 (24.64) 216 (33.08) 128 (34.22) 88 (31.54)
Queens 495 (18.26) 92 (14.09) 54 (14.44) 38 (13.62)
Staten Island 254 (9.37) 25 (3.83) 12 (3.21) 13 (4.66)
Likely Discrimination Frame 156 (5.75) 44 (6.74) 25 (6.68) 19 (6.81)
#----------------------------------------#
# Panel B: Monthly asking price (mean)
tab_a4_panel_b <- cbind(stratum_labels,
                    calc_descriptives(mean, df = aud,               x = "rent", label = "aud"),
                    calc_descriptives(sd,   df = aud,               x = "rent", label = "aud"),
                    calc_descriptives(mean, df = dat,               x = "rent", label = "exp"),
                    calc_descriptives(sd,   df = dat,               x = "rent", label = "exp"),
                    calc_descriptives(mean, df = dat[dat$TA != 0,], x = "rent", label = "trt"),
                    calc_descriptives(sd,   df = dat[dat$TA != 0,], x = "rent", label = "trt"),
                    calc_descriptives(mean, df = dat[dat$TA == 0,], x = "rent", label = "con"),
                    calc_descriptives(sd,   df = dat[dat$TA == 0,], x = "rent", label = "con"))



kable(tab_a4_panel_b, col.names=c("Stratum", "Audit Sample (Mean)", "Audit Sample (SD)",
                                  "Exp Sample (Mean)", "Exp Sample (SD)",
                                  "Any Treatment (Mean)", "Any Treatment (SD)",
                                  "Control (Mean)", "Control (SD)"),
      caption="**Table A4, Panel B. Monthly Asking Rental Price ($, Scraped from Craigslist).**")
Table A4, Panel B. Monthly Asking Rental Price ($, Scraped from Craigslist).
Stratum Audit Sample (Mean) Audit Sample (SD) Exp Sample (Mean) Exp Sample (SD) Any Treatment (Mean) Any Treatment (SD) Control (Mean) Control (SD)
Total 2340 (4721.74) 2429.74 (1208.02) 2411.23 (1163.67) 2456.26 (1271.73)
Brooklyn 2190.57 (1049.32) 2319.49 (956.68) 2285.06 (854.67) 2370.2 (1096.14)
Bronx 2345.55 (13616.75) 1545.64 (688.14) 1506.96 (675.92) 1596.53 (719.2)
Manhattan 3251.67 (1600.49) 3162.99 (1447.25) 3133.9 (1404.18) 3206.12 (1520.28)
Queens 1718.19 (562.32) 1885.07 (557.62) 1846.75 (449.9) 1940.82 (693.15)
Staten Island 1383.49 (511.19) 1335.62 (596.31) 1368.89 (655.17) 1292.86 (559.34)
Likely Discrimination Frame 2479.07 (1451.93) 2321.16 (736.25) 2336.08 (778.38) 2301.53 (697.4)
#----------------------------------------#
# Panel C: Monthly asking price (median)

tab_a4_panel_c <- cbind(stratum_labels,
                    calc_descriptives(median, df = aud, x = "rent", label = "aud"),
                    "",
                    calc_descriptives(median, df = dat, x = "rent", label = "exp"),
                    "",
                    calc_descriptives(median, df = dat[dat$TA != 0,], x = "rent", label = "trt"),
                    "",
                    calc_descriptives(median, df = dat[dat$TA == 0,], x = "rent", label = "con"),
                     "")


kable(tab_a4_panel_c, col.names=c("Stratum", "Audit Sample (Median)", "",
                                  "Exp Sample (Median)", "",
                                  "Any Treatment (Median)", "",
                                  "Control (Median)", ""),
      caption="**Table A4, Panel C. Median Monthly Asking Rental Price ($, Scraped from Craigslist).**")
Table A4, Panel C. Median Monthly Asking Rental Price ($, Scraped from Craigslist).
Stratum Audit Sample (Median) Exp Sample (Median) Any Treatment (Median) Control (Median)
Total 1850 2197.5 2200 2025
Brooklyn 1983.5 2200 2299 2050
Bronx 1325 1400 1400 1400
Manhattan 2950 2897.5 2900 2872.5
Queens 1600 1850 1850 1850
Staten Island 1300 1100 1195 1100
Likely Discrimination Frame 2272.5 2249.5 2200 2300
#----------------------------------------#
# Panel D: Number of bedrooms

tab_a4_panel_d <- cbind(stratum_labels,
                    calc_descriptives(mean, df = aud,               x = "numbr", label = "aud"),
                    calc_descriptives(sd,   df = aud,               x = "numbr", label = "aud"),
                    calc_descriptives(mean, df = dat,               x = "numbr", label = "exp"),
                    calc_descriptives(sd,   df = dat,               x = "numbr", label = "exp"),
                    calc_descriptives(mean, df = dat[dat$TA != 0,], x = "numbr", label = "trt"),
                    calc_descriptives(sd,   df = dat[dat$TA != 0,], x = "numbr", label = "trt"),
                    calc_descriptives(mean, df = dat[dat$TA == 0,], x = "numbr", label = "con"),
                    calc_descriptives(sd,   df = dat[dat$TA == 0,], x = "numbr", label = "con"))

kable(tab_a4_panel_d, col.names=c("Stratum", "Audit Sample (Mean)", "Audit Sample (SD)",
                                  "Exp Sample (Mean)", "Exp Sample (SD)",
                                  "Any Treatment (Mean)", "Any Treatment (SD)",
                                  "Control (Mean)", "Control (SD)"),
      caption="**Table A4, Panel D. Number of Bedrooms (Scraped from Craigslist).**")
Table A4, Panel D. Number of Bedrooms (Scraped from Craigslist).
Stratum Audit Sample (Mean) Audit Sample (SD) Exp Sample (Mean) Exp Sample (SD) Any Treatment (Mean) Any Treatment (SD) Control (Mean) Control (SD)
Total 0.94 (1.22) 0.88 (1.19) 0.85 (1.17) 0.92 (1.21)
Brooklyn 1 (1.23) 1.01 (1.3) 0.97 (1.26) 1.07 (1.36)
Bronx 0.82 (1.2) 0.85 (1.16) 0.86 (1.23) 0.84 (1.1)
Manhattan 0.85 (1.15) 0.69 (1.04) 0.63 (0.98) 0.76 (1.13)
Queens 0.81 (1.15) 0.59 (0.9) 0.56 (0.9) 0.63 (0.91)
Staten Island 0.89 (1.29) 0.8 (1.5) 1.08 (1.83) 0.54 (1.13)
Likely Discrimination Frame 1.88 (1.23) 1.95 (1.01) 1.92 (1) 2 (1.05)
#----------------------------------------#
# Panel E: Square footage

tab_a4_panel_e <- cbind(stratum_labels,
                    calc_descriptives(mean, df = aud,               x = "sqft", label = "aud"),
                    calc_descriptives(sd,   df = aud,               x = "sqft", label = "aud"),
                    calc_descriptives(mean, df = dat,               x = "sqft", label = "exp"),
                    calc_descriptives(sd,   df = dat,               x = "sqft", label = "exp"),
                    calc_descriptives(mean, df = dat[dat$TA != 0,], x = "sqft", label = "trt"),
                    calc_descriptives(sd,   df = dat[dat$TA != 0,], x = "sqft", label = "trt"),
                    calc_descriptives(mean, df = dat[dat$TA == 0,], x = "sqft", label = "con"),
                    calc_descriptives(sd,   df = dat[dat$TA == 0,], x = "sqft", label = "con"))

kable(tab_a4_panel_e, col.names=c("Stratum", "Audit Sample (Mean)", "Audit Sample (SD)",
                                  "Exp Sample (Mean)", "Exp Sample (SD)",
                                  "Any Treatment (Mean)", "Any Treatment (SD)",
                                  "Control (Mean)", "Control (SD)"),
      caption="**Table A4, Panel E. Square Footage (Scraped from Craigslist).** Standard invisible(dev.off())iation is noted as NA when only one observation in a given stratum has non-missing square footage data. Square footage information is rarely posted on Craigslist rental ads in New York City.")
Table A4, Panel E. Square Footage (Scraped from Craigslist). Standard invisible(dev.off())iation is noted as NA when only one observation in a given stratum has non-missing square footage data. Square footage information is rarely posted on Craigslist rental ads in New York City.
Stratum Audit Sample (Mean) Audit Sample (SD) Exp Sample (Mean) Exp Sample (SD) Any Treatment (Mean) Any Treatment (SD) Control (Mean) Control (SD)
Total 1504.7 (8050.42) 1017.49 (448.09) 915 (449.36) 1116.68 (430.93)
Brooklyn 3016.84 (16292.51) 1125.94 (638.07) 1085.11 (661.49) 1171.88 (652.64)
Bronx 999.59 (352.42) 1328.75 (453.84) 1100 (NA) 1405 (523.52)
Manhattan 1029.82 (495.06) 926.79 (350.32) 777.43 (334.36) 1135.9 (262.42)
Queens 930.39 (774.61) 917.6 (260.29) 866.67 (288.68) 994 (291.33)
Staten Island 1203.25 (981.19) 1233.33 (305.51) 1300 (NA) 1200 (424.26)
Likely Discrimination Frame 1011.5 (568.51) 885 (272.45) 900 (141.42) 880 (315.91)
#----------------------------------------#
# Panel F: Listed by broker

tab_a4_panel_f <- cbind(stratum_labels,
                        calc_pct_ses(df = aud,                x = "broker", label = "aud"),
                        calc_pct_ses(df = dat,                x = "broker", label = "exp"),
                        calc_pct_ses(df = dat[dat$TA != 0,],  x = "broker", label = "trt"),
                        calc_pct_ses(df = dat[dat$TA == 0,],  x = "broker", label = "con"))

kable(tab_a4_panel_f, col.names=c("Stratum", "Audit Sample (%)", "Audit Sample (SE)",
                                  "Exp Sample (%)", "Exp Sample (SE)",
                                  "Any Treatment (%)", "Any Treatment (SE)",
                                  "Control (%)", "Control (SE)"),
      caption="**Table A4, Panel E. Listed by Broker (Scraped from Craigslist).**")
Table A4, Panel E. Listed by Broker (Scraped from Craigslist).
Stratum Audit Sample (%) Audit Sample (SE) Exp Sample (%) Exp Sample (SE) Any Treatment (%) Any Treatment (SE) Control (%) Control (SE)
Total 57.29 (0.95) 84.53 (1.42) 83.42 (1.93) 86.02 (2.08)
Brooklyn 56.43 (1.75) 85.58 (2.44) 86.44 (3.17) 84.44 (3.84)
Bronx 51.04 (2.73) 75.00 (5.29) 72.97 (7.40) 77.42 (7.63)
Manhattan 65.72 (1.84) 88.89 (2.14) 87.50 (2.93) 90.91 (3.08)
Queens 51.72 (2.25) 81.52 (4.07) 79.63 (5.53) 84.21 (5.99)
Staten Island 52.36 (3.14) 56.00 (10.13) 41.67 (14.86) 69.23 (13.32)
Likely Discrimination Frame 64.74 (3.84) 95.45 (3.18) 92.00 (5.54) 100.00 (0.00)
#----------------------------------------#
# Min/Max Advertised Monthly Rental Price 
# (Reported in online SI text, page A-19)

min_max_monthly_rent <- cbind(stratum_labels,
                          calc_descriptives(min, df = aud,                 x = "rent", label = "aud"),
                          calc_descriptives(max, df = aud,                 x = "rent", label = "aud"),
                          calc_descriptives(min, df = dat,                 x = "rent", label = "exp"),
                          calc_descriptives(max, df = dat,                 x = "rent", label = "exp"),
                          calc_descriptives(min, df = dat[dat$TA != 0,],   x = "rent", label = "trt"),
                          calc_descriptives(max, df = dat[dat$TA != 0,],   x = "rent", label = "trt"),
                          calc_descriptives(min, df = dat[dat$TA == 0,],   x = "rent", label = "con"),
                          calc_descriptives(max, df = dat[dat$TA == 0,],   x = "rent", label = "con"))

kable(min_max_monthly_rent, col.names=c("Stratum", "Audit Sample (Min)", "Audit Sample (Max)",
                                        "Exp Sample (Min)", "Exp Sample (Max)",
                                        "Any Treatment (Min)", "Any Treatment (Max)",
                                        "Control (Min)", "Control (Max)"),
      caption="**Mininum and Maximum Advertised Monthly Rental Price (Reported in online SI, page A-19)**")
Mininum and Maximum Advertised Monthly Rental Price (Reported in online SI, page A-19)
Stratum Audit Sample (Min) Audit Sample (Max) Exp Sample (Min) Exp Sample (Max) Any Treatment (Min) Any Treatment (Max) Control (Min) Control (Max)
Total 160 2e+05 160 9495 160 9495 850 8900
Brooklyn 175 14000 950 6400 1099 5300 950 6400
Bronx 160 2e+05 160 3999 160 3500 898 3999
Manhattan 400 13000 1095 9495 1095 9495 1400 8900
Queens 450 5000 835 4000 835 2750 1100 4000
Staten Island 600 4000 750 2800 750 2800 850 2300
Likely Discrimination Frame 825 15000 1100 4150 1100 4150 1250 3395

4.7 Table A5 (p. A-21): Distribution of Rental Units Across Boroughs, by Sample

# Note: Citywide 2011 numbers are from Table 5 in 
#  http://www.nyc.gov/html/hpd/downloads/pdf/HPD-2011-HVS-Selected-Findings-Tables.pdf
# For an archived version of this document available via the Internet Archive Wayback Machine, see
#  http://web.archive.org/web/20131102065647/http://www.nyc.gov/html/hpd/downloads/pdf/HPD-2011-HVS-Selected-Findings-Tables.pdf 
# A PDF copy of this document is also included in the replication archive

tab.city.n <- c(691178, 388022, 587313, 449108, 57013)
tab.city.pct <- c(31.81, 17.86, 27.03, 20.67, 2.62)

db.aud <- ddply(aud[aud$frame != "likely disc",], .(boro), summarise,
                num = length(cid),
                pct = round(length(cid)/nrow(aud[aud$frame != "likely disc",])*100, 2))

db.exp <- ddply(dat[dat$frame != "likely disc",], .(boro), summarise,
                num = length(cid),
                pct = round(length(cid)/nrow(dat[dat$frame != "likely disc",])*100, 2))

db.con <- ddply(dat[dat$frame != "likely disc" & dat$TA==0,], .(boro), summarise,
                num = length(cid),
                pct = round(length(cid)/nrow(dat[dat$frame != "likely disc" & dat$TA==0,])*100, 2))

colnames(db.aud) <- c("boro", "Audit Sample (N)", "Audit Sample (%)")
colnames(db.exp) <- c("boro", "Exp Sample (N)", "Exp Sample (%)")
colnames(db.con) <- c("boro", "Control Group (N)", "Control Group (%)")

tab.db <- join(db.aud, db.exp, by="boro", type="left", match="all")
tab.db <- join(tab.db, db.con, by="boro", type="left", match="all")

tab.db <- cbind(tab.db[,1],
                tab.city.n,
                tab.city.pct,
                tab.db[,2:ncol(tab.db)])

tab.db[,1] <- c("Brooklyn", "Bronx", "Manhattan", "Queens", "Staten Island")
colnames(tab.db)[1:3] <- c("Borough", "Citywide 2011 (N)", "Citywide 2011 (%)")

tab.db <- rbind(tab.db,
                c("Total", round(apply(tab.db[,2:ncol(tab.db)],2,sum), 0)))


kable(tab.db, caption="**Table A5**")
Table A5
Borough Citywide 2011 (N) Citywide 2011 (%) Audit Sample (N) Audit Sample (%) Exp Sample (N) Exp Sample (%) Control Group (N) Control Group (%)
Brooklyn 691178 31.81 801 31.35 208 34.15 90 34.62
Bronx 388022 17.86 337 13.19 68 11.17 31 11.92
Manhattan 587313 27.03 668 26.14 216 35.47 88 33.85
Queens 449108 20.67 495 19.37 92 15.11 38 14.62
Staten Island 57013 2.62 254 9.94 25 4.11 13 5
Total 2172634 100 2555 100 609 100 260 100

4.8 Figure A5 (p. A-22): Map of the Geographic Distribution of Housing Units Corresponding to Advertised Listings

We omit the replication data and code to produce Figure A5, the map of the geographic distribution of rental units in the audit and experimental samples, because these require non-anonymized data that contain personal identifying information about the subjects.

4.9 Table A6 (p. A-24): Baseline Incidence of Discrimination: In-Person and Post-Visit

outvars <- c("meet", "sales", "qualpraise", "posbg", "posedit", "prof", "cb", "off")

summm <- function(outvar, group1 = "C", group2 = "A"){
  outvar1 <- paste0(outvar, "_", group1)
  outvar2 <- paste0(outvar, "_", group2)
  ctrl_maj_mean <- mean(as.numeric(dat[[outvar1]][dat$TA==0]), na.rm=TRUE) # majority mean
  ctrl_min_mean <- mean(as.numeric(dat[[outvar2]][dat$TA==0]), na.rm=TRUE) # minority mean
  ctrl_diff <- ctrl_maj_mean  - ctrl_min_mean  # difference
  if(outvar %in% c("meet", "cb", "off")){
    ctrl_P <- t.test(x=as.numeric(dat[[outvar1]][dat$TA==0]),
                     y=as.numeric(dat[[outvar2]][dat$TA==0]),
                     alternative="two.sided",
                     paired=TRUE,
                     conf.level=0.95)$p.value # p value from t-test (two-sided)
  } else {
    ctrl_P <- t.test(x = as.numeric(dat[[outvar1]][dat$TA==0]),
                     y = as.numeric(dat[[outvar2]][dat$TA==0]),
                     alternative = "two.sided",
                     paired = TRUE,
                     conf.level = 0.95)$p.value # p value from t-test (two-sided)
  }
  
  
  ctrl_N <- ifelse(length(dat[[outvar1]][dat$TA==0 & !is.na(dat[[outvar1]])])==length(as.numeric(dat[[outvar2]][dat$TA==0 & !is.na(dat[[outvar2]])])),
                   length(dat[[outvar1]][dat$TA==0 & !is.na(dat[[outvar1]])]),
                   length(dat[[outvar1]][dat$TA==0 & !( is.na(dat[[outvar1]]) & is.na(dat[[outvar2]]) )])  )# sample size
  
  ## ANY TREATMENT (cols 7-11)
  any_maj_mean <- mean(as.numeric(dat[[outvar1]][dat$TA %in% c(1,2)]), na.rm=TRUE)  # majority mean
  any_min_mean <- mean(as.numeric(dat[[outvar2]][dat$TA %in% c(1,2)]), na.rm=TRUE)  # minority mean
  any_diff <- any_maj_mean - any_min_mean  # difference
  if(outvar %in% c("meet", "cb", "off")){
    any_P <- t.test(x=as.numeric(dat[[outvar1]][dat$TA %in% c(1,2)]),
                    y=as.numeric(dat[[outvar2]][dat$TA %in% c(1,2)]),
                    alternative="two.sided",
                    paired=TRUE,
                    conf.level=0.95)$p.value # p value from t-test (two-sided)
  } else {
    any_P <- t.test(x=as.numeric(dat[[outvar1]][dat$TA %in% c(1,2)]),
                    y=as.numeric(dat[[outvar2]][dat$TA %in% c(1,2)]),
                    alternative="two.sided",
                    paired=FALSE,
                    conf.level=0.95)$p.value # p value from t-test (two-sided)    
  }
  any_N <- ifelse(length(dat[[outvar1]][dat$TA%in%c(1,2) & !is.na(dat[[outvar1]])])==length(as.numeric(dat[[outvar2]][dat$TA%in%c(1,2) & !is.na(dat[[outvar2]])])),
                  length(dat[[outvar1]][dat$TA%in%c(1,2) & !is.na(dat[[outvar1]])]),
                  length(dat[[outvar1]][dat$TA%in%c(1,2) & !( is.na(dat[[outvar1]]) & is.na(dat[[outvar2]]) )]) )# sample size
  
  ## EXPERIMENT SAMPLE (cols 12-16)
  exp_maj_mean <- mean(as.numeric(dat[[outvar1]][dat$TA %in% c(0,1,2)]), na.rm=TRUE)  # majority mean
  exp_min_mean <- mean(as.numeric(dat[[outvar2]][dat$TA %in% c(0,1,2)]), na.rm=TRUE) # minority mean
  exp_diff <-exp_maj_mean  - exp_min_mean  # difference
  if(outvar %in% c("meet", "cb", "off")){
    exp_P <- t.test(x=as.numeric(dat[[outvar1]][dat$TA %in% c(0,1,2)]),
                    y=as.numeric(dat[[outvar2]][dat$TA %in% c(0,1,2)]),
                    alternative="two.sided",
                    paired=TRUE,
                    conf.level=0.95)$p.value # p value from t-test (two-sided)
  } else {
    exp_P <- t.test(x=as.numeric(dat[[outvar1]][dat$TA %in% c(0,1,2)]),
                    y=as.numeric(dat[[outvar2]][dat$TA %in% c(0,1,2)]),
                    alternative="two.sided",
                    paired=FALSE,
                    conf.level=0.95)$p.value # p value from t-test (two-sided)    
  }
  exp_N <-  ifelse(length(dat[[outvar1]][dat$TA%in%c(0,1,2) & !is.na(dat[[outvar1]])])==length(as.numeric(dat[[outvar2]][dat$TA%in%c(0,1,2) & !is.na(dat[[outvar2]])])),
                   length(dat[[outvar1]][dat$TA%in%c(0,1,2) & !is.na(dat[[outvar1]])]),
                   length(dat[[outvar1]][dat$TA%in%c(0,1,2) & !( is.na(dat[[outvar1]]) & is.na(dat[[outvar2]]))])) # sample size

     c(ctrl_maj_mean, ctrl_min_mean, ctrl_diff, ctrl_P, ctrl_N, 
       any_maj_mean,  any_min_mean,  any_diff,  any_P,  any_N,
       exp_maj_mean,  exp_min_mean,  exp_diff,  exp_P,  exp_N)
  
  }


wb <- cbind( c("Showed up to appointment (White vs. Black)",                         
               "Perceived sales efforts (White vs. Black)",                          
               "Received praise about rental qualifications (White vs. Black)",      
               "Positive reactions to testers' background (White vs. Black)",        
               "Positive editorializing (White vs. Black)",                          
               "Professionalism (White vs. Black)",                                  
               "Received post-visit callback (White vs. Black)",                     
               "Received post-visit offer for unit (White vs. Black)",               
               "Index measure of favorable in-person interactions (White vs. Black)"), 
            rbind(t(sapply(outvars, summm, group1 = "C", group2 = "A")), "") )

wh <- cbind(c("Showed up to appointment (White vs. Hispanic)",                         
               "Perceived sales efforts (White vs. Hispanic)",                          
               "Received praise about rental qualifications (White vs. Hispanic)",      
               "Positive reactions to testers' background (White vs. Hispanic)",        
               "Positive editorializing (White vs. Hispanic)",                          
               "Professionalism (White vs. Hispanic)",                                  
               "Received post-visit callback (White vs. Hispanic)",                     
               "Received post-visit offer for unit (White vs. Hispanic)",               
               "Index measure of favorable in-person interactions (White vs. Hispanic)"),
             rbind(t(sapply(outvars, summm, group1 = "C", group2 = "B")), ""))

bh <- cbind( c("Showed up to appointment (Black vs. Hispanic)",                         
               "Perceived sales efforts (Black vs. Hispanic)",                          
               "Received praise about rental qualifications (Black vs. Hispanic)",      
               "Positive reactions to testers' background (Black vs. Hispanic)",        
               "Positive editorializing (Black vs. Hispanic)",                          
               "Professionalism (Black vs. Hispanic)",                                  
               "Received post-visit callback (Black vs. Hispanic)",                     
               "Received post-visit offer for unit (Black vs. Hispanic)",               
               "Index measure of favorable in-person interactions (Black vs. Hispanic)"), 
              rbind(t(sapply(outvars, summm, group1 = "A", group2 = "B")), ""))
 

## CONTROL GROUP
wb[nrow(wb),4] <- mean(as.numeric(dat[[net.ind.mat[1]]][dat$TA==0]), na.rm=TRUE) # difference
wb[nrow(wb),6] <- length(as.numeric(dat[[net.ind.mat[1]]][dat$TA%in%c(0) & !is.na(dat[[net.ind.mat[1]]])])) # sample size
## ANY TREATMENT
wb[nrow(wb),9] <- mean(as.numeric(dat[[net.ind.mat[1]]][dat$TA%in%c(1,2)]), na.rm=TRUE) # difference
wb[nrow(wb),11] <- length(as.numeric(dat[[net.ind.mat[1]]][dat$TA%in%c(1,2) & !is.na(dat[[net.ind.mat[1]]])])) # sample size
## EXPERIMENT SAMPLE
wb[nrow(wb),14] <- mean(as.numeric(dat[[net.ind.mat[1]]][dat$TA%in%c(0,1,2)]), na.rm=TRUE) # difference
wb[nrow(wb),16] <- length(as.numeric(dat[[net.ind.mat[1]]][dat$TA%in%c(0,1,2) & !is.na(dat[[net.ind.mat[1]]])]))  # sample size     

## CONTROL GROUP
wh[nrow(wh),4] <- mean(as.numeric(dat[[net.ind.mat[2]]][dat$TA==0]), na.rm=TRUE) # difference
wh[nrow(wh),6] <- length(as.numeric(dat[[net.ind.mat[2]]][dat$TA%in%c(0) & !is.na(dat[[net.ind.mat[2]]])])) # sample size
## ANY TREATMENT
wh[nrow(wh),9] <- mean(as.numeric(dat[[net.ind.mat[2]]][dat$TA%in%c(1,2)]), na.rm=TRUE) # difference
wh[nrow(wh),11] <- length(as.numeric(dat[[net.ind.mat[2]]][dat$TA%in%c(1,2) & !is.na(dat[[net.ind.mat[2]]])])) # sample size
## EXPERIMENT SAMPLE
wh[nrow(wh),14] <- mean(as.numeric(dat[[net.ind.mat[2]]][dat$TA%in%c(0,1,2)]), na.rm=TRUE) # difference
wh[nrow(wh),16] <- length(as.numeric(dat[[net.ind.mat[2]]][dat$TA%in%c(0,1,2) & !is.na(dat[[net.ind.mat[2]]])]))  # sample size     

## CONTROL GROUP
bh[nrow(bh),4] <- mean(as.numeric(dat[[net.ind.mat[3]]][dat$TA==0]), na.rm=TRUE) # difference
bh[nrow(bh),6] <- length(as.numeric(dat[[net.ind.mat[3]]][dat$TA%in%c(0) & !is.na(dat[[net.ind.mat[3]]])])) # sample size
## ANY TREATMENT
bh[nrow(bh),9] <- mean(as.numeric(dat[[net.ind.mat[3]]][dat$TA%in%c(1,2)]), na.rm=TRUE) # difference
bh[nrow(bh),11] <- length(as.numeric(dat[[net.ind.mat[3]]][dat$TA%in%c(1,2) & !is.na(dat[[net.ind.mat[3]]])])) # sample size
## EXPERIMENT SAMPLE
bh[nrow(bh),14] <- mean(as.numeric(dat[[net.ind.mat[3]]][dat$TA%in%c(0,1,2)]), na.rm=TRUE) # difference
bh[nrow(bh),16] <- length(as.numeric(dat[[net.ind.mat[3]]][dat$TA%in%c(0,1,2) & !is.na(dat[[net.ind.mat[3]]])]))  # sample size     
out <- rbind(wb[c(1,9,2:8),],
             wh[c(1,9,2:8),],
             bh[c(1,9,2:8),])

out <- out[-c(2,11,20),] # get rid of index measure (it makes no sense to include it because it is standardized to the control group mean)


kable(out[,c(1:6)], caption="**Table A6, Panel I: Cases Assigned to Control Group**", col.names=c("Measure", "Majority Group Mean", "Minority Group Mean", "Difference (Maj-Min)", "p-value", "[N]"))
Table A6, Panel I: Cases Assigned to Control Group
Measure Majority Group Mean Minority Group Mean Difference (Maj-Min) p-value [N]
meet Showed up to appointment (White vs. Black) 0.824 0.839 -0.014 (0.538) [279]
sales Perceived sales efforts (White vs. Black) 0.417 0.509 -0.091 (0.019) [253]
qualpraise Received praise about rental qualifications (White vs. Black) 0.061 0.068 -0.008 (0.706) [253]
posbg Positive reactions to testers’ background (White vs. Black) 0.017 0.004 0.013 (0.18) [253]
posedit Positive editorializing (White vs. Black) 0.817 0.765 0.052 (0.123) [253]
prof Professionalism (White vs. Black) 0.522 0.483 0.039 (0.55) [253]
cb Received post-visit callback (White vs. Black) 0.215 0.168 0.047 (0.107) [279]
off Received post-visit offer for unit (White vs. Black) 0.118 0.09 0.029 (0.239) [279]
meet Showed up to appointment (White vs. Hispanic) 0.824 0.789 0.036 (0.174) [279]
sales Perceived sales efforts (White vs. Hispanic) 0.417 0.45 -0.033 (0.558) [252]
qualpraise Received praise about rental qualifications (White vs. Hispanic) 0.061 0.045 0.015 (0.395) [252]
posbg Positive reactions to testers’ background (White vs. Hispanic) 0.017 0.009 0.008 (0.416) [252]
posedit Positive editorializing (White vs. Hispanic) 0.817 0.786 0.031 (0.509) [252]
prof Professionalism (White vs. Hispanic) 0.522 0.591 -0.069 (0.124) [252]
cb Received post-visit callback (White vs. Hispanic) 0.215 0.154 0.061 (0.019) [279]
off Received post-visit offer for unit (White vs. Hispanic) 0.118 0.061 0.057 (0.011) [279]
meet Showed up to appointment (Black vs. Hispanic) 0.839 0.789 0.05 (0.043) [279]
sales Perceived sales efforts (Black vs. Hispanic) 0.509 0.45 0.059 (0.26) [251]
qualpraise Received praise about rental qualifications (Black vs. Hispanic) 0.068 0.045 0.023 (0.468) [251]
posbg Positive reactions to testers’ background (Black vs. Hispanic) 0.004 0.009 -0.005 (0.158) [251]
posedit Positive editorializing (Black vs. Hispanic) 0.765 0.786 -0.021 (1) [251]
prof Professionalism (Black vs. Hispanic) 0.483 0.591 -0.108 (0.056) [251]
cb Received post-visit callback (Black vs. Hispanic) 0.168 0.154 0.014 (0.587) [279]
off Received post-visit offer for unit (Black vs. Hispanic) 0.09 0.061 0.029 (0.17) [279]
kable(out[,c(1, 7:11)], caption="**Table A6, Panel II: Cases Assigned to Any Treatment Group**", col.names=c("Measure", "Majority Group Mean", "Minority Group Mean", "Difference (Maj-Min)", "p-value", "[N]"))
Table A6, Panel II: Cases Assigned to Any Treatment Group
Measure Majority Group Mean Minority Group Mean Difference (Maj-Min) p-value [N]
meet Showed up to appointment (White vs. Black) 0.818 0.781 0.037 (0.094) [374]
sales Perceived sales efforts (White vs. Black) 0.436 0.421 0.015 (0.715) [334]
qualpraise Received praise about rental qualifications (White vs. Black) 0.052 0.045 0.008 (0.652) [334]
posbg Positive reactions to testers’ background (White vs. Black) 0.007 0.003 0.003 (0.587) [334]
posedit Positive editorializing (White vs. Black) 0.797 0.743 0.054 (0.121) [334]
prof Professionalism (White vs. Black) 0.498 0.507 -0.008 (0.836) [334]
cb Received post-visit callback (White vs. Black) 0.187 0.131 0.056 (0.018) [374]
off Received post-visit offer for unit (White vs. Black) 0.094 0.08 0.013 (0.467) [374]
meet Showed up to appointment (White vs. Hispanic) 0.818 0.794 0.024 (0.272) [374]
sales Perceived sales efforts (White vs. Hispanic) 0.436 0.394 0.042 (0.295) [334]
qualpraise Received praise about rental qualifications (White vs. Hispanic) 0.052 0.081 -0.028 (0.164) [334]
posbg Positive reactions to testers’ background (White vs. Hispanic) 0.007 0.007 0 (0.979) [334]
posedit Positive editorializing (White vs. Hispanic) 0.797 0.838 -0.042 (0.186) [334]
prof Professionalism (White vs. Hispanic) 0.498 0.458 0.04 (0.321) [334]
cb Received post-visit callback (White vs. Hispanic) 0.187 0.171 0.016 (0.503) [374]
off Received post-visit offer for unit (White vs. Hispanic) 0.094 0.08 0.013 (0.476) [374]
meet Showed up to appointment (Black vs. Hispanic) 0.781 0.794 -0.013 (0.597) [374]
sales Perceived sales efforts (Black vs. Hispanic) 0.421 0.394 0.027 (0.501) [339]
qualpraise Received praise about rental qualifications (Black vs. Hispanic) 0.045 0.081 -0.036 (0.069) [339]
posbg Positive reactions to testers’ background (Black vs. Hispanic) 0.003 0.007 -0.003 (0.572) [339]
posedit Positive editorializing (Black vs. Hispanic) 0.743 0.838 -0.095 (0.004) [339]
prof Professionalism (Black vs. Hispanic) 0.507 0.458 0.049 (0.235) [339]
cb Received post-visit callback (Black vs. Hispanic) 0.131 0.171 -0.04 (0.079) [374]
off Received post-visit offer for unit (Black vs. Hispanic) 0.08 0.08 0 (1) [374]
kable(out[,c(1, 12:16)], caption="**Table A6, Panel III: All Cases in Experimental Sample**", col.names=c("Measure", "Majority Group Mean", "Minority Group Mean", "Difference (Maj-Min)", "p-value", "[N]"))
Table A6, Panel III: All Cases in Experimental Sample
Measure Majority Group Mean Minority Group Mean Difference (Maj-Min) p-value [N]
meet Showed up to appointment (White vs. Black) 0.821 0.806 0.015 (0.345) [653]
sales Perceived sales efforts (White vs. Black) 0.428 0.46 -0.032 (0.294) [587]
qualpraise Received praise about rental qualifications (White vs. Black) 0.056 0.055 0.001 (0.947) [587]
posbg Positive reactions to testers’ background (White vs. Black) 0.011 0.004 0.007 (0.161) [587]
posedit Positive editorializing (White vs. Black) 0.806 0.753 0.053 (0.038) [587]
prof Professionalism (White vs. Black) 0.508 0.496 0.012 (0.691) [587]
cb Received post-visit callback (White vs. Black) 0.199 0.147 0.052 (0.005) [653]
off Received post-visit offer for unit (White vs. Black) 0.104 0.084 0.02 (0.178) [653]
meet Showed up to appointment (White vs. Hispanic) 0.821 0.792 0.029 (0.084) [653]
sales Perceived sales efforts (White vs. Hispanic) 0.428 0.418 0.01 (0.737) [586]
qualpraise Received praise about rental qualifications (White vs. Hispanic) 0.056 0.066 -0.01 (0.512) [586]
posbg Positive reactions to testers’ background (White vs. Hispanic) 0.011 0.008 0.003 (0.56) [586]
posedit Positive editorializing (White vs. Hispanic) 0.806 0.816 -0.011 (0.66) [586]
prof Professionalism (White vs. Hispanic) 0.508 0.515 -0.006 (0.843) [586]
cb Received post-visit callback (White vs. Hispanic) 0.199 0.164 0.035 (0.046) [653]
off Received post-visit offer for unit (White vs. Hispanic) 0.104 0.072 0.032 (0.026) [653]
meet Showed up to appointment (Black vs. Hispanic) 0.806 0.792 0.014 (0.442) [653]
sales Perceived sales efforts (Black vs. Hispanic) 0.46 0.418 0.042 (0.169) [590]
qualpraise Received praise about rental qualifications (Black vs. Hispanic) 0.055 0.066 -0.011 (0.472) [590]
posbg Positive reactions to testers’ background (Black vs. Hispanic) 0.004 0.008 -0.004 (0.403) [590]
posedit Positive editorializing (Black vs. Hispanic) 0.753 0.816 -0.063 (0.013) [590]
prof Professionalism (Black vs. Hispanic) 0.496 0.515 -0.018 (0.555) [590]
cb Received post-visit callback (Black vs. Hispanic) 0.147 0.164 -0.017 (0.329) [653]
off Received post-visit offer for unit (Black vs. Hispanic) 0.084 0.072 0.012 (0.359) [653]

4.10 Table A7 (p. A-25): Estimated Effects of Messaging on Net Discrimination Levels

# ---------------------------------------------------------------------- #
# DEFINE COLNAMES FOR OUTPUT TABLES

col.labels <- c("Outcome", "Estimate","SE","t","P")

# ---------------------------------------------------------------------- #
# MODELS WITH ONLY BLOCK FIXED EFFECTS

# results from ITT est w/ block FE (no other covs)
outvars <- c("nmeet_", "index.", "ncb_","noff_")


reg_and_summ <- function(outvar, compare){
  depvar <- paste0(outvar, compare)
  model.mc <- paste(depvar, " ~ TA1 + as.factor(block)", sep = "")
  model.pc <- paste(depvar, " ~ TA2 + as.factor(block)", sep = "")
  model.pm <- paste(depvar, " ~ TA2 + as.factor(block)", sep = "")
  
  fit.mc <- lm(formula=model.mc, data=dat[dat$TA %in% c(0,1),], weights=ipw10)
  fit.pc <- lm(formula=model.pc, data=dat[dat$TA %in% c(0,2),], weights=ipw20)
  fit.pm <- lm(formula=model.pm, data=dat[dat$TA %in% c(1,2),], weights=ipw21)
  
  itt.mc <- summary(fit.mc)$coefficients[2,]
  itt.pc <- summary(fit.pc)$coefficients[2,]
  itt.pm <- summary(fit.pm)$coefficients[2,]
    
  if(compare == "wb" | compare == "wh"  ){
  itt.mc[4] <- pt(coef(summary(fit.mc))[,3], summary(fit.mc)$df[2], lower=TRUE)[2] #one sided p
  itt.pc[4] <- pt(coef(summary(fit.pc))[,3], summary(fit.pc)$df[2], lower=TRUE)[2] #one sided p
  } 
  
    list(mc     = itt.mc, pc     = itt.pc, pm     = itt.pm,
         fit.mc = fit.mc, fit.pc = fit.pc, fit.pm = fit.pm )
  
}
itt.wb <- lapply(outvars, reg_and_summ,  compare = "wb" )
itt.wb.mc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$mc)
itt.wb.pc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$pc)
itt.wb.pm <- sapply(1:length(outvars), function(i) itt.wb[[i]]$pm)
fit.wb.mc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$fit.mc)
fit.wb.pc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$fit.pc)
fit.wb.pm <- sapply(1:length(outvars), function(i) itt.wb[[i]]$fit.pm)

itt.wh <- lapply(outvars, reg_and_summ,  compare = "wh" )
itt.wh.mc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$mc)
itt.wh.pc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$pc)
itt.wh.pm <- sapply(1:length(outvars), function(i) itt.wh[[i]]$pm)
fit.wh.mc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$fit.mc)
fit.wh.pc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$fit.pc)
fit.wh.pm <- sapply(1:length(outvars), function(i) itt.wh[[i]]$fit.pm)

itt.bh <- lapply(outvars, reg_and_summ,  compare = "bh" )
itt.bh.mc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$mc)
itt.bh.pc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$pc)
itt.bh.pm <- sapply(1:length(outvars), function(i) itt.bh[[i]]$pm)
fit.bh.mc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$fit.mc)
fit.bh.pc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$fit.pc)
fit.bh.pm <- sapply(1:length(outvars), function(i) itt.bh[[i]]$fit.pm)




#----- CIs -----#
df.w_ <- sapply(list(df.wb.mc = fit.wb.mc, 
                     df.wb.pc = fit.wb.pc, 
                     df.wb.pm = fit.wb.pm,
                     df.wh.mc = fit.wh.mc,
                     df.wh.pc = fit.wh.pc, 
                     df.wh.pm = fit.wh.pm,
                     df.bh.mc = fit.bh.mc,
                     df.bh.pc = fit.bh.pc,
                     df.bh.pm = fit.bh.pm),
                function(fit){
                        unlist(lapply(fit, function(x) summary(x)$df[2]))[2:4]})

df.2g.10 <-  c(df.w_[,c("df.wb.mc", "df.wh.mc", "df.bh.mc")])
df.2g.20 <-  c(df.w_[,c("df.wb.pc", "df.wh.pc", "df.bh.pc")])
df.2g.21 <-  c(df.w_[,c("df.wb.pm", "df.wh.pm", "df.bh.pm")])

cv.2g.10 <- qt(p=0.025, df=df.2g.10, lower.tail=TRUE)
cv.2g.20 <- qt(p=0.025, df=df.2g.20, lower.tail=TRUE)
cv.2g.21 <- qt(p=0.025, df=df.2g.21, lower.tail=TRUE)

cv.2g <- c(cv.2g.10, cv.2g.20, cv.2g.21)
itt.2g.ci <- as.numeric(out2[,2]) + abs(cv.2g)*cbind(-as.numeric(out2[,3]), as.numeric(out2[,3]))
# Stack, create output table
out2 <- rbind(itt.wb.mc, itt.wh.mc, itt.bh.mc,
              itt.wb.pc, itt.wh.pc, itt.bh.pc,
              itt.wb.pm, itt.wh.pm, itt.bh.pm)

out2 <- out2[-seq(1,33,4),]


# Round and format results
for(i in 2:ncol(out2)) out2[,i] <- round(as.numeric(out2[,i]),3)
for(i in 5) out2[,i] <- paste("(",out2[,i],")",sep="")

# Assemble results
itt.2g.tab <- cbind(out2, cbind(apply(itt.2g.ci, 1, function(x) paste("[", round(x[1],3) ,", ", round(x[2],3), "]", sep=""))))
colnames(itt.2g.tab) <- c("Outcome", "Estimate", "SE", "t", "p-value", "95% CI")


kable(itt.2g.tab[1:9,], caption="**Table A7, Panel I. Monitoring vs. Control**")
Table A7, Panel I. Monitoring vs. Control
Outcome Estimate SE t p-value 95% CI
Index measure of favorable in-person interactions (White vs. Black) -0.015 0.052 -0.285 (0.388) [-0.117, 0.087]
Received post-visit callback (White vs. Black) -0.002 0.045 -0.035 (0.486) [-0.09, 0.086]
Received post-visit offer for unit (White vs. Black) -0.003 0.036 -0.091 (0.464) [-0.074, 0.068]
Index measure of favorable in-person interactions (White vs. Hispanic) -0.061 0.057 -1.079 (0.141) [-0.173, 0.051]
Received post-visit callback (White vs. Hispanic) -0.036 0.043 -0.837 (0.201) [-0.121, 0.049]
Received post-visit offer for unit (White vs. Hispanic) -0.017 0.034 -0.496 (0.31) [-0.084, 0.05]
Index measure of favorable in-person interactions (Black vs. Hispanic) -0.089 0.052 -1.7 (0.09) [-0.191, 0.013]
Received post-visit callback (Black vs. Hispanic) -0.035 0.042 -0.822 (0.412) [-0.118, 0.048]
Received post-visit offer for unit (Black vs. Hispanic) -0.014 0.031 -0.444 (0.657) [-0.075, 0.047]
kable(itt.2g.tab[10:18,], caption="**Table A7, Panel II. Punitive vs. Control**")
Table A7, Panel II. Punitive vs. Control
Outcome Estimate SE t p-value 95% CI
Index measure of favorable in-person interactions (White vs. Black) 0.007 0.053 0.139 (0.555) [-0.097, 0.111]
Received post-visit callback (White vs. Black) 0.019 0.042 0.456 (0.676) [-0.064, 0.102]
Received post-visit offer for unit (White vs. Black) 0.018 0.035 0.515 (0.697) [-0.051, 0.087]
Index measure of favorable in-person interactions (White vs. Hispanic) 0.048 0.055 0.863 (0.806) [-0.06, 0.156]
Received post-visit callback (White vs. Hispanic) -0.066 0.041 -1.596 (0.056) [-0.147, 0.015]
Received post-visit offer for unit (White vs. Hispanic) -0.021 0.034 -0.618 (0.268) [-0.088, 0.046]
Index measure of favorable in-person interactions (Black vs. Hispanic) 0.039 0.049 0.791 (0.429) [-0.057, 0.135]
Received post-visit callback (Black vs. Hispanic) -0.085 0.041 -2.097 (0.037) [-0.166, -0.004]
Received post-visit offer for unit (Black vs. Hispanic) -0.039 0.033 -1.172 (0.242) [-0.104, 0.026]
kable(itt.2g.tab[19:27,], caption="**Table A7, Panel III. Punitive vs. Monitoring**")
Table A7, Panel III. Punitive vs. Monitoring
Outcome Estimate SE t p-value 95% CI
Index measure of favorable in-person interactions (White vs. Black) 0.018 0.058 0.319 (0.75) [-0.096, 0.132]
Received post-visit callback (White vs. Black) 0.033 0.049 0.665 (0.506) [-0.063, 0.129]
Received post-visit offer for unit (White vs. Black) 0.026 0.037 0.685 (0.494) [-0.047, 0.099]
Index measure of favorable in-person interactions (White vs. Hispanic) 0.107 0.063 1.701 (0.09) [-0.017, 0.231]
Received post-visit callback (White vs. Hispanic) -0.019 0.049 -0.391 (0.696) [-0.115, 0.077]
Received post-visit offer for unit (White vs. Hispanic) 0.002 0.038 0.047 (0.962) [-0.073, 0.077]
Index measure of favorable in-person interactions (Black vs. Hispanic) 0.139 0.057 2.431 (0.016) [0.027, 0.251]
Received post-visit callback (Black vs. Hispanic) -0.052 0.047 -1.11 (0.268) [-0.144, 0.04]
Received post-visit offer for unit (Black vs. Hispanic) -0.024 0.036 -0.667 (0.505) [-0.095, 0.047]

4.11 Table A8 (p. A-26): Unweighted ITT Estimates of Messaging on Net Discrimination in Receiving a Post-Visit Callback

# top left
cb1 <- ddply(dat, .(TA), summarise,
              mean.cb.w = mean(cb_C),
              mean.cb.b = mean(cb_A),
              mean.cb.h = mean(cb_B))
cb1 <- t(cb1[,2:4])
colnames(cb1) <- c("Control","Monitoring","Punitive")

# bottom left
cb2 <- ddply(dat, .(TA), summarise,
             mean.ncb.wb = mean(ncb_wb),
             mean.ncb.wh = mean(ncb_wh),
             mean.ncb.bh = mean(ncb_bh))
cb2 <- t(cb2[,2:4])

# top right
cb3 <- cbind(cb1[,2]-cb1[,1], cb1[,3]-cb1[,1], cb1[,3]-cb1[,2])

# bottom right
cb4 <- cbind(cb2[,2]-cb2[,1], cb2[,3]-cb2[,1], cb2[,3]-cb2[,2])

cb1 <- round(cb1,3)
cb2 <- round(cb2,3)
cb3 <- round(cb3,3)
cb4 <- round(cb4,3)

# p values
get_p <- function(var1, var2, T_r, T_c){
  ps <- c(t.test(dat[dat$TA==0, var1],
                 dat[dat$TA==0, var2], 
                 alternative="two.sided",
                 paired=TRUE,
                 conf.level=0.95)$p.value,
          t.test(dat[dat$TA==1, var1],
                 dat[dat$TA==1, var2],
                 alternative="two.sided",
                 paired=TRUE,
                 conf.level=0.95)$p.value,
          t.test(dat[dat$TA==2, var1],
                 dat[dat$TA==2, var2], 
                 alternative="two.sided",
                 paired=TRUE,
                 conf.level=0.95)$p.value)
  return(ps)
  }

p2 <- rbind(get_p("cb_C","cb_A"),
            get_p("cb_C","cb_B"),
            get_p("cb_A","cb_B"))

p2 <- matrix(paste("(",round(p2,3),")",sep=""), ncol=3, byrow=FALSE)

#p-value within race across treatments
get_p2 <- function(var){
  ps <- c(t.test(dat[dat$TA==1, var],
                 dat[dat$TA==0, var], 
                 alternative="two.sided",
                 conf.level=0.95)$p.value,
          t.test(dat[dat$TA==2, var],
                 dat[dat$TA==0, var],
                 alternative="two.sided",
                 conf.level=0.95)$p.value,
          t.test(dat[dat$TA==2, var],
                 dat[dat$TA==1, var], 
                 alternative="two.sided",
                 conf.level=0.95)$p.value)
  return(ps)
}

p3 <- rbind(get_p2("cb_C"), get_p2("cb_A"), get_p2("cb_B"))
p3 <- matrix(paste("(",round(p3,3),")",sep=""), ncol=3, byrow=FALSE)

f0 <- lm(ncb_wb ~ TA1 + TA2, data=dat)
f1 <- lm(ncb_wb ~ TA0 + TA2, data=dat)
p4.wb <- c(pt(q=summary(f0)$coefficients[,3], df=f0$df.residual, lower.tail=TRUE)[2:3],
           summary(f1)$coefficients[3,4])

f0 <- lm(ncb_wh ~ TA1 + TA2, data=dat)
f1 <- lm(ncb_wh ~ TA0 + TA2, data=dat)
p4.wh <- c(pt(q=summary(f0)$coefficients[,3], df=f0$df.residual, lower.tail=TRUE)[2:3],
           summary(f1)$coefficients[3,4])

f0 <- lm(ncb_bh ~ TA1 + TA2, data=dat)
f1 <- lm(ncb_bh ~ TA0 + TA2, data=dat)
p4.bh <- c(summary(f0)$coefficients[2:3,4],
           summary(f1)$coefficients[3,4])

p4 <- rbind(p4.wb, p4.wh, p4.bh)
p4 <- matrix(paste("(",round(p4,3),")",sep=""), ncol=3, byrow=FALSE)

p1 <- matrix(NA, ncol=3, nrow=3)

cb1 <- rbind(cb1[1,], p1[1,],
             cb1[2,], p1[2,],
             cb1[3,], p1[3,])
cb2 <- rbind(cb2[1,], p2[1,],
             cb2[2,], p2[2,],
             cb2[3,], p2[3,])
cb3 <- rbind(cb3[1,], p3[1,],
             cb3[2,], p3[2,],
             cb3[3,], p3[3,])
cb4 <- rbind(cb4[1,], p4[1,],
             cb4[2,], p4[2,],
             cb4[3,], p4[3,])

table_a8 <- cbind(c("White",NA,"Black",NA,"Hispanic",NA,"White vs. Black",NA,"White vs. Hispanic",NA,"Black vs. Hispanic",NA),
             rbind(cbind(cb1, cb3), cbind(cb2, cb4)))
table_a8 <- rbind(table_a8, 
                  c("Sample size", nrow(dat[dat$TA == 0,]), nrow(dat[dat$TA == 1,]), 
                       nrow(dat[dat$TA == 2,]), nrow(dat[dat$TA %in% c(0,1),]), 
                       nrow(dat[dat$TA %in% c(0,2),]), nrow(dat[dat$TA %in% c(1,2),])))

colnames(table_a8) <- c("","Control", "Monitoring", "Punitive", "Monitoring vs Control", "Punitive vs Control", "Punitive vs Monitoring")


kable(table_a8[1:6,], caption="**Table A8, Panel A. Percent Favorable**")
Table A8, Panel A. Percent Favorable
Control Monitoring Punitive Monitoring vs Control Punitive vs Control Punitive vs Monitoring
White 0.215 0.184 0.19 -0.031 -0.025 0.006
(0.418) (0.5) (0.881)
Black 0.168 0.144 0.12 -0.025 -0.048 -0.024
(0.478) (0.133) (0.502)
Hispanic 0.154 0.155 0.185 0.001 0.031 0.03
(0.976) (0.378) (0.444)
kable(table_a8[7:12,], caption="**Table A8, Panel B. Net Discrimination (% Majority Favorable - % Minority Favorable)**")
Table A8, Panel B. Net Discrimination (% Majority Favorable - % Minority Favorable)
Control Monitoring Punitive Monitoring vs Control Punitive vs Control Punitive vs Monitoring
White vs. Black 0.047 0.04 0.07 -0.006 0.023 0.03
(0.107) (0.264) (0.026) (0.444) (0.705) (0.539)
White vs. Hispanic 0.061 0.029 0.005 -0.032 -0.056 -0.024
(0.019) (0.425) (0.876) (0.23) (0.09) (0.611)
Black vs. Hispanic 0.014 -0.011 -0.065 -0.026 -0.079 -0.054
(0.587) (0.733) (0.037) (0.544) (0.052) (0.242)
kable(rbind(table_a8[13,]), caption="**Table A8. Sample Size (last row)**")
Table A8. Sample Size (last row)
Control Monitoring Punitive Monitoring vs Control Punitive vs Control Punitive vs Monitoring
Sample size 279 174 200 453 479 374

4.12 Table A9 (p. A-27): Unweighted ITT Estimates of Messaging on Net Discrimination in Receiving a Post-Visit Offer for the Unit

# top left
off1 <- ddply(dat, .(TA), summarise,
              mean.off.w = mean(off_C),
              mean.off.b = mean(off_A),
              mean.off.h = mean(off_B))
off1 <- t(off1[,2:4])
colnames(off1) <- c("Control","Monitoring","Punitive")

# bottom left
off2 <- ddply(dat, .(TA), summarise,
              mean.noff.wb = mean(noff_wb),
              mean.noff.wh = mean(noff_wh),
              mean.noff.bh = mean(noff_bh))
off2 <- t(off2[,2:4])

# top right
off3 <- cbind(off1[,2]-off1[,1], off1[,3]-off1[,1], off1[,3]-off1[,2])

# bottom right
off4 <- cbind(off2[,2]-off2[,1], off2[,3]-off2[,1], off2[,3]-off2[,2])

off1 <- round(off1,3)
off2 <- round(off2,3)
off3 <- round(off3,3)
off4 <- round(off4,3)

# p values

p2 <- rbind(get_p("off_C","off_A"),
            get_p("off_C","off_B"),
            get_p("off_A","off_B"))

p2 <- matrix(paste("(",round(p2,3),")",sep=""), ncol=3, byrow=FALSE)

p3 <- rbind(get_p2("off_C"),
            get_p2("off_A"),
            get_p2("off_B"))

p3 <- matrix(paste("(",round(p3,3),")",sep=""), ncol=3, byrow=FALSE)

f0 <- lm(noff_wb ~ TA1 + TA2, data=dat)
f1 <- lm(noff_wb ~ TA0 + TA2, data=dat)
p4.wb <- c(pt(q=summary(f0)$coefficients[,3], df=f0$df.residual, lower.tail=TRUE)[2:3],
           summary(f1)$coefficients[3,4])

f0 <- lm(noff_wh ~ TA1 + TA2, data=dat)
f1 <- lm(noff_wh ~ TA0 + TA2, data=dat)
p4.wh <- c(pt(q=summary(f0)$coefficients[,3], df=f0$df.residual, lower.tail=TRUE)[2:3],
           summary(f1)$coefficients[3,4])

f0 <- lm(noff_bh ~ TA1 + TA2, data=dat)
f1 <- lm(noff_bh ~ TA0 + TA2, data=dat)
p4.bh <- c(summary(f0)$coefficients[2:3,4],
           summary(f1)$coefficients[3,4])

p4 <- rbind(p4.wb, p4.wh, p4.bh)
p4 <- matrix(paste("(",round(p4,3),")",sep=""), ncol=3, byrow=FALSE)

p1 <- matrix(NA, ncol=3, nrow=3)

off1 <- rbind(off1[1,], p1[1,],
              off1[2,], p1[2,],
              off1[3,], p1[3,])
off2 <- rbind(off2[1,], p2[1,],
              off2[2,], p2[2,],
              off2[3,], p2[3,])
off3 <- rbind(off3[1,], p3[1,],
              off3[2,], p3[2,],
              off3[3,], p3[3,])
off4 <- rbind(off4[1,], p4[1,],
              off4[2,], p4[2,],
              off4[3,], p4[3,])

table_a9 <- cbind(c("White",NA,"Black",NA,"Hispanic",NA,"White vs. Black",NA,"White vs. Hispanic",NA,"Black vs. Hispanic",NA),
             rbind(cbind(off1, off3), cbind(off2, off4)))
table_a9 <- rbind(table_a9, 
                  c("Sample size", nrow(dat[dat$TA == 0,]), nrow(dat[dat$TA == 1,]), 
                       nrow(dat[dat$TA == 2,]), nrow(dat[dat$TA %in% c(0,1),]), 
                       nrow(dat[dat$TA %in% c(0,2),]), nrow(dat[dat$TA %in% c(1,2),])))
colnames(table_a9) <- c("","Control", "Monitoring", "Punitive", "Monitoring vs Control", "Punitive vs Control", "Punitive vs Monitoring")

kable(table_a9[1:6,], caption="**Table A9, Panel A. Percent Favorable**")
Table A9, Panel A. Percent Favorable
Control Monitoring Punitive Monitoring vs Control Punitive vs Control Punitive vs Monitoring
White 0.118 0.08 0.105 -0.038 -0.013 0.025
(0.183) (0.648) (0.414)
Black 0.09 0.08 0.08 -0.009 -0.01 0
(0.734) (0.709) (0.987)
Hispanic 0.061 0.063 0.095 0.002 0.034 0.032
(0.922) (0.178) (0.254)
kable(table_a9[7:12,], caption="**Table A9, Panel B. Net Discrimination (% Majority Favorable - % Minority Favorable)**")
Table A9, Panel B. Net Discrimination (% Majority Favorable - % Minority Favorable)
Control Monitoring Punitive Monitoring vs Control Punitive vs Control Punitive vs Monitoring
White vs. Black 0.029 0 0.025 -0.029 -0.004 0.025
(0.239) (1) (0.319) (0.216) (0.458) (0.523)
White vs. Hispanic 0.057 0.017 0.01 -0.04 -0.047 -0.007
(0.011) (0.514) (0.706) (0.13) (0.083) (0.85)
Black vs. Hispanic 0.029 0.017 -0.015 -0.011 -0.044 -0.032
(0.17) (0.44) (0.565) (0.729) (0.168) (0.362)
kable(rbind(table_a9[13,]), caption="**Table A9. Sample Size (last row)**")
Table A9. Sample Size (last row)
Control Monitoring Punitive Monitoring vs Control Punitive vs Control Punitive vs Monitoring
Sample size 279 174 200 453 479 374

4.13 Table A10 (p. A-28): Sensitivity Analysis: Estimated Effects of Messaging on Net Discrimination Levels (Three-Group Parametric Estimator)

  #one-sided p function
  get_plower <- function(x) pt(summary(x)$coefficients[,3], summary(x)$df[2], lower = TRUE)
  
  # select all outcome variables
  outpatt <- c("^index","ncb","noff")
  outvar <- unlist(lapply(outpatt, function(y) grep(y, names(dat), value = TRUE)))
  
  datm <- reshape2::melt(dat, id = names(dat)[!names(dat)%in% c(outvar)],
                         variable.name = "outcome", 
                         value.name = "outcome_val")

  # control as comparison

  # monitoring vs. control
  dat_out_c <- datm %>% 
    dplyr::group_by(outcome) %>%
    dplyr::do(fit = lm(outcome_val ~ TA1 + TA2 + as.factor(block), weights=ipw, data=.))


  dat_outcome_coef <- lapply(dat_out_c$fit, tidy)
  dat_outcome_coef <- do.call(rbind, dat_outcome_coef )
  dat_outcome_coef$outcome <- rep(dat_out_c$outcome, 19) # number of terms
  

  #one-sided p-value
  p_low <- lapply(dat_out_c$fit, get_plower) 
  p.lower <-  do.call(c,  p_low )
  p.lower <- cbind(  p.lower , term = names(p.lower))
  p.lower <- as.data.frame(p.lower)
  p.lower$outcome <- rep(dat_out_c$outcome, 19) # number of terms
 
  dat_coef <- left_join(dat_outcome_coef, p.lower, by = c("outcome","term"))

  # replace with one-sided p.value for all but B-H comparison
dat_coef$p.value[grep("_bh", dat_coef$outcome, invert = T)] <- as.numeric(paste0(dat_coef$p.lower[grep("_bh", dat_coef$outcome, invert = T)]))

  itt3g_mc <- dat_coef[dat_coef$term == "TA1",] %>% dplyr::select(-p.lower, -term)
  itt3g_mc$df <- unlist(lapply(dat_out_c$fit, function(x) summary(x)$df[2]))

  # punitive vs. control
  itt3g_pc <- dat_coef[dat_coef$term == "TA2",] %>% dplyr::select(-p.lower, -term)
  itt3g_pc$df <- unlist(lapply(dat_out_c$fit, function(x) summary(x)$df[2]))
  
  # monitoring as comparison

  # punitive vs monitoring
  dat_out_m <- datm %>%
    group_by(outcome) %>%
    do(fit = lm(outcome_val ~ TA0 + TA2 + as.factor(block), weights=ipw, data=.))

 

  dat_outcome_coef <- lapply(dat_out_m$fit, tidy)
  dat_outcome_coef <- do.call(rbind, dat_outcome_coef )
  dat_outcome_coef$outcome <- rep(dat_out_m$outcome, 19) # number of terms
  

  #one-sided p-value
  p_low <- lapply(dat_out_m$fit, get_plower) 
  p.lower <-  do.call(c,  p_low )
  p.lower <- cbind(  p.lower , term = names(p.lower))
  p.lower <- as.data.frame(p.lower)
  p.lower$outcome <- rep(dat_out_m$outcome, 19) # number of terms
 
  dat_coef <- left_join(dat_outcome_coef, p.lower, by = c("outcome","term"))


  itt3g_pm <- dat_coef[dat_coef$term == "TA2",] %>% dplyr::select(-term, -p.lower)
  itt3g_pm$df <- unlist(lapply(dat_out_m$fit, function(x) summary(x)$df[2]))

  r_ord <- c(7,2,5,8,3,6,9,1,4)

  out3 <- rbind(itt3g_mc[r_ord,], itt3g_pc[r_ord,], itt3g_pm[r_ord,])
  out3$outcome <- rep(c(outvars.wb.labs[-1], outvars.wh.labs[-1], outvars.bh.labs[-1]), 3)

itt.3g.ci <- as.numeric(out3$estimate) + abs(qt(p=0.025, df=out3$df, lower.tail=TRUE))*cbind(-as.numeric(out3$std.error), as.numeric(out3$std.error))
itt.3g.cir <- apply(itt.3g.ci, 1, function(x) paste0("[", round(x[1],3) ,", ", round(x[2],3), "]", sep=""))

# Join with CI
itt.3g.tab <- cbind(out3[,"outcome"], 
                    out3[,c("estimate", "std.error",  "statistic", "p.value")],
                    itt.3g.cir)

# Round and format results
 itt.3g.tab[,c("estimate", "std.error",  "statistic", "p.value")] <- 
 apply(itt.3g.tab[,c("estimate", "std.error",  "statistic", "p.value")],2, round,3 )
itt.3g.tab[,5] <- paste0("(",itt.3g.tab[,5],")",sep="")

colnames(itt.3g.tab) <- c("Outcome", "Estimate", "SE", "t", "p-value", "95% CI")

kable(itt.3g.tab[1:9,], caption="**Table A10, Panel I. Monitoring vs. Control**")
Table A10, Panel I. Monitoring vs. Control
Outcome Estimate SE t p-value 95% CI
Index measure of favorable in-person interactions (White vs. Black) -0.012 0.054 -0.225 (0.411) [-0.119, 0.094]
Received post-visit callback (White vs. Black) -0.009 0.046 -0.189 (0.425) [-0.099, 0.081]
Received post-visit offer for unit (White vs. Black) -0.007 0.036 -0.194 (0.423) [-0.079, 0.065]
Index measure of favorable in-person interactions (White vs. Hispanic) -0.059 0.059 -0.999 (0.159) [-0.175, 0.057]
Received post-visit callback (White vs. Hispanic) -0.044 0.045 -0.987 (0.324) [-0.133, 0.044]
Received post-visit offer for unit (White vs. Hispanic) -0.023 0.036 -0.654 (0.257) [-0.094, 0.047]
Index measure of favorable in-person interactions (Black vs. Hispanic) -0.092 0.053 -1.745 (0.082) [-0.197, 0.012]
Received post-visit callback (Black vs. Hispanic) -0.036 0.043 -0.822 (0.206) [-0.121, 0.05]
Received post-visit offer for unit (Black vs. Hispanic) -0.016 0.033 -0.489 (0.313) [-0.082, 0.049]
kable(itt.3g.tab[10:18,], caption="**Table A10, Panel II. Punitive vs. Control**")
Table A10, Panel II. Punitive vs. Control
Outcome Estimate SE t p-value 95% CI
10 Index measure of favorable in-person interactions (White vs. Black) 0.010 0.052 0.198 (0.579) [-0.092, 0.112]
11 Received post-visit callback (White vs. Black) 0.020 0.044 0.454 (0.65) [-0.067, 0.107]
12 Received post-visit offer for unit (White vs. Black) 0.016 0.035 0.465 (0.679) [-0.053, 0.086]
13 Index measure of favorable in-person interactions (White vs. Hispanic) 0.050 0.056 0.893 (0.373) [-0.06, 0.16]
14 Received post-visit callback (White vs. Hispanic) -0.068 0.043 -1.568 (0.059) [-0.153, 0.017]
15 Received post-visit offer for unit (White vs. Hispanic) -0.025 0.035 -0.720 (0.236) [-0.093, 0.043]
16 Index measure of favorable in-person interactions (Black vs. Hispanic) 0.039 0.050 0.781 (0.782) [-0.06, 0.138]
17 Received post-visit callback (Black vs. Hispanic) -0.088 0.042 -2.100 (0.018) [-0.171, -0.006]
18 Received post-visit offer for unit (Black vs. Hispanic) -0.041 0.032 -1.278 (0.101) [-0.105, 0.022]
kable(itt.3g.tab[19:27,], caption="**Table A10, Panel III. Punitive vs. Monitoring**")
Table A10, Panel III. Punitive vs. Monitoring
Outcome Estimate SE t p-value 95% CI
19 Index measure of favorable in-person interactions (White vs. Black) 0.022 0.055 0.412 (0.68) [-0.085, 0.13]
20 Received post-visit callback (White vs. Black) 0.029 0.046 0.631 (0.528) [-0.061, 0.118]
21 Received post-visit offer for unit (White vs. Black) 0.023 0.036 0.647 (0.518) [-0.048, 0.095]
22 Index measure of favorable in-person interactions (White vs. Hispanic) 0.109 0.058 1.879 (0.061) [-0.005, 0.223]
23 Received post-visit callback (White vs. Hispanic) -0.024 0.045 -0.532 (0.595) [-0.111, 0.064]
24 Received post-visit offer for unit (White vs. Hispanic) -0.001 0.036 -0.042 (0.966) [-0.071, 0.068]
25 Index measure of favorable in-person interactions (Black vs. Hispanic) 0.132 0.053 2.480 (0.014) [0.027, 0.236]
26 Received post-visit callback (Black vs. Hispanic) -0.053 0.043 -1.216 (0.224) [-0.137, 0.032]
27 Received post-visit offer for unit (Black vs. Hispanic) -0.025 0.033 -0.752 (0.452) [-0.09, 0.04]

4.14 Table A11 (p. A-29): Predicted Means, Differences, and Percent Differences from Two-Group Parametric Estimators

# TA: Treatment Assignment -- {Control = 0, Monitoring = 1, Punitive = 2}

ccdat <- data.frame(TA1=0, TA2=0, block=1:17)
mcdat <- data.frame(TA1=1, TA2=0, block=1:17)
pcdat <- data.frame(TA1=0, TA2=1, block=1:17)

cmdat <- data.frame(TA0=0, TA2=0, block=1:17)
pmdat <- data.frame(TA0=0, TA2=1, block=1:17)
cmdat.i2 <- data.frame(TA0=0, TA2=0, block=c(1:4,6:17))
pmdat.i2 <- data.frame(TA0=0, TA2=1, block=c(1:4,6:17))

#results from ITT est w/ block FE (no other covs)
# Results grabbed from main analysis (see first chunk of code)
pred.vals <- list()

for(i in 1:4){
  
  #--------- m-c ---------#
  # wb
  mc.wb <- c(mean(predict(fit.wb.mc[[i]], mcdat)), mean(predict(fit.wb.mc[[i]], ccdat)))
  # wh
  mc.wh <- c(mean(predict(fit.wh.mc[[i]], mcdat)), mean(predict(fit.wh.mc[[i]], ccdat)))
  # bh
  mc.bh <- c(mean(predict(fit.bh.mc[[i]], mcdat)), mean(predict(fit.bh.mc[[i]], ccdat)))
  
  #--------- p-c ---------#
  # wb
  pc.wb <- c(mean(predict(fit.wb.pc[[i]], pcdat)), mean(predict(fit.wb.pc[[i]], ccdat)))
  # wh
  pc.wh <- c(mean(predict(fit.wh.pc[[i]], pcdat)), mean(predict(fit.wh.pc[[i]], ccdat)))
  # bh
  pc.bh <- c(mean(predict(fit.bh.pc[[i]], pcdat)), mean(predict(fit.bh.pc[[i]], ccdat)))
  
  #--------- p-m ---------#
  
  if(i != 2){
    # wb
    pm.wb <- c(mean(predict(fit.wb.pm[[i]], pmdat)), mean(predict(fit.wb.pm[[i]], cmdat)))
    # w
    pm.wh <- c(mean(predict(fit.wh.pm[[i]], pmdat)), mean(predict(fit.wh.pm[[i]], cmdat)))
    # bh
    pm.bh <- c(mean(predict(fit.bh.pm[[i]], pmdat)), mean(predict(fit.bh.pm[[i]], cmdat)))
  } else {
    # wb
    pm.wb <- c(mean(predict(fit.wb.pm[[i]], pmdat.i2)), mean(predict(fit.wb.pm[[i]], cmdat.i2)))
    # w
    pm.wh <- c(mean(predict(fit.wh.pm[[i]], pmdat.i2)), mean(predict(fit.wh.pm[[i]], cmdat.i2)))
    # bh
    pm.bh <- c(mean(predict(fit.bh.pm[[i]], pmdat.i2)), mean(predict(fit.bh.pm[[i]], cmdat.i2)))
    
  }
  
  pred.vals[[i]] <- rbind(mc.wb, mc.wh, mc.bh,
                          pc.wb, pc.wh, pc.bh,
                          pm.wb, pm.wh, pm.bh)
  
}

names(pred.vals) <- c("meet","index","callback","offer")

pred.mat <- NULL

for(i in 1:nrow(pred.vals[[1]])){
  for(j in 2:length(pred.vals)){
    pred.mat <- rbind(pred.mat, pred.vals[[j]][i,])
  }
}

pred.out <- cbind(pred.mat, pred.mat[,1]-pred.mat[,2], (pred.mat[,1]-pred.mat[,2])/abs(pred.mat[,2])*100)
colnames(pred.out) <- c("Predicted Treatment Mean", "Predicted Control Mean", "Difference", "Percent Difference")

for(i in 1:ncol(pred.out)) pred.out[,i] <- round(pred.out[,i], digits=3)

pred.out <- cbind(Outcome=itt.2g.tab[,1], pred.out)

pred.out2 <- list()
seclabs.index <- 0

for(i in 1:length(start.rows)){
  if(i %in% c(1,4,7)){
    seclabs.index <- seclabs.index + 1
    pred.out2[[i]] <- rbind(c(seclabs[seclabs.index], rep(NA,4)),
                           c(sublabs[i], rep(NA,4)),
                           pred.out[start.rows[i]:stop.rows[i],])        
  } else {
    pred.out2[[i]] <- rbind(c(sublabs[i], rep(NA,4)), pred.out[start.rows[i]:stop.rows[i],])    
  }
}

pred.out2 <- do.call(rbind, pred.out2)

kable(pred.out2, caption="**Table A11**")
Table A11
Outcome Predicted Treatment Mean Predicted Control Mean Difference Percent Difference
I. Monitoring vs. Control
A. White vs. Black
Index measure of favorable in-person interactions (White vs. Black) 0.023 0.038 -0.015 -39.204
Received post-visit callback (White vs. Black) 0.006 0.007 -0.002 -21.372
Received post-visit offer for unit (White vs. Black) 0.005 0.008 -0.003 -40.863
B. White vs. Hispanic
Index measure of favorable in-person interactions (White vs. Hispanic) -0.095 -0.034 -0.061 -178.801
Received post-visit callback (White vs. Hispanic) 0.001 0.038 -0.036 -96.164
Received post-visit offer for unit (White vs. Hispanic) 0.023 0.04 -0.017 -42.45
C. Black vs. Hispanic
Index measure of favorable in-person interactions (Black vs. Hispanic) -0.162 -0.073 -0.089 -121.727
Received post-visit callback (Black vs. Hispanic) -0.004 0.03 -0.035 -114.689
Received post-visit offer for unit (Black vs. Hispanic) 0.018 0.032 -0.014 -42.853
II. Punitive vs. Control
A. White vs. Black
Index measure of favorable in-person interactions (White vs. Black) 0.007 0 0.007 7904.738
Received post-visit callback (White vs. Black) 0.012 -0.007 0.019 281.195
Received post-visit offer for unit (White vs. Black) 0.032 0.014 0.018 127.36
B. White vs. Hispanic
Index measure of favorable in-person interactions (White vs. Hispanic) 0.009 -0.039 0.048 122.284
Received post-visit callback (White vs. Hispanic) -0.005 0.061 -0.066 -108.818
Received post-visit offer for unit (White vs. Hispanic) 0.013 0.034 -0.021 -62.201
C. Black vs. Hispanic
Index measure of favorable in-person interactions (Black vs. Hispanic) -0.012 -0.051 0.039 76.051
Received post-visit callback (Black vs. Hispanic) -0.018 0.068 -0.085 -126.385
Received post-visit offer for unit (Black vs. Hispanic) -0.019 0.02 -0.039 -196.921
III. Punitive vs. Monitoring
A. White vs. Black
Index measure of favorable in-person interactions (White vs. Black) 0.019 0 0.018 6621.1
Received post-visit callback (White vs. Black) 0.055 0.022 0.033 147.447
Received post-visit offer for unit (White vs. Black) 0.018 -0.008 0.026 322.614
B. White vs. Hispanic
Index measure of favorable in-person interactions (White vs. Hispanic) 0.014 -0.092 0.107 115.589
Received post-visit callback (White vs. Hispanic) -0.02 0 -0.019 -4046.942
Received post-visit offer for unit (White vs. Hispanic) -0.002 -0.004 0.002 49.995
C. Black vs. Hispanic
Index measure of favorable in-person interactions (Black vs. Hispanic) -0.008 -0.147 0.139 94.696
Received post-visit callback (Black vs. Hispanic) -0.074 -0.023 -0.052 -229.774
Received post-visit offer for unit (Black vs. Hispanic) -0.019 0.004 -0.024 -551.477

4.15 Table A12 (p. A-31): Incidence of Early Stage Discrimination: Subjective Measures

table_a12 <- es_discrim_full_table[-c(1,2,14,15,27,28),]

table_a12_audit   <- es_discrim_full_table[-c(1,2,14,15,27,28),1:6]
table_a12_expsamp <- es_discrim_full_table[-c(1,2,14,15,27,28),c(1, 7:11)]
table_a12_control <- es_discrim_full_table[-c(1,2,14,15,27,28),c(1, 12:16)]

colnames(table_a12_audit)[2:6] <- colnames(table_a12_expsamp)[2:6] <- colnames(table_a12_control)[2:6] <- c("Majority Group Mean", "Minority Group Mean", "Difference (Maj-Min)", "p-value", "[N]")

kable(table_a12_audit, caption="**Table A12, Panel I. All Pursued Cases in Audit Sample**")
Table A12, Panel I. All Pursued Cases in Audit Sample
es.wb.labs Majority Group Mean Minority Group Mean Difference (Maj-Min) p-value [N]
numattr No. of attributes brought up (White vs. Black) 1.101 1.041 0.06 (0.136) [2711]
numskep No. attributes - skeptical response (White vs. Black) 0.053 0.027 0.026 (0.001) [2711]
numpos No. attributes - positive response (White vs. Black) 0.132 0.125 0.007 (0.637) [2711]
numneu No. attributes - neutral response (White vs. Black) 0.918 0.894 0.024 (0.501) [2711]
numneg No. attributes - negative response (White vs. Black) 0.051 0.022 0.029 (0) [2711]
pctskep Pct. of attributes - skeptical response (White vs. Black) 0.012 0.008 0.005 (0.021) [2711]
pctpos Pct. of attributes - positive response (White vs. Black) 0.033 0.033 0 (0.933) [2711]
pctneu Pct. of attributes - neutral response (White vs. Black) 0.35 0.328 0.022 (0.026) [2711]
pctneg Pct. of attributes - negative response (White vs. Black) 0.013 0.006 0.007 (0) [2711]
anyskep Responded skeptically for any attribute (White vs. Black) 0.032 0.016 0.017 (0) [2711]
anyneg Responded negatively for any attribute (White vs. Black) 0.035 0.016 0.019 (0) [2711]
numattr No. of attributes brought up (White vs. Hispanic) 1.101 1.07 0.031 (0.434) [2711]
numskep No. attributes - skeptical response (White vs. Hispanic) 0.053 0.065 -0.012 (0.241) [2711]
numpos No. attributes - positive response (White vs. Hispanic) 0.132 0.14 -0.008 (0.532) [2711]
numneu No. attributes - neutral response (White vs. Hispanic) 0.918 0.884 0.035 (0.32) [2711]
numneg No. attributes - negative response (White vs. Hispanic) 0.051 0.046 0.005 (0.543) [2711]
pctskep Pct. of attributes - skeptical response (White vs. Hispanic) 0.012 0.016 -0.004 (0.114) [2711]
pctpos Pct. of attributes - positive response (White vs. Hispanic) 0.033 0.037 -0.004 (0.27) [2711]
pctneu Pct. of attributes - neutral response (White vs. Hispanic) 0.35 0.34 0.01 (0.305) [2711]
pctneg Pct. of attributes - negative response (White vs. Hispanic) 0.013 0.013 0 (0.929) [2711]
anyskep Responded skeptically for any attribute (White vs. Hispanic) 0.032 0.036 -0.004 (0.435) [2711]
anyneg Responded negatively for any attribute (White vs. Hispanic) 0.035 0.03 0.004 (0.324) [2711]
numattr No. of attributes brought up (Black vs. Hispanic) 1.041 1.07 -0.029 (0.47) [2711]
numskep No. attributes - skeptical response (Black vs. Hispanic) 0.027 0.065 -0.038 (0) [2711]
numpos No. attributes - positive response (Black vs. Hispanic) 0.125 0.14 -0.015 (0.327) [2711]
numneu No. attributes - neutral response (Black vs. Hispanic) 0.894 0.884 0.01 (0.775) [2711]
numneg No. attributes - negative response (Black vs. Hispanic) 0.022 0.046 -0.024 (0) [2711]
pctskep Pct. of attributes - skeptical response (Black vs. Hispanic) 0.008 0.016 -0.008 (0) [2711]
pctpos Pct. of attributes - positive response (Black vs. Hispanic) 0.033 0.037 -0.004 (0.278) [2711]
pctneu Pct. of attributes - neutral response (Black vs. Hispanic) 0.328 0.34 -0.013 (0.202) [2711]
pctneg Pct. of attributes - negative response (Black vs. Hispanic) 0.006 0.013 -0.007 (0.001) [2711]
anyskep Responded skeptically for any attribute (Black vs. Hispanic) 0.016 0.036 -0.02 (0) [2711]
anyneg Responded negatively for any attribute (Black vs. Hispanic) 0.016 0.03 -0.014 (0) [2711]
kable(table_a12_expsamp, caption="**Table A12, Panel II. All Pursued Cases in Experimental Sample**")
Table A12, Panel II. All Pursued Cases in Experimental Sample
es.wb.labs Majority Group Mean Minority Group Mean Difference (Maj-Min) p-value [N]
numattr No. of attributes brought up (White vs. Black) 2.023 1.936 0.087 (0.334) [653]
numskep No. attributes - skeptical response (White vs. Black) 0.077 0.041 0.035 (0.073) [653]
numpos No. attributes - positive response (White vs. Black) 0.262 0.256 0.006 (0.876) [653]
numneu No. attributes - neutral response (White vs. Black) 1.703 1.668 0.035 (0.675) [653]
numneg No. attributes - negative response (White vs. Black) 0.058 0.012 0.046 (0.001) [653]
pctskep Pct. of attributes - skeptical response (White vs. Black) 0.018 0.015 0.003 (0.602) [653]
pctpos Pct. of attributes - positive response (White vs. Black) 0.07 0.07 0.001 (0.925) [653]
pctneu Pct. of attributes - neutral response (White vs. Black) 0.706 0.646 0.06 (0.011) [653]
pctneg Pct. of attributes - negative response (White vs. Black) 0.014 0.003 0.011 (0.002) [653]
anyskep Responded skeptically for any attribute (White vs. Black) 0.049 0.025 0.025 (0.018) [653]
anyneg Responded negatively for any attribute (White vs. Black) 0.04 0.009 0.031 (0) [653]
numattr No. of attributes brought up (White vs. Hispanic) 2.023 2.072 -0.049 (0.549) [653]
numskep No. attributes - skeptical response (White vs. Hispanic) 0.077 0.081 -0.005 (0.831) [653]
numpos No. attributes - positive response (White vs. Hispanic) 0.262 0.27 -0.008 (0.832) [653]
numneu No. attributes - neutral response (White vs. Hispanic) 1.703 1.75 -0.047 (0.524) [653]
numneg No. attributes - negative response (White vs. Hispanic) 0.058 0.052 0.006 (0.706) [653]
pctskep Pct. of attributes - skeptical response (White vs. Hispanic) 0.018 0.023 -0.005 (0.39) [653]
pctpos Pct. of attributes - positive response (White vs. Hispanic) 0.07 0.077 -0.007 (0.514) [653]
pctneu Pct. of attributes - neutral response (White vs. Hispanic) 0.706 0.714 -0.009 (0.656) [653]
pctneg Pct. of attributes - negative response (White vs. Hispanic) 0.014 0.015 -0.002 (0.742) [653]
anyskep Responded skeptically for any attribute (White vs. Hispanic) 0.049 0.051 -0.002 (0.898) [653]
anyneg Responded negatively for any attribute (White vs. Hispanic) 0.04 0.038 0.002 (0.876) [653]
numattr No. of attributes brought up (Black vs. Hispanic) 1.936 2.072 -0.136 (0.107) [653]
numskep No. attributes - skeptical response (Black vs. Hispanic) 0.041 0.081 -0.04 (0.045) [653]
numpos No. attributes - positive response (Black vs. Hispanic) 0.256 0.27 -0.014 (0.749) [653]
numneu No. attributes - neutral response (Black vs. Hispanic) 1.668 1.75 -0.083 (0.298) [653]
numneg No. attributes - negative response (Black vs. Hispanic) 0.012 0.052 -0.04 (0.002) [653]
pctskep Pct. of attributes - skeptical response (Black vs. Hispanic) 0.015 0.023 -0.008 (0.196) [653]
pctpos Pct. of attributes - positive response (Black vs. Hispanic) 0.07 0.077 -0.008 (0.498) [653]
pctneu Pct. of attributes - neutral response (Black vs. Hispanic) 0.646 0.714 -0.069 (0.003) [653]
pctneg Pct. of attributes - negative response (Black vs. Hispanic) 0.003 0.015 -0.012 (0.001) [653]
anyskep Responded skeptically for any attribute (Black vs. Hispanic) 0.025 0.051 -0.026 (0.011) [653]
anyneg Responded negatively for any attribute (Black vs. Hispanic) 0.009 0.038 -0.029 (0) [653]
kable(table_a12_control, caption="**Table A12, Panel III. All Pursued Cases in Control Group**")
Table A12, Panel III. All Pursued Cases in Control Group
es.wb.labs Majority Group Mean Minority Group Mean Difference (Maj-Min) p-value [N]
numattr No. of attributes brought up (White vs. Black) 2.151 1.961 0.19 (0.145) [279]
numskep No. attributes - skeptical response (White vs. Black) 0.097 0.039 0.057 (0.106) [279]
numpos No. attributes - positive response (White vs. Black) 0.269 0.258 0.011 (0.852) [279]
numneu No. attributes - neutral response (White vs. Black) 1.821 1.685 0.136 (0.259) [279]
numneg No. attributes - negative response (White vs. Black) 0.061 0.018 0.043 (0.064) [279]
pctskep Pct. of attributes - skeptical response (White vs. Black) 0.022 0.011 0.012 (0.176) [279]
pctpos Pct. of attributes - positive response (White vs. Black) 0.071 0.066 0.005 (0.743) [279]
pctneu Pct. of attributes - neutral response (White vs. Black) 0.745 0.666 0.078 (0.028) [279]
pctneg Pct. of attributes - negative response (White vs. Black) 0.016 0.003 0.013 (0.027) [279]
anyskep Responded skeptically for any attribute (White vs. Black) 0.054 0.018 0.036 (0.025) [279]
anyneg Responded negatively for any attribute (White vs. Black) 0.039 0.011 0.029 (0.032) [279]
numattr No. of attributes brought up (White vs. Hispanic) 2.151 1.989 0.161 (0.19) [279]
numskep No. attributes - skeptical response (White vs. Hispanic) 0.097 0.039 0.057 (0.074) [279]
numpos No. attributes - positive response (White vs. Hispanic) 0.269 0.24 0.029 (0.533) [279]
numneu No. attributes - neutral response (White vs. Hispanic) 1.821 1.728 0.093 (0.383) [279]
numneg No. attributes - negative response (White vs. Hispanic) 0.061 0.022 0.039 (0.07) [279]
pctskep Pct. of attributes - skeptical response (White vs. Hispanic) 0.022 0.012 0.01 (0.218) [279]
pctpos Pct. of attributes - positive response (White vs. Hispanic) 0.071 0.069 0.002 (0.898) [279]
pctneu Pct. of attributes - neutral response (White vs. Hispanic) 0.745 0.733 0.011 (0.699) [279]
pctneg Pct. of attributes - negative response (White vs. Hispanic) 0.016 0.007 0.009 (0.204) [279]
anyskep Responded skeptically for any attribute (White vs. Hispanic) 0.054 0.029 0.025 (0.145) [279]
anyneg Responded negatively for any attribute (White vs. Hispanic) 0.039 0.018 0.022 (0.109) [279]
numattr No. of attributes brought up (Black vs. Hispanic) 1.961 1.989 -0.029 (0.813) [279]
numskep No. attributes - skeptical response (Black vs. Hispanic) 0.039 0.039 0 (1) [279]
numpos No. attributes - positive response (Black vs. Hispanic) 0.258 0.24 0.018 (0.772) [279]
numneu No. attributes - neutral response (Black vs. Hispanic) 1.685 1.728 -0.043 (0.703) [279]
numneg No. attributes - negative response (Black vs. Hispanic) 0.018 0.022 -0.004 (0.782) [279]
pctskep Pct. of attributes - skeptical response (Black vs. Hispanic) 0.011 0.012 -0.001 (0.849) [279]
pctpos Pct. of attributes - positive response (Black vs. Hispanic) 0.066 0.069 -0.003 (0.831) [279]
pctneu Pct. of attributes - neutral response (Black vs. Hispanic) 0.666 0.733 -0.067 (0.036) [279]
pctneg Pct. of attributes - negative response (Black vs. Hispanic) 0.003 0.007 -0.005 (0.302) [279]
anyskep Responded skeptically for any attribute (Black vs. Hispanic) 0.018 0.029 -0.011 (0.367) [279]
anyneg Responded negatively for any attribute (Black vs. Hispanic) 0.011 0.018 -0.007 (0.415) [279]

4.16 Table A13 (p. A-32): Treatment Noncompliance Incidence

nc.counts <- table(dat$TA, dat$TD, useNA="ifany")
nc.props  <- table(dat$TA, dat$TD, useNA="ifany")/rowSums(table(dat$TA, dat$TD, useNA="ifany"))

table_a13 <- cbind(nc.counts[,1], nc.props[,1],
              nc.counts[,2], nc.props[,2],
              nc.counts[,3], nc.props[,3],
              rowSums(table(dat$TA, dat$TD, useNA="ifany")))

table_a13[,c(2,4,6)] <- round(table_a13[,c(2,4,6)], digits=2)
table_a13 <- cbind(c("Control","Monitoring","Punitive"), table_a13)
colnames(table_a13) <- c("Assigned Arm",
                        "Control (N)", "Control (Proportion)", 
                        "Monitoring (N)", "Monitoring (Proportion)",
                        "Punitive (N)", "Punitive (Proportion)", "Row Totals")

rownames(table_a13) <- NULL
kable(table_a13, caption="**Table A13**")
Table A13
Assigned Arm Control (N) Control (Proportion) Monitoring (N) Monitoring (Proportion) Punitive (N) Punitive (Proportion) Row Totals
Control 279 1 0 0 0 0 279
Monitoring 31 0.18 143 0.82 0 0 174
Punitive 38 0.19 17 0.08 145 0.72 200

4.17 Table A14 (p. A-34): Estimated Complier Average Causal Effects of Messages on Net Discrimination Levels

# DEFINE PAIRS OF TREATMENT-COMPARISON DIFFERENCES OF INTEREST
diffs <- rbind(c(1,0),
               c(2,0),
               c(2,1))
colnames(diffs) <- c("treatment","comparison")

# ---------------------------------------------------------------------- #
# Code treatment receipt two ways for P vs C comparison
# Alt Approach A: treat Z=punitive D=monitoring as fail-to-treat
# Alt Approach B: treat Z=punitive D=monitoring as effectively punitive messaging
dat$TD_a <- dat$TD_b <- dat$TD
dat$TD_a <- with(dat, ifelse(TA==2 & TD==1, 0, TD_a))
dat$TD_b <- with(dat, ifelse(TA==2 & TD==1, 2, TD_b))

# ---------------------------------------------------------------------- #
# CACE helper function
# inputs:
#  -- dat: data frame
#  -- Y: name of outcome variable
#  -- D: name of treatment receipt variable
#  -- trt_t: value of treatment arm (e.g., 2 for punitive)
#  -- trt_c: value of comparison arm (e.g., 0 for control)

cace.est <- function(dat, Y, D, trt_t, trt_c){
  sub <- dat[dat$TA %in% c(trt_t, trt_c), ]

  sub$Z <- ifelse(sub$TA==trt_t, 1, 0)
  sub$D <- ifelse(sub[[D]]==trt_t, 1, 0)

  sub$wtvar <- sub[[paste0("ipw",trt_t,trt_c)]]
  
  itt.model <- paste(Y, "~ Z + as.factor(block)")
  ittd.model <- "D ~ Z + as.factor(block)"
  iv.model <- paste(Y, "~ D + as.factor(block) | Z + as.factor(block)")
  
  itt.fit <- lm(formula=itt.model, data=sub, weights=wtvar)
  ittd.fit <- lm(formula=ittd.model, data=sub, weights=wtvar)
  iv.fit <- ivreg(formula=iv.model, data=sub, weights=wtvar)
  
  itt.est <- summary(itt.fit)$coef[2,1] # itt
  ittd.est <- summary(ittd.fit)$coef[2,1] # itt.d
  cace.est <- summary(iv.fit)$coef[2,1] # cace
  
  # p value: use one sided if M/C or P/C (j %in% 1:2), use two sided if P/M (j==3)
  if( trt_c == 0 ){
    cace.pval <- pt(coef(summary(iv.fit))[,3], summary(iv.fit)$df[2], lower=TRUE)[2]
  } else {
    cace.pval <- summary(iv.fit)$coef[2,4]
  }
  
  out <- c(Y, trt_t, trt_c, round(itt.est,3), round(ittd.est,3), round(cace.est,3), round(cace.pval,3))
  names(out) <- c("Y", "treatment", "comparison", "ITT", "ITT_D", "CACE", "p-value")
  return(out)
}

# M vs C - estimate CACE using IV
mc.cace.wb <- rbind(cace.est(dat=dat, Y=outvars.wb[2], D="TD", trt_t=1, trt_c=0),
                    cace.est(dat=dat, Y=outvars.wb[3], D="TD", trt_t=1, trt_c=0),
                    cace.est(dat=dat, Y=outvars.wb[4], D="TD", trt_t=1, trt_c=0))

mc.cace.wh <- rbind(cace.est(dat=dat, Y=outvars.wh[2], D="TD", trt_t=1, trt_c=0),
                    cace.est(dat=dat, Y=outvars.wh[3], D="TD", trt_t=1, trt_c=0),
                    cace.est(dat=dat, Y=outvars.wh[4], D="TD", trt_t=1, trt_c=0))

mc.cace.bh <- rbind(cace.est(dat=dat, Y=outvars.bh[2], D="TD", trt_t=1, trt_c=0),
                    cace.est(dat=dat, Y=outvars.bh[3], D="TD", trt_t=1, trt_c=0),
                    cace.est(dat=dat, Y=outvars.bh[4], D="TD", trt_t=1, trt_c=0))

# P vs C (v1) - treat Z=punitive D=monitoring as fail-to-treat
pc1.cace.wb <- rbind(cace.est(dat=dat, Y=outvars.wb[2], D="TD_a", trt_t=2, trt_c=0),
                     cace.est(dat=dat, Y=outvars.wb[3], D="TD_a", trt_t=2, trt_c=0),
                     cace.est(dat=dat, Y=outvars.wb[4], D="TD_a", trt_t=2, trt_c=0))

pc1.cace.wh <- rbind(cace.est(dat=dat, Y=outvars.wh[2], D="TD_a", trt_t=2, trt_c=0),
                     cace.est(dat=dat, Y=outvars.wh[3], D="TD_a", trt_t=2, trt_c=0),
                     cace.est(dat=dat, Y=outvars.wh[4], D="TD_a", trt_t=2, trt_c=0))

pc1.cace.bh <- rbind(cace.est(dat=dat, Y=outvars.bh[2], D="TD_a", trt_t=2, trt_c=0),
                     cace.est(dat=dat, Y=outvars.bh[3], D="TD_a", trt_t=2, trt_c=0),
                     cace.est(dat=dat, Y=outvars.bh[4], D="TD_a", trt_t=2, trt_c=0))

# P vs C (v2) - treat Z=punitive D=monitoring as effectively punitive messaging
pc2.cace.wb <- rbind(cace.est(dat=dat, Y=outvars.wb[2], D="TD_b", trt_t=2, trt_c=0),
                     cace.est(dat=dat, Y=outvars.wb[3], D="TD_b", trt_t=2, trt_c=0),
                     cace.est(dat=dat, Y=outvars.wb[4], D="TD_b", trt_t=2, trt_c=0))

pc2.cace.wh <- rbind(cace.est(dat=dat, Y=outvars.wh[2], D="TD_b", trt_t=2, trt_c=0),
                     cace.est(dat=dat, Y=outvars.wh[3], D="TD_b", trt_t=2, trt_c=0),
                     cace.est(dat=dat, Y=outvars.wh[4], D="TD_b", trt_t=2, trt_c=0))

pc2.cace.bh <- rbind(cace.est(dat=dat, Y=outvars.bh[2], D="TD_b", trt_t=2, trt_c=0),
                     cace.est(dat=dat, Y=outvars.bh[3], D="TD_b", trt_t=2, trt_c=0),
                     cace.est(dat=dat, Y=outvars.bh[4], D="TD_b", trt_t=2, trt_c=0))

# Stitch together results
stitch.cace <- function(wb, wh, bh, wb.labs, wh.labs, bh.labs){
  blanks <- rep(NA, 4)
  out <- rbind(blanks, wb[,4:7],
               blanks, wh[,4:7],
               blanks, bh[,4:7])
  out <- cbind(c("A. White vs. Black", wb.labs[2:4],
                 "B. White vs. Hispanic", wh.labs[2:4],
                 "C. Black vs. Hispanic", bh.labs[2:4]), out)
  out[,1] <- gsub("(White vs. Black)", "", out[,1], fixed=TRUE)
  out[,1] <- gsub("(White vs. Hispanic)", "", out[,1], fixed=TRUE)
  out[,1] <- gsub("(Black vs. Hispanic)", "", out[,1], fixed=TRUE)

  rownames(out) <- NULL
  colnames(out) <- c("Outcome", "ITT", "ITT_D", "CACE", "p-value")
  return(out)
}

cace.mc <- stitch.cace(wb=mc.cace.wb,
                       wh=mc.cace.wh,
                       bh=mc.cace.bh,
                       wb.labs=outvars.wb.labs,
                       wh.labs=outvars.wh.labs,
                       bh.labs=outvars.bh.labs)
cace.pc1 <- stitch.cace(wb=pc1.cace.wb,
                        wh=pc1.cace.wh,
                        bh=pc1.cace.bh,
                        wb.labs=outvars.wb.labs,
                        wh.labs=outvars.wh.labs,
                        bh.labs=outvars.bh.labs)
cace.pc2 <- stitch.cace(wb=pc2.cace.wb,
                        wh=pc2.cace.wh,
                        bh=pc2.cace.bh,
                        wb.labs=outvars.wb.labs,
                        wh.labs=outvars.wh.labs,
                        bh.labs=outvars.bh.labs)

# Assemble final CACE table (same info; two possible formats)
cace.table1 <- rbind(c("I. Monitoring vs. Control", rep(NA, 4)),
                     cace.mc,
                     c("II. Punitive vs. Control (Upper Bound)", rep(NA,4)),
                     cace.pc1,
                     c("III. Punitive vs. Control (Lower Bound)", rep(NA, 4)),
                     cace.pc2)
cace.pc <- list()
keep.rows <- c(1,5,9)
for(i in 1:nrow(cace.pc1)){
  if(i %in% keep.rows){
    cace.pc[[i]] <- cace.pc1[i,]
  } else {
    cace.pc[[i]] <- rbind(c(cace.pc1[i,1], rep(NA, 4)),
                          c("v1",cace.pc1[i,2:5]),
                          c("v2",cace.pc2[i,2:5]))
  }
}
cace.pc <- do.call(rbind, cace.pc)
cace.pc[,1] <- gsub("v1", "Upper bound", cace.pc[,1], fixed=TRUE)
cace.pc[,1] <- gsub("v2", "Lower bound", cace.pc[,1], fixed=TRUE)
cace.table2 <- rbind(c("I. Monitoring vs. Control", rep(NA, 4)),
                     cace.mc,
                     c("II. Punitive vs. Control (Upper and Lower Bounds)", rep(NA, 4)),
                     cace.pc)
kable(cace.table2, caption="**Table A14**")
Table A14
Outcome ITT ITT_D CACE p-value
I. Monitoring vs. Control
A. White vs. Black
Index measure of favorable in-person interactions -0.015 0.81 -0.018 0.388
Received post-visit callback -0.002 0.81 -0.002 0.486
Received post-visit offer for unit -0.003 0.81 -0.004 0.464
B. White vs. Hispanic
Index measure of favorable in-person interactions -0.061 0.81 -0.075 0.14
Received post-visit callback -0.036 0.81 -0.045 0.202
Received post-visit offer for unit -0.017 0.81 -0.021 0.31
C. Black vs. Hispanic
Index measure of favorable in-person interactions -0.089 0.81 -0.106 0.045
Received post-visit callback -0.035 0.81 -0.043 0.206
Received post-visit offer for unit -0.014 0.81 -0.017 0.328
II. Punitive vs. Control (Upper and Lower Bounds)
A. White vs. Black
Index measure of favorable in-person interactions
Upper bound 0.007 0.718 0.01 0.555
Lower bound 0.007 0.8 0.009 0.555
Received post-visit callback
Upper bound 0.019 0.718 0.027 0.675
Lower bound 0.019 0.8 0.024 0.676
Received post-visit offer for unit
Upper bound 0.018 0.718 0.025 0.696
Lower bound 0.018 0.8 0.023 0.696
B. White vs. Hispanic
Index measure of favorable in-person interactions
Upper bound 0.048 0.718 0.067 0.806
Lower bound 0.048 0.8 0.061 0.806
Received post-visit callback
Upper bound -0.066 0.718 -0.092 0.056
Lower bound -0.066 0.8 -0.083 0.056
Received post-visit offer for unit
Upper bound -0.021 0.718 -0.029 0.269
Lower bound -0.021 0.8 -0.026 0.269
C. Black vs. Hispanic
Index measure of favorable in-person interactions
Upper bound 0.039 0.718 0.053 0.785
Lower bound 0.039 0.8 0.047 0.785
Received post-visit callback
Upper bound -0.085 0.718 -0.119 0.02
Lower bound -0.085 0.8 -0.107 0.019
Received post-visit offer for unit
Upper bound -0.039 0.718 -0.055 0.123
Lower bound -0.039 0.8 -0.049 0.122
Ys <- c("index.wb", "ncb_wb", "noff_wb",
        "index.wh", "ncb_wh", "noff_wh",
        "index.bh", "ncb_bh", "noff_bh")

llrace <- c("primary_api", "primary_blk", "primary_hsp", "primary_wht")
llage <- c("primary_age_18to34", "primary_age_35to44", "primary_age_45to64", "primary_age_65over", "primary_age_unknown")
testerfe <- c("tid.A01", "tid.A10", "tid.A11", "tid.A13", "tid.A02", "tid.A21", "tid.A22", "tid.A03", "tid.A04", "tid.A05",
              "tid.A06", "tid.A07", "tid.A08", "tid.A09", "tid.B01", "tid.B11", "tid.B12", "tid.B14", "tid.B16", "tid.B02", 
              "tid.B20", "tid.B23", "tid.B24", "tid.B25", "tid.B27", "tid.B03", "tid.B04", "tid.B06", "tid.B07", "tid.B08", 
              "tid.B09", "tid.C01", "tid.C10", "tid.C12", "tid.C13", "tid.C14", "tid.C15", "tid.C02", "tid.C27", "tid.C29", 
              "tid.C03", "tid.C31", "tid.C33", "tid.C04", "tid.C05", "tid.C06", "tid.C07", "tid.C08", "tid.C09")
tafe <- testerfe[grepl(".A",testerfe,fixed=TRUE)]
tbfe <- testerfe[grepl(".B",testerfe,fixed=TRUE)]
tcfe <- testerfe[grepl(".C",testerfe,fixed=TRUE)]

Xs <- c("frame", "partnered", "rent + m.rent", "numbr", "sqft + m.sqft","regime1","regime2","regime3",
        "nnumattr_bh", "nnumattr_wh", "nnumattr_wb", "nnumskep_bh", "nnumskep_wh", "nnumskep_wb", 
        "npctskep_bh", "npctskep_wh", "npctskep_wb", "nnumpos_bh" , "nnumpos_wh" , "nnumpos_wb" , 
        "nnumneu_bh" , "nnumneu_wh" , "nnumneu_wb" , "nnumneg_bh" , "nnumneg_wh" , "nnumneg_wb" , 
        "npctpos_bh" , "npctpos_wh" , "npctpos_wb" , "npctneu_bh" , "npctneu_wh" , "npctneu_wb" , 
        "npctneg_bh" , "npctneg_wh" , "npctneg_wb" , "nanyskep_bh", "nanyskep_wh", "nanyskep_wb", 
        "nanyneg_bh" , "nanyneg_wh" , "nanyneg_wb" , "team_gender", "broker",
        "callorder.wb", "callorder.wh", "callorder.bh",
        "incrank.wb.gt", "incrank.wh.gt", "incrank.bh.gt",
        "incrank.wb.eq", "incrank.wh.eq", "incrank.bh.eq",
        "incrank.wb.lt", "incrank.wh.lt", "incrank.bh.lt",
        "boro.brx", "boro.brk", "boro.mnh", "boro.que", "boro.stn",
        "inc.w.hi","inc.b.hi","inc.h.hi","ll.female + m.ll.female",
        llrace, llage, testerfe)

# Define sets of matched pair vars to kick out, by outcome
wb.vars <- Xs[grepl(".wb",Xs,fixed=TRUE) | grepl("_wb",Xs,fixed=TRUE)]
wh.vars <- Xs[grepl(".wh",Xs,fixed=TRUE) | grepl("_wh",Xs,fixed=TRUE)]
bh.vars <- Xs[grepl(".bh",Xs,fixed=TRUE) | grepl("_bh",Xs,fixed=TRUE)]


# Change factor vars to numeric (needed for cross validation function)
dat$frame <- as.numeric(dat$frame) # 1=likely disc, 2=representative (recoded to 0)
dat$frame <- ifelse(dat$frame==2, 0, dat$frame)

dat$team_gender <- as.numeric(dat$team_gender) # 1=female, 2=male (recoded to 0)
dat$team_gender <- ifelse(dat$team_gender==2, 0, dat$team_gender)

# Subset data by treatment arm
sub0 <- dat[dat$TA==0,]
sub1 <- dat[dat$TA==1,]
sub2 <- dat[dat$TA==2,]

4.18 Table A15 (p. A-36): Covariate Adjusted ITT Estimates

4.18.1 Lasso Regression for Covariate Selection

We first use the lasso as a principled method to select pre-treatment covariates that are prognostic of outcomes by arm (which are then used to estimate covariate adjusted ITT effects). The code below conducts this covariate selection procedure and produces the output files misc_covselect_select0.csv, misc_covselect_select1.csv, and misc_covselect_select2.csv (which are included in the replication archive). Note: This code chunk takes awhile to run.

#------------------------------------------------#
# Variable selection using lasso
#------------------------------------------------#
set.seed(20150210)
nfolds <- 5
niter <- 1000  # number of times to shuffle folds

# ============== #
# Z=0 (control)
# ============== #

# initialize object to store results
# nested lists of 9 (one per outcome)
# one for model selected at lambda.min (cv0.min)
# another for model selected at lambda.1se (cv0.1se)
# one for a final set of covs we choose (cv0)
cv0.min <- cv0.1se <- list()
cv0 <- list()

# iterate over outcomes
  for(j in 1:length(Ys)){
  # cat("\n ******************* Outcome: ", Ys[j] ," ******************* \n", sep="")
  # pre-process data conditional on outcome of interest
  if(j %in% 1:3) {
    subX <- Xs[!(Xs %in% c(wh.vars, bh.vars))]  # if excluding hisp tester FE add: , tbfe
  } else if(j %in% 4:6) {
    subX <- Xs[!(Xs %in% c(wb.vars, bh.vars))]  # if excluding black tester FE add: , tafe
  } else if(j %in% 7:9) {
    subX <- Xs[!(Xs %in% c(wb.vars, wh.vars))]  # if excluding white tester FE add: , tcfe
  }
  temp <- sub0[,names(sub0) %in% c(Ys[j], subX, "ipw")]
  temp <- temp[apply(temp, 1, function(x) sum(is.na(x))) == 0,]  # kick out missing (required for cv.glmnet)
    
  # initialize object to store
  cv.out <- list()
  
  # initialize vector of fold assignments
  folds0 <- c(rep(1, round(nrow(temp)/nfolds, 0)),
              rep(2, round(nrow(temp)/nfolds, 0)),
              rep(3, round(nrow(temp)/nfolds, 0)),
              rep(4, round(nrow(temp)/nfolds, 0)),
              rep(5, nrow(temp) - 4*round(nrow(temp)/nfolds, 0) ))
    
  # iterate over shuffled fold assignments
  for(i in 1:niter){
    # cat(i, " ", sep="")
    # shuffle folds
    fold.assign <- folds0[sample(1:nrow(temp), nrow(temp), replace=FALSE)]
    # lasso
    cv.out[[i]] <- cv.glmnet(x=as.matrix(temp[,!(names(temp) %in% c(Ys[j], "ipw"))]),
                             y=as.vector(temp[[Ys[j]]]),
                             weights=temp$ipw,
                             type.measure="mse",
                             nfolds=5,
                             foldid=fold.assign,
                             alpha=1)
  }
  # Grab variables associated with model at lambda.1se and lambda.min
  out.1se  <- lapply(cv.out, function(x) grabvars(x, s="lambda.1se"))
  out.min  <- lapply(cv.out, function(x) grabvars(x, s="lambda.min"))
  vars.1se <- lapply(out.1se, function(x) as.character(x$vars[,1]))
  vars.min <- lapply(out.min, function(x) as.character(x$vars[,1]))
  
  # For each candidate model, fit Y ~ X for the treatment arm of interest
  fit.1se <- fit.min <- list()
  
  for(k in 1:length(vars.1se)){
    # one SE rule
    if ( length(vars.1se[[k]]) > 1 ) {
      X.selected <- vars.1se[[k]][vars.1se[[k]] != "(Intercept)"]
      X.selected <- paste(X.selected, collapse=" + ")
      model <- paste(Ys[j], "~", X.selected, sep=" ")
      fit <- lm(formula=model, data=sub0)
      adjusted.r2 <- summary(fit)$adj.r.squared
      f.stat <- summary(fit)$fstatistic[1]
      f.pvalue <- 1-pf(q=summary(fit)$fstatistic[1], df1=summary(fit)$fstatistic[2], df2=summary(fit)$fstatistic[3])
      fit.1se[[k]] <- c(X.selected, adjusted.r2, f.stat, f.pvalue)      
    } else {
      adjusted.r2 <- NA
      f.stat <- NA
      f.pvalue <- NA
      fit.1se[[k]] <- c(NA, adjusted.r2, f.stat, f.pvalue)
    }
    
    # min lambda
    if ( length(vars.min[[k]]) > 1 ) {
      X.selected <- vars.min[[k]][vars.min[[k]] != "(Intercept)"]
      X.selected <- paste(X.selected, collapse=" + ")
      model <- paste(Ys[j], "~", X.selected, sep=" ")
      fit <- lm(formula=model, data=sub0)
      adjusted.r2 <- summary(fit)$adj.r.squared
      f.stat <- summary(fit)$fstatistic[1]
      f.pvalue <- 1-pf(q=summary(fit)$fstatistic[1], df1=summary(fit)$fstatistic[2], df2=summary(fit)$fstatistic[3])
      fit.min[[k]] <- c(X.selected, adjusted.r2, f.stat, f.pvalue)      
    } else {
      adjusted.r2 <- NA
      f.stat <- NA
      f.pvalue <- NA
      fit.min[[k]] <- c(NA, adjusted.r2, f.stat, f.pvalue)      
    }
  }
  
  # collect all results
  fit.min <- do.call(rbind, fit.min)
  fit.1se <- do.call(rbind, fit.1se)
  fit.min <- as.data.frame(fit.min)
  fit.1se <- as.data.frame(fit.1se)
  names(fit.1se) <- names(fit.min) <- c("variables","adj.r2","f.stat","f.pvalue")
  fit.min[,1] <- as.character(fit.min[,1])
  fit.1se[,1] <- as.character(fit.1se[,1])
  for(v in 2:ncol(fit.min)) fit.min[,v] <- as.numeric(as.character(fit.min[,v]))
  for(v in 2:ncol(fit.1se)) fit.1se[,v] <- as.numeric(as.character(fit.1se[,v]))
  
  # choose variables that are (1) not missing, (2) with highest adj R2, and (3) with significant f stat p value
  choose.min <- fit.min[!is.na(fit.min$variables) & fit.min$adj.r2==max(fit.min$adj.r2, na.rm=TRUE) & fit.min$f.pvalue < 0.05,]
  choose.1se <- fit.1se[!is.na(fit.1se$variables) & fit.1se$adj.r2==max(fit.1se$adj.r2, na.rm=TRUE) & fit.1se$f.pvalue < 0.05,]

  # store result - lambda min
  if(nrow(choose.min)==0){
    cv0.min[[j]] <- rep(NA,4)
  } else {
    cv0.min[[j]] <- choose.min <- choose.min[!duplicated(choose.min),]
  }
  
  # store result - lambda 1se
  if(nrow(choose.1se)==0){
    cv0.1se[[j]] <- rep(NA,4)
  } else {
    cv0.1se[[j]] <- choose.1se <- choose.1se[!duplicated(choose.1se),]
  }
}

# ============== #
# Z=1 (monitoring)
# ============== # 

# initialize object to store results
# nested lists of 9 (one per outcome)
# one for model selected at lambda.min (cv1.min)
# another for model selected at lambda.1se (cv1.1se)
# one for a final set of covs we choose (cv1)
cv1.min <- cv1.1se <- list()
# iterate over outcomes
  for(j in 1:length(Ys)){
  # cat("\n ******************* Outcome: ", Ys[j] ," ******************* \n", sep="")
  # pre-process data conditional on outcome of interest
  if(j %in% 1:3) {
    subX <- Xs[!(Xs %in% c(wh.vars, bh.vars))]  # if excluding hisp tester FE add: , tbfe
  } else if(j %in% 4:6) {
    subX <- Xs[!(Xs %in% c(wb.vars, bh.vars))]  # if excluding black tester FE add: , tafe
  } else if(j %in% 7:9) {
    subX <- Xs[!(Xs %in% c(wb.vars, wh.vars))]  # if excluding white tester FE add: , tcfe
  }
  temp <- sub1[,names(sub1) %in% c(Ys[j], subX, "ipw")]
  temp <- temp[apply(temp, 1, function(x) sum(is.na(x))) == 0,]  # kick out missing (required for cv.glmnet)
  
  # initialize object to store
  cv.out <- list()
  # initialize vector of fold assignments
  folds0 <- c(rep(1, round(nrow(temp)/nfolds, 0)),
              rep(2, round(nrow(temp)/nfolds, 0)),
              rep(3, round(nrow(temp)/nfolds, 0)),
              rep(4, round(nrow(temp)/nfolds, 0)),
              rep(5, nrow(temp) - 4*round(nrow(temp)/nfolds, 0) ))
  # iterate over shuffled fold assignments
  for(i in 1:niter){
    # cat(i, " ", sep="")
    # shuffle folds
    fold.assign <- folds0[sample(1:nrow(temp), nrow(temp), replace=FALSE)]
    # lasso
    cv.out[[i]] <- cv.glmnet(x=as.matrix(temp[,!(names(temp) %in% c(Ys[j], "ipw"))]),
                             y=as.vector(temp[[Ys[j]]]),
                             weights=temp$ipw,
                             type.measure="mse",
                             nfolds=5,
                             foldid=fold.assign,
                             alpha=1)
  }
  # Grab variables associated with model at lambda.1se and lambda.min
  out.1se  <- lapply(cv.out, function(x) grabvars(x, s="lambda.1se"))
  out.min  <- lapply(cv.out, function(x) grabvars(x, s="lambda.min"))
  vars.1se <- lapply(out.1se, function(x) as.character(x$vars[,1]))
  vars.min <- lapply(out.min, function(x) as.character(x$vars[,1]))
  
  # For each candidate model, fit Y ~ X for the treatment arm of interest
  fit.1se <- fit.min <- list()
  for(k in 1:length(vars.1se)){
    # one SE rule
    if ( length(vars.1se[[k]]) > 1 ) {
      X.selected <- vars.1se[[k]][vars.1se[[k]] != "(Intercept)"]
      X.selected <- paste(X.selected, collapse=" + ")
      model <- paste(Ys[j], "~", X.selected, sep=" ")
      fit <- lm(formula=model, data=sub1)
      adjusted.r2 <- summary(fit)$adj.r.squared
      f.stat <- summary(fit)$fstatistic[1]
      f.pvalue <- 1-pf(q=summary(fit)$fstatistic[1], df1=summary(fit)$fstatistic[2], df2=summary(fit)$fstatistic[3])
      fit.1se[[k]] <- c(X.selected, adjusted.r2, f.stat, f.pvalue)      
    } else {
      adjusted.r2 <- NA
      f.stat <- NA
      f.pvalue <- NA
      fit.1se[[k]] <- c(NA, adjusted.r2, f.stat, f.pvalue)
    }
    # min lambda
    if ( length(vars.min[[k]]) > 1 ) {
      X.selected <- vars.min[[k]][vars.min[[k]] != "(Intercept)"]
      X.selected <- paste(X.selected, collapse=" + ")
      model <- paste(Ys[j], "~", X.selected, sep=" ")
      fit <- lm(formula=model, data=sub1)
      adjusted.r2 <- summary(fit)$adj.r.squared
      f.stat <- summary(fit)$fstatistic[1]
      f.pvalue <- 1-pf(q=summary(fit)$fstatistic[1], df1=summary(fit)$fstatistic[2], df2=summary(fit)$fstatistic[3])
      fit.min[[k]] <- c(X.selected, adjusted.r2, f.stat, f.pvalue)      
    } else {
      adjusted.r2 <- NA
      f.stat <- NA
      f.pvalue <- NA
      fit.min[[k]] <- c(NA, adjusted.r2, f.stat, f.pvalue)      
    }
  }
  
  # collect all results
  fit.min <- do.call(rbind, fit.min)
  fit.1se <- do.call(rbind, fit.1se)
  fit.min <- as.data.frame(fit.min)
  fit.1se <- as.data.frame(fit.1se)
  names(fit.1se) <- names(fit.min) <- c("variables","adj.r2","f.stat","f.pvalue")
  fit.min[,1] <- as.character(fit.min[,1])
  fit.1se[,1] <- as.character(fit.1se[,1])
  for(v in 2:ncol(fit.min)) fit.min[,v] <- as.numeric(as.character(fit.min[,v]))
  for(v in 2:ncol(fit.1se)) fit.1se[,v] <- as.numeric(as.character(fit.1se[,v]))
  
  # choose variables that are (1) not missing, (2) with highest adj R2, and (3) with significant f stat p value
  choose.min <- fit.min[!is.na(fit.min$variables) & fit.min$adj.r2==max(fit.min$adj.r2, na.rm=TRUE) & fit.min$f.pvalue < 0.05,]
  choose.1se <- fit.1se[!is.na(fit.1se$variables) & fit.1se$adj.r2==max(fit.1se$adj.r2, na.rm=TRUE) & fit.1se$f.pvalue < 0.05,]
  
  # store result - lambda min
  if(nrow(choose.min)==0){
    cv1.min[[j]] <- rep(NA,4)
  } else {
    cv1.min[[j]] <- choose.min <- choose.min[!duplicated(choose.min),]
  }
  
  # store result - lambda 1se
  if(nrow(choose.1se)==0){
    cv1.1se[[j]] <- rep(NA,4)
  } else {
    cv1.1se[[j]] <- choose.1se <- choose.1se[!duplicated(choose.1se),]
  }
  
  }

# ============== #
# Z=2 (punitive)
# ============== # 

# initialize object to store results
# nested lists of 9 (one per outcome)
# one for model selected at lambda.min (cv2.min)
# another for model selected at lambda.1se (cv2.1se)
# one for a final set of covs we choose (cv1)
cv2.min <- cv2.1se <- list()
# iterate over outcomes
  for(j in 1:length(Ys)){
  # cat("\n ******************* Outcome: ", Ys[j] ," ******************* \n", sep="")
  # pre-process data conditional on outcome of interest
  if(j %in% 1:3) {
    subX <- Xs[!(Xs %in% c(wh.vars, bh.vars))]  # if excluding hisp tester FE add: , tbfe
  } else if(j %in% 4:6) {
    subX <- Xs[!(Xs %in% c(wb.vars, bh.vars))]  # if excluding black tester FE add: , tafe
  } else if(j %in% 7:9) {
    subX <- Xs[!(Xs %in% c(wb.vars, wh.vars))]  # if excluding white tester FE add: , tcfe
  }
  temp <- sub2[,names(sub2) %in% c(Ys[j], subX, "ipw")]
  temp <- temp[apply(temp, 1, function(x) sum(is.na(x))) == 0,]  # kick out missing (required for cv.glmnet)
  # initialize object to store
  cv.out <- list()
  # initialize vector of fold assignments
  folds0 <- c(rep(1, round(nrow(temp)/nfolds, 0)),
              rep(2, round(nrow(temp)/nfolds, 0)),
              rep(3, round(nrow(temp)/nfolds, 0)),
              rep(4, round(nrow(temp)/nfolds, 0)),
              rep(5, nrow(temp) - 4*round(nrow(temp)/nfolds, 0) ))
  # iterate over shuffled fold assignments
  for(i in 1:niter){
    # cat(i, " ", sep="")
    # shuffle folds
    fold.assign <- folds0[sample(1:nrow(temp), nrow(temp), replace=FALSE)]
    # lasso
    cv.out[[i]] <- cv.glmnet(x=as.matrix(temp[,!(names(temp) %in% c(Ys[j], "ipw"))]),
                             y=as.vector(temp[[Ys[j]]]),
                             weights=temp$ipw,
                             type.measure="mse",
                             nfolds=5,
                             foldid=fold.assign,
                             alpha=1)
  }
  # Grab variables associated with model at lambda.1se and lambda.min
  out.1se <- lapply(cv.out, function(x) grabvars(x, s="lambda.1se"))
  out.min <- lapply(cv.out, function(x) grabvars(x, s="lambda.min"))
  vars.1se <- lapply(out.1se, function(x) as.character(x$vars[,1]))
  vars.min <- lapply(out.min, function(x) as.character(x$vars[,1]))
  # For each candidate model, fit Y ~ X for the treatment arm of interest
  fit.1se <- fit.min <- list()
  for(k in 1:length(vars.1se)){
    # one SE rule
    if ( length(vars.1se[[k]]) > 1 ) {
      X.selected <- vars.1se[[k]][vars.1se[[k]] != "(Intercept)"]
      X.selected <- paste(X.selected, collapse=" + ")
      model <- paste(Ys[j], "~", X.selected, sep=" ")
      fit <- lm(formula=model, data=sub2)
      adjusted.r2 <- summary(fit)$adj.r.squared
      f.stat <- summary(fit)$fstatistic[1]
      f.pvalue <- 1-pf(q=summary(fit)$fstatistic[1], df1=summary(fit)$fstatistic[2], df2=summary(fit)$fstatistic[3])
      fit.1se[[k]] <- c(X.selected, adjusted.r2, f.stat, f.pvalue)      
    } else {
      adjusted.r2 <- NA
      f.stat <- NA
      f.pvalue <- NA
      fit.1se[[k]] <- c(NA, adjusted.r2, f.stat, f.pvalue)
    }
    
    # min lambda
    if ( length(vars.min[[k]]) > 1 ) {
      X.selected <- vars.min[[k]][vars.min[[k]] != "(Intercept)"]
      X.selected <- paste(X.selected, collapse=" + ")
      model <- paste(Ys[j], "~", X.selected, sep=" ")
      fit <- lm(formula=model, data=sub2)
      adjusted.r2 <- summary(fit)$adj.r.squared
      f.stat <- summary(fit)$fstatistic[1]
      f.pvalue <- 1-pf(q=summary(fit)$fstatistic[1], df1=summary(fit)$fstatistic[2], df2=summary(fit)$fstatistic[3])
      fit.min[[k]] <- c(X.selected, adjusted.r2, f.stat, f.pvalue)      
    } else {
      adjusted.r2 <- NA
      f.stat <- NA
      f.pvalue <- NA
      fit.min[[k]] <- c(NA, adjusted.r2, f.stat, f.pvalue)      
    }
  }
  
  # collect all results
  fit.min <- do.call(rbind, fit.min)
  fit.1se <- do.call(rbind, fit.1se)
  fit.min <- as.data.frame(fit.min)
  fit.1se <- as.data.frame(fit.1se)
  names(fit.1se) <- names(fit.min) <- c("variables","adj.r2","f.stat","f.pvalue")
  fit.min[,1] <- as.character(fit.min[,1])
  fit.1se[,1] <- as.character(fit.1se[,1])
  for(v in 2:ncol(fit.min)) fit.min[,v] <- as.numeric(as.character(fit.min[,v]))
  for(v in 2:ncol(fit.1se)) fit.1se[,v] <- as.numeric(as.character(fit.1se[,v]))
  
  # choose variables that are (1) not missing, (2) with highest adj R2, and (3) with significant f stat p value
  choose.min <- fit.min[!is.na(fit.min$variables) & fit.min$adj.r2==max(fit.min$adj.r2, na.rm=TRUE) & fit.min$f.pvalue < 0.05,]
  choose.1se <- fit.1se[!is.na(fit.1se$variables) & fit.1se$adj.r2==max(fit.1se$adj.r2, na.rm=TRUE) & fit.1se$f.pvalue < 0.05,]
  
  # store result - lambda min
  if(nrow(choose.min)==0){
    cv2.min[[j]] <- rep(NA,4)
  } else {
    cv2.min[[j]] <- choose.min <- choose.min[!duplicated(choose.min),]
  }
  
  # store result - lambda 1se
  if(nrow(choose.1se)==0){
    cv2.1se[[j]] <- rep(NA,4)
  } else {
    cv2.1se[[j]] <- choose.1se <- choose.1se[!duplicated(choose.1se),]
  }
  
  }

# Label items in list
names(cv0.1se) <- names(cv1.1se) <- names(cv2.1se) <- Ys
names(cv0.min) <- names(cv1.min) <- names(cv2.min) <- Ys

# For each, check for more than one result, if so check if the predictors are the same
#  Deduplicate selected variables (if multiple results returned)
for(i in 1:length(cv0.min)){

  # cv0.min
  if( !is.null(nrow(cv0.min[[i]])) ) {
    if ( nrow(cv0.min[[i]]) > 1 ) {
      var.comp <- strsplit(cv0.min[[i]][,1], " + ", fixed=TRUE)
      var.comp <- lapply(var.comp, function(x) sort(x))
      # if there are the same # of vars then
      if ( apply(as.matrix(sapply(var.comp, length)), MARGIN=2, function(x) sum(length(unique(x)))==1) ) {
        # combine all
        var.comp <- do.call(cbind, var.comp)
        # if all vars are the same
        if ( mean(apply(var.comp, MARGIN=1, function(x) sum(length(unique(x)))==1))==1 ) {
          cv0.min[[i]] <- cv0.min[[i]][sample(1:nrow(cv0.min[[i]]), 1),]
        } 
      }
    }
  }
  
  # cv1.min
  if( !is.null(nrow(cv1.min[[i]])) ) {
    if ( nrow(cv1.min[[i]]) > 1 ) {
      var.comp <- strsplit(cv1.min[[i]][,1], " + ", fixed=TRUE)
      var.comp <- lapply(var.comp, function(x) sort(x))
      # if there are the same # of vars then
      if ( apply(as.matrix(sapply(var.comp, length)), MARGIN=2, function(x) sum(length(unique(x)))==1) ) {
        # combine all
        var.comp <- do.call(cbind, var.comp)
        # if all vars are the same
        if ( mean(apply(var.comp, MARGIN=1, function(x) sum(length(unique(x)))==1))==1 ) {
          cv1.min[[i]] <- cv1.min[[i]][sample(1:nrow(cv1.min[[i]]), 1),]
        } 
      }
    }
  }
  
  # cv2.min
  if( !is.null(nrow(cv2.min[[i]])) ) {
    if ( nrow(cv2.min[[i]]) > 1 ) {
      var.comp <- strsplit(cv2.min[[i]][,1], " + ", fixed=TRUE)
      var.comp <- lapply(var.comp, function(x) sort(x))
      # if there are the same # of vars then
      if ( apply(as.matrix(sapply(var.comp, length)), MARGIN=2, function(x) sum(length(unique(x)))==1) ) {
        # combine all
        var.comp <- do.call(cbind, var.comp)
        # if all vars are the same
        if ( mean(apply(var.comp, MARGIN=1, function(x) sum(length(unique(x)))==1))==1 ) {
          cv2.min[[i]] <- cv2.min[[i]][sample(1:nrow(cv2.min[[i]]), 1),]
        } 
      }
    }
  }
}

# Collapse into a matrix
select0 <- do.call(rbind, cv0.min)
select1 <- do.call(rbind, cv1.min)
select2 <- do.call(rbind, cv2.min)

# Save output - these are the selected covariates by arm
write.csv(select0, "misc_covselect_select0.csv", row.names=TRUE)
write.csv(select1, "misc_covselect_select1.csv", row.names=TRUE)
write.csv(select2, "misc_covselect_select2.csv", row.names=TRUE)

4.18.2 Covariate Adjusted ITT Estimation

The following code estimates covariate adjusted ITT effects and bootstraps 95% confidence intervals. Note: This code chunk takes awhile to run. This code chunk also saves main results used to build Table A15, which is done in the next code chunk below.

##=============================================================================##
## Covariate Adjustment
##=============================================================================##

# define outcomes 
Ys <- c("index.wb", "ncb_wb", "noff_wb",
        "index.wh", "ncb_wh", "noff_wh",
        "index.bh", "ncb_bh", "noff_bh")

# read in covariates from variable selection procedure

select0 <- read.csv("misc_covselect_select0.csv", header=TRUE, colClasses="character")
select1 <- read.csv("misc_covselect_select1.csv", header=TRUE, colClasses="character")
select2 <- read.csv("misc_covselect_select2.csv", header=TRUE, colClasses="character")

names(select0)[1] <- names(select1)[1] <- names(select2)[1] <- "Y"

# create block FE dummy variables
block.dums <- model.matrix(~ -1 + as.factor(block), data=dat)
colnames(block.dums) <- paste("block", 1:17, sep="")
#head(block.dums)
dat <- cbind(dat, block.dums)
#names(dat)

# subset data by treatment arm

sub0 <- dat[dat$TA==0,]
sub1 <- dat[dat$TA==1,]
sub2 <- dat[dat$TA==2,]

# create vector of block fixed effect variables
blockfe <- paste("block", 2:17, sep="")

# create probability of treatment (for 2 group estimator)
# if in regime 1, equal probability of treatment

dat$pt10 <- ifelse(dat$regime==1, .5, ifelse(dat$regime==2, 1/3, .5))
dat$pt20 <- ifelse(dat$regime==1, .5, ifelse(dat$regime==2, 1/3, .5))
dat$pt21 <- ifelse(dat$regime==1, .5, ifelse(dat$regime==2, .5, .5))

# create ipw from probability of treatment (for 2 group estimator)

dat$ipw10 <- ifelse(dat$TA == 1, 1/dat$pt10, ifelse(dat$TA==0, 1/(1-dat$pt10), NA))
dat$ipw20 <- ifelse(dat$TA == 2, 1/dat$pt20, ifelse(dat$TA==0, 1/(1-dat$pt20), NA))
dat$ipw21 <- ifelse(dat$TA == 2, 1/dat$pt21, ifelse(dat$TA==1, 1/(1-dat$pt21), NA))

# create vectors of covariates by treatment arm and outcome
c0 <- lapply(strsplit(select0$variables, " + ", fixed=TRUE), function(x) c(x[!is.na(x)], blockfe))
c1 <- lapply(strsplit(select1$variables, " + ", fixed=TRUE), function(x) c(x[!is.na(x)], blockfe))
c2 <- lapply(strsplit(select2$variables, " + ", fixed=TRUE), function(x) c(x[!is.na(x)], blockfe))

# covariate adjusted estimates with empirical sandwich variance estimator

esv2g.10 <- NULL
esv2g.20 <- NULL
esv2g.21 <- NULL

for(i in 1:length(Ys)){
  esv2g.10[[i]] <- esv2g(data=dat,zvar="TA",Yvar=Ys[i],covs0=c0[[i]],covs1=c1[[i]],z1=1,z0=0,ipwvar="ipw10",ptvar="pt10")
  esv2g.20[[i]] <- esv2g(data=dat,zvar="TA",Yvar=Ys[i],covs0=c0[[i]],covs1=c2[[i]],z1=2,z0=0,ipwvar="ipw20",ptvar="pt20")
  esv2g.21[[i]] <- esv2g(data=dat,zvar="TA",Yvar=Ys[i],covs0=c1[[i]],covs1=c2[[i]],z1=2,z0=1,ipwvar="ipw21",ptvar="pt21")  
}


# bootstrap

set.seed(20150229) # set seed locally for reproducible bootstrap estimates

bs2g.10 <- NULL
bs2g.20 <- NULL
bs2g.21 <- NULL

# Define loop functions to run in parallel
bsfun_1 <- function(i) {
  bs(data=dat, sims=5000, zvar="TA", Yvar=Ys[i], z0=0, z1=1, X0=c0[[i]], X1=c1[[i]], ipwvar="ipw10",ptvar="pt10")
}

bsfun_2 <- function(i) {
  bs(data=dat, sims=5000, zvar="TA", Yvar=Ys[i], z0=0, z1=2, X0=c0[[i]], X1=c2[[i]], ipwvar="ipw20",ptvar="pt20")
}

bsfun_3 <- function(i) {
  bs(data=dat, sims=5000, zvar="TA", Yvar=Ys[i], z0=1, z1=2, X0=c1[[i]], X1=c2[[i]], ipwvar="ipw21",ptvar="pt21")  
}

bs2g.10 <- parallel::mclapply(1:length(Ys), bsfun_1)
bs2g.20 <- parallel::mclapply(1:length(Ys), bsfun_2)
bs2g.21 <- parallel::mclapply(1:length(Ys), bsfun_3)

# assemble results into matrices

bs10.est <- sapply(bs2g.10, function(x) unlist(x[1,]))
bs10.var <- sapply(bs2g.10, function(x) unlist(x[2,]))
bs10.se <- sapply(bs2g.10, function(x) unlist(x[3,]))

bs20.est <- sapply(bs2g.20, function(x) unlist(x[1,]))
bs20.var <- sapply(bs2g.20, function(x) unlist(x[2,]))
bs20.se <- sapply(bs2g.20, function(x) unlist(x[3,]))

bs21.est <- sapply(bs2g.21, function(x) unlist(x[1,]))
bs21.var <- sapply(bs2g.21, function(x) unlist(x[2,]))
bs21.se <- sapply(bs2g.21, function(x) unlist(x[3,]))

es10 <- do.call(rbind, lapply(esv2g.10, function(x) c(x$est, x$se)))
es20 <- do.call(rbind, lapply(esv2g.20, function(x) c(x$est, x$se)))
es21 <- do.call(rbind, lapply(esv2g.21, function(x) c(x$est, x$se)))

# Save bootstrap estimates
save(bs10.est, bs10.var, bs10.se,
     bs20.est, bs20.var, bs20.se,
     bs21.est, bs21.var, bs21.se,
     file="misc_bootstrap_estimates.RData")

# Read in bootstrap estimates
load("misc_bootstrap_estimates.RData")

#============================================================================#
# calculate 95% confidence intervals -- bootstrap
# (1) first order normal approximation
# (2) basic bootstrap interval
# (3) bootstrap percentile interval
# (4) studentized bootstrap interval
# (5) adjusted bootstrap percentile (BCa) interval
#============================================================================#


#==========================#
# (1) normal approximation
#==========================#

# if the ITT estimate is approximately normal, then
# 95% CI is  ITT.hat - b +/- z(alpha)*sqrt(nu)
# where z(alpha) = critical value of z at alpha=0.05
# ITT.hat = estimate (non-bootstrapped, get this from the empirical sandwich estimation output)
# b = mean of bootstrap estimate - ITT.hat
# nu = (1/(R-1))* SUM (over draws) (estimated mean - mean of bootstrap estimates)^2
# R = number of bootstrap draws (sims=5000 for us)

R = 5000

# M-C
bs10.normalci <- list()
for(i in 1:length(Ys)){
  nu <- (1/(R-1)) *   sum((bs10.est[,i]-mean(bs10.est[,i]))^2)
  lci <- es10[i,1] - (mean(bs10.est[,i]) - es10[i,1]) - qnorm(p=0.975, mean=0, sd=1) * sqrt(nu)
  uci <- es10[i,1] - (mean(bs10.est[,i]) - es10[i,1]) + qnorm(p=0.975, mean=0, sd=1) * sqrt(nu)
  bs10.normalci[[i]] <- c(lci, uci)
}

# P-C
bs20.normalci <- list()
for(i in 1:length(Ys)){
  nu <- (1/(R-1)) *   sum((bs20.est[,i]-mean(bs20.est[,i]))^2)
  lci <- es20[i,1] - (mean(bs20.est[,i]) - es20[i,1]) - qnorm(p=0.975, mean=0, sd=1) * sqrt(nu)
  uci <- es20[i,1] - (mean(bs20.est[,i]) - es20[i,1]) + qnorm(p=0.975, mean=0, sd=1) * sqrt(nu)
  bs20.normalci[[i]] <- c(lci, uci)
}

# P-M
bs21.normalci <- list()
for(i in 1:length(Ys)){
  nu <- (1/(R-1)) *   sum((bs21.est[,i]-mean(bs21.est[,i]))^2)
  lci <- es21[i,1] - (mean(bs21.est[,i]) - es21[i,1]) - qnorm(p=0.975, mean=0, sd=1) * sqrt(nu)
  uci <- es21[i,1] - (mean(bs21.est[,i]) - es21[i,1]) + qnorm(p=0.975, mean=0, sd=1) * sqrt(nu)
  bs21.normalci[[i]] <- c(lci, uci)
}


bs10.normalci <- do.call(rbind, bs10.normalci)
bs20.normalci <- do.call(rbind, bs20.normalci)
bs21.normalci <- do.call(rbind, bs21.normalci)

# bs10.normalci; bs20.normalci; bs21.normalci

#==========================#
# (2) basic bootstrap interval
#==========================#

# 2*t0 - quantile(t, c(.975, .025))

bs10.basicci <- list()
bs20.basicci <- list()
bs21.basicci <- list()

for(i in 1:length(Ys)){
  bs10.basicci[[i]] <- 2*es10[i,1] - quantile(bs10.est[,i], c(.975, .025))
  bs20.basicci[[i]] <- 2*es20[i,1] - quantile(bs20.est[,i], c(.975, .025))
  bs21.basicci[[i]] <- 2*es21[i,1] - quantile(bs21.est[,i], c(.975, .025))
}

bs10.basicci <- do.call(rbind, bs10.basicci)
bs20.basicci <- do.call(rbind, bs20.basicci)
bs21.basicci <- do.call(rbind, bs21.basicci)
colnames(bs10.basicci) <- colnames(bs20.basicci) <- colnames(bs21.basicci) <- c("lower","upper")

# bs10.basicci; bs20.basicci; bs21.basicci

#==========================#
# (3) percentile interval
#==========================#

# use empirical quantities at the 2.5th and 97.5th percentiles

bs10.pctci <- t(apply(bs10.est, 2, function(x) quantile(x, c(.025, .975)))) 
bs20.pctci <- t(apply(bs20.est, 2, function(x) quantile(x, c(.025, .975))))
bs21.pctci <- t(apply(bs21.est, 2, function(x) quantile(x, c(.025, .975))))

# bs10.pctci; bs20.pctci; bs21.pctci



#==========================#
# (4) studentized boot ci
#==========================#

# generalization of student t statistic to bootstrap setting
# requires variance for bootstrap ITT etsimates computed from each bs sample

# first calculate test statistic for each bs sample
bs10.t <- bs20.t <- bs21.t <- matrix(NA, ncol=length(Ys), nrow=R)

for(i in 1:length(Ys)){
  bs10.t[,i] <- (bs10.est[,i] - es10[i,1])/bs10.se[,i]
  bs20.t[,i] <- (bs20.est[,i] - es20[i,1])/bs20.se[,i]
  bs21.t[,i] <- (bs21.est[,i] - es21[i,1])/bs21.se[,i]
}

# from the empirical distribution of the test statistics,
# we calculate order statistics/quantiles:
#   t*[(R+1)(1-alpha)]
#   t*[(R+1)(alpha)]

bs10.q <- apply(bs10.t, 2, function(x) quantile(x, c(.975, .025)))
bs20.q <- apply(bs20.t, 2, function(x) quantile(x, c(.975, .025)))
bs21.q <- apply(bs21.t, 2, function(x) quantile(x, c(.975, .025)))

# the studentized CI is given by
# lower: ITT.hat - SE.hat * t*[(R+1)(1-alpha)]
# upper: ITT.hat - SE.hat * t*[(R+1)(alpha)]

bs10.tci <- bs20.tci <- bs21.tci <- NULL
for(i in 1:length(Ys)){
  bs10.tci[[i]] <- es10[i,1] - sd(bs10.est[,i]) * bs10.q[,i]
  bs20.tci[[i]] <- es20[i,1] - sd(bs20.est[,i]) * bs20.q[,i]
  bs21.tci[[i]] <- es21[i,1] - sd(bs21.est[,i]) * bs21.q[,i]  
}

bs10.tci <- do.call(rbind, bs10.tci)
bs20.tci <- do.call(rbind, bs20.tci)
bs21.tci <- do.call(rbind, bs21.tci)
colnames(bs10.tci) <- colnames(bs20.tci) <- colnames(bs21.tci) <- c("lower", "upper")

# bs10.tci; bs20.tci; bs21.tci



#============================================================================#
# calculate 95% confidence intervals -- empirical sandwich variance estimator
#============================================================================#

# calculate df by model

n10 <- rep(NA, length(Ys))
n20 <- rep(NA, length(Ys))
n21 <- rep(NA, length(Ys))

for(i in 1:length(Ys)){
  n10[i] <- sum(apply(dat[dat$TA %in% c(0,1),names(dat) %in% c(Ys[i],c0[[i]],c1[[i]],"TA")] ,1,function(x) sum(is.na(x))==0))
  n20[i] <- sum(apply(dat[dat$TA %in% c(0,2),names(dat) %in% c(Ys[i],c0[[i]],c2[[i]],"TA")] ,1,function(x) sum(is.na(x))==0))
  n21[i] <- sum(apply(dat[dat$TA %in% c(1,2),names(dat) %in% c(Ys[i],c1[[i]],c2[[i]],"TA")] ,1,function(x) sum(is.na(x))==0))  
}

# calculate DF as the two-group sample size, minus the sum of the number of covariates across treatment arms,
# minus 3 (2 for the intercepts of arm-specific models; 1 for the treatment-control difference)
df10 <- n10 - (sapply(c0, length)+sapply(c1, length)) - 3
df20 <- n20 - (sapply(c0, length)+sapply(c2, length)) - 3
df21 <- n21 - (sapply(c1, length)+sapply(c2, length)) - 3

cv10 <- -qt(p=0.025, df=df10, lower.tail=TRUE)
cv20 <- -qt(p=0.025, df=df20, lower.tail=TRUE)
cv21 <- -qt(p=0.025, df=df21, lower.tail=TRUE)

# form confidence interval

es10.ci <- cbind(es10[,1]-abs(cv10)*es10[,2], es10[,1]+abs(cv10)*es10[,2])
es20.ci <- cbind(es20[,1]-abs(cv20)*es20[,2], es20[,1]+abs(cv20)*es20[,2])
es21.ci <- cbind(es21[,1]-abs(cv21)*es21[,2], es21[,1]+abs(cv21)*es21[,2])


#============================================================================#
# plot estimates and confidence intervals
#============================================================================#

histYlabs <- c("Interactions index\n(White-Black)", "Callback\n(White-Black)", "Offer\n(White-Black)",
               "Interactions index\n(White-Hispanic)", "Callback\n(White-Hispanic)", "Offer\n(White-Hispanic)",
               "Interactions index\n(Black-Hispanic)", "Callback\n(Black-Hispanic)", "Offer\n(Black-Hispanic)")

# cbind(Ys, histYlabs)

xrange <- max(abs(range(c(es10[,1], bs10.est))))
xrange <- c(-xrange, xrange)

pdf("misc_bootstrap_bs_10_est.pdf", height=10, width=10)
#par(mfrow=c(3,3))
layout(matrix(c(1:9,0,10,0), nrow=4, ncol=3, byrow=TRUE), widths=c(3,3,3), heights=c(3,3,3,3), respect=FALSE)
for(i in 1:length(Ys)){
  hist(bs10.est[,i], breaks=55, main=histYlabs[i], xlab="bootstrap ITT estimate \n (monitoring v. control)", xlim=xrange, cex.axis=0.8)
  abline(v=es10[i,1], col="red", lwd=2, lty=1)  
  abline(v=mean(bs10.est[,i]), col="black", lwd=2, lty=2)
  abline(v=c(mean(bs10.est[,i])-sd(bs10.est[,i]), mean(bs10.est[,i])+sd(bs10.est[,i])), col="grey80", lwd=1, lty=4)  
  abline(v=es10.ci[i,], col="grey80", lwd=2, lty=2)
  abline(v=bs10.basicci[i,], col=rgb(1,0,1,.5), lty=2)
  abline(v=bs10.pctci[i,], col=rgb(0,0,1,.3), lty=3)
  abline(v=bs10.normalci[i,], col=rgb(0,1,0,.5), lty=4)
  abline(v=bs10.tci[i,], col=rgb(1,0,0,.3), lty=2)  
}
plot(c(0,10), c(0,10), pch="", xaxt="n", yaxt="n", main="", xlab="", ylab="", bty="n")
legend(0,10,legend=c("ITT Estimate",
                     "Mean of Bootstrap ITT Estimates",
                     "+/- 1 sd, Bootstrap ITT Estimates",
                     "95% CI, Empirical Sandwich Var.",
                     "95% CI, Basic Bootstrap",                    
                     "95% CI, Percentile Bootstrap",
                     "95% CI, Normal Approx. Bootstrap",
                     "95% CI, Studentized Bootstrap"),
       col=c("red", "black", "grey80", "grey80", rgb(1,0,1,.5), rgb(0,0,1,.3), rgb(0,1,0,.5), rgb(1,0,0,.3)),
       lwd=rep(1.5, 8),
       lty=c(1,2,4,2,2,3,4,2),
       bty="n",
       seg.len=4)
invisible( dev.off())



xrange <- max(abs(range(c(es20[,1], bs20.est))))
xrange <- c(-xrange, xrange)

pdf("misc_bootstrap_bs_20_est.pdf", height=10, width=10)
#par(mfrow=c(3,3))
layout(matrix(c(1:9,0,10,0), nrow=4, ncol=3, byrow=TRUE), widths=c(3,3,3), heights=c(3,3,3,3), respect=FALSE)
for(i in 1:length(Ys)){
  hist(bs20.est[,i], breaks=55, main=histYlabs[i], xlab="bootstrap ITT estimate \n (punitive v. control)", xlim=xrange, cex.axis=0.8)
  abline(v=es20[i,1], col="red", lwd=2, lty=1)  
  abline(v=mean(bs20.est[,i]), col="black", lwd=2, lty=2)
  abline(v=c(mean(bs20.est[,i])-sd(bs20.est[,i]), mean(bs20.est[,i])+sd(bs20.est[,i])), col="grey80", lwd=1, lty=4)  
  abline(v=es20.ci[i,], col="grey80", lwd=2, lty=2)
  abline(v=bs20.basicci[i,], col=rgb(1,0,1,.5), lty=2)
  abline(v=bs20.pctci[i,], col=rgb(0,0,1,.3), lty=3)
  abline(v=bs20.normalci[i,], col=rgb(0,1,0,.5), lty=4)
  abline(v=bs20.tci[i,], col=rgb(1,0,0,.3), lty=2)  
}
plot(c(0,10), c(0,10), pch="", xaxt="n", yaxt="n", main="", xlab="", ylab="", bty="n")
legend(0,10,legend=c("ITT Estimate",
                     "Mean of Bootstrap ITT Estimates",
                     "+/- 1 sd, Bootstrap ITT Estimates",
                     "95% CI, Empirical Sandwich Var.",
                     "95% CI, Basic Bootstrap",                    
                     "95% CI, Percentile Bootstrap",
                     "95% CI, Normal Approx. Bootstrap",
                     "95% CI, Studentized Bootstrap"),
       col=c("red", "black", "grey80", "grey80", rgb(1,0,1,.5), rgb(0,0,1,.3), rgb(0,1,0,.5), rgb(1,0,0,.3)),
       lwd=rep(1.5, 8),
       lty=c(1,2,4,2,2,3,4,2),
       bty="n",
       seg.len=4)
invisible( dev.off())



xrange <- max(abs(range(c(es21[,1], bs21.est))))
xrange <- c(-xrange, xrange)

pdf("misc_bootstrap_bs_21_est.pdf", height=10, width=10)
#par(mfrow=c(3,3))
layout(matrix(c(1:9,0,10,0), nrow=4, ncol=3, byrow=TRUE), widths=c(3,3,3), heights=c(3,3,3,3), respect=FALSE)
for(i in 1:length(Ys)){
  hist(bs21.est[,i], breaks=55, main=histYlabs[i], xlab="bootstrap ITT estimate \n (punitive v. monitoring)", xlim=xrange, cex.axis=0.8)
  abline(v=es21[i,1], col="red", lwd=2, lty=1)  
  abline(v=mean(bs21.est[,i]), col="black", lwd=2, lty=2)
  abline(v=c(mean(bs21.est[,i])-sd(bs21.est[,i]), mean(bs21.est[,i])+sd(bs21.est[,i])), col="grey80", lwd=1, lty=4)  
  abline(v=es21.ci[i,], col="grey80", lwd=2, lty=2)
  abline(v=bs21.basicci[i,], col=rgb(1,0,1,.5), lty=2)
  abline(v=bs21.pctci[i,], col=rgb(0,0,1,.3), lty=3)
  abline(v=bs21.normalci[i,], col=rgb(0,1,0,.5), lty=4)
  abline(v=bs21.tci[i,], col=rgb(1,0,0,.3), lty=2)  
}
plot(c(0,10), c(0,10), pch="", xaxt="n", yaxt="n", main="", xlab="", ylab="", bty="n")
legend(0,10,legend=c("ITT Estimate",
                     "Mean of Bootstrap ITT Estimates",
                     "+/- 1 sd, Bootstrap ITT Estimates",
                     "95% CI, Empirical Sandwich Var.",
                     "95% CI, Basic Bootstrap",                    
                     "95% CI, Percentile Bootstrap",
                     "95% CI, Normal Approx. Bootstrap",
                     "95% CI, Studentized Bootstrap"),
       col=c("red", "black", "grey80", "grey80", rgb(1,0,1,.5), rgb(0,0,1,.3), rgb(0,1,0,.5), rgb(1,0,0,.3)),
       lwd=rep(1.5, 8),
       lty=c(1,2,4,2,2,3,4,2),
       bty="n",
       seg.len=4)
invisible( dev.off())



#============================================================================#
# assemble table
#============================================================================#

# Prep labels for output tables -- outcome labels, table section titles, table subsection titles

Ylabs <- c("Index measure of favorable in-person interactions",
           "Received post-visit callback",
           "Received post-visit offer")
Ylabs <- rep(Ylabs, 9)
# Ylabs

seclabs <- c("I. Monitoring vs. Control", "II. Punitive vs. Control", "III. Punitive vs. Monitoring")
sublabs <- rep(c("A. White vs. Black", "B. White vs. Hispanic", "C. Black vs. Hispanic"), 3)

# Covariate adjusted estimates - table shell:
#   Col 1 = Covariate adj ITT estimate
#   Col 2 = Empirical sandwich SE estimate
#   Col 3 = Empirical sandwich 95% CI
#   Col 4 = Mean of bootstrap ITT estimates
#   Col 5 = SD of bootstrap ITT estimates
#   Col 6 = 95% CI - Studentized t
#   Col 7 = 95% CI - Basic bootstrap
#   Col 8 = 95% CI - Percentile bootstrap
#   Col 9 = 95% CI - Normal Approx

tab10 <- cbind(round(es10, 3),
               apply(es10.ci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") ),
               round(apply(bs10.est, 2, mean),3),
               round(apply(bs10.est, 2, sd),3),
               apply(bs10.tci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") ),
               apply(bs10.basicci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") ),
               apply(bs10.pctci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") ),
               apply(bs10.normalci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") )
)
# tab10

tab20 <- cbind(round(es20, 3),
               apply(es20.ci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") ),
               round(apply(bs20.est, 2, mean),3),
               round(apply(bs20.est, 2, sd),3),
               apply(bs20.tci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") ),
               apply(bs20.basicci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") ),
               apply(bs20.pctci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") ),
               apply(bs20.normalci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") )
)
# tab20

tab21 <- cbind(round(es21, 3),
               apply(es21.ci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") ),
               round(apply(bs21.est, 2, mean),3),
               round(apply(bs21.est, 2, sd),3),
               apply(bs21.tci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") ),
               apply(bs21.basicci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") ),
               apply(bs21.pctci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") ),
               apply(bs21.normalci, 1, function(x) paste("[",round(x[1],3),",",round(x[2],3),"]",sep="") )
)
# tab21


# stack panels into single table
tab.out <- rbind(tab10, tab20, tab21)

# add parentheses to SE estimates
tab.out[,2] <- paste("(", tab.out[,2], ")", sep="")
tab.out[,5] <- paste("(", tab.out[,5], ")", sep="")

tab.out <- cbind(Ylabs, tab.out)

colnames(tab.out) <- c("Outcome Measure", "Estimate", "SE", "95% CI", "Mean Bootstrap Est.", "Bootstrap SE", "Studentized", "Basic", "Percentile", "Normal Approx.")

# tab.out

# Write an unformatted table
write.csv(tab.out, "out_tablea15_itt_blockfe_covadj_full_UNFORMATTED.csv", row.names=FALSE)

start.rows <- seq(1,27,3)
stop.rows <- start.rows + 2
# start.rows; stop.rows

tab.out2 <- list()
seclabs.index <- 0
for(i in 1:length(start.rows)){
  if(i %in% c(1,4,7)){
    seclabs.index <- seclabs.index + 1
    tab.out2[[i]] <- rbind(c(seclabs[seclabs.index], rep(NA,9)),
                           c(sublabs[i], rep(NA,9)),
                           tab.out[start.rows[i]:stop.rows[i],])        
  } else {
    tab.out2[[i]] <- rbind(c(sublabs[i], rep(NA,9)), tab.out[start.rows[i]:stop.rows[i],])    
  }
}


tab.out2 <- do.call(rbind, tab.out2)
# tab.out2

## Export results to csv

write.csv(tab.out2, "out_tablea15_itt_blockfe_covadj_full.csv", row.names=FALSE)

The following code builds Table A15 which includes the covariate adjusted ITT estimates (produced by the code above) and the unadjusted ITT estimates (from Table A7, for comparison)

### Unadjusted estimates (two-group estimator)

un <- read.csv("out_tablea7_itt_2g_blockfe_nocovs.csv",
               header=TRUE, stringsAsFactors = FALSE)
un <- un[,c(1,2,3,6)]

### Covariate adjusted estimates (two-group estimator)

ca <- read.csv("out_tablea15_itt_blockfe_covadj_full.csv",
               header=TRUE, stringsAsFactors=FALSE)

# get rid of normal approx CIs
ca <- ca[,-10]

### Combine unadjusted with covariate adjusted
x <- cbind(ca[,1], un[,2:4], ca[,2:9])

# clean up colnames
names(x) <- c("Outcome Measure", "Estimate", "SE", "95% CI", "Estimate", "SE", "95% CI", "Mean", "SE", "95% Studentized CI", "95% Basic CI", "95% Percentile CI")

# Show table in Rmd output
kable(x, col.names=c("Outcome Measure", "Unadj. (Est.)", "Unadj. (SE)", "Unadj. (95% CI)",
                     "Cov adj (Est.)", "Cov adj (SE)", "Cov adj (95% CI)",
                     "Cov adj with bootstrap (Mean)", "Cov adj with bootstrap (SE)",
                     "Cov adj with bootstrap (95% Studentized CI)",
                     "Cov adj with bootstrap (95% Basic CI)",
                     "Cov adj with bootstrap (95% Percentile CI)"),
      caption="**Table A15.**")
Table A15.
Outcome Measure Unadj. (Est.) Unadj. (SE) Unadj. (95% CI) Cov adj (Est.) Cov adj (SE) Cov adj (95% CI) Cov adj with bootstrap (Mean) Cov adj with bootstrap (SE) Cov adj with bootstrap (95% Studentized CI) Cov adj with bootstrap (95% Basic CI) Cov adj with bootstrap (95% Percentile CI)
I. Monitoring vs. Control
A. White vs. Black
Index measure of favorable in-person interactions -0.015 0.052 [-0.117, 0.087] -0.013 (0.045) [-0.101,0.076] -0.013 (0.055) [-0.145,0.124] [-0.118,0.099] [-0.124,0.093]
Received post-visit callback -0.002 0.045 [-0.09, 0.086] -0.018 (0.042) [-0.1,0.065] -0.022 (0.049) [-0.131,0.102] [-0.11,0.082] [-0.118,0.074]
Received post-visit offer -0.003 0.036 [-0.074, 0.068] -0.015 (0.033) [-0.079,0.049] -0.016 (0.035) [-0.09,0.064] [-0.082,0.055] [-0.085,0.052]
B. White vs. Hispanic
Index measure of favorable in-person interactions -0.061 0.057 [-0.173, 0.051] -0.098 (0.051) [-0.198,0.003] -0.090 (0.06) [-0.259,0.031] [-0.225,0.014] [-0.209,0.029]
Received post-visit callback -0.036 0.043 [-0.121, 0.049] -0.034 (0.042) [-0.116,0.048] -0.033 (0.046) [-0.135,0.068] [-0.124,0.056] [-0.124,0.057]
Received post-visit offer -0.017 0.034 [-0.084, 0.05] -0.021 (0.032) [-0.083,0.042] -0.021 (0.035) [-0.095,0.056] [-0.089,0.048] [-0.09,0.048]
C. Black vs. Hispanic
Index measure of favorable in-person interactions -0.089 0.052 [-0.191, 0.013] -0.095 (0.048) [-0.189,0] -0.088 (0.056) [-0.233,0.024] [-0.211,0.006] [-0.196,0.021]
Received post-visit callback -0.035 0.042 [-0.118, 0.048] -0.005 (0.039) [-0.083,0.073] -0.002 (0.044) [-0.109,0.094] [-0.095,0.079] [-0.089,0.085]
Received post-visit offer -0.014 0.031 [-0.075, 0.047] 0.004 (0.029) [-0.052,0.061] 0.005 (0.03) [-0.06,0.066] [-0.057,0.06] [-0.052,0.066]
II. Punitive vs. Control
A. White vs. Black
Index measure of favorable in-person interactions 0.007 0.053 [-0.097, 0.111] -0.043 (0.048) [-0.136,0.051] -0.039 (0.059) [-0.196,0.098] [-0.162,0.07] [-0.156,0.076]
Received post-visit callback 0.019 0.042 [-0.064, 0.102] 0.030 (0.039) [-0.046,0.107] 0.026 (0.042) [-0.054,0.125] [-0.046,0.115] [-0.054,0.107]
Received post-visit offer 0.018 0.035 [-0.051, 0.087] 0.023 (0.03) [-0.036,0.083] 0.025 (0.033) [-0.053,0.094] [-0.045,0.085] [-0.039,0.091]
B. White vs. Hispanic
Index measure of favorable in-person interactions 0.048 0.055 [-0.06, 0.156] 0.023 (0.049) [-0.073,0.12] 0.023 (0.056) [-0.11,0.155] [-0.088,0.136] [-0.09,0.135]
Received post-visit callback -0.066 0.041 [-0.147, 0.015] -0.051 (0.037) [-0.123,0.021] -0.055 (0.044) [-0.155,0.064] [-0.133,0.041] [-0.142,0.032]
Received post-visit offer -0.021 0.034 [-0.088, 0.046] 0.000 (0.031) [-0.061,0.061] 0.002 (0.034) [-0.077,0.075] [-0.069,0.066] [-0.066,0.069]
C. Black vs. Hispanic
Index measure of favorable in-person interactions 0.039 0.049 [-0.057, 0.135] 0.037 (0.045) [-0.052,0.125] 0.035 (0.049) [-0.073,0.142] [-0.06,0.132] [-0.059,0.133]
Received post-visit callback -0.085 0.041 [-0.166, -0.004] -0.100 (0.038) [-0.175,-0.025] -0.099 (0.044) [-0.201,0] [-0.185,-0.015] [-0.184,-0.015]
Received post-visit offer -0.039 0.033 [-0.104, 0.026] -0.032 (0.031) [-0.093,0.03] -0.034 (0.034) [-0.106,0.048] [-0.098,0.038] [-0.101,0.035]
III. Punitive vs. Monitoring
A. White vs. Black
Index measure of favorable in-person interactions 0.018 0.058 [-0.096, 0.132] -0.024 (0.06) [-0.142,0.094] -0.019 (0.058) [-0.131,0.083] [-0.142,0.083] [-0.13,0.094]
Received post-visit callback 0.033 0.049 [-0.063, 0.129] 0.038 (0.045) [-0.051,0.127] 0.036 (0.05) [-0.07,0.151] [-0.057,0.139] [-0.063,0.134]
Received post-visit offer 0.026 0.037 [-0.047, 0.099] 0.029 (0.034) [-0.037,0.095] 0.029 (0.034) [-0.042,0.101] [-0.039,0.097] [-0.039,0.097]
B. White vs. Hispanic
Index measure of favorable in-person interactions 0.107 0.063 [-0.017, 0.231] 0.082 (0.061) [-0.038,0.201] 0.083 (0.059) [-0.033,0.196] [-0.034,0.195] [-0.032,0.197]
Received post-visit callback -0.019 0.049 [-0.115, 0.077] -0.061 (0.049) [-0.157,0.036] -0.060 (0.05) [-0.165,0.043] [-0.16,0.037] [-0.158,0.039]
Received post-visit offer 0.002 0.038 [-0.073, 0.077] -0.015 (0.035) [-0.084,0.055] -0.013 (0.039) [-0.103,0.069] [-0.092,0.06] [-0.089,0.063]
C. Black vs. Hispanic
Index measure of favorable in-person interactions 0.139 0.057 [0.027, 0.251] 0.141 (0.06) [0.023,0.26] 0.135 (0.056) [0.043,0.256] [0.04,0.257] [0.025,0.243]
Received post-visit callback -0.052 0.047 [-0.144, 0.04] -0.080 (0.043) [-0.164,0.005] -0.077 (0.045) [-0.178,0.014] [-0.17,0.007] [-0.166,0.011]
Received post-visit offer -0.024 0.036 [-0.095, 0.047] -0.039 (0.034) [-0.105,0.027] -0.038 (0.035) [-0.117,0.031] [-0.109,0.028] [-0.106,0.031]

4.19 Table A16 (p. A-37): Missingness Analysis: Estimated Correlation between Treatment Assignment and Missingness on Subject Index Measure

# Missingness analysis - Part 1: Does treatment predict missingness on index net discrim measure?
# Regression analysis with block FE and IPW

miss.index <- list(wb.10 = lm(is.na(index.wb) ~ TA1 + as.factor(block), data=dat[dat$TA %in% c(0,1),], weights=ipw10),
                   wb.20 = lm(is.na(index.wb) ~ TA2 + as.factor(block), data=dat[dat$TA %in% c(0,2),], weights=ipw20),
                   wb.21 = lm(is.na(index.wb) ~ TA2 + as.factor(block), data=dat[dat$TA %in% c(1,2),], weights=ipw21),
                   wh.10 = lm(is.na(index.wh) ~ TA1 + as.factor(block), data=dat[dat$TA %in% c(0,1),], weights=ipw10),
                   wh.20 = lm(is.na(index.wh) ~ TA2 + as.factor(block), data=dat[dat$TA %in% c(0,2),], weights=ipw20),
                   wh.21 = lm(is.na(index.wh) ~ TA2 + as.factor(block), data=dat[dat$TA %in% c(1,2),], weights=ipw21),
                   bh.10 = lm(is.na(index.bh) ~ TA1 + as.factor(block), data=dat[dat$TA %in% c(0,1),], weights=ipw10),
                   bh.20 = lm(is.na(index.bh) ~ TA2 + as.factor(block), data=dat[dat$TA %in% c(0,2),], weights=ipw20),
                   bh.21 = lm(is.na(index.bh) ~ TA2 + as.factor(block), data=dat[dat$TA %in% c(1,2),], weights=ipw21))

miss.index.out <- cbind(do.call(rbind, lapply(miss.index, function(x) summary(x)$coefficient[2,])), # estimates
                        unlist(lapply(miss.index, function(x) summary(x)$fstatistic[1])), # f test, f statistic
                        unlist(lapply(miss.index, lmp))) # f-test p value

miss.index.out <- apply(miss.index.out, 2, function(x) round(x, 3))

miss.index.out <- cbind(rep(c("Monitoring vs. Control", "Punitive vs. Control", "Punitive vs. Monitoring"), 3),
                        miss.index.out)
colnames(miss.index.out) <- c("Comparison", "Estimate", "SE", "t", "p-value", "F-statistic", "F-test p-value")
rownames(miss.index.out) <- NULL

out.miss <- rbind(c("A. White vs. Black", rep(NA, 6)),
                  miss.index.out[1:3,],
                  c("B. White vs. Hispanic", rep(NA, 6)),
                  miss.index.out[4:6,],
                  c("C. Black vs. Hispanic", rep(NA, 6)),
                  miss.index.out[7:9,])

kable(out.miss, caption="**Table A16**")
Table A16
Comparison Estimate SE t p-value F-statistic F-test p-value
A. White vs. Black
Monitoring vs. Control 0.073 0.042 1.729 0.085 1.481 0.097
Punitive vs. Control 0.036 0.04 0.897 0.37 1.263 0.212
Punitive vs. Monitoring -0.021 0.048 -0.436 0.663 0.936 0.532
B. White vs. Hispanic
Monitoring vs. Control 0.019 0.044 0.428 0.669 1.081 0.369
Punitive vs. Control -0.023 0.04 -0.573 0.567 2.187 0.004
Punitive vs. Monitoring -0.045 0.047 -0.944 0.346 1.293 0.194
C. Black vs. Hispanic
Monitoring vs. Control 0.083 0.043 1.913 0.056 1.651 0.049
Punitive vs. Control 0.038 0.042 0.903 0.367 1.247 0.224
Punitive vs. Monitoring -0.042 0.05 -0.854 0.394 1.091 0.36

4.20 Table A17 (p. A-37): Missingness Analysis: Pairwise Correlations between Missing Subjective Index Measures and Objective Net Discrimination Measures

# Part 2: Is missingness correlated with offers or callbacks?

miss.cor <- cor(cbind(dat[,c("ncb_wb", "noff_wb", 
                             "ncb_wh", "noff_wh",
                             "ncb_bh", "noff_bh" )],
                      is.na(dat$index.wb), is.na(dat$index.wh), is.na(dat$index.bh)))

miss.cor <- miss.cor[7:9,1:6]

miss.cor <- apply(miss.cor, 2, function(x) round(x,2))
rownames(miss.cor) <- paste("Missing Index Measure", c("W-B", "W-H", "B-H"), sep=", ")
miss.cor <- rbind(rep(c("Callbacks", "Offers"), 3), miss.cor)
colnames(miss.cor) <- c("W-B", "W-B", "W-H", "W-H", "B-H", "B-H") # label for tex output

# output for Rmd file only (omit first row of tex table and relabel column names)
kable(miss.cor[-1,], col.names = c("W-B Callbacks", "W-B Offers", "W-H Callbacks", "W-H Offers", "B-H Callbacks", "B-H Offers"), caption="**Table A17**")
Table A17
W-B Callbacks W-B Offers W-H Callbacks W-H Offers B-H Callbacks B-H Offers
Missing Index Measure, W-B -0.05 -0.06 -0.08 -0.11 -0.02 -0.05
Missing Index Measure, W-H -0.11 -0.05 -0.02 0 0.09 0.06
Missing Index Measure, B-H 0 0 0.01 0 0.02 0

4.21 Table A18 (p. A-37): Missingness Analysis: Predicting Missingness on Subjective Indicators As a Function of Tester Race

# Part 3: Missingness as a function of tester race/ethnicity

miss.Ys <- c("sales", "qualpraise", "posbg", "posedit", "prof")
coefs <- fstat <- fp <- list()
for(i in 1:length(miss.Ys)){
  fmla <- paste0("is.na(",miss.Ys[i],") ~ ttype")
  fit <- lm(formula=fmla, data=ctx)
  coefs[[i]] <- apply(summary(fit)$coefficients, 2, function(x) round(x,3))
  fstat[[i]] <- summary(fit)$fstatistic[1]
  fp[[i]] <- lmp(fit)
}

# for subjective measures used to make index - results same across items (keep element #3)
miss3 <- rbind(cbind(rownames(coefs[[1]]), coefs[[1]]),
               c("F-statistic:", round(fstat[[i]], 3), rep(NA, 3)),
               c("F test p-value:", round(fp[[1]], 3), rep(NA, 3)))
miss3[,1] <- gsub("ttypeB", "Hispanic tester", miss3[,1], fixed=TRUE)
miss3[,1] <- gsub("ttypeC", "White tester", miss3[,1], fixed=TRUE)
rownames(miss3) <- NULL
colnames(miss3) <- c("Variable", "Estimate", "SE", "t", "p-value")

kable(miss3, caption="**Table A18**")
Table A18
Variable Estimate SE t p-value
(Intercept) 0.143 0.014 10.267 0
Hispanic tester -0.006 0.02 -0.324 0.746
White tester -0.008 0.02 -0.387 0.699
F-statistic: 0.086
F test p-value: 0.917

4.22 Table A19 (p. A-38): Balance Table for Categorical Covariates

# CREATE BALANCE TABLES (Both categorical and continuous variables)
# unweighted, all arms
bt_unwtd = balanceTables(d=dat, z="TA", xcon=Xs.num, xcat=Xs.cat)
# weighted, all arms
bt_wtd   = balanceTables(d=dat, z="TA", xcon=Xs.num, xcat=Xs.cat, weight="ipw")

# BALANCE TABLES: CATEGORICAL VARIABLES ONLY
bt_xcat = cbind(bt_unwtd$bt_cat[,1:4],
                bt_wtd$bt_cat[,2:4],
                bt_unwtd$bt_cat[,5:7],
                bt_wtd$bt_cat[,5:7],
                bt_unwtd$bt_cat[,8:10],
                bt_wtd$bt_cat[,8:10])
bt_xcat[,1] = c("Frame", "Likely discrimination", "Representative", "Household size", "1", "2", "Tester team gender", "Female", "Male")

kable(bt_xcat[,c(1:7)], 
      caption="**Table A19, Left Panel: Control Group**",
      col.names=c("Variable", "Unweighted Proportion", "Unweighted SE", "Unweighted N",
                  "Weighted Proportion", "Weighted SE", "Weighted N"))
Table A19, Left Panel: Control Group
Variable Unweighted Proportion Unweighted SE Unweighted N Weighted Proportion Weighted SE Weighted N
Frame
Likely discrimination 0.068 0.015 19 0.07 0.015 46
Representative 0.932 0.015 260 0.93 0.015 610
Household size
1 0.774 0.025 216 0.771 0.025 506
2 0.226 0.025 63 0.229 0.025 150
Tester team gender
Female 0.505 0.03 141 0.514 0.03 337
Male 0.495 0.03 138 0.486 0.03 319
kable(bt_xcat[,c(1, 8:13)], 
      caption="**Table A19, Middle Panel: Monitoring Group**",
      col.names=c("Variable", "Unweighted Proportion", "Unweighted SE", "Unweighted N",
                  "Weighted Proportion", "Weighted SE", "Weighted N"))
Table A19, Middle Panel: Monitoring Group
Variable Unweighted Proportion Unweighted SE Unweighted N Weighted Proportion Weighted SE Weighted N
Frame
Likely discrimination 0.069 0.019 12 0.066 0.019 40
Representative 0.931 0.019 162 0.934 0.019 564
Household size
1 0.816 0.029 142 0.823 0.029 497
2 0.184 0.029 32 0.177 0.029 107
Tester team gender
Female 0.54 0.038 94 0.545 0.038 329
Male 0.46 0.038 80 0.455 0.038 275
kable(bt_xcat[,c(1, 14:19)], 
      caption="**Table A19, Right Panel: Punitive Group**",
      col.names=c("Variable", "Unweighted Proportion", "Unweighted SE", "Unweighted N",
                  "Weighted Proportion", "Weighted SE", "Weighted N"))
Table A19, Right Panel: Punitive Group
Variable Unweighted Proportion Unweighted SE Unweighted N Weighted Proportion Weighted SE Weighted N
Frame
Likely discrimination 0.065 0.017 13 0.063 0.017 43
Representative 0.935 0.017 187 0.937 0.017 641
Household size
1 0.795 0.029 159 0.794 0.029 543
2 0.205 0.029 41 0.206 0.029 141
Tester team gender
Female 0.525 0.035 105 0.525 0.035 359
Male 0.475 0.035 95 0.475 0.035 325

4.23 Table A20 (p. A-38 to A-44): Balance Table for Continuous Covariates

bt_xcon = cbind(bt_unwtd$bt_con[,1:2],
                bt_wtd$bt_con[,2],
                bt_unwtd$bt_con[,3],
                bt_wtd$bt_con[,3],
                bt_unwtd$bt_con[,4],
                bt_wtd$bt_con[,4])

# reorder table output by covariate group
xcon_groups = list(Apartment.Characteristics = c(1:3, 53:57),
                   Randomization.Regime = 4:6,
                   Early.Stage.Discrimination = 7:39,
                   Subject.Characteristics = c(40, 61:69),
                   Tester.Call.Order = 41:43,
                   Testers.Assumed.Income = c(44:52, 58:60),
                   Tester.Fixed.Effects = 70:118)

con_tab = list()
for(i in 1:length(xcon_groups)){
  vals = xcon_groups[[i]]
  temp = list()
  for(j in 1:length(vals)){
    temp[[j]] = bt_xcon[c(3*vals[j]-2, 3*vals[j]-1, 3*vals[j]),]
  }
  temp = do.call(rbind, temp)
  temp = rbind(c(gsub(".", " ", names(xcon_groups)[i], fixed=TRUE), rep(NA, ncol(temp)-1)), temp)
  names(temp)[3:7] = c("Z=0 wt", "Z=1", "Z=1 wt", "Z=2", "Z=2 wt")
  con_tab[[i]] = temp
}
con_tab = do.call(rbind, con_tab)
rownames(con_tab) = NULL
colnames(con_tab) <-  c("Variable", "Control-Unweighted", "Control-Weighted", "Monitoring-Unweighted","Monitoring-Weighted", "Punitive-Unweighted", "Punitive-Weighted")

con_tab_labs = Xs.num
con_tab_labs = gsub("tid.", "Tester ID ", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("numbr", "Advertised number of bedrooms", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("o_rent", "Advertised monthly rental price", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("o_sqft", "Advertised square footage", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("regime", "Regime ", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("boro.brx", "Borough: Bronx", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("boro.brk", "Borough: Brooklyn", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("boro.mnh", "Borough: Manhattan", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("boro.que", "Borough: Queens", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("boro.stn", "Borough: Staten Island", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("incrank.wb.gt", "Assumed Tester Incomes: White > Black", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("incrank.wh.gt", "Assumed Tester Incomes: White > Hispanic", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("incrank.bh.gt", "Assumed Tester Incomes: Black > Hispanic", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("incrank.wb.eq", "Assumed Tester Incomes: White = Black", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("incrank.wh.eq", "Assumed Tester Incomes: White = Hispanic", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("incrank.bh.eq", "Assumed Tester Incomes: Black = Hispanic", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("incrank.wb.lt", "Assumed Tester Incomes: White < Black", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("incrank.wh.lt", "Assumed Tester Incomes: White < Hispanic", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("incrank.bh.lt", "Assumed Tester Incomes: Black < Hispanic", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("inc.w.hi", "Assumed Tester Incomes: White Highest", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("inc.b.hi", "Assumed Tester Incomes: Black Highest", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("inc.h.hi", "Assumed Tester Incomes: Hispanic Highest", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("primary_api", "Modal Perception of Landlord Race among Testers: Asian", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("primary_blk", "Modal Perception of Landlord Race among Testers: Black", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("primary_hsp", "Modal Perception of Landlord Race among Testers: Hispanic/Latino", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("primary_wht", "Modal Perception of Landlord Race among Testers: White", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("primary_age_18to34", "Modal Perception of Landlord Age among Testers: 18 to 34", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("primary_age_35to44", "Modal Perception of Landlord Age among Testers: 35 to 44", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("primary_age_45to64", "Modal Perception of Landlord Age among Testers: 45 to 64", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("primary_age_65over", "Modal Perception of Landlord Age among Testers: 65 and up", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("primary_age_unknown", "Modal Perception of Landlord Age among Testers: Unknown/No Consensus", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("callorder.wb", "Randomized Tester Call Order: White before Black", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("callorder.wh", "Randomized Tester Call Order: White before Hispanic", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("callorder.bh", "Randomized Tester Call Order: Black before Hispanic", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("broker", "Subject is a Broker", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("_wb", ": White-Black", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("_wh", ": White-Hispanic", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("_bh", ": Black-Hispanic", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("nnumattr", "Net Diff. in Num. Attributes Inquired About over Phone", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("nnumskep", "Net Diff. in Num. Attributes Eliciting Skeptical Reaction over Phone", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("npctskep", "Net Diff. in Pct. of Attributes Raised Eliciting Skeptical Reaction over Phone", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("nnumpos", "Net Diff. in Num. Attributes Eliciting Positive Reaction over Phone", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("nnumneu", "Net Diff. in Num. Attributes Eliciting Neutral Reaction over Phone", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("nnumneg", "Net Diff. in Num. Attributes Eliciting Negative Reaction over Phone", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("npctpos", "Net Diff. in Pct. of Attributes Raised Eliciting Positive Reaction over Phone", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("npctneu", "Net Diff. in Pct. of Attributes Raised Eliciting Neutral Reaction over Phone", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("npctneg", "Net Diff. in Pct. of Attributes Raised Eliciting Negative Reaction over Phone", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("nanyskep", "Net Diff. in Receiving Any Skeptical Reaction to Attributes over Phone", con_tab_labs, fixed=TRUE)
con_tab_labs = gsub("nanyneg", "Net Diff. in Receiving Any Negative Reaction to Attributes over Phone", con_tab_labs, fixed=TRUE)

for(i in 1:length(Xs.num)){
  indpos=which(con_tab[,1]==Xs.num[i])
  con_tab[indpos,1] = con_tab_labs[i]
}

# produce version to display w/ kable() -- remove all rows w/ all NAs that are included for tex formatting
con_tab_rmd <- con_tab[apply(con_tab, 1, function(J) sum(is.na(J)) != ncol(con_tab) ),]
rownames(con_tab_rmd) <- NULL
kable(con_tab_rmd, caption="**Table A20**")
Table A20
Variable Control-Unweighted Control-Weighted Monitoring-Unweighted Monitoring-Weighted Punitive-Unweighted Punitive-Weighted
Apartment Characteristics
Advertised number of bedrooms 0.925 0.928 0.764 0.755 0.93 0.928
(1.214) (1.217) (1.126) (1.108) (1.201) (1.216)
Advertised monthly rental price 2456.261 2450.424 2430.397 2403.453 2395.57 2392.987
(1271.726) (1227.85) (1224.187) (1205.577) (1115.931) (1113.922)
Advertised square footage 1116.677 1124.844 836.364 856 960.526 992.508
(430.927) (444.738) (331.52) (324.802) (508.235) (516.829)
Borough: Bronx 0.115 0.114 0.103 0.108 0.105 0.108
(0.319) (0.318) (0.305) (0.31) (0.307) (0.311)
Borough: Brooklyn 0.358 0.351 0.379 0.373 0.34 0.345
(0.48) (0.478) (0.487) (0.484) (0.475) (0.476)
Borough: Manhattan 0.341 0.352 0.368 0.371 0.345 0.338
(0.475) (0.478) (0.484) (0.483) (0.477) (0.473)
Borough: Queens 0.14 0.136 0.115 0.116 0.18 0.178
(0.347) (0.343) (0.32) (0.32) (0.385) (0.383)
Borough: Staten Island 0.047 0.047 0.034 0.033 0.03 0.031
(0.211) (0.212) (0.183) (0.179) (0.171) (0.173)
Randomization Regime
Regime 1 0.151 0.192 0.218 0.189 0.26 0.228
(0.358) (0.394) (0.414) (0.392) (0.44) (0.42)
Regime 2 0.649 0.552 0.471 0.543 0.42 0.491
(0.478) (0.498) (0.501) (0.499) (0.495) (0.5)
Regime 3 0.201 0.256 0.31 0.268 0.32 0.281
(0.401) (0.437) (0.464) (0.443) (0.468) (0.45)
Early Stage Discrimination
Net Diff. in Num. Attributes Inquired About over Phone: Black-Hispanic -0.029 -0.07 -0.305 -0.272 -0.14 -0.099
(2.018) (2.089) (2.218) (2.218) (2.299) (2.266)
Net Diff. in Num. Attributes Inquired About over Phone: White-Hispanic 0.161 0.102 -0.149 -0.124 -0.255 -0.235
(2.051) (2.074) (2.204) (2.175) (2.018) (2.008)
Net Diff. in Num. Attributes Inquired About over Phone: White-Black 0.19 0.172 0.155 0.147 -0.115 -0.136
(2.17) (2.17) (2.362) (2.322) (2.446) (2.414)
Net Diff. in Num. Attributes Eliciting Skeptical Reaction over Phone: Black-Hispanic 0 0.008 -0.063 -0.063 -0.075 -0.067
(0.416) (0.447) (0.506) (0.518) (0.609) (0.579)
Net Diff. in Num. Attributes Eliciting Skeptical Reaction over Phone: White-Hispanic 0.057 0.047 -0.04 -0.036 -0.06 -0.044
(0.533) (0.515) (0.448) (0.452) (0.639) (0.63)
Net Diff. in Num. Attributes Eliciting Skeptical Reaction over Phone: White-Black 0.057 0.04 0.023 0.026 0.015 0.023
(0.591) (0.604) (0.416) (0.438) (0.431) (0.442)
Net Diff. in Pct. of Attributes Raised Eliciting Skeptical Reaction over Phone: Black-Hispanic -0.001 -0.001 0.001 0.003 -0.025 -0.025
(0.119) (0.121) (0.202) (0.21) (0.161) (0.161)
Net Diff. in Pct. of Attributes Raised Eliciting Skeptical Reaction over Phone: White-Hispanic 0.01 0.007 -0.015 -0.014 -0.018 -0.016
(0.139) (0.135) (0.128) (0.127) (0.179) (0.179)
Net Diff. in Pct. of Attributes Raised Eliciting Skeptical Reaction over Phone: White-Black 0.012 0.008 -0.016 -0.017 0.007 0.008
(0.143) (0.142) (0.177) (0.186) (0.111) (0.11)
Net Diff. in Num. Attributes Eliciting Positive Reaction over Phone: Black-Hispanic 0.018 0.04 -0.006 -0.018 -0.065 -0.063
(1.03) (1.103) (1.023) (0.978) (1.252) (1.186)
Net Diff. in Num. Attributes Eliciting Positive Reaction over Phone: White-Hispanic 0.029 0.02 0.017 0.035 -0.08 -0.06
(0.768) (0.774) (0.909) (0.907) (1.118) (1.073)
Net Diff. in Num. Attributes Eliciting Positive Reaction over Phone: White-Black 0.011 -0.02 0.023 0.053 -0.015 0.003
(0.961) (1.03) (0.918) (0.905) (1.123) (1.086)
Net Diff. in Num. Attributes Eliciting Neutral Reaction over Phone: Black-Hispanic -0.043 -0.11 -0.241 -0.192 0 0.037
(1.883) (1.936) (2.126) (2.123) (2.136) (2.11)
Net Diff. in Num. Attributes Eliciting Neutral Reaction over Phone: White-Hispanic 0.093 0.044 -0.132 -0.124 -0.17 -0.175
(1.783) (1.791) (2.131) (2.094) (1.854) (1.821)
Net Diff. in Num. Attributes Eliciting Neutral Reaction over Phone: White-Black 0.136 0.154 0.109 0.068 -0.17 -0.212
(2.013) (2.013) (2.231) (2.212) (2.233) (2.2)
Net Diff. in Num. Attributes Eliciting Negative Reaction over Phone: Black-Hispanic -0.004 0 -0.057 -0.061 -0.075 -0.073
(0.216) (0.228) (0.368) (0.37) (0.387) (0.372)
Net Diff. in Num. Attributes Eliciting Negative Reaction over Phone: White-Hispanic 0.039 0.038 -0.034 -0.035 -0.005 0
(0.363) (0.362) (0.386) (0.391) (0.496) (0.499)
Net Diff. in Num. Attributes Eliciting Negative Reaction over Phone: White-Black 0.043 0.038 0.023 0.026 0.07 0.073
(0.386) (0.402) (0.284) (0.298) (0.382) (0.395)
Net Diff. in Pct. of Attributes Raised Eliciting Positive Reaction over Phone: Black-Hispanic -0.003 0.005 0.02 0.014 -0.038 -0.034
(0.268) (0.283) (0.325) (0.311) (0.279) (0.272)
Net Diff. in Pct. of Attributes Raised Eliciting Positive Reaction over Phone: White-Hispanic 0.002 0.001 0.009 0.011 -0.032 -0.027
(0.226) (0.226) (0.266) (0.263) (0.296) (0.284)
Net Diff. in Pct. of Attributes Raised Eliciting Positive Reaction over Phone: White-Black 0.005 -0.003 -0.012 -0.004 0.006 0.007
(0.263) (0.282) (0.292) (0.281) (0.251) (0.247)
Net Diff. in Pct. of Attributes Raised Eliciting Neutral Reaction over Phone: Black-Hispanic -0.067 -0.077 -0.099 -0.077 -0.044 -0.046
(0.533) (0.546) (0.654) (0.641) (0.611) (0.596)
Net Diff. in Pct. of Attributes Raised Eliciting Neutral Reaction over Phone: White-Hispanic 0.011 0.011 -0.048 -0.045 -0.002 -0.015
(0.483) (0.487) (0.546) (0.547) (0.466) (0.457)
Net Diff. in Pct. of Attributes Raised Eliciting Neutral Reaction over Phone: White-Black 0.078 0.089 0.05 0.033 0.043 0.031
(0.592) (0.603) (0.616) (0.61) (0.615) (0.611)
Net Diff. in Pct. of Attributes Raised Eliciting Negative Reaction over Phone: Black-Hispanic -0.005 -0.004 -0.013 -0.015 -0.023 -0.024
(0.073) (0.071) (0.106) (0.107) (0.123) (0.128)
Net Diff. in Pct. of Attributes Raised Eliciting Negative Reaction over Phone: White-Hispanic 0.009 0.009 -0.012 -0.013 -0.007 -0.008
(0.113) (0.109) (0.093) (0.094) (0.151) (0.154)
Net Diff. in Pct. of Attributes Raised Eliciting Negative Reaction over Phone: White-Black 0.013 0.012 0.001 0.002 0.016 0.016
(0.098) (0.099) (0.062) (0.062) (0.097) (0.094)
Net Diff. in Receiving Any Skeptical Reaction to Attributes over Phone: Black-Hispanic -0.011 -0.011 -0.034 -0.033 -0.04 -0.038
(0.199) (0.21) (0.338) (0.344) (0.262) (0.257)
Net Diff. in Receiving Any Skeptical Reaction to Attributes over Phone: White-Hispanic 0.025 0.017 -0.023 -0.02 -0.02 -0.013
(0.287) (0.284) (0.322) (0.325) (0.316) (0.322)
Net Diff. in Receiving Any Skeptical Reaction to Attributes over Phone: White-Black 0.036 0.027 0.011 0.013 0.02 0.025
(0.266) (0.264) (0.304) (0.315) (0.223) (0.231)
Net Diff. in Receiving Any Negative Reaction to Attributes over Phone: Black-Hispanic -0.007 -0.006 -0.034 -0.038 -0.055 -0.056
(0.147) (0.156) (0.238) (0.245) (0.25) (0.248)
Net Diff. in Receiving Any Negative Reaction to Attributes over Phone: White-Hispanic 0.022 0.02 -0.023 -0.025 -0.005 -0.004
(0.223) (0.224) (0.214) (0.218) (0.309) (0.308)
Net Diff. in Receiving Any Negative Reaction to Attributes over Phone: White-Black 0.029 0.026 0.011 0.013 0.05 0.051
(0.223) (0.23) (0.186) (0.191) (0.24) (0.24)
Subject Characteristics
Subject is a Broker 0.86 0.861 0.816 0.82 0.85 0.849
(0.347) (0.346) (0.389) (0.385) (0.358) (0.358)
Modal Perception of Landlord Race among Testers: Asian 0.111 0.111 0.08 0.079 0.085 0.086
(0.315) (0.315) (0.273) (0.271) (0.28) (0.281)
Modal Perception of Landlord Race among Testers: Black 0.122 0.114 0.109 0.109 0.135 0.135
(0.328) (0.318) (0.313) (0.312) (0.343) (0.341)
Modal Perception of Landlord Race among Testers: Hispanic/Latino 0.158 0.159 0.149 0.152 0.12 0.115
(0.365) (0.366) (0.358) (0.36) (0.326) (0.32)
Modal Perception of Landlord Race among Testers: White 0.53 0.537 0.534 0.53 0.56 0.56
(0.5) (0.499) (0.5) (0.5) (0.498) (0.497)
Modal Perception of Landlord Age among Testers: 18 to 34 0.437 0.441 0.5 0.498 0.515 0.512
(0.497) (0.497) (0.501) (0.5) (0.501) (0.5)
Modal Perception of Landlord Age among Testers: 35 to 44 0.262 0.265 0.241 0.243 0.28 0.285
(0.44) (0.442) (0.429) (0.429) (0.45) (0.452)
Modal Perception of Landlord Age among Testers: 45 to 64 0.211 0.207 0.144 0.137 0.12 0.117
(0.409) (0.406) (0.352) (0.345) (0.326) (0.322)
Modal Perception of Landlord Age among Testers: 65 and up 0.011 0.011 0.017 0.017 0.015 0.015
(0.103) (0.103) (0.131) (0.128) (0.122) (0.12)
Modal Perception of Landlord Age among Testers: Unknown/No Consensus 0.079 0.076 0.098 0.104 0.07 0.072
(0.27) (0.266) (0.298) (0.306) (0.256) (0.258)
Tester Call Order
Randomized Tester Call Order: White before Black 0.47 0.48 0.506 0.52 0.53 0.537
(0.5) (0.5) (0.501) (0.5) (0.5) (0.499)
Randomized Tester Call Order: White before Hispanic 0.434 0.434 0.489 0.493 0.555 0.554
(0.496) (0.496) (0.501) (0.5) (0.498) (0.497)
Randomized Tester Call Order: Black before Hispanic 0.541 0.544 0.477 0.487 0.48 0.48
(0.499) (0.498) (0.501) (0.5) (0.501) (0.5)
Testers Assumed Income
Assumed Tester Incomes: White > Black 0.466 0.468 0.431 0.442 0.45 0.455
(0.5) (0.499) (0.497) (0.497) (0.499) (0.498)
Assumed Tester Incomes: White > Hispanic 0.455 0.463 0.506 0.508 0.46 0.461
(0.499) (0.499) (0.501) (0.5) (0.5) (0.499)
Assumed Tester Incomes: Black > Hispanic 0.441 0.445 0.523 0.525 0.47 0.472
(0.497) (0.497) (0.501) (0.5) (0.5) (0.5)
Assumed Tester Incomes: White = Black 0.079 0.081 0.098 0.101 0.075 0.073
(0.27) (0.273) (0.298) (0.302) (0.264) (0.26)
Assumed Tester Incomes: White = Hispanic 0.082 0.085 0.069 0.071 0.07 0.073
(0.276) (0.28) (0.254) (0.257) (0.256) (0.26)
Assumed Tester Incomes: Black = Hispanic 0.154 0.157 0.075 0.073 0.08 0.08
(0.362) (0.364) (0.264) (0.26) (0.272) (0.272)
Assumed Tester Incomes: White < Black 0.455 0.451 0.471 0.457 0.475 0.472
(0.499) (0.498) (0.501) (0.499) (0.501) (0.5)
Assumed Tester Incomes: White < Hispanic 0.462 0.451 0.425 0.421 0.47 0.466
(0.499) (0.498) (0.496) (0.494) (0.5) (0.499)
Assumed Tester Incomes: Black < Hispanic 0.405 0.398 0.402 0.402 0.45 0.447
(0.492) (0.49) (0.492) (0.491) (0.499) (0.498)
Assumed Tester Incomes: White Highest 0.384 0.393 0.385 0.396 0.355 0.354
(0.487) (0.489) (0.488) (0.489) (0.48) (0.478)
Assumed Tester Incomes: Black Highest 0.394 0.39 0.408 0.401 0.39 0.392
(0.49) (0.488) (0.493) (0.49) (0.489) (0.489)
Assumed Tester Incomes: Hispanic Highest 0.38 0.373 0.333 0.329 0.37 0.367
(0.486) (0.484) (0.473) (0.47) (0.484) (0.482)
Tester Fixed Effects
Tester ID A01 0.108 0.11 0.121 0.126 0.11 0.111
(0.31) (0.313) (0.327) (0.332) (0.314) (0.314)
Tester ID A10 0.136 0.122 0.08 0.084 0.11 0.111
(0.344) (0.327) (0.273) (0.278) (0.314) (0.314)
Tester ID A11 0.029 0.024 0.023 0.026 0.035 0.039
(0.167) (0.154) (0.15) (0.161) (0.184) (0.195)
Tester ID A13 0.154 0.168 0.144 0.141 0.17 0.156
(0.362) (0.374) (0.352) (0.348) (0.377) (0.364)
Tester ID A02 0 0 0 0 0.01 0.009
(0) (0) (0) (0) (0.1) (0.093)
Tester ID A21 0.007 0.009 0.011 0.01 0.005 0.004
(0.085) (0.095) (0.107) (0.099) (0.071) (0.066)
Tester ID A22 0.043 0.055 0.121 0.104 0.055 0.048
(0.203) (0.228) (0.327) (0.306) (0.229) (0.214)
Tester ID A03 0.057 0.061 0.069 0.063 0.085 0.085
(0.233) (0.239) (0.254) (0.243) (0.28) (0.279)
Tester ID A04 0.086 0.081 0.069 0.073 0.1 0.111
(0.281) (0.273) (0.254) (0.26) (0.301) (0.314)
Tester ID A05 0.14 0.128 0.155 0.164 0.11 0.12
(0.347) (0.334) (0.363) (0.37) (0.314) (0.325)
Tester ID A06 0.061 0.059 0.052 0.051 0.06 0.058
(0.24) (0.237) (0.222) (0.221) (0.238) (0.235)
Tester ID A07 0.004 0.005 0.011 0.01 0.005 0.004
(0.06) (0.068) (0.107) (0.099) (0.071) (0.066)
Tester ID A08 0.168 0.169 0.138 0.142 0.14 0.137
(0.375) (0.375) (0.346) (0.35) (0.348) (0.345)
Tester ID A09 0.007 0.009 0.006 0.005 0.005 0.004
(0.085) (0.095) (0.076) (0.07) (0.071) (0.066)
Tester ID B01 0.204 0.189 0.27 0.291 0.19 0.203
(0.404) (0.392) (0.445) (0.455) (0.393) (0.403)
Tester ID B11 0.061 0.052 0.029 0.031 0.055 0.063
(0.24) (0.222) (0.168) (0.175) (0.229) (0.243)
Tester ID B12 0.011 0.009 0.006 0.007 0 0
(0.103) (0.095) (0.076) (0.081) (0) (0)
Tester ID B14 0.108 0.116 0.115 0.106 0.115 0.115
(0.31) (0.32) (0.32) (0.308) (0.32) (0.32)
Tester ID B16 0.032 0.027 0.034 0.04 0.035 0.035
(0.177) (0.163) (0.183) (0.195) (0.184) (0.184)
Tester ID B02 0.025 0.032 0.04 0.035 0.04 0.035
(0.157) (0.176) (0.197) (0.183) (0.196) (0.184)
Tester ID B20 0 0 0 0 0.005 0.004
(0) (0) (0) (0) (0.071) (0.066)
Tester ID B23 0.014 0.018 0.011 0.01 0.03 0.026
(0.119) (0.134) (0.107) (0.099) (0.171) (0.16)
Tester ID B24 0.022 0.027 0.063 0.055 0.04 0.035
(0.145) (0.163) (0.244) (0.227) (0.196) (0.184)
Tester ID B25 0.039 0.05 0.057 0.05 0.08 0.07
(0.195) (0.219) (0.233) (0.217) (0.272) (0.256)
Tester ID B27 0.022 0.027 0.046 0.04 0.02 0.018
(0.145) (0.163) (0.21) (0.195) (0.14) (0.131)
Tester ID B03 0.104 0.117 0.086 0.083 0.095 0.091
(0.306) (0.322) (0.281) (0.276) (0.294) (0.287)
Tester ID B04 0.022 0.027 0.034 0.03 0.045 0.039
(0.145) (0.163) (0.183) (0.17) (0.208) (0.195)
Tester ID B06 0.043 0.044 0.04 0.04 0.065 0.064
(0.203) (0.206) (0.197) (0.195) (0.247) (0.246)
Tester ID B07 0.018 0.023 0.023 0.02 0.035 0.031
(0.133) (0.15) (0.15) (0.14) (0.184) (0.173)
Tester ID B08 0.233 0.203 0.126 0.144 0.13 0.146
(0.423) (0.402) (0.333) (0.351) (0.337) (0.354)
Tester ID B09 0.043 0.037 0.017 0.02 0.02 0.023
(0.203) (0.188) (0.131) (0.14) (0.14) (0.151)
Tester ID C01 0.025 0.032 0.029 0.025 0.015 0.013
(0.157) (0.176) (0.168) (0.156) (0.122) (0.114)
Tester ID C10 0.022 0.023 0.046 0.041 0.035 0.034
(0.145) (0.15) (0.21) (0.199) (0.184) (0.18)
Tester ID C12 0.004 0.003 0.011 0.013 0.015 0.015
(0.06) (0.055) (0.107) (0.114) (0.122) (0.12)
Tester ID C13 0.097 0.082 0.063 0.073 0.045 0.053
(0.296) (0.275) (0.244) (0.26) (0.208) (0.223)
Tester ID C14 0.007 0.009 0.057 0.05 0.015 0.013
(0.085) (0.095) (0.233) (0.217) (0.122) (0.114)
Tester ID C15 0.018 0.023 0.052 0.045 0.045 0.039
(0.133) (0.15) (0.222) (0.207) (0.208) (0.195)
Tester ID C02 0.28 0.276 0.207 0.21 0.245 0.249
(0.45) (0.447) (0.406) (0.408) (0.431) (0.432)
Tester ID C27 0.039 0.05 0.023 0.02 0.07 0.061
(0.195) (0.219) (0.15) (0.14) (0.256) (0.24)
Tester ID C29 0.05 0.064 0.069 0.06 0.05 0.044
(0.219) (0.245) (0.254) (0.237) (0.218) (0.205)
Tester ID C03 0.004 0.005 0 0 0.03 0.026
(0.06) (0.068) (0) (0) (0.171) (0.16)
Tester ID C31 0.004 0.005 0 0 0 0
(0.06) (0.068) (0) (0) (0) (0)
Tester ID C33 0.011 0.009 0.006 0.007 0.005 0.006
(0.103) (0.095) (0.076) (0.081) (0.071) (0.076)
Tester ID C04 0.179 0.171 0.126 0.131 0.16 0.161
(0.384) (0.377) (0.333) (0.337) (0.368) (0.368)
Tester ID C05 0.011 0.014 0.029 0.025 0.04 0.035
(0.103) (0.116) (0.168) (0.156) (0.196) (0.184)
Tester ID C06 0.004 0.005 0.006 0.005 0.005 0.004
(0.06) (0.068) (0.076) (0.07) (0.071) (0.066)
Tester ID C07 0.244 0.226 0.253 0.276 0.2 0.225
(0.43) (0.418) (0.436) (0.448) (0.401) (0.418)
Tester ID C08 0 0 0.006 0.005 0 0
(0) (0) (0.076) (0.07) (0) (0)
Tester ID C09 0.004 0.005 0.017 0.015 0.025 0.022
(0.06) (0.068) (0.131) (0.121) (0.157) (0.147)

4.24 Table A21 (p. A-46): Sensitivity Analysis: Estimated Effects of Messaging on Net Discrimination Levels Among Subsample Excluding Likely Discrimination Cases

col.labels <- c("Outcome", "Estimate","SE","t","P") 

outvars <- c("nmeet_", "index.", "ncb_","noff_" )
# ---------------------------------------------------------------------- #
# MODELS WITH ONLY BLOCK FIXED EFFECTS

reg_and_summ <- function(outvar, compare) {  
  depvar <- paste0(outvar, compare)
  
  model.mc <- paste(depvar," ~ TA1 + as.factor(block)", sep = "")
  model.pc <- paste(depvar," ~ TA2 + as.factor(block)", sep = "")
  model.pm <- paste(depvar," ~ TA2 + as.factor(block)", sep = "")
  
  fit.mc <- lm(formula = model.mc, data=dat[dat$TA %in% c(0,1) & dat$frame == "representative",], weights=ipw10)
  fit.pc <- lm(formula = model.pc, data=dat[dat$TA %in% c(0,2) & dat$frame == "representative",], weights=ipw20)
  fit.pm <- lm(formula = model.pm, data=dat[dat$TA %in% c(1,2) & dat$frame == "representative",], weights=ipw21)
  
  itt.mc <- summary(fit.mc)$coefficients[2,]
  itt.pc <- summary(fit.pc)$coefficients[2,]
  itt.pm <- summary(fit.pm)$coefficients[2,]
  
  if(compare == "wb" | compare == "wh"){
  itt.mc[4] <- pt(coef(summary(fit.mc))[,3], summary(fit.mc)$df[2], lower=TRUE)[2] #one sided p
  itt.pc[4] <- pt(coef(summary(fit.pc))[,3], summary(fit.pc)$df[2], lower=TRUE)[2] #one sided p
  }

  list(mc     = itt.mc, pc     = itt.pc, pm     = itt.pm,
         fit.mc = fit.mc, fit.pc = fit.pc, fit.pm = fit.pm )
  
}
itt.wb <- lapply(outvars, reg_and_summ,  compare = "wb" )
itt.wb.mc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$mc)
itt.wb.pc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$pc)
itt.wb.pm <- sapply(1:length(outvars), function(i) itt.wb[[i]]$pm)
fit.wb.mc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$fit.mc)
fit.wb.pc <- sapply(1:length(outvars), function(i) itt.wb[[i]]$fit.pc)
fit.wb.pm <- sapply(1:length(outvars), function(i) itt.wb[[i]]$fit.pm)

itt.wh <- lapply(outvars, reg_and_summ,  compare = "wh" )
itt.wh.mc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$mc)
itt.wh.pc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$pc)
itt.wh.pm <- sapply(1:length(outvars), function(i) itt.wh[[i]]$pm)
fit.wh.mc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$fit.mc)
fit.wh.pc <- sapply(1:length(outvars), function(i) itt.wh[[i]]$fit.pc)
fit.wh.pm <- sapply(1:length(outvars), function(i) itt.wh[[i]]$fit.pm)

itt.bh <- lapply(outvars, reg_and_summ,  compare = "bh" )
itt.bh.mc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$mc)
itt.bh.pc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$pc)
itt.bh.pm <- sapply(1:length(outvars), function(i) itt.bh[[i]]$pm)
fit.bh.mc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$fit.mc)
fit.bh.pc <- sapply(1:length(outvars), function(i) itt.bh[[i]]$fit.pc)
fit.bh.pm <- sapply(1:length(outvars), function(i) itt.bh[[i]]$fit.pm)




#----- CIs -----#
df.w_ <- sapply(list(df.wb.mc = fit.wb.mc, 
                     df.wb.pc = fit.wb.pc, 
                     df.wb.pm = fit.wb.pm,
                     df.wh.mc = fit.wh.mc,
                     df.wh.pc = fit.wh.pc, 
                     df.wh.pm = fit.wh.pm,
                     df.bh.mc = fit.bh.mc,
                     df.bh.pc = fit.bh.pc,
                     df.bh.pm = fit.bh.pm),
                function(fit){
                        unlist(lapply(fit, function(x) summary(x)$df[2]))[2:4]})

df.2g.10 <-  c(df.w_[,c("df.wb.mc", "df.wh.mc", "df.bh.mc")])
df.2g.20 <-  c(df.w_[,c("df.wb.pc", "df.wh.pc", "df.bh.pc")])
df.2g.21 <-  c(df.w_[,c("df.wb.pm", "df.wh.pm", "df.bh.pm")])

cv.2g.10 <- qt(p=0.025, df=df.2g.10, lower.tail=TRUE)
cv.2g.20 <- qt(p=0.025, df=df.2g.20, lower.tail=TRUE)
cv.2g.21 <- qt(p=0.025, df=df.2g.21, lower.tail=TRUE)

cv.2g <- c(cv.2g.10, cv.2g.20, cv.2g.21)
itt.2g.ci <- as.numeric(out2[,2]) + abs(cv.2g)*cbind(-as.numeric(out2[,3]), as.numeric(out2[,3]))
# Stack, create output table
out2 <- rbind(itt.wb.mc, itt.wh.mc, itt.bh.mc,
              itt.wb.pc, itt.wh.pc, itt.bh.pc,
              itt.wb.pm, itt.wh.pm, itt.bh.pm)

out2 <- out2[-seq(1,33,4),]


# Assemble vector of 95% CIs
itt.2g.ci <- as.numeric(out2[,2]) + abs(cv.2g)*cbind(-as.numeric(out2[,3]), as.numeric(out2[,3]))
table_a21_cis <- apply(itt.2g.ci, 1, function(x) paste("[", round(x[1],3) ,", ", round(x[2],3), "]", sep=""))

# Round and format results
for(i in 2:ncol(out2)) out2[,i] <- round(as.numeric(out2[,i]),3)
for(i in 5) out2[,i] <- paste("(",out2[,i],")",sep="")

# Attach CIs
table_a21 <- cbind(out2, table_a21_cis)
# Clean up column names
colnames(table_a21) <- c("Outcome", "Estimate", "SE", "t", "p-value", "95% CI")

# display table in Rmd output file
kable(table_a21[1:9,], caption="**Table A21, Panel I. Monitoring vs. Control**")
Table A21, Panel I. Monitoring vs. Control
Outcome Estimate SE t p-value 95% CI
Index measure of favorable in-person interactions (White vs. Black) -0.011 0.054 -0.201 (0.42) [-0.117, 0.095]
Received post-visit callback (White vs. Black) 0.015 0.047 0.315 (0.624) [-0.077, 0.107]
Received post-visit offer for unit (White vs. Black) 0.008 0.038 0.2 (0.579) [-0.066, 0.082]
Index measure of favorable in-person interactions (White vs. Hispanic) -0.047 0.059 -0.794 (0.214) [-0.163, 0.069]
Received post-visit callback (White vs. Hispanic) -0.022 0.045 -0.501 (0.308) [-0.11, 0.065]
Received post-visit offer for unit (White vs. Hispanic) -0.009 0.035 -0.252 (0.401) [-0.079, 0.061]
Index measure of favorable in-person interactions (Black vs. Hispanic) -0.075 0.054 -1.389 (0.166) [-0.182, 0.032]
Received post-visit callback (Black vs. Hispanic) -0.037 0.043 -0.856 (0.392) [-0.123, 0.048]
Received post-visit offer for unit (Black vs. Hispanic) -0.016 0.032 -0.509 (0.611) [-0.08, 0.047]
kable(table_a21[10:18,], caption="**Table A21, Panel II. Punitive vs. Control**")
Table A21, Panel II. Punitive vs. Control
Outcome Estimate SE t p-value 95% CI
Index measure of favorable in-person interactions (White vs. Black) 0.015 0.056 0.267 (0.605) [-0.095, 0.125]
Received post-visit callback (White vs. Black) 0.017 0.043 0.389 (0.651) [-0.068, 0.102]
Received post-visit offer for unit (White vs. Black) 0.027 0.035 0.773 (0.78) [-0.042, 0.095]
Index measure of favorable in-person interactions (White vs. Hispanic) 0.061 0.059 1.033 (0.849) [-0.055, 0.176]
Received post-visit callback (White vs. Hispanic) -0.067 0.043 -1.58 (0.057) [-0.151, 0.016]
Received post-visit offer for unit (White vs. Hispanic) -0.013 0.035 -0.363 (0.358) [-0.082, 0.057]
Index measure of favorable in-person interactions (Black vs. Hispanic) 0.042 0.052 0.813 (0.417) [-0.06, 0.144]
Received post-visit callback (Black vs. Hispanic) -0.084 0.042 -2.018 (0.044) [-0.166, -0.002]
Received post-visit offer for unit (Black vs. Hispanic) -0.04 0.034 -1.17 (0.243) [-0.107, 0.027]
kable(table_a21[19:27,], caption="**Table A21, Panel III. Punitive vs. Monitoring**")
Table A21, Panel III. Punitive vs. Monitoring
Outcome Estimate SE t p-value 95% CI
Index measure of favorable in-person interactions (White vs. Black) 0.018 0.06 0.302 (0.763) [-0.1, 0.136]
Received post-visit callback (White vs. Black) 0.023 0.048 0.469 (0.639) [-0.072, 0.118]
Received post-visit offer for unit (White vs. Black) 0.032 0.036 0.895 (0.371) [-0.039, 0.104]
Index measure of favorable in-person interactions (White vs. Hispanic) 0.105 0.066 1.58 (0.116) [-0.026, 0.236]
Received post-visit callback (White vs. Hispanic) -0.032 0.051 -0.636 (0.525) [-0.132, 0.068]
Received post-visit offer for unit (White vs. Hispanic) 0.008 0.04 0.2 (0.841) [-0.07, 0.086]
Index measure of favorable in-person interactions (Black vs. Hispanic) 0.127 0.059 2.135 (0.034) [0.01, 0.244]
Received post-visit callback (Black vs. Hispanic) -0.055 0.047 -1.18 (0.239) [-0.147, 0.037]
Received post-visit offer for unit (Black vs. Hispanic) -0.024 0.036 -0.688 (0.492) [-0.094, 0.046]

4.25 Table A22 (p. A-47): Distribution of Subjects by their Perceived Race

dat$subj_race <- rep(NA, nrow(dat))
dat$subj_race <- ifelse(dat$primary_wht == 1 & dat$primary_blk != 1 & dat$primary_api != 1 & dat$primary_hsp != 1, "White", dat$subj_race)
dat$subj_race <- ifelse(dat$primary_wht != 1 & dat$primary_blk == 1 & dat$primary_api != 1 & dat$primary_hsp != 1, "Black", dat$subj_race)
dat$subj_race <- ifelse(dat$primary_wht != 1 & dat$primary_blk != 1 & dat$primary_api != 1 & dat$primary_hsp == 1, "Hispanic", dat$subj_race)
dat$subj_race <- ifelse(is.na(dat$subj_race), "Other", dat$subj_race)

tab_pcvdrace <- cbind(table(dat$subj_race), round(table(dat$subj_race)/nrow(dat), 2))
colnames(tab_pcvdrace) <- c("N", "Proportion")
pcvrace <- tab_pcvdrace
kable(tab_pcvdrace, caption="**Table A22**", col.names=c("Number of Subjects", "Proportion"))
Table A22
Number of Subjects Proportion
Black 75 0.11
Hispanic 83 0.13
Other 157 0.24
White 338 0.52

4.26 Figure A6 (p. A-48): Estimated Effects of Monitoring Messaging on Net Discrimination Levels Relative to Control, by the Perceived Race of the Landlord/Broker

# monitoring-control
fit_mc_llrace <- do.call(rbind, 
                         lapply(names(table(dat$subj_race)), function(J){
                           x<-rbind(summary(lm(ncb_wb ~ TA1 + factor(block), data=dat, subset=(TA %in% c(0,1) & subj_race==J), weights=ipw10))$coefficients["TA1",],
                                    summary(lm(ncb_wh ~ TA1 + factor(block), data=dat, subset=(TA %in% c(0,1) & subj_race==J), weights=ipw10))$coefficients["TA1",],
                                    summary(lm(noff_wb ~ TA1 + factor(block), data=dat, subset=(TA %in% c(0,1) & subj_race==J), weights=ipw10))$coefficients["TA1",],
                                    summary(lm(noff_wh ~ TA1 + factor(block), data=dat, subset=(TA %in% c(0,1) & subj_race==J), weights=ipw10))$coefficients["TA1",]
                           )
                           x<-data.frame(Pair = c("Net Discrimination\nAgainst Blacks (vs. Whites)", 
                                                  "Net Discrimination\nAgainst Hispanics (vs. Whites)", 
                                                  "Net Discrimination\nAgainst Blacks (vs. Whites)", 
                                                  "Net Discrimination\nAgainst Hispanics (vs. Whites)"),
                                         Outcome = c("Callback", "Callback", "Offer", "Offer"),
                                         SubjectRace = J,
                                         x,
                                         stringsAsFactors=FALSE)
                           return(x)
                         }))
# table(dat$subj_race[dat$TA %in% c(0,1)])
fit_mc_llrace$SubjectRace <- ifelse(fit_mc_llrace$SubjectRace == "Black", "Black\n(n=48)", fit_mc_llrace$SubjectRace)
fit_mc_llrace$SubjectRace <- ifelse(fit_mc_llrace$SubjectRace == "Hispanic", "Hispanic\n(n=61)", fit_mc_llrace$SubjectRace)
fit_mc_llrace$SubjectRace <- ifelse(fit_mc_llrace$SubjectRace == "Other", "Other\n(n=115)", fit_mc_llrace$SubjectRace)
fit_mc_llrace$SubjectRace <- ifelse(fit_mc_llrace$SubjectRace == "White", "White\n(n=229)", fit_mc_llrace$SubjectRace)

fit_mc_llrace$SubjectRace <- factor(fit_mc_llrace$SubjectRace, levels=c("White\n(n=229)", "Black\n(n=48)", "Hispanic\n(n=61)", "Other\n(n=115)"))

ggplot(fit_mc_llrace, aes(group=SubjectRace, colour=SubjectRace)) + 
  geom_linerange(aes(x=Outcome, ymin=Estimate-1.96*Std..Error, ymax=Estimate+1.96*Std..Error), lwd=1, position=position_dodge(width=1/2)) +
  geom_linerange(aes(x=Outcome, ymin=Estimate-1.64*Std..Error, ymax=Estimate+1.64*Std..Error), lwd=2, position=position_dodge(width=1/2)) +
  geom_point(aes(x=Outcome, y=Estimate), size=3.5, position=position_dodge(width=1/2)) + 
  scale_y_continuous(limits = c(-.5, .5), breaks=seq(-.5, .5, .1)) +
  xlab("\nNet Discrimination Outcome") +
  ylab("\nEstimated Effect of Sending\nMonitoring Message (vs. Control)\n") +
  geom_hline(yintercept=0, lty=2) + facet_wrap( ~ Pair) +
  scale_colour_brewer(palette = "Dark2", guide = guide_legend(title = "Perceived Race of Landlord/Broker:")) +
  theme_bw(base_size=15) + theme(legend.position="bottom", legend.key = element_blank())

ggsave("out_figurea6_appx_hetfxSUBJRACE_mc.pdf", width=11, height=7)

4.27 Figure A7 (p. A-49): Estimated Effects of Punitive Messaging on Net Discrimination Levels Relative to Control, by the Perceived Race of the Landlord/Broker

# punitive-control
fit_pc_llrace <- 
  do.call(rbind, 
   lapply(names(table(dat$subj_race)),
     function(J){
        x <- rbind(summary(lm(ncb_wb ~ TA2 + factor(block),
                               data=dat, 
                               subset=(TA %in% c(0,2) & subj_race==J),
                               weights=ipw20))$coefficients["TA2",],
                   
                   summary(lm(ncb_wh ~ TA2 + factor(block), 
                               data=dat, 
                               subset = (TA %in% c(0,2) & subj_race==J),
                               weights=ipw20))$coefficients["TA2",],
                   
                   summary(lm(noff_wb ~ TA2 + factor(block), 
                               data = dat,
                               subset =(TA %in% c(0,2) & subj_race==J),
                               weights = ipw20))$coefficients["TA2",],
                   
                   summary(lm(noff_wh ~ TA2 + factor(block), 
                               data = dat, 
                               subset = (TA %in% c(0,2) & subj_race==J), 
                               weights = ipw20))$coefficients["TA2",])
        
        x <- data.frame(Pair = c("Net Discrimination\nAgainst Blacks (vs. Whites)", 
                                 "Net Discrimination\nAgainst Hispanics (vs. Whites)", 
                                 "Net Discrimination\nAgainst Blacks (vs. Whites)", 
                                 "Net Discrimination\nAgainst Hispanics (vs. Whites)"),
                        Outcome = c("Callback", "Callback", "Offer", "Offer"),
                        SubjectRace = J,
                        x,
                        stringsAsFactors = FALSE)
  return(x)
}))

# table(dat$subj_race[dat$TA %in% c(0,2)])
fit_pc_llrace$SubjectRace <- ifelse(fit_pc_llrace$SubjectRace == "Black", "Black\n(n=59)", fit_pc_llrace$SubjectRace)
fit_pc_llrace$SubjectRace <- ifelse(fit_pc_llrace$SubjectRace == "Hispanic", "Hispanic\n(n=59)", fit_pc_llrace$SubjectRace)
fit_pc_llrace$SubjectRace <- ifelse(fit_pc_llrace$SubjectRace == "Other", "Other\n(n=112)", fit_pc_llrace$SubjectRace)
fit_pc_llrace$SubjectRace <- ifelse(fit_pc_llrace$SubjectRace == "White", "White\n(n=249)", fit_pc_llrace$SubjectRace)

fit_pc_llrace$SubjectRace <- factor(fit_pc_llrace$SubjectRace, levels=c("White\n(n=249)", "Black\n(n=59)", "Hispanic\n(n=59)", "Other\n(n=112)"))

ggplot(fit_pc_llrace, aes(group=SubjectRace, colour=SubjectRace)) + 
  geom_linerange(aes(x=Outcome, ymin=Estimate-1.96*Std..Error, ymax=Estimate+1.96*Std..Error), lwd=1, position=position_dodge(width=1/2)) +
  geom_linerange(aes(x=Outcome, ymin=Estimate-1.64*Std..Error, ymax=Estimate+1.64*Std..Error), lwd=2, position=position_dodge(width=1/2)) +
  geom_point(aes(x=Outcome, y=Estimate), size=3.5, position=position_dodge(width=1/2)) + 
  scale_y_continuous(limits = c(-.6, .6), breaks=seq(-.8, .8, .1)) +
  xlab("\nNet Discrimination Outcome") +
  ylab("\nEstimated Effect of Sending\nPunitive Message (vs. Control)\n") +
  geom_hline(yintercept=0, lty=2) + facet_wrap( ~ Pair) +
  scale_colour_brewer(palette = "Dark2", guide = guide_legend(title = "Perceived Race of Landlord/Broker:")) +
  theme_bw(base_size=15) + theme(legend.position="bottom", legend.key = element_blank())

ggsave("out_figurea7_appx_hetfxSUBJRACE_pc.pdf", width=11, height=7)

4.28 Appendix E.15 (p. A-50): Analyses Addressing Spillover Concerns

For transparency, we show the code we use to conduct the analyses reported in Appendix E.15. However, we have omitted the raw data files that are used to conduct these analyses (i.e., every scraped ad from Craigslist) because they contain personal identifying information about human subjects.

setwd("") # set to directory containing all scraped Craigslist ads (before sampling)
# dirs1 - subdirectory listing, one per day

phone <- NULL # one row for each ad
phone_nums <- NULL 
for(i in 1:length(dirs1)) {
  
  thisdir <- dirs1[i]
  subdirs <- list.files(thisdir)
  subdirs <- subdirs[which(!grepl(".csv", subdirs))]
  
  if(length(subdirs) > 0) {
    
    subphone <- array(NA, length(subdirs))
    entry <- array(NA, length(subdirs))
    for(j in 1:length(subdirs)) { # one subdir per listing each day
      
      listing <- list.files(paste0(thisdir, "/", subdirs[j]))
      listing <- listing[which(grepl(".html", listing))]
      if(length(listing) > 1) { listing <- listing[1] }
      list_source <- readLines(paste0(thisdir, "/", subdirs[j], "/", listing))
      
      entry[j] <- paste(thisdir, subdirs[j], sep = "_")
      subphone[j] <- ifelse(length(which(str_extract_all(list_source, 
      "\\(?\\d{3}\\)?[.-]? *\\d{3}[[:space:]]*[.-]+ *[.-[:space:]]?\\d{4}",
      simplify = TRUE) != "")) > 0, 1, 0)
      if(subphone[j] == 1) {
        phone_vec <- str_extract_all(list_source, 
        "\\(?\\d{3}\\)?[.-]? *\\d{3}[[:space:]]*[.-]+ *[.-[:space:]]?\\d{4}",
        simplify = TRUE)
      phone_nums <- c(phone_nums, unique(phone_vec[which(phone_vec != "")])) # keep all phone numbers here (not available)
      }
  }
 phone <- rbind(phone, cbind(entry, subphone))
}
}

phone_df <- as.data.frame(phone)
names(phone_df)[2] <- "hasphonenum"
phone_df$hasphonenum <- as.numeric(as.character(phone_df$hasphonenum))

phone_df
# # A tibble: 85,981 x 2
#             entry hasphonenum
#             <chr>       <dbl>
# 1   2012-04-13_1           1
# 2  2012-04-13_10           1
# 3 2012-04-13_100           0
# 4 2012-04-13_101           1
# 5 2012-04-13_102           1
# 6 2012-04-13_103           1
# 7 2012-04-13_104           0
# 8 2012-04-13_105           1
# 9 2012-04-13_106           0
# 10 2012-04-13_107          1
# # ... with 85,971 more rows


# > length(unique(phone_df$entry))
# [1] 85981
# > sum(phone_df$hasphonenum)
# [1] 47978
# > mean(phone_df$hasphonenum) ### estimated proportion of rental listings requiring contact by phone 
# [1] 0.558007

### probability a subject enters the audit sample
2711 / 85981
# [1] 0.03153022
### probability a subject enters the experiment sample
653 / 85981
# [1] 0.007594701

### estimated percentage of duplicate-subject listings by phone number
# > length(which(duplicated(phone_nums)))/length(phone_nums) # 78.96%

0.7896 * 85981
# [1] 67890.6
85981-67890
# [1] 18091
2711 / 18091
# [1] 0.1498535
653 / 18091
# [1] 0.0360953


### Capture-recapture analysis ###

phone_df <- tbl_df(as.data.frame(phone[which(phone[,2] == "1"),])) # keep only listings with phone numbers
phone_df <- phone_df[which(!duplicated(phone_nums)),] # de-dupe using list of phone numbers from above (not available)

# sample 1 - which are marked

set.seed(9382)

phone_s1 <- sample(phone_df$entry, 1000)
#sum(as.numeric(as.character(phone_s1))) # 538

# sample 2

phone_s2 <- sample(phone_df$entry, 1000)

length(which(phone_s2 %in% phone_s1)) # 95

# estimate pop

1000 / (95/1000) # [1] 10526.32

# using 55.8% est. with phone #, this means:

10526.32 / .558 # 18864.37

4.29 Appendix E.16 (p. A-51): Joint Distribution of the Number of Testers in Matched Trios Who Receive a Callback and an Offer

dat$num_cb <- apply(dat[,c("cb_A", "cb_B", "cb_C")], 1, sum)
dat$num_off <- apply(dat[,c("off_A", "off_B", "off_C")], 1, sum)

appxe16 <- with(dat, table(num_cb, num_off))

appxe16a <- cbind(appxe16, rowSums(appxe16))
appxe16b <- round(cbind(appxe16, rowSums(appxe16)) / rowSums(appxe16) * 100, 2)

appxe16_table <- list()
for(i in 1:nrow(appxe16a)) {
  appxe16_table[[i]] <- rbind(as.character(round(appxe16a[i,], 0)),
                               paste0("(", sprintf("%.2f", appxe16b[i,]) ,")"))
}
appxe16_table <- do.call(rbind, appxe16_table)
appxe16_table <- cbind(c("Number of Testers Receiving a Callback: 0", "",
                         "Number of Testers Receiving a Callback: 1", "",
                         "Number of Testers Receiving a Callback: 2", "",
                         "Number of Testers Receiving a Callback: 3", ""),
                       appxe16_table)

colnames(appxe16_table) <- c("", paste(rep("Number of Testers Receiving an Offer:", 4), 0:3), "Row Totals")
  
kable(appxe16_table, caption="**Appendix E.16: Number of Testers in a Matched Trio Receiving an Offer by the Number of Testers in a Matched Trio Receiving a Callback.** Cells report counts with percentages in parentheses.")
Appendix E.16: Number of Testers in a Matched Trio Receiving an Offer by the Number of Testers in a Matched Trio Receiving a Callback. Cells report counts with percentages in parentheses.
Number of Testers Receiving an Offer: 0 Number of Testers Receiving an Offer: 1 Number of Testers Receiving an Offer: 2 Number of Testers Receiving an Offer: 3 Row Totals
Number of Testers Receiving a Callback: 0 427 0 0 0 427
(100.00) (0.00) (0.00) (0.00) (100.00)
Number of Testers Receiving a Callback: 1 68 75 0 0 143
(47.55) (52.45) (0.00) (0.00) (100.00)
Number of Testers Receiving a Callback: 2 19 20 20 0 59
(32.20) (33.90) (33.90) (0.00) (100.00)
Number of Testers Receiving a Callback: 3 5 8 6 5 24
(20.83) (33.33) (25.00) (20.83) (100.00)