Topics II
Interested in the effect of an intervention on a single unit.
Available data: time series for unit of interest and possible comparison groups
Generate a Frankenstein control from weighting a set of control units so that they collectively track the unit of interest and can plausibly serve as a counterfactual.
Identify weights \(w\) that minimize \(\left(Y_i^s(0) - \sum_{j\neq i} w_j Y_j^s(0)\right)^2\) in time period(s) \(s\), prior to treatment; use these to predict counterfactual outcomes \(Y^t_i(0)\) in later periods, \(t\)
Note, in Abadie and Gardeazabal (2003) weights are chosen to minimize \((X_1 - X_0W)'V(X_1 - X_0W)\) where \(X_1\), \(X_0\) are covariates in the unit of interest and the “donor” units respectively. \(V\) is chosen so that the pre treatment outcomes are as close as possible.
In practice weights are found using a two stage minimization procedure:
So in effect you end up with weights on cases and also weights on variables
The seminal study is: Abadie and Gardeazabal (2003) which examines the impact of conflict on the Basque area
A vignette showing how to use the SCtools
package to implement synthetic controls for this example is (here)[https://cran.r-project.org/web/packages/SCtools/vignettes/replicating-basque.html]
We will walk through this and then interrogate this with a common package and a tidy version
Packages and data:
Some fairly nasty data manipulation:
f_dataprep <- function(data, time_plot = min(data$year):max(data$year))
dataprep(
foo = data,
predictors = c("school.illit", "school.prim", "school.med",
"school.high", "school.post.high",
"invest"),
predictors.op = "mean",
time.predictors.prior = 1964:1969,
special.predictors = list(
list("gdpcap", 1960:1969 ,"mean"),
list("sec.agriculture", seq(1961, 1969, 2), "mean"),
list("sec.energy", seq(1961, 1969, 2), "mean"),
list("sec.industry", seq(1961, 1969, 2), "mean"),
list("sec.construction", seq(1961, 1969, 2), "mean"),
list("sec.services.venta", seq(1961, 1969, 2), "mean"),
list("sec.services.nonventa", seq(1961, 1969, 2), "mean"),
list("popdens", 1969, "mean")),
dependent = "gdpcap",
unit.variable = "regionno",
unit.names.variable = "regionname",
time.variable = "year",
treatment.identifier = 17,
controls.identifier = c(2:16, 18),
time.optimize.ssr = 1960:1969,
time.plot = time_plot)
dataprep.out <- f_dataprep(basque)
X1, X0, Z1, Z0 all come directly from dataprep object.
****************
searching for synthetic control unit
****************
****************
****************
MSPE (LOSS V): 0.008864606
solution.v:
0.02773094 1.194e-07 1.60609e-05 0.0007163836 1.486e-07 0.002423908 0.0587055 0.2651997 0.02851006 0.291276 0.007994382 0.004053188 0.009398579 0.303975
solution.w:
2.53e-08 4.63e-08 6.44e-08 2.81e-08 3.37e-08 4.844e-07 4.2e-08 4.69e-08 0.8508145 9.75e-08 3.2e-08 5.54e-08 0.1491843 4.86e-08 9.89e-08 1.162e-07
This produces the following weights, selected to render the pre-treatment trends of the synthetic and actual controls as similar as possible.
regionno | regionname | w.weight |
---|---|---|
1 | Spain (Espana) | NA |
2 | Andalucia | 0.00 |
3 | Aragon | 0.00 |
4 | Principado De Asturias | 0.00 |
5 | Baleares (Islas) | 0.00 |
6 | Canarias | 0.00 |
7 | Cantabria | 0.00 |
8 | Castilla Y Leon | 0.00 |
9 | Castilla-La Mancha | 0.00 |
10 | Cataluna | 0.85 |
11 | Comunidad Valenciana | 0.00 |
12 | Extremadura | 0.00 |
13 | Galicia | 0.00 |
14 | Madrid (Comunidad De) | 0.15 |
15 | Murcia (Region de) | 0.00 |
16 | Navarra (Comunidad Foral De) | 0.00 |
17 | Basque Country (Pais Vasco) | NA |
18 | Rioja (La) | 0.00 |
Lets plot by hand and compare with package
basque_df <- basque |> filter(regionno == 17) |>
mutate(counterfactual = .15*basque$gdpcap[basque$regionno == 14] + .85*basque$gdpcap[basque$regionno == 10] )
basque_df |> ggplot(aes(year, gdpcap)) + geom_line(color = "blue") +
geom_line(aes(year, counterfactual), color = "red") + theme_bw() +
ylab("real per-capita GDP (1986 USD, thousand)") + xlab("year")
What is your estimate?
How do you do inference?
Estimate could be at a particular point in time or averaging over a range. Inference often based on placebos: assess whether estimated effects are large relative to what you would get if you looked for effects in random places.
This has an RI flavor to it, but we are not exploiting any actual randomization.
Alternatives are to do sensitivity analyses, bootstrap (though this is really capturing sampling variability), and perhaps using estimates of the uncertainty from errors in pre treatment predictions.
tidysynth
implementationtidysynth
a wrapper for synth
and tidier for sure
Vignette on: Impact of Proposition 99 on cigarette consumption in California. https://github.com/edunford/tidysynth
state | year | cigsale | lnincome | beer | age15to24 | retprice |
---|---|---|---|---|---|---|
Rhode Island | 1970 | 123.9 | NA | NA | 0.1831579 | 39.3 |
Tennessee | 1970 | 99.8 | NA | NA | 0.1780438 | 39.9 |
Indiana | 1970 | 134.6 | NA | NA | 0.1765159 | 30.6 |
Nevada | 1970 | 189.5 | NA | NA | 0.1615542 | 38.9 |
Louisiana | 1970 | 115.9 | NA | NA | 0.1851852 | 34.3 |
Oklahoma | 1970 | 108.4 | NA | NA | 0.1754592 | 38.4 |
tidysynth
implementationIn a pipe:
smoking_out <-
smoking %>%
# initial the synthetic control object
synthetic_control(
outcome = cigsale, # outcome
unit = state, # unit index in the panel data
time = year, # time index in the panel data
i_unit = "California", # unit where the intervention occurred
i_time = 1988, # time period when the intervention occurred
generate_placebos=T # generate placebo synthetic controls (for inference)
)
smoking_out <- smoking_out %>%
# Generate the aggregate predictors used to fit the weights
# average log income, retail price of cigarettes, and proportion of the
# population between 15 and 24 years of age from 1980 - 1988
generate_predictor(time_window = 1980:1988,
ln_income = mean(lnincome, na.rm = T),
ret_price = mean(retprice, na.rm = T),
youth = mean(age15to24, na.rm = T)) %>%
# average beer consumption in the donor pool from 1984 - 1988
generate_predictor(time_window = 1984:1988,
beer_sales = mean(beer, na.rm = T)) %>%
# Lagged cigarette sales
generate_predictor(time_window = 1975, cigsale_1975 = cigsale) %>%
generate_predictor(time_window = 1980, cigsale_1980 = cigsale) %>%
generate_predictor(time_window = 1988, cigsale_1988 = cigsale)
sc_helper <- function(data, treatment_time = 2010)
data |>
# initial the synthetic control object
synthetic_control(
outcome = Y, # outcome
unit = state, # unit index in the panel data
time = year, # time index in the panel data
i_unit = "01", # unit where the intervention occurred
i_time = treatment_time, # time period when the intervention occurred
generate_placebos = F # generate placebo synthetic controls
) |>
generate_predictor(time_window = 2000:2020,
lnincome = mean(ln_income, na.rm = T),
youth = mean(age15to24, na.rm = T)) %>%
generate_weights(optimization_window = 2000:2010, # time for optimization
margin_ipop = .02,sigf_ipop = 7,bound_ipop = 6 # options
) |>
generate_control()
design <-
declare_model(
state = add_level(50,
youth_mean = runif(N, .2, .7),
income_mean = runif(N, 1, 5),
b = runif(N, 0, 10)),
time = add_level(21, year = 2000:2020, nest = FALSE),
unit = cross_levels(join_using(state, time))) +
declare_model(
treatment = 1*(state == "01" & (year >=2010)),
age15to24 = youth_mean + .5*runif(N, -.1, .1) - .2*(year- 2000),
ln_income = income_mean + .5*rlnorm(N),
Y = 5*age15to24 + .1*(year- 2000)^2 + 1*ln_income + b* treatment) +
declare_inquiry(b[1]) +
declare_estimator(handler = label_estimator(sc_estimator))
Sometimes you give a medicine but only a nonrandom sample of people actually try to use it. Can you still estimate the medicine’s effect?
X=0 | X=1 | |
---|---|---|
Z=0 | \(\overline{y}_{00}\) (\(n_{00}\)) | \(\overline{y}_{01}\) (\(n_{01}\)) |
Z=1 | \(\overline{y}_{10}\) (\(n_{10}\)) | \(\overline{y}_{11}\) (\(n_{11}\)) |
Say that people are one of 3 types:
Sometimes you give a medicine but only a non random sample of people actually try to use it. Can you still estimate the medicine’s effect?
X=0 | X=1 | |
---|---|---|
Z=0 | \(\overline{y}_{00}\) (\(n_{00}\)) | \(\overline{y}_{01}\) (\(n_{01}\)) |
Z=1 | \(\overline{y}_{10}\) (\(n_{10}\)) | \(\overline{y}_{11}\) (\(n_{11}\)) |
We can figure something about types:
\(X=0\) | \(X=1\) | |
---|---|---|
\(Z=0\) | \(\frac{\frac{1}{2}n_c}{\frac{1}{2}n_c + \frac{1}{2}n_n} \overline{y}^0_{c}+\frac{\frac{1}{2}n_n}{\frac{1}{2}n_c + \frac{1}{2}n_n} \overline{y}_{n}\) | \(\overline{y}_{a}\) |
\(Z=1\) | \(\overline{y}_{n}\) | \(\frac{\frac{1}{2}n_c}{\frac{1}{2}n_c + \frac{1}{2}n_a} \overline{y}^1_{c}+\frac{\frac{1}{2}n_a}{\frac{1}{2}n_c + \frac{1}{2}n_a} \overline{y}_{a}\) |
You give a medicine to 50% but only a non random sample of people actually try to use it. Can you still estimate the medicine’s effect?
\(X=0\) | \(X=1\) | |
---|---|---|
\(Z=0\) | \(\frac{n_c}{n_c + n_n} \overline{y}^0_{c}+\frac{n_n}{n_c + n_n} \overline{y}_n\) | \(\overline{y}_{a}\) |
(n) | (\(\frac{1}{2}(n_c + n_n)\)) | (\(\frac{1}{2}n_a\)) |
\(Z=1\) | \(\overline{y}_{n}\) | \(\frac{n_c}{n_c + n_a} \overline{y}^1_{c}+\frac{n_a}{n_c + n_a} \overline{y}_{a}\) |
(n) | (\(\frac{1}{2}n_n\)) | (\(\frac{1}{2}(n_a+n_c)\)) |
Key insight: the contributions of the \(a\)s and \(n\)s are the same in the \(Z=0\) and \(Z=1\) groups so if you difference you are left with the changes in the contributions of the \(c\)s.
Average in \(Z=0\) group: \(\frac{{n_c} \overline{y}^0_{c}+ \left(n_{n}\overline{y}_{n} +{n_a} \overline{y}_a\right)}{n_a+n_c+n_n}\)
Average in \(Z=1\) group: \(\frac{{n_c} \overline{y}^1_{c} + \left(n_{n}\overline{y}_{n} +{n_a} \overline{y}_a \right)}{n_a+n_c+n_n}\)
So, the difference is the ITT: \(({\overline{y}^1_c-\overline{y}^0_c})\frac{n_c}{n}\)
Last step:
\[ITT = ({\overline{y}^1_c-\overline{y}^0_c})\frac{n_c}{n}\]
\[\leftrightarrow\]
\[LATE = \frac{ITT}{\frac{n_c}{n}}= \frac{\text{Intent to treat effect}}{\text{First stage effect}}\]
With and without an imposition of monotonicity
data("lipids_data")
models <-
list(unrestricted = make_model("Z -> X -> Y; X <-> Y"),
restricted = make_model("Z -> X -> Y; X <-> Y") |>
set_restrictions("X[Z=1] < X[Z=0]")) |>
lapply(update_model, data = lipids_data, refresh = 0)
models |>
query_model(query = list(CATE = "Y[X=1] - Y[X=0]",
Nonmonotonic = "X[Z=1] < X[Z=0]"),
given = list("X[Z=1] > X[Z=0]", TRUE),
using = "posteriors")
With and without an imposition of monotonicity:
model | query | mean | sd |
---|---|---|---|
unrestricted | CATE | 0.70 | 0.05 |
restricted | CATE | 0.71 | 0.05 |
unrestricted | Nonmonotonic | 0.01 | 0.01 |
restricted | Nonmonotonic | 0.00 | 0.00 |
In one case we assume monotonicity, in the other we update on it (easy in this case because of the empirically verifiable nature of one sided non compliance)
There is no foundationless answer to this question.
Belmont principles commonly used for guidance:
Unfortunately, operationalizing these requires further ethical theories. (1) is often operationalized by informed consent (a very liberal idea). (2) and (3) sometimes by more utiliarian principles
The major focus on (1) by IRBs might follow from the view that if subjects consent, then they endorse the ethical calculations made for 2 and 3 — they think that it is good and fair.
Trickiness: can a study be good or fair because of implications for non-subjects?
Many (many) field experiments have nothing like informed consent.
For example, whether the government builds a school in your village, whether an ad appears on your favorite radio show, and so on.
Consider three cases:
Consider three cases:
In all cases, there is no consent given by subjects.
In 2 and 3, the treatment is possibly harmful for subjects, and the results might also be harmful. But even in case 1, there could be major unintended harmful consequences.
In cases 1 and 3, however, the “intervention” is within the sphere of normal activities for the implementer.
Sometimes it is possible to use this point of difference to make a “spheres of ethics” argument for “embedded experimentation.”
Spheres of Ethics Argument: Experimental research that involves manipulations that are not normally appropriate for researchers may nevertheless be ethical if:
Otherwise keep focus on consent and desist if this is not possible
Political science researchers should respect autonomy, consider the wellbeing of participants and other people affected by their research, and be open about the ethical issues they face.
Political science researchers have an individual responsibility to consider the ethics of their research related activities and cannot outsource ethical reflection to review boards, other institutional bodies, or regulatory agencies.
These principles describe the standards of conduct and reflexive openness that are expected of political science researchers. … [In cases of reasonable deviations], researchers should acknowledge and justify deviations in scholarly publications and presentations of their work.
[Note: no general injunction against]
Researchers should generally avoid harm when possible, minimize harm when avoidance is not possible, and not conduct research when harm is excessive.
do not limit concern to physical and psychological risks to the participant.
cases in which research that produces impacts on political processes without consent of individuals directly engaged by the research might be appropriate. [examples]
Studies of interventions by third parties do not usually invoke this principle on impact. [details]
This principle is not intended to discourage any form of political engagement by political scientists in their non-research activities or private lives.
researchers should report likely impacts
Mentors, advisors, dissertation committee members, and instructors
Graduate programs in political science should include ethics instruction in their formal and informal graduate curricula;
Editors and reviewers should encourage researchers to be open about the ethical decisions …
Journals, departments, and associations should incorporate ethical commitments into their mission, bylaws, instruction, practices, and procedures.
Multiple survey experimental designs have been generated to make it easier for subjects to answer sensitive questions
The key idea is to use inference rather than measurement.
Subjects are placed in different conditions and the conditions affect the answers that are given in such a way that you can infer some underlying quantity of interest
This is an obvious DAG but the main point is to be clear that the Value is the quantity of interest and the value is not affected by the treatment, Z.
The list experiment supposes that:
In other words: sensitivities notwithstanding, they are happy for the researcher to make correct inferences about them or their group
Respondents are given a short list and a long list.
The long list differs from the short list in having one extra item—the sensitive item
We ask how many items in each list does a respondent agree with:
How many of these do you agree with:
Short list | Long list | “Effect” | |
---|---|---|---|
“2 + 2 = 4” | “2 + 2 = 4” | ||
“2 * 3 = 6” | “2 * 3 = 6” | ||
“3 + 6 = 8” | “Climate change is real” | ||
“3 + 6 = 8” | |||
Answer | Y(0) = 2 | Y(1) = 4 | Y(1) - Y(0) = 2 |
[Note: this is obviously not a good list. Why not?]
declaration_17.3 <-
declare_model(
N = 500,
control_count = rbinom(N, size = 3, prob = 0.5),
Y_star = rbinom(N, size = 1, prob = 0.3),
potential_outcomes(Y_list ~ Y_star * Z + control_count)
) +
declare_inquiry(prevalence_rate = mean(Y_star)) +
declare_assignment(Z = complete_ra(N)) +
declare_measurement(Y_list = reveal_outcomes(Y_list ~ Z)) +
declare_estimator(Y_list ~ Z, .method = difference_in_means,
inquiry = "prevalence_rate")
diagnosands <- declare_diagnosands(
bias = mean(estimate - estimand),
mean_CI_width = mean(conf.high - conf.low)
)
Design | Inquiry | Bias | Mean CI Width |
---|---|---|---|
declaration_17.3 | prevalence_rate | 0.00 | 0.32 |
(0.00) | (0.00) |
declaration_17.4 <-
declare_model(
N = N,
U = rnorm(N),
control_count = rbinom(N, size = 3, prob = 0.5),
Y_star = rbinom(N, size = 1, prob = 0.3),
W = case_when(Y_star == 0 ~ 0L,
Y_star == 1 ~ rbinom(N, size = 1, prob = proportion_hiding)),
potential_outcomes(Y_list ~ Y_star * Z + control_count)
) +
declare_inquiry(prevalence_rate = mean(Y_star)) +
declare_assignment(Z = complete_ra(N)) +
declare_measurement(Y_list = reveal_outcomes(Y_list ~ Z),
Y_direct = Y_star - W) +
declare_estimator(Y_list ~ Z, inquiry = "prevalence_rate", label = "list") +
declare_estimator(Y_direct ~ 1, inquiry = "prevalence_rate", label = "direct")
rho <- -.8
correlated_lists <-
declare_model(
N = 500,
U = rnorm(N),
control_1 = rbinom(N, size = 1, prob = 0.5),
control_2 = correlate(given = control_1, rho = rho, draw_binary, prob = 0.5),
control_count = control_1 + control_2,
Y_star = rbinom(N, size = 1, prob = 0.3),
potential_outcomes(Y_list ~ Y_star * Z + control_count)
) +
declare_inquiry(prevalence_rate = mean(Y_star)) +
declare_assignment(Z = complete_ra(N)) +
declare_measurement(Y_list = reveal_outcomes(Y_list ~ Z)) +
declare_estimator(Y_list ~ Z)
This is typically used to estimate average levels
However you can use it in the obvious way to get average levels for groups: this is equivalent to calculating group level heterogeneous effects
Extending the idea you can even get individual level estimates: for instance you might use causal forests
You can also use this to estimate the effect of an experimental treatment on an item that’s measured using a list, without requiring individual level estimates:
\[Y_i = \beta_0 + \beta_1Z_i + \beta_2Long_i + \beta_3Z_iLong_i\]
Note that here we looked at “hiders” – people not answering the direct question truthfully
See Li (2019) on bounds when the “no liars” assumption is threatened — this is about whether people respond truthfully to the list experimental question
Errors and diagnostics
See excellent introduction: Lee and Lemieux (2010)
Kids born on 31 August start school a year younger than kids born on 1 September: does starting younger help or hurt?
Kids born on 12 September 1983 are more likely to register Republican than kids born on 10 September 1983: can this identify the effects of registration on long term voting?
A district in which Republicans got 50.1% of the vote get a Republican representative while districts in which Republicans got 49.9% of the vote do not: does having a Republican representative make a difference for these districts?
Setting:
Two arguments:
Continuity: \(\mathbb{E}[Y(1)|X=x]\) and \(\mathbb{E}[Y(0)|X=x]\) are continuous (at \(x=0\)) in \(x\): so \(\lim_{\hat x \rightarrow 0}\mathbb{E}[Y(0)|X=\hat x] = \mathbb{E}[Y(0)|X=\hat 0]\)
Local randomization: tiny things that determine exact values of \(x\) are as if random and so we can think of a local experiment around \(X=0\).
Note:
Exclusion restriction is implicit in continuity: If something else happens at the threshold then the conditional expectation functions jump at the thresholds
Implicit: \(X\) is exogenous in the sense that units cannot adjust \(X\) in order to be on one or the other side of the threshold
Typically researchers show:
Typically researchers show:
In addition:
Sometimes:
Error in library(rdrobust): there is no package called 'rdrobust'
cutoff <- 0.5
bandwidth <- 0.5
control <- function(X) {
as.vector(poly(X, 4, raw = TRUE) %*% c(.7, -.8, .5, 1))}
treatment <- function(X) {
as.vector(poly(X, 4, raw = TRUE) %*% c(0, -1.5, .5, .8)) + .25}
rdd_design <-
declare_model(
N = 1000,
U = rnorm(N, 0, 0.1),
X = runif(N, 0, 1) + U - cutoff,
D = 1 * (X > 0),
Y_D_0 = control(X) + U,
Y_D_1 = treatment(X) + U
) +
declare_inquiry(LATE = treatment(0) - control(0)) +
declare_measurement(Y = reveal_outcomes(Y ~ D)) +
declare_sampling(S = X > -bandwidth & X < bandwidth) +
declare_estimator(Y ~ D*X, term = "D", label = "lm") +
declare_estimator(
Y, X,
term = "Bias-Corrected",
.method = rdrobust_helper,
label = "optimal"
)
Note rdrobust
implements:
See Calonico, Cattaneo, and Titiunik (2014) and related papers ? rdrobust::rdrobust
Estimator | Mean Estimate | Bias | SD Estimate | Coverage |
---|---|---|---|---|
lm | 0.23 | -0.02 | 0.01 | 0.64 |
(0.00) | (0.00) | (0.00) | (0.02) | |
optimal | 0.25 | 0.00 | 0.03 | 0.89 |
(0.00) | (0.00) | (0.00) | (0.01) |
Are popular in political science:
See Keele and Titiunik (2015)