Causal Inference

Topics II

Macartan Humphreys

1 Topics 2

1.1 Synthetic controls

Interested in the effect of an intervention on a single unit.

Available data: time series for unit of interest and possible comparison groups

1.1.1 Core idea

Generate a Frankenstein control from weighting a set of control units so that they collectively track the unit of interest and can plausibly serve as a counterfactual.
Identify weights \(w\) that minimize \(\left(Y_i^s(0) - \sum_{j\neq i} w_j Y_j^s(0)\right)^2\) in time period(s) \(s\), prior to treatment; use these to predict counterfactual outcomes \(Y^t_i(0)\) in later periods, \(t\)
Note, in Abadie and Gardeazabal (2003) weights are chosen to minimize \((X_1 - X_0W)'V(X_1 - X_0W)\) where \(X_1\), \(X_0\) are covariates in the unit of interest and the “donor” units respectively. \(V\) is chosen so that the pre treatment outcomes are as close as possible.

1.2 Optimization problem

In practice weights are found using a two stage minimization procedure:

Given matrix \(V\) choose \(W(V)\) to minimize \((X_1-X_0W)'V(X1-X_0W)\)
Choose \(V\) to minimize \((Z_1-Z_0W(V))'(X1-X_0W(V))\)

So in effect you end up with weights on cases and also weights on variables

1.2.1 Basque example

The seminal study is: Abadie and Gardeazabal (2003) which examines the impact of conflict on the Basque area
A vignette showing how to use the SCtools package to implement synthetic controls for this example is (here)[https://cran.r-project.org/web/packages/SCtools/vignettes/replicating-basque.html]
We will walk through this and then interrogate this with a common package and a tidy version

1.2.2 Basque example

Packages and data:

library(SCtools)
library(Synth)
data("basque")

1.2.3 Basque example

Some fairly nasty data manipulation:

f_dataprep  <- function(data, time_plot = min(data$year):max(data$year))
  dataprep(
  foo = data,
  predictors = c("school.illit", "school.prim", "school.med",
     "school.high", "school.post.high", 
    "invest"),
  predictors.op = "mean",
  time.predictors.prior = 1964:1969,
  special.predictors = list(
    list("gdpcap", 1960:1969 ,"mean"),
    list("sec.agriculture", seq(1961, 1969, 2), "mean"),
    list("sec.energy", seq(1961, 1969, 2), "mean"),
    list("sec.industry", seq(1961, 1969, 2), "mean"),
    list("sec.construction", seq(1961, 1969, 2), "mean"),
    list("sec.services.venta", seq(1961, 1969, 2), "mean"),
    list("sec.services.nonventa", seq(1961, 1969, 2), "mean"),
    list("popdens",               1969,               "mean")),
  dependent = "gdpcap",
  unit.variable = "regionno",
  unit.names.variable = "regionname",
  time.variable = "year",
  treatment.identifier = 17,
  controls.identifier = c(2:16, 18),
  time.optimize.ssr = 1960:1969,
  time.plot = time_plot)

dataprep.out  <- f_dataprep(basque)

1.2.4 Implementation (with messages)

synth.out <- synth(data.prep.obj = dataprep.out, method = "BFGS")


X1, X0, Z1, Z0 all come directly from dataprep object.


**************** 
 searching for synthetic control unit  
 

**************** 
**************** 
**************** 

MSPE (LOSS V): 0.008864606 

solution.v:
 0.02773094 1.194e-07 1.60609e-05 0.0007163836 1.486e-07 0.002423908 0.0587055 0.2651997 0.02851006 0.291276 0.007994382 0.004053188 0.009398579 0.303975 

solution.w:
 2.53e-08 4.63e-08 6.44e-08 2.81e-08 3.37e-08 4.844e-07 4.2e-08 4.69e-08 0.8508145 9.75e-08 3.2e-08 5.54e-08 0.1491843 4.86e-08 9.89e-08 1.162e-07

1.2.5 Results

This produces the following weights, selected to render the pre-treatment trends of the synthetic and actual controls as similar as possible.

regionno	regionname	w.weight
1	Spain (Espana)	NA
2	Andalucia	0.00
3	Aragon	0.00
4	Principado De Asturias	0.00
5	Baleares (Islas)	0.00
6	Canarias	0.00
7	Cantabria	0.00
8	Castilla Y Leon	0.00
9	Castilla-La Mancha	0.00
10	Cataluna	0.85
11	Comunidad Valenciana	0.00
12	Extremadura	0.00
13	Galicia	0.00
14	Madrid (Comunidad De)	0.15
15	Murcia (Region de)	0.00
16	Navarra (Comunidad Foral De)	0.00
17	Basque Country (Pais Vasco)	NA
18	Rioja (La)	0.00

1.2.6 Plotting the counterfactual

Lets plot by hand and compare with package

basque_df <- basque |> filter(regionno == 17) |> 
  mutate(counterfactual = .15*basque$gdpcap[basque$regionno == 14] + .85*basque$gdpcap[basque$regionno == 10] )

basque_df |> ggplot(aes(year, gdpcap)) + geom_line(color = "blue") + 
  geom_line(aes(year, counterfactual), color = "red") + theme_bw() +
  ylab("real per-capita GDP (1986 USD, thousand)") + xlab("year")

1.2.7 Plotting 2

path.plot(synth.res = synth.out, dataprep.res = dataprep.out,
          Ylab = "real per-capita GDP (1986 USD, thousand)", Xlab = "year",
          Ylim = c(0, 12), Legend = c("Basque country",
                                      "synthetic Basque country"), 
          Legend.position = "bottomright")

1.2.8 Then what?

What is your estimate?

How do you do inference?

Estimate could be at a particular point in time or averaging over a range. Inference often based on placebos: assess whether estimated effects are large relative to what you would get if you looked for effects in random places.
This has an RI flavor to it, but we are not exploiting any actual randomization.
Alternatives are to do sensitivity analyses, bootstrap (though this is really capturing sampling variability), and perhaps using estimates of the uncertainty from errors in pre treatment predictions.

1.2.9 `tidysynth` implementation

tidysynth a wrapper for synth and tidier for sure

Vignette on: Impact of Proposition 99 on cigarette consumption in California. https://github.com/edunford/tidysynth

data("smoking")
smoking |> head() |> kable()

state	year	cigsale	lnincome	beer	age15to24	retprice
Rhode Island	1970	123.9	NA	NA	0.1831579	39.3
Tennessee	1970	99.8	NA	NA	0.1780438	39.9
Indiana	1970	134.6	NA	NA	0.1765159	30.6
Nevada	1970	189.5	NA	NA	0.1615542	38.9
Louisiana	1970	115.9	NA	NA	0.1851852	34.3
Oklahoma	1970	108.4	NA	NA	0.1754592	38.4

1.2.10 `tidysynth` implementation

In a pipe:

smoking_out <-
  
  smoking %>%
  
  # initial the synthetic control object
  synthetic_control(
    outcome = cigsale, # outcome
    unit = state, # unit index in the panel data
    time = year, # time index in the panel data
    i_unit = "California", # unit where the intervention occurred
    i_time = 1988, # time period when the intervention occurred
    generate_placebos=T # generate placebo synthetic controls (for inference)
    )

1.2.11 add predictor information

smoking_out <- smoking_out %>%
  
  # Generate the aggregate predictors used to fit the weights
  
  # average log income, retail price of cigarettes, and proportion of the
  # population between 15 and 24 years of age from 1980 - 1988
  generate_predictor(time_window = 1980:1988,
                     ln_income = mean(lnincome, na.rm = T),
                     ret_price = mean(retprice, na.rm = T),
                     youth = mean(age15to24, na.rm = T)) %>%
  
  # average beer consumption in the donor pool from 1984 - 1988
  generate_predictor(time_window = 1984:1988, 
                     beer_sales = mean(beer, na.rm = T)) %>%
  
  # Lagged cigarette sales 
  generate_predictor(time_window = 1975, cigsale_1975 = cigsale) %>%
  generate_predictor(time_window = 1980, cigsale_1980 = cigsale) %>%
  generate_predictor(time_window = 1988, cigsale_1988 = cigsale)

1.2.12 implement

smoking_out <- smoking_out %>%
  
  
  # Generate the fitted weights for the synthetic control
  generate_weights(optimization_window = 1970:1988, # time for optimization
                   margin_ipop = .02,sigf_ipop = 7,bound_ipop = 6 # options
  ) %>%
  
  # Generate the synthetic control
  generate_control()

1.2.13 plot trends

smoking_out %>% plot_trends()

1.2.14 plot weights

smoking_out %>% plot_weights()

1.2.15 plot placebos

smoking_out %>% plot_placebos()

1.2.16 Design declaration

sc_helper <- function(data, treatment_time = 2010)

  data |>
  
  # initial the synthetic control object
  synthetic_control(
    outcome = Y, # outcome
    unit = state, # unit index in the panel data
    time = year, # time index in the panel data
    i_unit = "01", # unit where the intervention occurred
    i_time = treatment_time, # time period when the intervention occurred
    generate_placebos = F  # generate placebo synthetic controls
    )  |>
  
  generate_predictor(time_window = 2000:2020,
                     lnincome = mean(ln_income, na.rm = T),
                     youth = mean(age15to24, na.rm = T)) %>%
  generate_weights(optimization_window = 2000:2010, # time for optimization
                   margin_ipop = .02,sigf_ipop = 7,bound_ipop = 6 # options
  ) |>
  generate_control()

1.2.17 Declare an estimator

sc_estimator <- function(data, treatment_time = 2010) 
  
  data.frame(estimate = 
               sc_helper(data, treatment_time = 2010) |>
               grab_synthetic_control() |> 
               mutate(difference = real_y - synth_y) |>
               filter(time_unit >= treatment_time) |> 
               pull(difference) |> 
               mean()
)

1.2.18 Design

design <-
  declare_model(
    state = add_level(50, 
                      youth_mean = runif(N, .2, .7), 
                      income_mean = runif(N, 1, 5), 
                      b = runif(N, 0, 10)),
    time = add_level(21, year = 2000:2020, nest = FALSE),
    unit = cross_levels(join_using(state, time))) +
  
  declare_model(
    treatment = 1*(state == "01" & (year >=2010)),
    age15to24 = youth_mean + .5*runif(N, -.1, .1) - .2*(year- 2000),
    ln_income = income_mean + .5*rlnorm(N),
    Y = 5*age15to24 + .1*(year- 2000)^2 +  1*ln_income + b* treatment) +
  
  declare_inquiry(b[1]) +
  
  declare_estimator(handler = label_estimator(sc_estimator))

1.2.19 Inspection

draw_data(design) |> sc_helper()  |> plot_trends()

1.2.20 Simulation

simulations <- simulate_design(design, sims = 10)

simulations |> ggplot(aes(estimand, estimate)) + geom_point()

1.3 Noncompliance and the LATE estimand

1.3.1 Local Average Treatment Effects

Sometimes you give a medicine but only a nonrandom sample of people actually try to use it. Can you still estimate the medicine’s effect?

	X=0	X=1
Z=0	\(\overline{y}_{00}\) (\(n_{00}\))	\(\overline{y}_{01}\) (\(n_{01}\))
Z=1	\(\overline{y}_{10}\) (\(n_{10}\))	\(\overline{y}_{11}\) (\(n_{11}\))

Say that people are one of 3 types:

\(n_a\) “always takers” have \(X=1\) no matter what and have average outcome \(\overline{y}_a\)
\(n_n\) “never takers” have \(X=0\) no matter what with outcome \(\overline{y}_n\)
\(n_c\) “compliers have” \(X=Z\) and average outcomes \(\overline{y}^1_c\) if treated and \(\overline{y}^0_c\) if not.

1.3.2 Local Average Treatment Effects

Sometimes you give a medicine but only a non random sample of people actually try to use it. Can you still estimate the medicine’s effect?

	X=0	X=1
Z=0	\(\overline{y}_{00}\) (\(n_{00}\))	\(\overline{y}_{01}\) (\(n_{01}\))
Z=1	\(\overline{y}_{10}\) (\(n_{10}\))	\(\overline{y}_{11}\) (\(n_{11}\))

We can figure something about types:

	\(X=0\)	\(X=1\)
\(Z=0\)	\(\frac{\frac{1}{2}n_c}{\frac{1}{2}n_c + \frac{1}{2}n_n} \overline{y}^0_{c}+\frac{\frac{1}{2}n_n}{\frac{1}{2}n_c + \frac{1}{2}n_n} \overline{y}_{n}\)	\(\overline{y}_{a}\)
\(Z=1\)	\(\overline{y}_{n}\)	\(\frac{\frac{1}{2}n_c}{\frac{1}{2}n_c + \frac{1}{2}n_a} \overline{y}^1_{c}+\frac{\frac{1}{2}n_a}{\frac{1}{2}n_c + \frac{1}{2}n_a} \overline{y}_{a}\)

1.3.3 Local Average Treatment Effects

You give a medicine to 50% but only a non random sample of people actually try to use it. Can you still estimate the medicine’s effect?

	\(X=0\)	\(X=1\)
\(Z=0\)	\(\frac{n_c}{n_c + n_n} \overline{y}^0_{c}+\frac{n_n}{n_c + n_n} \overline{y}_n\)	\(\overline{y}_{a}\)
(n)	(\(\frac{1}{2}(n_c + n_n)\))	(\(\frac{1}{2}n_a\))
\(Z=1\)	\(\overline{y}_{n}\)	\(\frac{n_c}{n_c + n_a} \overline{y}^1_{c}+\frac{n_a}{n_c + n_a} \overline{y}_{a}\)
(n)	(\(\frac{1}{2}n_n\))	(\(\frac{1}{2}(n_a+n_c)\))

Key insight: the contributions of the \(a\)s and \(n\)s are the same in the \(Z=0\) and \(Z=1\) groups so if you difference you are left with the changes in the contributions of the \(c\)s.

1.3.4 Local Average Treatment Effects

Average in \(Z=0\) group: \(\frac{{n_c} \overline{y}^0_{c}+ \left(n_{n}\overline{y}_{n} +{n_a} \overline{y}_a\right)}{n_a+n_c+n_n}\)

Average in \(Z=1\) group: \(\frac{{n_c} \overline{y}^1_{c} + \left(n_{n}\overline{y}_{n} +{n_a} \overline{y}_a \right)}{n_a+n_c+n_n}\)

So, the difference is the ITT: \(({\overline{y}^1_c-\overline{y}^0_c})\frac{n_c}{n}\)

Last step:

\[ITT = ({\overline{y}^1_c-\overline{y}^0_c})\frac{n_c}{n}\]

\[\leftrightarrow\]

\[LATE = \frac{ITT}{\frac{n_c}{n}}= \frac{\text{Intent to treat effect}}{\text{First stage effect}}\]

1.3.5 The good and the bad of LATE

You get a well-defined estimate even when there is non-random take-up
May sometimes be used to assess mediation or knock-on effects
But:
- You need assumptions (monotonicity and the exclusion restriction – where were these used above?)
- Your estimate is only for a subpopulation
- The subpopulation is not chosen by you and is unknown
- Different encouragements may yield different estimates since they may encourage different subgroups

1.3.6 Pearl and Chickering again

With and without an imposition of monotonicity

data("lipids_data")

models <- 
  list(unrestricted =  make_model("Z -> X -> Y; X <-> Y"),
       restricted =  make_model("Z -> X -> Y; X <-> Y") |>
         set_restrictions("X[Z=1] < X[Z=0]")) |> 
  lapply(update_model,  data = lipids_data, refresh = 0) 

models |>
  query_model(query = list(CATE = "Y[X=1] - Y[X=0]", 
                           Nonmonotonic = "X[Z=1] < X[Z=0]"),
              given = list("X[Z=1] > X[Z=0]", TRUE),
              using = "posteriors")

1.3.7 Pearl and Chickering again

With and without an imposition of monotonicity:

model	query	mean	sd
unrestricted	CATE	0.70	0.05
restricted	CATE	0.71	0.05
unrestricted	Nonmonotonic	0.01	0.01
restricted	Nonmonotonic	0.00	0.00

In one case we assume monotonicity, in the other we update on it (easy in this case because of the empirically verifiable nature of one sided non compliance)

1.4 Ethics

1.4.1 Constraint: Is it ethical to manipulate subjects for research purposes?

There is no foundationless answer to this question.
Belmont principles commonly used for guidance:
1. Respect for persons
2. Beneficence
3. Justice
Unfortunately, operationalizing these requires further ethical theories. (1) is often operationalized by informed consent (a very liberal idea). (2) and (3) sometimes by more utiliarian principles
The major focus on (1) by IRBs might follow from the view that if subjects consent, then they endorse the ethical calculations made for 2 and 3 — they think that it is good and fair.
Trickiness: can a study be good or fair because of implications for non-subjects?

1.4.2 Is it ethical to manipulate subjects for research purposes?

Many (many) field experiments have nothing like informed consent.
For example, whether the government builds a school in your village, whether an ad appears on your favorite radio show, and so on.
Consider three cases:
1. You work with a nonprofit to post (true?) posters about the crimes of politicians on billboards to see effects on voters
2. You hire confederates to offer bribes to police officers to see if they are more likely to bend the law for coethnics
3. The British government asks you to work on figuring out how the use of water cannons helps stop rioters rioting

1.4.3 Is it ethical to manipulate subjects for research purposes?

Consider three cases:
- You work with a nonprofit to post (true?) posters about the crimes of politicians on billboards to see effects on voters
- You hire confederates to offer bribes to police officers to see if they are more likely to bend the law for coethnics
- The British government asks you to work on figuring out how the use of water cannons helps stop rioters rioting
In all cases, there is no consent given by subjects.
In 2 and 3, the treatment is possibly harmful for subjects, and the results might also be harmful. But even in case 1, there could be major unintended harmful consequences.
In cases 1 and 3, however, the “intervention” is within the sphere of normal activities for the implementer.

1.4.4 Constraint: Is it ethical to manipulate subjects for research purposes?

Sometimes it is possible to use this point of difference to make a “spheres of ethics” argument for “embedded experimentation.”
Spheres of Ethics Argument: Experimental research that involves manipulations that are not normally appropriate for researchers may nevertheless be ethical if:
- Researchers and implementers agree on a division of responsibility where implementers take on responsibility for actions
- Implementers have legitimacy to make these decisions within the sphere of the intervention
- Implementers are indeed materially independent of researchers (no swapping hats)

1.4.5 Constraint: Is it ethical to manipulate subjects for research purposes?

Difficulty with this argument:
- Question begging: How to determine the legitimacy of the implementer? (Can we rule out Nazi doctors?)

Otherwise keep focus on consent and desist if this is not possible

1.4.6 APSA Guidelines

Available here
Used e.g. by APSR
Below is lightly abbreviated; full text however has detailed guidelines

1.4.7 APSA Ethics: General [Abbr]

Political science researchers should respect autonomy, consider the wellbeing of participants and other people affected by their research, and be open about the ethical issues they face.
Political science researchers have an individual responsibility to consider the ethics of their research related activities and cannot outsource ethical reflection to review boards, other institutional bodies, or regulatory agencies.
These principles describe the standards of conduct and reflexive openness that are expected of political science researchers. … [In cases of reasonable deviations], researchers should acknowledge and justify deviations in scholarly publications and presentations of their work.

1.4.8 APSA Ethics: Power

When designing and conducting research, political scientists should be aware of power differentials between researcher and researched, and the ways in which such power differentials can affect the voluntariness of consent and the evaluation of risk and benefit.

especially with low-power or vulnerable participants
covert or deceptive research with more than minimal harm may sometimes be ethically permissible in research with powerful parties

1.4.10 APSA Ethics: Deception

Political science researchers should carefully consider any use of deception and the ways in which deception can conflict with participant autonomy.

ask: is it plausible that engaged individuals would withhold consent if fully informed consent were sought
disclose, justify,, and describe steps taken to respect participant autonomy.

[Note: no general injunction against]

1.4.11 APSA Ethics: Harm and Trauma

Political science researchers should consider the harms associated with their research.

Researchers should generally avoid harm when possible, minimize harm when avoidance is not possible, and not conduct research when harm is excessive.
do not limit concern to physical and psychological risks to the participant.

Political science researchers should anticipate and protect individual participants from trauma stemming from participation in research.

1.4.12 APSA Ethics: Confidentiality

Political science researchers should generally keep the identities of research participants confidential; when circumstances require, researchers should adopt the higher standard of ensuring anonymity.

Researchers should clearly communicate assurances of confidentiality / anonymity
If confidentiality bit provided, communicate this and justify c./d. consider risks at all stages
Researchers who determine that it would be unethical to share materials derived from human subjects should be prepared to justify their decision to journal editors, to reviewers, etc

1.4.13 APSA Ethics: Impact

Political science researchers conducting studies on political processes should consider the broader social impacts of the research process as well as the impact on the experience of individuals directly engaged by the research. In general, political science researchers should not compromise the integrity of political processes for research purposes without the consent of individuals that are directly engaged by the research process.

cases in which research that produces impacts on political processes without consent of individuals directly engaged by the research might be appropriate. [examples]
Studies of interventions by third parties do not usually invoke this principle on impact. [details]
This principle is not intended to discourage any form of political engagement by political scientists in their non-research activities or private lives.
researchers should report likely impacts

1.4.14 APSA Ethics: Laws, Regulations, and Prospective Review

Political science researchers should be aware of relevant laws and regulations governing their research related activities.

1.4.15 APSA Ethics: Shared Responsibility

The responsibility to promote ethical research goes beyond the individual researcher or research team.

Mentors, advisors, dissertation committee members, and instructors
Graduate programs in political science should include ethics instruction in their formal and informal graduate curricula;
Editors and reviewers should encourage researchers to be open about the ethical decisions …
Journals, departments, and associations should incorporate ethical commitments into their mission, bylaws, instruction, practices, and procedures.

1.5 Survey experiments

Survey experiments are used to measure things: nothing (except answers) should be changed!
If the experiment in the survey is changing things then it is a field experiment in a survey, not a survey experiment

1.5.1 The list experiment: Motivation

Multiple survey experimental designs have been generated to make it easier for subjects to answer sensitive questions
The key idea is to use inference rather than measurement.
Subjects are placed in different conditions and the conditions affect the answers that are given in such a way that you can infer some underlying quantity of interest

1.5.2 The list experiment: Motivation

This is an obvious DAG but the main point is to be clear that the Value is the quantity of interest and the value is not affected by the treatment, Z.

1.5.3 The list experiment: Motivation

The list experiment supposes that:

Subjects do not want to give a direct answer to a question
They nevertheless are willing to truthfully answer an indirect question

In other words: sensitivities notwithstanding, they are happy for the researcher to make correct inferences about them or their group

1.5.4 The list experiment: Strategy

Respondents are given a short list and a long list.
The long list differs from the short list in having one extra item—the sensitive item
We ask how many items in each list does a respondent agree with:
- \(Y_i(0)\) is the number of elements on a short list that a respondent agrees with
- \(Y_i(1)\) is the number of elements on a long list that a respondent agrees with
- \(Y_i(1) - Y_i(0)\) is an indicator for whether an individual agrees with the sensitive item
- \(\mathbb{E}[Y_i(1) - Y_i(0)]\) is the share of people agreeing with sensitive item

1.5.5 The list experiment: Simplified example

How many of these do you agree with:

	Short list	Long list	“Effect”
	“2 + 2 = 4”	“2 + 2 = 4”
	“2 * 3 = 6”	“2 * 3 = 6”
	“3 + 6 = 8”	“Climate change is real”
		“3 + 6 = 8”
Answer	Y(0) = 2	Y(1) = 4	Y(1) - Y(0) = 2

[Note: this is obviously not a good list. Why not?]

1.5.6 The list experiment: Design

declaration_17.3 <-
  declare_model(
    N = 500,
    control_count = rbinom(N, size = 3, prob = 0.5),
    Y_star = rbinom(N, size = 1, prob = 0.3),
    potential_outcomes(Y_list ~ Y_star * Z + control_count) 
  ) +
  declare_inquiry(prevalence_rate = mean(Y_star)) +
  declare_assignment(Z = complete_ra(N)) + 
  declare_measurement(Y_list = reveal_outcomes(Y_list ~ Z)) +
  declare_estimator(Y_list ~ Z, .method = difference_in_means, 
                    inquiry = "prevalence_rate")

diagnosands <- declare_diagnosands(
  bias = mean(estimate - estimand),
  mean_CI_width = mean(conf.high - conf.low)
)

1.5.7 Diagnosis

diagnose_design(declaration_17.3, diagnosands = diagnosands)

Design	Inquiry	Bias	Mean CI Width
declaration_17.3	prevalence_rate	0.00	0.32
		(0.00)	(0.00)

1.5.8 Tradeoffs: is the question really sensitive?

declaration_17.4 <- 
  declare_model(
    N = N,
    U = rnorm(N),
    control_count = rbinom(N, size = 3, prob = 0.5),
    Y_star = rbinom(N, size = 1, prob = 0.3),
    W = case_when(Y_star == 0 ~ 0L,
                  Y_star == 1 ~ rbinom(N, size = 1, prob = proportion_hiding)),
    potential_outcomes(Y_list ~ Y_star * Z + control_count)
  ) +
  declare_inquiry(prevalence_rate = mean(Y_star)) +
  declare_assignment(Z = complete_ra(N)) + 
  declare_measurement(Y_list = reveal_outcomes(Y_list ~ Z),
                      Y_direct = Y_star - W) +
  declare_estimator(Y_list ~ Z, inquiry = "prevalence_rate", label = "list") + 
  declare_estimator(Y_direct ~ 1, inquiry = "prevalence_rate", label = "direct")

1.5.9 Diagnosis

declaration_17.4 |> 
  redesign(proportion_hiding = seq(from = 0, to = 0.3, by = 0.1), 
           N = seq(from = 500, to = 2500, by = 500)) |> 
  diagnose_design()

1.5.10 Negatively correlated items

How would estimates be affected if the items selected for the list were negatively correlated?
How would subject protection be affected?

1.5.11 Negatively correlated items

rho <- -.8 

correlated_lists <- 
  declare_model(
    N = 500,
    U = rnorm(N),
    control_1 = rbinom(N, size = 1, prob = 0.5),
    control_2 = correlate(given = control_1, rho = rho, draw_binary, prob = 0.5),
    control_count = control_1 + control_2,
    Y_star = rbinom(N, size = 1, prob = 0.3),
    potential_outcomes(Y_list ~ Y_star * Z + control_count)
  ) +
  declare_inquiry(prevalence_rate = mean(Y_star)) +
  declare_assignment(Z = complete_ra(N)) + 
  declare_measurement(Y_list = reveal_outcomes(Y_list ~ Z)) +
  declare_estimator(Y_list ~ Z)

1.5.12 Negatively correlated items

draw_data(correlated_lists) |> ggplot(aes(control_count)) + 
  geom_histogram() + theme_bw()

1.5.13 Negatively correlated items

correlated_lists |> redesign(rho = c(-.8, 0, .8)) |> diagnose_design()

These trade-off against each other: the more accuracy you have the less protection you have

1.5.14 Individual or group effects?

This is typically used to estimate average levels
However you can use it in the obvious way to get average levels for groups: this is equivalent to calculating group level heterogeneous effects
Extending the idea you can even get individual level estimates: for instance you might use causal forests
You can also use this to estimate the effect of an experimental treatment on an item that’s measured using a list, without requiring individual level estimates:

\[Y_i = \beta_0 + \beta_1Z_i + \beta_2Long_i + \beta_3Z_iLong_i\]

1.5.15 Hiders and liars

Note that here we looked at “hiders” – people not answering the direct question truthfully
See Li (2019) on bounds when the “no liars” assumption is threatened — this is about whether people respond truthfully to the list experimental question

1.6 Regression discontintuity

Errors and diagnostics

1.6.1 Intuition

The core idea in an RDD design is that if a decision rule assigns units that are almost identical to each other to treatment and control conditions then we can infer effects for those cases by looking at those cases.

See excellent introduction: Lee and Lemieux (2010)

1.6.2 Intuition

Kids born on 31 August start school a year younger than kids born on 1 September: does starting younger help or hurt?
Kids born on 12 September 1983 are more likely to register Republican than kids born on 10 September 1983: can this identify the effects of registration on long term voting?
A district in which Republicans got 50.1% of the vote get a Republican representative while districts in which Republicans got 49.9% of the vote do not: does having a Republican representative make a difference for these districts?

1.6.3 Argument for identification

Setting:

Typically the decision is based on a value on a “running variable”, \(X\). e.g. Treatment if \(X > 0\)
The estimand is \(\mathbb{E}[Y(1) - Y(0)|X=0]\)

Two arguments:

Continuity: \(\mathbb{E}[Y(1)|X=x]\) and \(\mathbb{E}[Y(0)|X=x]\) are continuous (at \(x=0\)) in \(x\): so \(\lim_{\hat x \rightarrow 0}\mathbb{E}[Y(0)|X=\hat x] = \mathbb{E}[Y(0)|X=\hat 0]\)
Local randomization: tiny things that determine exact values of \(x\) are as if random and so we can think of a local experiment around \(X=0\).

1.6.4 Argument for identification

Note:

continuity argument requires continuous \(x\): granularity
also builds off a conditional expectation function defined at \(X=0\)

Exclusion restriction is implicit in continuity: If something else happens at the threshold then the conditional expectation functions jump at the thresholds

Implicit: \(X\) is exogenous in the sense that units cannot adjust \(X\) in order to be on one or the other side of the threshold

1.6.5 Evidence

Typically researchers show:

“First stage” results: assignment to treatment does indeed jump at the threshold
“ITT”: outcomes jump at the threshold
LATE (if fuzzy / imperfect compliance) using IV

1.6.6 Evidence

Typically researchers show:

In addition:

Arguments for no other treatments at the threshold
Arguments for no “sorting” at the threshold
Evidence for no “heaping” at the threshold (McCrary density test)

Sometimes:

argue for why estimates extend beyond the threshold
exclude points at the threshold (!)

1.6.7 Design

library(rdss) # for helper functions
library(rdrobust)

Error in library(rdrobust): there is no package called 'rdrobust'

cutoff <- 0.5
bandwidth <- 0.5

control <- function(X) {
  as.vector(poly(X, 4, raw = TRUE) %*% c(.7, -.8, .5, 1))}
treatment <- function(X) {
  as.vector(poly(X, 4, raw = TRUE) %*% c(0, -1.5, .5, .8)) + .25}

rdd_design <-
  declare_model(
    N = 1000,
    U = rnorm(N, 0, 0.1),
    X = runif(N, 0, 1) + U - cutoff,
    D = 1 * (X > 0),
    Y_D_0 = control(X) + U,
    Y_D_1 = treatment(X) + U
  ) +
  declare_inquiry(LATE = treatment(0) - control(0)) +
  declare_measurement(Y = reveal_outcomes(Y ~ D)) + 
  declare_sampling(S = X > -bandwidth & X < bandwidth) +
  declare_estimator(Y ~ D*X, term = "D", label = "lm") + 
  declare_estimator(
    Y, X, 
    term = "Bias-Corrected",
    .method = rdrobust_helper,
    label = "optimal"
  )

1.6.8 RDD Data plotted

Note rdrobust implements:

local polynomial Regression Discontinuity (RD) point estimators
robust bias-corrected confidence intervals

See Calonico, Cattaneo, and Titiunik (2014) and related papers ? rdrobust::rdrobust

1.6.9 RDD Data plotted

rdd_design  |> draw_data() |> ggplot(aes(X, Y, color = factor(D))) + 
  geom_point(alpha = .3) + theme_bw() +
  geom_smooth(aes(X, Y_D_0)) + geom_smooth(aes(X, Y_D_1)) + theme(legend.position = "none")

1.6.10 RDD diagnosis

rdd_design |> diagnose_design()

Estimator	Mean Estimate	Bias	SD Estimate	Coverage
lm	0.23	-0.02	0.01	0.64
	(0.00)	(0.00)	(0.00)	(0.02)
optimal	0.25	0.00	0.03	0.89
	(0.00)	(0.00)	(0.00)	(0.01)

1.6.11 Bandwidth tradeoff

rdd_design |> 
  redesign(bandwidth = seq(from = 0.05, to = 0.5, by = 0.05)) |> 
  diagnose_designs()

As we increase the bandwidth, the lm bias gets worse, but slowly, while the error falls.
The best bandwidth is relatively wide.
This is more true for the optimal estimator.

1.6.12 Geographic RDs

Are popular in political science:

Put a lot of pressure on assumption of no alternative treatment—including “random” country level shocks!
Put a lot of pressure on no sorting assumptions (why was the border put where it was; why did units settle here or there?)
Put a lot of pressure on SUTVA: people on one side are literally proximate to people on another

See Keele and Titiunik (2015)

Abadie, Alberto, and Javier Gardeazabal. 2003. “The Economic Costs of Conflict: A Case Study of the Basque Country.” American Economic Review 93 (1): 113–32.

Calonico, Sebastian, Matias D Cattaneo, and Rocio Titiunik. 2014. “Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs.” Econometrica 82 (6): 2295–2326.

Keele, Luke J, and Rocio Titiunik. 2015. “Geographic Boundaries as Regression Discontinuities.” Political Analysis 23 (1): 127–55.

Lee, David S, and Thomas Lemieux. 2010. “Regression Discontinuity Designs in Economics.” Journal of Economic Literature 48 (2): 281–355.

Li, Yimeng. 2019. “Relaxing the No Liars Assumption in List Experiment Analyses.” Political Analysis 27 (4): 540–55.

Causal Inference

1 Topics 2

1.1 Synthetic controls

1.1.1 Core idea

1.2 Optimization problem

1.2.1 Basque example

1.2.2 Basque example

1.2.3 Basque example

1.2.4 Implementation (with messages)

1.2.5 Results

1.2.6 Plotting the counterfactual

1.2.7 Plotting 2

1.2.8 Then what?

1.2.9 tidysynth implementation

1.2.10 tidysynth implementation

1.2.11 add predictor information

1.2.12 implement

1.2.13 plot trends

1.2.14 plot weights

1.2.15 plot placebos

1.2.16 Design declaration

1.2.17 Declare an estimator

1.2.18 Design

1.2.19 Inspection

1.2.20 Simulation

1.3 Noncompliance and the LATE estimand

1.3.1 Local Average Treatment Effects

1.3.2 Local Average Treatment Effects

1.3.3 Local Average Treatment Effects

1.3.4 Local Average Treatment Effects

1.3.5 The good and the bad of LATE

1.3.6 Pearl and Chickering again

1.3.7 Pearl and Chickering again

1.4 Ethics

1.4.1 Constraint: Is it ethical to manipulate subjects for research purposes?

1.4.2 Is it ethical to manipulate subjects for research purposes?

1.4.3 Is it ethical to manipulate subjects for research purposes?

1.4.4 Constraint: Is it ethical to manipulate subjects for research purposes?

1.4.5 Constraint: Is it ethical to manipulate subjects for research purposes?

1.4.6 APSA Guidelines

1.4.7 APSA Ethics: General [Abbr]

1.4.8 APSA Ethics: Power

1.4.9 APSA Ethics: Consent

1.4.10 APSA Ethics: Deception

1.4.11 APSA Ethics: Harm and Trauma

1.4.12 APSA Ethics: Confidentiality

1.4.13 APSA Ethics: Impact

1.4.14 APSA Ethics: Laws, Regulations, and Prospective Review

1.4.15 APSA Ethics: Shared Responsibility

1.5 Survey experiments

1.5.1 The list experiment: Motivation

1.5.2 The list experiment: Motivation

1.5.3 The list experiment: Motivation

1.5.4 The list experiment: Strategy

1.5.5 The list experiment: Simplified example

1.5.6 The list experiment: Design

1.5.7 Diagnosis

1.5.8 Tradeoffs: is the question really sensitive?

1.5.9 Diagnosis

1.5.10 Negatively correlated items

1.5.11 Negatively correlated items

1.5.12 Negatively correlated items

1.5.13 Negatively correlated items

1.5.14 Individual or group effects?

1.5.15 Hiders and liars

1.6 Regression discontintuity

1.6.1 Intuition

1.6.2 Intuition

1.6.3 Argument for identification

1.6.4 Argument for identification

1.6.5 Evidence

1.6.6 Evidence

1.6.7 Design

1.6.8 RDD Data plotted

1.6.9 RDD Data plotted

1.6.10 RDD diagnosis

1.6.11 Bandwidth tradeoff

1.6.12 Geographic RDs

1.2.9 `tidysynth` implementation

1.2.10 `tidysynth` implementation