SSDS 2023: Orientation

Macartan Humphreys

Today

General aims and structure
Expectations
Pointers for replication and re-analysis

General aims and structure

Aims

Develop skills in empirical methods
Exposure to open science practices
Strengthen professional networks
Engage deeply with fresh cutting edge work

Syllabus

is here

The speakers

All amazing and friendly types! Many are previous CU. A number were students in this class not long ago!

3 April, 2pm: Guy Grossman, Penn
4 April, 12am: Summer Lindsay, Rutgers
5 April, 2pm: Kate Baldwin, Yale
6 April, 2pm: Sarah Khan, Yale
10 April, 2pm: Tara Slough, NYU
11 April, 12am: Laura Paler, American
12 April, 2pm: Cyrus Samii, NYU
13 April, 2pm: Salma Moussa, Yale
17 April, 2pm: Ken Scheve, Yale
18 April, 12am: Gwyneth McClendon, NYU
19 April, 2pm: Internal
20 April, 2pm: Internal
21 April, 2pm: Abhit Bhandari, NYU

Expectations

All: Work in three “rep teams”
For registered students: On other days write short memo responses; 1 page max providing key reflections on the pieces (not summaries, not opinions). This is due into the class drive by midnight the day before. If you are presenting in a given day this is not required.
For registered students: Prepare a research design. Typically this contains: (i) a theoretical argument or motivation, (ii) a proposed empirical test of that argument (iii) a formal design object and (iv) a discussion of policy prescriptions that might result from the argument. We will discuss these in the final days of class. Docus on something that might turn into a research project.

Rep Teams

Teams

Rep team job

Engage with both methods and substance of paper
Access data (normally available 2 weeks beforehand, but indicate if not!)
Run basic replication of main results
Draft a pre-re-analysis plan
Implement new analyses
Share clean re-analysis files with class via drive and ultimately with the speaker (drive details to follow)
Prepare a presentation
Present to the speaker on the day
Join for lunch

Good behavior

Do not share data without agreement from speaker
Any new findings from the analyses do not belong to the class or the students that engaged in the replication. You are working with the data for training purposes not for research purposes.
Any public commentary has to be bland at best. If you have to tweet or related after sessions, these should be of no cause for embarrassment for speakers.
All critiques should be as constructive as possible

How to: Form

Replication files

Best in self-contained documents for easy third party viewing. e.g. .html via .qmd or .Rmd

Some examples:

Good coding rules

Metadata first
Call packages at the beginning: use pacman
Put options at the top
Call all data files once, at the top. Best is to call from an archive, when possible.
Use functions and define them at the top: comment them; useful sometimes to illustrate what they do
Replicate first, re-analyze second. Use sections.
Have subsections names afters specific tables, figures or analyses

Aim

Nothing local, everything relative: so please do not include hardcoded paths to your computer

First best: if someone has access to your .Rmd/.qmd file they can hit render or compile and the whole thing reproduces first time.
But: often you need ancillary files for data and code. That’s OK but aims should still be that with a self contained folder someone can open a master.Rmd file, hit compile and get everything. I usually have an input and an output subfolder.

Resources and ideas from the institute for replication https://i4replication.org/reproducibility.html

Longer term goal: replication package to make it wasier to access and share replications like these

How to: Contents

Sample TOC:

Motivation
Theory: A DAG
Design (e.g. ex ante perspective): A design diagnosis
Analysis:
- replication
- re-analysis: for robustness, for strengthening
Interpretation (explanation, implications, generalizability)

Don’t skip the big picture.

See: How to critique: https://macartan.github.io/teaching/how-to-critique

Biggest message: be probing but be sympthetic

How to: Theory

Strong recommend you try to initially present all arguments as a DAG

make_model(
   "Inequality -> Mobilization -> Democratization <- Threat <- Inequality;
   Pressure -> Democratization")  |> 
  plot()

How to: Theory

…and then use the DAG to describe new analysis, e.g. questions regarding:

identification
mechanisms
plausible or implausible assumptions
possible confounds

Straight replication

most cases code will be provided
encourage you not to rely on this; try from scratch

If you fail to replicate:

this might be your fault!
check your code against their code; compare Ns; break it down to estimate single quantities
if it’s a small discrepancy don’t worry
if it’s large, double check the text and your code and then check with authors (politely) to help make sense of things; always be generous

Re-analysis

Two distinct overarching goals:

Check robustness
Strengthen the work

Robustness: Things you might do:

Plot data to understand where effects are coming from and checking no weirdness
Swap out datasets (sometimes possible with observational data)
Check robustness to specification / assumptions
Check missingness patterns
Assess plausible external validity (e.g. Naoki’s new work)
Check for multiple comparisons
Check against pre-analysis plans

Strengthening: Things you might do

Principled exploration of heterogeneity (e.g. using causal forests)
Principled incorporation of controls (e.g. Lasso approaches)
Search for additional implications of theory: if the theory is right what should we see? If the theory is right where should effects be strongest or weakest?

Reanalysis: Avoiding fishing

There’s a risk of engaging in a type of fishing when you do replication or re-analysis.
Say you decide to control for something and a result disappears: is this a threat to tthe analysis?
Not necessarily
In our Research Design book we recommend using design delcaration to justify re-analysis decisions

Reanalysis: Avoiding fishing

e.g.

Home ground dominance. Holding the original M constant (i.e., the home ground of the original study), if you can show that a new answer strategy A’ yields better diagnosands than the original A, then A’ can be justified by home ground dominance.

Robustness to alternative models. A second justification for a change in answer strategy is that you can show that a new answer strategy is robust to both the original model M and a new, also plausible, M’.

Fin

Q&A

DeclareDesign

book: https://book.declaredesign.org/

The MIDA Framework

Q: Is my research design good?

A: Well let’s simulate it to see how it performs.

Q: What should I put in the simulation?

A: All elements of a research design.

Q: What are the elements of a research design?

A: M! I! D! A!

Four elements of any research design

Model: set of models of what causes what and how
Inquiry: a question stated in terms of the model
Data strategy: the set of procedures we use to gather information from the world (sampling, assignment, measurement)
Answer strategy: how we summarize the data produced by the data strategy

Four elements of any research design

Declaration

Telling the computer what M, I, D, and A are.

Diagnosis

Estimating “diagnosands” like power, bias, rmse, error rates, ethical harm, amount learned.

want to diagnose over model uncertainty

Redesign

Fine-tuning features of the data and answer strategies to understand how they change the diagnosands

Different sample sizes
Different randomization procedures
Different estimation strategies
Implementation: effort into compliance versus more effort into sample size

Very often you have to simulate!

This is too hard to work out from rules of thumb or power calculators
Specialized formulas exist for some diagnosands, but not all.

Key commands for making a design

declare_model()
declare_inquiry()
declare_assignment()
declare_measurement()
declare_inquiry
declare_estimator()

and there are more declare_ functions!

Key commands for using a design

draw_data(design)
draw_estimands(design)
draw_estimates(design)
get_estimates(design, data)
run_design(design), simulate_design(design)
diagnose_design(design)
redesign(design, N = 200)
design |> redesign(N = c(200, 400)) |> diagnose_designs()
compare_designs(), compare_diagnoses()

Cheat sheet

https://raw.githubusercontent.com/rstudio/cheatsheets/master/declaredesign.pdf

A simple design

N <- 100
b <- .5

design <- 
  declare_model(N = N, U = rnorm(N), 
                potential_outcomes(Y ~ b * Z + U)) + 
  declare_assignment(Z = simple_ra(N), Y = reveal_outcomes(Y ~ Z)) + 
  declare_inquiry(ate = mean(Y_Z_1 - Y_Z_0)) + 
  declare_estimator(Y ~ Z, inquiry = "ate", .method = lm_robust)

You now have a two arm design object in memory!

If you just type design it will run the design—a good check to make sure the design has been declared properly.

Make data from the design

data <- draw_data(design)

data |> head () |> kable()

ID	U	Y_Z_0	Y_Z_1	Z	Y
001	0.0241755	0.0241755	0.5241755	0	0.0241755
002	-0.2978318	-0.2978318	0.2021682	0	-0.2978318
003	-0.2818823	-0.2818823	0.2181177	1	0.2181177
004	1.7400711	1.7400711	2.2400711	1	2.2400711
005	0.6893446	0.6893446	1.1893446	1	1.1893446
006	-0.3798324	-0.3798324	0.1201676	1	0.1201676

Draw estimands

draw_estimands(design) |>
  kable(digits = 2)

inquiry	estimand
ate	0.5

Draw estimates

draw_estimates(design) |> 
  kable(digits = 2)

estimator	term	estimate	std.error	statistic	p.value	conf.low	conf.high	df	outcome	inquiry
estimator	Z	0.38	0.18	2.08	0.04	0.02	0.73	98	Y	ate

Get estimates

get_estimates(design, data) |>
  kable(digits = 2)

estimator	term	estimate	std.error	statistic	p.value	conf.low	conf.high	df	outcome	inquiry
estimator	Z	0.7	0.2	3.42	0	0.29	1.1	98	Y	ate

Simulate design

simulate_design(design, sims = 3) |>
  kable(digits = 2)

design	sim_ID	inquiry	estimand	estimator	term	estimate	std.error	statistic	p.value	conf.low	conf.high	df	outcome
design	1	ate	0.5	estimator	Z	0.25	0.22	1.12	0.27	-0.19	0.68	98	Y
design	2	ate	0.5	estimator	Z	0.61	0.18	3.33	0.00	0.25	0.98	98	Y
design	3	ate	0.5	estimator	Z	0.26	0.23	1.16	0.25	-0.19	0.71	98	Y

Diagnose design

design |> 
  diagnose_design(sims = 100)

Mean Estimate	Bias	SD Estimate	RMSE	Power	Coverage
0.47	-0.03	0.21	0.21	0.67	0.92
(0.02)	(0.02)	(0.01)	(0.02)	(0.05)	(0.03)

Redesign

new_design <-
  
  design |> redesign(b = 0)

Modify any arguments that are explicitly called on by design steps.
Or add, remove, or replace steps

Compare designs

redesign(design, N = 50) %>%
  
  compare_diagnoses(design)

diagnosand	mean_1	mean_2	mean_difference	conf.low	conf.high
mean_estimand	0.50	0.50	0.00	0.00	0.00
mean_estimate	0.51	0.49	-0.03	-0.06	0.01
bias	0.01	-0.01	-0.03	-0.06	0.01
sd_estimate	0.30	0.20	-0.10	-0.13	-0.08
rmse	0.30	0.20	-0.10	-0.13	-0.08
power	0.43	0.67	0.24	0.18	0.31
coverage	0.94	0.95	0.01	-0.01	0.04