Causal Inference and Experimentation
Introduction to the class
Getting started
- General aims and structure
- Expectations
- Pointers for exercises
- Quick
DeclareDesign
intro
Aims and items
- Deep understanding of key ideas in causal inference
- Transportable tools for understanding how to evaluate and improve design
- Applied skills for design and analysis
- Exposure to open science practices
- Deeper dive into some specific topics (see survey)
The topics: Fundamentals
Day 1: Intro
Day 2: Causality
Estimation, Inference, and Design
Day 3: Estimation and Inference 1
Day 4: Estimation and Inference 2
Day 5: Design
Expectations
- 5 tasks
- (Required) Work in four “exercise teams”: 1 team (and typically 2 exercises) per session \(\times 4\)
- (Optional) Prepare a research design or short paper, perhaps building on existing work. Typically this contains:
- a problem statement
- a description of a method to address the problem
- analytic or simulation based results describing properties of the solution
- a discussion of implications for practice.
A passing paper will illustrate subtle features of a method; a good paper will identify unknown properties of a method; en excellent paper will develop a new method.
- Plus general reading and participation.
Exercise team job
Teams should prepare 15 - 20 minute presentations on set puzzles. Typically the task is to:
Take a puzzle
Declare and diagnose a design that shows the issue under study (e.g. some estimator produces unbiased estimates under some condition)
Modify the design to show behavior when conditions are violated
Share a report with the class. Best in self-contained documents for easy third party viewing. e.g. .html
via .qmd
or .Rmd
Presentations should be about 10 minutes for a given puzzle.
Good coding rules
- Metadata first
- Call packages at the beginning: use
pacman
- Put options at the top
- Call all data files once, at the top. Best to call directly from a public archive, when possible.
- Use functions and define them at the top: comment them; useful sometimes to illustrate what they do
- Replicate first, re-analyze second. Use sections.
- (For replications) Have subsections named after specific tables, figures or analyses
Aim
First best: If someone has access to your .Rmd
/.qmd
file they can hit render or compile and the whole thing reproduces first time. So: Nothing local, everything relative: so please do not include hardcoded paths to your computer
But: often you need ancillary files for data and code. That’s OK but aims should still be that with a self contained folder someone can open a main.Rmd
file, hit compile and get everything. I usually have an input
and an output
subfolder.
Collaborative coding / writing
- Do not get in the business of passing attachments around
- Documents in some cloud:
git
, osf, Dropbox, Drive, Nextcloud
- General rule: only post non sensitive, non proprietary material
- Share self contained folders; folders contain a small set of live documents plus an archive. Old versions of documents are in archive. Only one version of the most recent document is in a main folder.
- Data is self contained folder (
in
) and is never edited directly
- Update to github frequently