Replication reflections

macartan humphreys

Thoughts

The most important dimension of replication, for our fields, is forward looking: the many site field replication design to cumulate knowledge.
I am going to talk here though about more backwards looking replications as these are of particular interest to journals and for individuals.
Note also third category: more learning from reanalysis of existing studies

still very ad hoc approaches
many places
many file structures
not automatically verified
high access costs
shift to AI replication: risk that we outsource not just coding but understanding

with Vermon Washington and Cord Masche

See examples for:

re-analysis tabs:

The hard part:

I would like to live in a world in which:

Critic spots error
Critic informs author
Author takes responsibility: checks, issues a correction on journal website and updates repo files. [Even better – track change edits to article!] Does not minimize issues. Thanks Critic in footnote.
Critic happy with footnote. Speedy process.

People think this will not / does not work because:

My thoughts: 1. Really? 2. Really? 3. For sure!

For 3: we may need threat of third party correction?

Critics dramatize risks
Focus on errors not overall reliability of results; Spend a lot of time cataloging minor errors
Add a lot of additional material to make for a full article– e.g. alternative analyses, checking for heterogeneity

Risks:

Bigger question: What is the unit of research output in the age of AI

If about scope:

Disagreement here: Some favor “gardens of forking paths.”

I worry a lot about:

Scope for negative fishing (unpreregistered re-analyses with many degrees of freedom)
A bad analysis debunking a reasonable analysis because it finds different results

I would like to see:

Expectation that re-analyses are justified on ex ante grounds. e.g. in DeclareDesign framework.

Home ground dominance. Holding the original constant (i.e., the home ground of the original study), if you can show that a new answer strategy \(A'\) yields better diagnosands than the original…
Robustness to alternative models. You can show that a new answer strategy is robust to both the original model and a new, also plausible, \(M'\)
Model plausibility. If the diagnosands for a design with \(A'\) are worse than those under \(M\) but better under \(M'\), then justify by showing \(M'\) more plausible than \(M'\)

We are going to have to become more willing to accept errors
An article that replicates everything and catalogues a million errors, soon there will be
Very many errors in code and in interpretation found there will be
We’ll have to figure out how to sort through them, and it can’t be just machines talking to machines