Here are some pointers on the things to look for when discussing or
reviewing a paper.
Discussanting
Generally discussants have 10 - 15 minutes to give comments on a
paper, sometimes less. With that much time you can make 3 good comments.
You should not use this time to say everything you liked or did not like
about a paper and you should not get lost in the weeds. If you describe
errors you have to get to the so what. The fact that there is an error
is not in itself of interest. You should select your comments so
that:
- they open up a conversation
- they speak to the major issues the paper addresses
- they provide pointers to how to do better.
The really useful critiques often come from taking a really fresh
perspective on a piece of work. This requires stepping back and not
becoming beholden to the author’s spinning of their findings. Often
useful to figure out what this is a case of? What is the general class
of phenomena this speaks to? If you had lots of resources how would you
address the question? If you could set it up as an experiment how would
you do it? If you really had to take a policy action based on this work,
which elements would give you pause? But as you take different
perspectives you should try to speak the same language otherwise you can
end up talking to yourself and influencing no one.
Remember as a discussant it is not about you, it is about making the
paper better and helping people understand its strengths and
limitations. Mostly it’s about the speaker. If you think the paper is
great you do not have to drum up a critique, but you should still try to
help people see why it is great. Having slides helps organize your
presentation and helps people follow. A single slide with three bullets
on the three big points is enough. If you have a laundry list of smaller
points, share it with the speaker afterwards.
Reviewing
Your ostensible role as a reviewer is to advise an editor on whether
to publish or not. To be useful your conclusions need to be reasoned and
this requires going into some depth. In practice many see reviewing as a
time to receive and provide constructive feedback. That’s my take too:
if you have done the hard work of reading and assessing it is a
relatively low cost and large benefit step to provide useful feedback.
Doing so can also make you think more deeply about the work and improve
your assessments.
For a formal review or referee report you have space to go into much
more depth. A standard approach is to divide these reviews into three
parts.
- The first part can be a single paragraph – it summarizes the key
contribution of the paper as you see it, gives an overall assessment,
and points to the key issues, concerns, or strengths. Don’t forget the
strengths. Try to articulate succinctly what you know now that you
didn’t know before you read the piece. Often a quick summary can draw
attention to strong features you were not conscious of, or makes you
realize that what you were impressed by is not so impressive after all.
This is also where you can provide your overall assessment and advice to
the editor. You can be emphatic but emphatic but still be decent: if you
strongly support, show your enthusiasm; if you are strongly opposed, say
that too while signaling the reasoning.
- The second part discusses 3 - 6 major features of the paper; the
checklist below lists features that could be useful to think through
when selecting themes. Try to organize by theme (measurement,
explanation etc.).
- The third part is for “smaller issues” where you can bullet point
things from ambiguities, to estimation issues, to pointers to other
work.
Bonus points:
- It’s useful to authors when you can point to literature they have
not read, if relevant.
- It’s useful to authors to know what to cut: reviews tend to worry
about length but still ask for more.
- Tone: Your tone should be such that you would not
feel embarrassed if some day your review gets into the public domain by
mistake.
- You should feel free to ask for extra material such as replication
data or analysis plans. Sometimes reviewing can go quicker if you can
access data.
- Don’t ask the authors to ask and answer a different question;
respond to the paper you have been sent.
- Be generous: share references if they are missing but don’t assume
that researchers intentionally ignored the work of others (or your
work!); raise ethical issues if you see them but don’t assume
researchers acted without ethical concern; ask for multiple comparisons
corrections but don’t assume deliberately misleading reporting.
- Pronouns. For anonymous review it’s usually safe to use pronouns
“you” or “they” even if single authorship has been indicated.
The Checklist
Here is my list of what to look out for as I read a paper:
Theory
- Is the theory internally consistent?
- Is it consistent with past literature and findings?
- Is it novel or surprising?
- Are elements that are excluded or simplified plausibly unimportant
for the outcomes?
- Is the theory general or specific? Are there more general theories
on which this theory could draw or contribute?
From Theory to Hypotheses
- Is the theory really needed to generate the hypotheses?
- Does the theory generate more hypotheses than considered?
- Are the hypotheses really implied by the theory? Or are there
ambiguities arising from say non-monotonicities or multiple
equilibria?
- Does the theory specify mechanisms?
- Does the theory suggest heterogeneous effects?
Hypotheses
- Are the hypotheses complex? (eg in fact 2 or 3 hypotheses bundled
together)
- Are the hypotheses falsifiable?
Evidence I: Design
- External validity: is the population examined representative of the
larger population of interest?
- External validity: Are the conditions under which they are examined
consistent with the conditions of interest?
- Measure validity: Do the measures capture the objects specified by
the theory?
- Consistency: Is the empirical model used consistent with the
theory?
- Mechanisms: Are mechanisms tested? How are they identified?
- Replicability: Has the study been done in a way that it can be
replicated?
- Interpretation: Do the results admit rival interpretations?
Evidence II: Analysis and Testing
- Identification: are there concerns with reverse causality?
- Identification: are there concerns of omitted variable bias?
- Identification: does the model control for pre treatment variables
only? Does it control or does it match?
- Identification: Are poorly identified claims flagged as such?
- Robustness: Are results robust to changes in the model, to
subsetting the data, to changing the period of measurement or of
analysis, to the addition or exclusion of plausible controls?
- Standard errors: does the calculation of test statistics make use of
the design? Do standard errors take account of plausibly clustering
structures/differences in levels?
- Presentation: Are the results presented in an intelligible way? Eg
using fitted values or graphs? How can this be improved?
- Interpretation: Can no evidence of effect be interpreted as evidence
of only weak effects?
Evidence III: Other sources of bias
- Fishing: were hypotheses generated prior to testing? Was any
training data separated from test data?
- Measurement error: is error from sampling, case selection, or
missing data plausibly correlated with outcomes?
- Spillovers / Contamination: Is it plausible that outcomes in control
units were altered because of the treatment received by the
treated?
- Compliance: Did the treated really get treatment? Did the controls
really not?
- Hawthorne effects: Are subjects modifying behavior simply because
they know they are under study?
- Measurement: Is treatment the only systematic difference between
treatment and control or are there differences in how items were
measured?
- Implications of Bias: Are any sources of bias likely to work for or
against the hypothesis tested?
Explanation
- Does the evidence support the particular causal account given?
- Are mechanisms examined? Can they be?
- Are there observable implications we might expect to see associated
with different possible mechanisms?
Policy Implications
- Do the policy implications really follow from the results?
- If implemented would the policy changes have effects other thank
those specified by the research?
- Have the policy claims been tested directly?
- Is the author overselling or underselling the findings?