Experimenting
Cox & Reid (2000) define experiments as:
investigations in which an intervention, in all its essential elements, is under the control of the investigator.
Two types of control:
Experimental studies use research designs in which the researchers uses:
Let’s discuss:
Then: Deep dive into discussion of actual experiments
Then: Plans for our own
Model: set of models of what causes what and howInquiry: a question stated in terms of the modelData strategy: the set of procedures we use to gather information from the world (sampling, assignment, measurement)Answer strategy: how we summarize the data produced by the data strategyDesign declaration is telling the computer (and readers) what M, I, D, and A are.
Design diagnosis is figuring out how the design will perform under imagined conditions.
Estimating “diagnosands” like power, bias, rmse, error rates, ethical harm, “amount learned”.
Diagnosis takes account of model uncertainty: it aims to identify models for which the design works well and models for which it does not
Redesign is the fine-tuning of features of the data- and answer strategies to understand how changing them affects the diagnosands
Good questions studied well
Randomization of
There is no foundationless answer to this question.
Belmont principles commonly used for guidance:
Unfortunately, operationalizing these requires further ethical theories. (1) is often operationalized by informed consent (a very liberal idea). (2) and (3) sometimes by more utiliarian principles
The major focus on (1) by IRBs might follow from the view that if subjects consent, then they endorse the ethical calculations made for 2 and 3 — they think that it is good and fair.
Trickiness: can a study be good or fair because of implications for non-subjects?
Many (many) field experiments have nothing like informed consent.
For example, whether the government builds a school in your village, whether an ad appears on your favorite radio show, and so on.
Consider three cases:
Consider three cases:
In all cases, there is no consent given by subjects.
In 2 and 3, the treatment is possibly harmful for subjects, and the results might also be harmful. But even in case 1, there could be major unintended harmful consequences.
In cases 1 and 3, however, the “intervention” is within the sphere of normal activities for the implementer.
Sometimes it is possible to use this point of difference to make a “spheres of ethics” argument for “embedded experimentation.”
Spheres of Ethics Argument: Experimental research that involves manipulations that are not normally appropriate for researchers may nevertheless be ethical if:
Otherwise keep focus on consent and desist if this is not possible
Political science researchers should respect autonomy, consider the wellbeing of participants and other people affected by their research, and be open about the ethical issues they face.
Political science researchers have an individual responsibility to consider the ethics of their research related activities and cannot outsource ethical reflection to review boards, other institutional bodies, or regulatory agencies.
These principles describe the standards of conduct and reflexive openness that are expected of political science researchers. … [In cases of reasonable deviations], researchers should acknowledge and justify deviations in scholarly publications and presentations of their work.
[Note: no general injunction against]
Researchers should generally avoid harm when possible, minimize harm when avoidance is not possible, and not conduct research when harm is excessive.
do not limit concern to physical and psychological risks to the participant.
cases in which research that produces impacts on political processes without consent of individuals directly engaged by the research might be appropriate. [examples]
Studies of interventions by third parties do not usually invoke this principle on impact. [details]
This principle is not intended to discourage any form of political engagement by political scientists in their non-research activities or private lives.
researchers should report likely impacts
Mentors, advisors, dissertation committee members, and instructors
Graduate programs in political science should include ethics instruction in their formal and informal graduate curricula;
Editors and reviewers should encourage researchers to be open about the ethical decisions …
Journals, departments, and associations should incorporate ethical commitments into their mission, bylaws, instruction, practices, and procedures.
Experimental researchers are deeply engaged in the movement towards more transparency social science research.
Experimental researchers are deeply engaged in the movement towards more transparency social science research.
Contentious issues (mostly):
Data. How soon should you make your data available? My view: as soon as possibe. Along with working papers and before publication. Before it affects policy in any case. Own the ideas not the data.
Where should you make your data available? Dataverse is focal for political science. Not personal website (mea culpa)
What data should you make available? Disagreement is over how raw your data should be. My view: as raw as you can but at least post cleaning and pre-manipulation.
Experimental researchers are deeply engaged in the movement towards more transparency social science research.
Should you register?: Hard to find reasons against. But case strongest in testing phase rather than exploratory phase.
Registration: When should you register? My view: Before treatment assignment. (Not just before analysis, mea culpa)
Registration: Should you deviate from a preanalysis plan if you change your mind about optimal estimation strategies. My view: Yes, but make the case and describe both sets of results.
File drawer bias (Publication bias)
Analysis bias (Fishing)
– Say in truth \(X\) affects \(Y\) in 50% of cases.
– Researchers conduct multiple excellent studies. But they only write up the 50% that produce “positive” results.
– Even if each individual study is indisputably correct, the account in the research record – that X affects Y in 100% of cases – will be wrong.
– Say in truth \(X\) affects \(Y\) in 50% of cases.
– Researchers conduct multiple excellent studies. But they only write up the 50% that produce “positive” results.
– Even if each individual study is indisputably correct, the account in the research record – that X affects Y in 100% of cases – will be wrong.
Exacerbated by:
– Publication bias – the positive results get published
– Citation bias – the positive results get read and cited
– Chatter bias – the positive results gets blogged, tweeted and TEDed.
– Say in truth \(X\) affects \(Y\) in 50% of cases.
– But say that researchers enjoy discretion to select measures for \(X\) or \(Y\), or enjoy discretion to select statistical models after seeing \(X\) and \(Y\) in each case.
– Then, with enough discretion, 100% of analyses may report positive effects, even if all studies get published.
– Say in truth \(X\) affects \(Y\) in 50% of cases.
– But say that researchers enjoy discretion to select measures for \(X\) or \(Y\), or enjoy discretion to select statistical models after seeing \(X\) and \(Y\) in each case.
– Then, with enough discretion, 100% of analyses may report positive effects, even if all studies get published.
– Try the exact fishy test An Exact Fishy Test (https://macartan.shinyapps.io/fish/)
– What’s the problem with this test?
When your conclusions do not really depend on the data
Eg – some evidence will always support your proposition – some interpretation of evidence will always support your proposition
Knowing the mapping from data to inference in advance gives a handle on the false positive rate.
Source: Gerber and Malhotra
Implications are:
Summary: we do not know when we can or cannot trust claims made by researchers.
[Not a tradition specific claim]
Simple idea:
Lots of misunderstandings around registration
Fishing can happen in very subtle ways, and may seem natural and justifiable.
Example:
Our journal review process is largely organized around advising researchers how to adjust analysis in light of findings in the data.
Frequentists can do it
Bayesians can do it too.
Qualitative researchers can also do it.
You can even do it with descriptive statistics
The key distinction is between prospective and retrospective studies.
Not between experimental and observational studies.
A reason (from the medical literature) why registration is especially important for experiments: because you owe it to subjects
A reason why registration is less important for experiments: because it is more likely that the intended analysis is implied by the design in an experimental study. Researcher degrees of freedom may be greatest for observational qualitative analyses.
Registration will produce some burden but does not require the creation of content that is not needed anyway
It does shift preparation of analyses forward
And it also can increase the burden of developing analyses plans even for projects that don’t work. But that is in part, the point.
Upside is that ultimate analyses may be much easier.
In neither case would the creation of a registration facility prevent exploration.
What it might do is make it less credible for someone to claim that they have tested a proposition when in fact the proposition was developed using the data used to test it.
Registration communicates when researchers are angage in exploration or not. We love exploration and should be proud of it.
Incentives and strategies
| Inquiry | In the preanalysis plan | In the paper | In the appendix |
|---|---|---|---|
| Gender effect | X | X | |
| Age effect | X |
| Inquiry | Following A from the PAP | Following A from the paper | Notes |
|---|---|---|---|
| Gender effect | estimate = 0.6, s.e = 0.31 | estimate = 0.6, s.e = 0.25 | Difference due to change in control variables [provide cross references to tables and code] |