Note: you cannot just ignore the blocks because assignment is no longer independent of potential outcomes: you might be sampling units with different potential outcomes with different probabilities.
However, the formula above works fine because selecting is random conditional on blocks.
# with estimatrestimatr::difference_in_means(Y ~ Z, blocks = X, data = df) |>tidy() |>kable(digits =2)
term
estimate
std.error
statistic
p.value
conf.low
conf.high
df
outcome
Z
0.72
0.11
6.66
0
0.51
0.94
496
Y
1.1.8 ATE with IPW
This also corresponds to the difference in the weighted average of treatment outcomes (with weights given by the inverse of the probability that each unit is assigned to treatment) and control outcomes (with weights given by the inverse of the probability that each unit is assigned to control).
The average difference in means estimator is the same as what you would get if you weighted inversely by shares of units in different conditions inside blocks.
# with estimatrestimatr::difference_in_means(Y ~ Z, weights = ip, data = df) |>tidy() |>kable(digits =2)
term
estimate
std.error
statistic
p.value
conf.low
conf.high
df
outcome
Z
0.74
0.11
6.65
0
0.52
0.96
498
Y
1.1.10 ATE with IPW
But inverse propensity weighting is a more general principle, which can be used even if you do not have blocks.
The intuition for it comes straight from sampling weights — you weight up in order to recover an unbiased estimate of the potential outcomes for all units, whether or not they are assigned to treatment.
With sampling weights however you can include units even if their weight was 1. Why can you not include these units when doing inverse propensity weighting?
1.1.11 Illustration: Estimating treatment effects with terrible treatment assignments: Fixer
Say you made a mess and used a randomization that was correlated with some variable, \(U\). For example:
The randomization is done in a way that introduces a correlation between Treatment Assignment and Potential Outcomes
Then possibly, even though there is no true causal effect, we naively estimate a large one — enormous bias
However since we know the assignment procedure we can fully correct for the bias
1.1.12 Illustration: Estimating treatment effects with terrible treatment assignments: Fixer
In the next example, we do this using “inverse propensity score weighting.”
This is exactly analogous to standard survey weighting — since we selected different units for treatment with different probabilities, we weight them differently to recover the average outcome among treated units (same for control).
1.1.13 Basic randomization: Fixer
Bad assignment, some randomization process you can’t understand (but can replicate) that results in unequal probabilities.
draw_data(design_2) |>ggplot(aes(probs, weights, color =factor(Z))) +geom_point()
1.1.17 Basic randomization: Fixer
Improved results
diagnosis <-diagnose_design(design_2)
diagnosis$simulations_df |>ggplot(aes(estimate)) +geom_histogram() +facet_grid(estimator~.)+geom_vline(xintercept =0, color ="red")
1.1.18 IPW with one unit!
This example is surprising but it helps you see the logic of why inverse weighting gets unbiased estimates (and why that might not guarantee a reasonable answer)
Imagine there is one unit with potential outcomes \(Y(1) = 2, Y(0) = 1\). So the unit level treatment effect is 1.
You toss a coin.
If you assign to treatment you estimate: \(\hat\tau = \frac{2}{0.5} = 4\)
If you assign to control you estimate: \(\hat\tau = -\frac{1}{0.5} = -2\)
So your expected estimate is: \[0.5 \times 4 - 0.5 \times (-2) = 1\]
Great on average but always lousy
1.1.19 Generalization: why IPW works
Say a given unit is assigned to treatment with probability \(\pi_i\)
We estimate the average \(Y(1)\) using
\[\hat{\overline{Y_1}} = \frac{1}n\left(\sum_i \frac{Z_iY_i(1)}{\pi_i}\right)\] With independent assignment the expected value of \(\hat{\overline{Y_1}}\) is just:
Note we needed \(\pi_i >0\) and also \(\pi_i <1\) everywhere. Why?
We used independence here; sampling theory is used to show similar results for e.g. complete randomization
For blocked randomization this is easy to see
1.2 Design-based Estimation of Variance
Lets talk about “inference”
1.2.1 Var(ATE)
Recall that the treatment effect is gotten by taking a sample of outcomes under treatment and comparing them to a sample of outcomes under control
Say that there is no “error”
Why would this procedure produce uncertainty?
1.2.2 Var(ATE)
Why would this procedure produce uncertainty?
The uncertainty comes from being uncertain about the average outcome under control from observations of the control units, and from being uncertain about the average outcome under treatment from observation of the treated units
In other words, it comes from the variance in the treatment outcomes and variance in the control outcomes (and not, for example, from variance in the treatment effect)
1.2.3 Var(ATE)
In classical statistics we characterize our uncertainty over an estimate using an estimate of variance of the sampling distribution of the estimator.
Key idea is we want to be able to say: how likely are we to have gotten such an estimate if the distribution of estimates associated with our design looked a given way.
More specifically we want to estimate “standard error” or the “standard deviation of the sampling distribution”
(See Woolridge (2023) where the standard error is understood as the “estimate of the standard deviation of the sampling distribution”)
1.2.4 Variance and standard errors
Given:
\(\hat\tau\) is an estimate for \(\tau\)
\(\overline{x}\) is the average values of \(x\)
The variance of the estimator of \(n\) repeated ‘runs’ of a design is: \(Var(\hat{\tau}) = \frac{1}n\sum_i(\hat\tau_i - \overline{\hat\tau_i})^2\)
If we have a good measure for the shape of the sampling distribution we can start to make statements of the form:
What are the chances that an estimate would be this large or larger?
If the sampling distribution is roughly normal, as it may be with large samples, then we can use procedures such as: “there is a 5% probability that an estimate would be more than 1.96 standard errors away from the mean of the sampling distribution”
1.2.6 Var(ATE)
Key idea: You can estimate variance straight from the data, given knowledge of the assignment process and assuming well defined potential outcomes?
Recall in general \(Var(x) = \frac{1}n\sum_i(x_i - \overline{x})^2\). here the \(x_i\)s are the treatment effect estimates we might get under different random assignments, the \(n\) is number of different assignments (assumed here all equally likely, but otherwise we can weight) and \(\overline{x}\) is the truth.
For intuition imagine we have just two units \(A\), \(B\), with potential outcomes \(A_1\), \(A_0\), \(B_1\), \(B_0\).
When there are two units with outcomes \(x_1, x_2\), the variance simplifies like this:
In the two unit case the two possible treatment estimates are: \(\hat{\tau}_1=A_1 - B_0\) and \(\hat{\tau}_2=B_1 - A_0\), depending on what gets put into treatment. So the variance is:
\[Var(\hat{\tau}) = \left(\frac{\hat{\tau}_1 - \hat{\tau}_2}{2}\right)^2 = \left(\frac{(A_1 - B_0) - (B_1 - A_0)}{2}\right)^2 =\left(\frac{(A_1 - B_1) + (A_0 - B_0)}{2}\right)^2 \] which we can re-write as:
\[Var(\hat{\tau}) = \left(\frac{A_1 - B_1}{2}\right)^2 + \left(\frac{A_0 - B_0}{2}\right)^2+ 2\frac{(A_1 - B_1)(A_0-B_0)}{2}\] The first two terms correspond to the variance of \(Y(1)\) and the variance of \(Y(0)\). The last term is a bit pesky though, it corresponds to twice the covariance of \(Y(1)\) and \(Y(0)\).
In the two unit case it is quite challenging because we do not have an estimate for any of the three terms: we do not have an estimate for the variance in the treatment group or in the control group because we have only one observation in each case; and we do not have an estimate for the covariance because we don’t observe both potential outcomes for any case.
Things do look a bit better however with more units…
1.2.12 Illustration of Neyman Conservative Estimator
An illustration of how conservative the conservative estimator of variance really is (numbers in plot are correlations between \(Y(1)\) and \(Y(0)\).
We confirm that:
the estimator is conservative
the estimator is more conservative for negative correlations between \(Y(0)\) and \(Y(1)\) — eg if those cases that do particularly badly in control are the ones that do particularly well in treatment, and
with \(\tau\) and \(V(Y(0))\) fixed, high positive correlations are associated with highest variance.
1.2.13 Illustration of Neyman Conservative Estimator
\(\tau\)
\(\rho\)
\(\sigma^2_{Y(1)}\)
\(\Delta\)
\(\sigma^2_{\tau}\)
\(\widehat{\sigma}^2_{\tau}\)
\(\widehat{\sigma}^2_{\tau(\text{Neyman})}\)
1.00
-1.00
1.00
-0.04
0.00
-0.00
0.04
1.00
-0.67
1.00
-0.03
0.01
0.01
0.04
1.00
-0.33
1.00
-0.03
0.01
0.01
0.04
1.00
0.00
1.00
-0.02
0.02
0.02
0.04
1.00
0.33
1.00
-0.01
0.03
0.03
0.04
1.00
0.67
1.00
-0.01
0.03
0.03
0.04
1.00
1.00
1.00
0.00
0.04
0.04
0.04
Here \(\rho\) is the unobserved correlation between \(Y(1)\) and \(Y(0)\); and \(\Delta\) is the final term in the sample variance equation that we cannot estimate.
1.2.14 Illustration of Neyman Conservative Estimator
1.2.15 Tighter Bounds On Variance Estimate
The conservative variance comes from the fact that you do not know the covariance between \(Y(1)\) and \(Y(0)\).
Intuitively, if you know that the variance of \(Y(1)\) is 0, then the covariance also has to be zero.
This basic insight opens a way of calculating bounds on the variance of the sample average treatment effect.
1.2.16 Tighter Bounds On Variance Estimate
Example:
Take a million-observation dataset, with treatment randomly assigned
Assume \(Y(0)=0\) for everyone and \(Y(1)\) distributed normally with mean 0 and standard deviation of 1000.
Note here the covariance of \(Y(1)\) and \(Y(0)\) is 0.
Note the true variance of the estimated sample average treatment effect should be (approx) \(\frac{Var(Y(1))}{{1000000}} + \frac{Var(Y(0))}{{1000000}} = 1+0=1\), for an se of \(1\).
But using the Neyman estimator (or OLS!) we estimate (approx) \(\frac{Var(Y(1))}{({1000000/2})} + \frac{Var(Y(0))}{({1000000/2})} = 2\), for an se of \(\sqrt{2}\).
But we can recover the truth knowing the covariance between \(Y(1)\) and \(Y(0)\) is 0.
The sharp bounds are \([1,1]\) but the conservative estimate is \(\sqrt{2}\).
1.2.19 Asymptotics
It is a remarkable thing that you can estimate the standard error straight from the data
However, once you want to use the standard error to do hypothesis testing you generally end up looking up distributions (\(t\)-distributions or normal distributions)
That’s a little disappointing and has been one of the criticisms made by Deaton and Cartwright (2018)
However you can do hypothesis testing even without an estimate of the standard error.
Up next
1.3 Randomization Inference
A procedure for using the randomization distribution to calculate \(p\) values
1.3.1 Calculate a \(p\) value in your head
Illustrating \(p\) values via “randomization inference”
Say you randomized assignment to treatment and your data looked like this.
Unit
1
2
3
4
5
6
7
8
9
10
Treatment
0
0
0
0
0
0
0
1
0
0
Health score
4
2
3
1
2
3
4
8
7
6
Then:
Does the treatment improve your health?
What’s the \(p\) value for the null that treatment had no effect on anybody?
1.3.2 Calculate a \(p\) value in your head
Illustrating \(p\) values via “randomization inference”
Say you randomized assignment to treatment and your data looked like this.
Unit
1
2
3
4
5
6
7
8
9
10
Treatment
0
0
0
0
0
0
0
0
1
0
Health score
4
2
3
1
2
3
4
8
7
6
Then:
Does the treatment improve your health?
What’s the \(p\) value for the null that treatment had no effect on anybody?
1.3.3 Randomization Inference: Some code
In principle it is very easy.
These few lines generate data, produce the regression estimate and then an ri estimate of \(p\):
# dataset.seed(1)df <-fabricate(N =1000, Z =rep(c(0,1), N/2), Y= .1*Z +rnorm(N))# test stattest.stat <-function(df) with(df, mean(Y[Z==1])-mean(Y[Z==0]))# test stat distributionts <-replicate(4000, df |>mutate(Z =sample(Z)) |>test.stat())# testmean(ts >=test.stat(df)) # One sided p value
[1] 0.025
1.3.4 Randomization Inference: Some code
The \(p\) value is the mass to the right of the vertical
hist(ts); abline(v =test.stat(df), col ="red")
1.3.5 Using ri2
You can do the same using Alex Coppock’s ri2 package
library(ri2)# Declare the assignmentassignment <-declare_ra(N =1000, m =500)# Implementri2_out <-conduct_ri(formula = Y ~ Z,declaration = assignment,sharp_hypothesis =0,data = df, p ="upper",sims =4000)
1.3.6 Using ri2
term
estimate
upper_p_value
Z
0.1321367
0.02225
You’ll notice slightly different answer. This is because although the procedure is “exact” it is subject to simulation error.
1.3.7 Randomization Inference
Randomization inference can get more complicated when you want to test a null other than the sharp null of no effect.
Say you wanted to test the null that the effect is 2 for all units. How do you do it?
Say you wanted to test the null that an interaction effect is zero. How do you do it?
In both cases by filling in a potential outcomes schedule given the hypothesis in question and then generating a test statistic
Observed
Under null that
effect is 0
Under null that
effect is 2
Y(0)
Y(1)
Y(0)
Y(1)
Y(0)
Y(1)
1
NA
1
1
1
3
2
NA
2
2
2
4
NA
4
4
4
2
4
NA
3
3
3
1
3
1.3.8ri and CIs
It is possible to use this procedure to generate confidence intervals with a natural interpretation.
The key idea is that we can use the same procedure to assess the probability of the data given a sharp null of no effect, but also a sharp null of any other **constant* effect.
We can then see what set of effects we reject and what set we accept
We are left with a set of values that we cannot reject at the 0.05 level.
Warning: calculating confidence intervals this way can be computationally intensive
1.3.11ri with DeclareDesign
DeclareDesign can do randomization inference natively
The trick is to ensure that when calculating the \(p\) values the only stochastic component is the assignment to treatment when calculating the \(p\) values
1.3.12ri with DeclareDesign (advanced)
Here we get minimal detectable effects by using a design that has two stage simulations so we can estimate the sampling distribution of summaries of the sampling distribution generated from reassignments.
test_stat <-function(data)with(data, data.frame(estimate =mean(Y[Z==1]) -mean(Y[Z==0])))b <-0design <-declare_model(N =100, Z =complete_ra(N), Y = b*Z +rnorm(N)) +declare_estimator(handler =label_estimator(test_stat), label ="actual")+declare_measurement(Z =sample(Z)) +# this is the permutation stepdeclare_estimator(handler =label_estimator(test_stat), label ="null")
1.3.13ri with DeclareDesign (advanced)
Simulations data frame from two step simulation. Note computational intensity as number of runs is the product of the sims vector. I speed things up by using a simple estimation function and also using parallelization.
If you want to figure out more precisely what b gives 80% or 90% power you can narrow down the b range.
1.3.16ri interactions
Lets now imagine a world with two treatments and we are interested in using ri for assessing the interaction. (Code from Coppock, ri2)
set.seed(1)N <-100declaration <- randomizr::declare_ra(N = N, m =50)data <-fabricate(N = N,Z =conduct_ra(declaration),X =rnorm(N),Y = .9* X + .2* Z + .1* X * Z +rnorm(N))
1.3.17ri interactions
The approach is to declare a null model that is nested by the full model. Then \(F\) test statistic from the model comparisons is taken as the test statistic and distribution of this is built up under re-randomizations.
conduct_ri(model_1 = Y ~ Z + X,model_2 = Y ~ Z + X + Z * X,declaration = declaration,assignment ="Z",sharp_hypothesis =coef(lm(Y ~ Z, data = data))[2],data = data, sims =1000 ) |>summary() |>kable()
term
estimate
two_tailed_p_value
F-statistic
1.954396
0.171
1.3.18ri interactions with DeclareDesign
Let’s imagine a true model with interactions. We take an estimate. We then ask how likely that estimate is from a null model with constant effects
Note: this is quite a sharp hypothesis
df <-fabricate(N =1000, Z1 =rep(c(0,1), N/2), Z2 =sample(Z1), Y = Z1 + Z2 - .15*Z1*Z2 +rnorm(N))my_estimate <- (lm(Y ~ Z1*Z2, data = df) |>coef())[4]null_model <-function(df) { M0 <-lm(Y ~ Z1 + Z2, data = df) d1 <-coef(M0)[2] d2 <-coef(M0)[3] df |>mutate(Y_Z1_0_Z2_0 = Y - Z1*d1 - Z2*d2,Y_Z1_1_Z2_0 = Y + (1-Z1)*d1 - Z2*d2,Y_Z1_0_Z2_1 = Y - Z1*d1 + (1-Z2)*d2,Y_Z1_1_Z2_1 = Y + (1-Z1)*d1 + (1-Z2)*d2) }
1.3.19ri interactions with DeclareDesign
Let’s imagine a true model with interactions. We take an estimate. We then ask how likely that estimate is from a null model with constant effects
In practice (unless you have a design declaration), it is a good idea to create a \(P\) matrix when you do your randomization.
This records the set of possible randomizations you might have had: or a sample of these.
So, again: assignments have to be replicable
1.3.22ri Applications
Recall that silly randomization procedure from this slide.
Say you forgot to take account of the wacky assignment in your estimates and you estimate 0.15.
Does the treatment improve your health?: \(p=?\)
1.3.23ri Applications
Randomization procedures are sometimes funky in lab experiments
Using randomization inference would force a focus on the true assignment of individuals to treatments
Fake (but believable) example follows
1.3.24ri Applications
Capacity
T1
T2
T3
Session
Thursday
40
10
30
0
Friday
40
10
0
30
Saturday
10
10
0
0
Optimal assignment to treatment given constraints due to facilities
Subject type
N
Available
A
3
Thurs, Fri
B
30
Thurs, Sat
C
30
Fri, Sat
Constraints due to subjects
1.3.25ri Applications
If you think hard about assignment you might come up with an allocation like this.
Allocations
Subject type
N
Available
Thurs
Fri
Sat
A
30
Thurs, Fri
15
15
NA
B
30
Thurs, Sat
25
NA
5
C
30
Fri, Sat
NA
25
5
Assignment of people to days
1.3.26ri Applications
That allocation balances as much as possible. Given the allocation you might randomly assign individuals to different days as well as randomly assigning them to treatments within days. If you then figure out assignment propensities, this is what you would get:
Assignment Probabilities
Subject type
N
Available
T1
T2
T3
A
30
Thurs, Fri
0.250
0.375
0.375
B
30
Thurs, Sat
0.375
0.625
0.000
C
30
Fri, Sat
0.375
NA
0.625
1.3.27ri Applications
Even under the assumption that the day of measurement does not matter, these assignment probabilities have big implications for analysis.
Assignment Probabilities
Subject type
N
Available
T1
T2
T3
A
30
Thurs, Fri
0.250
0.375
0.375
B
30
Thurs, Sat
0.375
0.625
0.000
C
30
Fri, Sat
0.375
NA
0.625
Only the type \(A\) subjects could have received any of the three treatments.
There are no two treatments for which it is possible to compare outcomes for subpopulations \(B\) and \(C\)
A comparison of \(T1\) versus \(T2\) can only be made for population \(A \cup B\)
However subpopulation \(A\) is assigned to \(A\) (versus \(B\)) with probability 4/5; while population \(B\) is assigned with probability 3/8
1.3.28ri Applications
Implications for design: need to uncluster treatment delivery
Implications for analysis: need to take account of propensities
Idea: Wacky assignments happen but if you know the propensities you can do the analysis.
1.3.29ri Applications: Indirect assignments
A particularly interesting application is where a random assignment combines with existing features to determine an assignment to an “indirect” treatment.
For instance: \(n\) of \(N\) are assigned to a treatment.
You are interested in whether “having a friend assigned to treatment” makes a difference to a subject. Or maybe “a friend of a friend”
That means the subject has a complex clustered assignment that depends on how many friends they have
A bit mind-boggling, but:
Rerun your assignment many times and each time figure out whether a subject is assigned to an indirect treatment or not
Calculate the implied quantity of interest for each assignment
Assess the place of the actual quantity in the sampling distribution
1.4 Covariate Adjustment
1.4.1 Example
Consider for example this data.
You randomly pair offerers and receivers in a dictator game (in which offerers decide how much of $1 to give to receivers).
Your population comes from two groups (80% Baganda and 20% Banyankole) so in randomly assigning partners you are randomly determining whether a partner is a coethnic or not.
You find that in non-coethnic pairings 35% is offered, in coethnic pairings 48% is offered.
Should you believe it?
1.4.2 Covariate Adjustment
Population: randomly matched Baganda (80% of pop) and Banyankole (20% of pop)
You find: in non-coethnic pairings 35% is offered, in coethnic pairings 48% is offered.
But a closer look at the data reveals…
To: Baganda
To: Banyankole
Offers by
Baganda
64%
16%
Banyankole
16%
4%
Figure 1: Number of Games
To: Baganda
To: Banyankole
Offers by
Baganda
50
50
Banyankole
20
20
Figure 2: Average Offers
So that’s a problem
1.4.3 Covariate Adjustment
Control?
With such data you might be tempted to ‘control’ for the covariate (here: ethnic group), using regression.
But, perhaps surprisingly, it turns out that regression with covariates does not estimate average treatment effects.
It does estimate an average of treatment effects, but specifically a minimum variance estimator, not necessarily an estimator of your estimand.
Instead you can use formula above for \(\hat{\tau}_{ATE}\) to estimate ATE
alternatively…
1.4.5 Covariate adjustment via saturated regression
Alternatively you can use propensity weights.
Alternatively you can use a regression that includes both the treatment and the treatment interacted with the covariates.
In practice this is best done by demeaning the covariates; doing this lets you read off the average effect from the main term. Key resource: Lin (2012)
You should have noticed that the logic for controlling for a covariate here is equivalent to the logic we saw for heterogeneous assignment propensities. These are really the same thing.
1.4.6 Covariate adjustment via saturated regression
Returning to prior example:
df <- fabricatr::fabricate(N =500, X =rep(0:1, N/2), Z =rbinom(N, 1, .2+ .3*X),Y =rnorm(N) + Z*X)lm_robust(Y ~ Z*X_c, data = df |>mutate(X_c = X -mean(X))) |>tidy() |>kable(digits =2)
term
estimate
std.error
statistic
p.value
conf.low
conf.high
df
outcome
(Intercept)
-0.02
0.06
-0.42
0.68
-0.14
0.09
496
Y
Z
0.59
0.10
6.05
0.00
0.40
0.78
496
Y
X_c
0.16
0.11
1.42
0.15
-0.06
0.39
496
Y
Z:X_c
0.64
0.20
3.28
0.00
0.26
1.02
496
Y
1.4.7 Covariate adjustment via saturated regression
It’s all good. But you need to match the estimator to the inquiry: demean for average marginal effects; do not demean for conditional marginal effects.
1.4.12 Recap
If you have different groups with different assignment propensities you can do any or all of these:
Blocked differences in means
Inverse propensity weighting
Saturated regression (Lin)
More… (coming)
You cannot (reliably):
Ignore the groups
Include them in a regression (without interactions)
1.5 To control or not control
When does controlling for covariates improve things and when does it make it worse
1.5.1 Considerations
Even though randomization ensures no bias, you may sometimes want to “control” for covariates in order to improve efficiency (see the discussion of blocking above).
Or you may have to take account of the fact that the assignment to treatment is correlated with a covariate (as above).
In observational work you might also figure out you have to control for a covariate to justify inferences (Refer to our discussion of the backdoor criteria)
1.5.2 Observational work
Observational motivation: Controls can provide grounds for identification
But recall – they can also destroy identification
For a great walk through of what you can draw from graphical models for the decision to control see:
Aside: these implications generally refer to use controls as covariates – e.g. by implementing blocked differences in means or similar. For a Bayesian model of the form used in CausalQueries the information from “bad controls” is used wisely.
1.5.3 Experimental work
Conditional Bias and Precision Gains from Controls
Experimental motivation: Controls can reduce noise and improve precision. This is an argument for using variables that are correlated with the output (not with the treatment).
1.5.4 Precision Gains from Controls
However: Introducing controls can create complications
As argued by Freedman (summary from Lin (2012)), we can get: “worsened asymptotic precision, invalid measures of precision, and small-sample bias”\(^*\)
These adverse effects are essentially removed with an interacted model
See discussions in Imbens and Rubin (2015) (7.6, 7.7) and especially Theorem 7.2 for the asymptotic variance of the estimator
\(^*\) though note that the precision concern does not hold when treatment and control groups are equally sized
The design implements estimation controlling and not controlling for \(X\) and also keeps track of the results of a test for the relation between \(Z\) and \(X\).
1.5.12 Simulations
We simulate with many simulations over a range of designs
simulations <-list(design |>redesign(a =0, b =0), design |>redesign(a =1, b =0), design |>redesign(a =0, b =1)) |>simulate_design(sims =20000)
1.5.13 Standard errors
We see the standard errors are larger when you control in cases in which the control is not predictive of the outcome and it is correlated with the treatment. Otherwise they can be smaller.
Challenge: Use DeclareDesign to compare performance of drtmle and lm_lin
1.7 Principle: Keep the reporting close to the design
1.7.1 Design-based analysis
Report the analysis that is implied by the design.
T2
N
Y
All
Diff
T1
N
\(\overline{y}_{00}\)
\(\overline{y}_{01}\)
\(\overline{y}_{0x}\)
\(d_2|T1=0\)
(sd)
(sd)
(sd)
(sd)
Y
\(\overline{y}_{10}\)
\(\overline{y}_{10}\)
\(\overline{y}_{1x}\)
\(d_2|T1=1\)
(sd)
(sd)
(sd)
(sd)
All
\(\overline{y}_{x0}\)
\(\overline{y}_{x1}\)
\(y\)
\(d_2\)
(sd)
(sd)
(sd)
(sd)
Diff
\(d_1|T2=0\)
\(d_1|T2=1\)
\(d_1\)
\(d_1d_2\)
(sd)
(sd)
(sd)
(sd)
This is instantly recognizable from the design and returns all the benefits of the factorial design including all main effects, conditional causal effects, interactions and summary outcomes. It is much clearer and more informative than a regression table.
Cinelli, Carlos, Andrew Forney, and Judea Pearl. 2022. “A Crash Course in Good and Bad Controls.”Sociological Methods & Research, 00491241221099552.
Deaton, Angus, and Nancy Cartwright. 2018. “Understanding and Misunderstanding Randomized Controlled Trials.”Social Science & Medicine 210: 2–21.
Freedman, David A. 2008. “On Regression Adjustments to Experimental Data.”Advances in Applied Mathematics 40 (2): 180–93.
Imbens, Guido W, and Donald B Rubin. 2015. Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge University Press.
Lin, Winston. 2012. “Agnostic Notes on Regression Adjustments to Experimental Data: Reexamining Freedman’s Critique.”arXiv Preprint arXiv:1208.2301.
Robins, James M, Andrea Rotnitzky, and Lue Ping Zhao. 1994. “Estimation of Regression Coefficients When Some Regressors Are Not Always Observed.”Journal of the American Statistical Association 89 (427): 846–66.
Samii, Cyrus, and Peter M Aronow. 2012. “On Equivalencies Between Design-Based and Regression-Based Variance Estimators for Randomized Experiments.”Statistics & Probability Letters 82 (2): 365–70.