# Chapter 5 Querying models

Models can be queried using the `query_distribution`

and `query_model`

functions. The difference between these functions is that `query_distribution`

examines a single query and returns a full distribution of draws from the distribution of the estimand (prior or posterior); `query_model`

takes a collection of queries and returns a dataframe with summary statistics on the queries.

The simplest queries ask about causal estimands given particular parameter values and case level data. Here is one surprising result of this form:

## 5.1 Case level queries

The `query_model`

function takes causal queries and conditions (`given`

) and specifies the parameters to be used. The result is a dataframe which can be displayed as a table.

For a case level query we can make the query *given* a particular parameter vector, as below:

```
make_model("X-> M -> Y <- X") %>%
set_restrictions(c(decreasing("X", "M"),
decreasing("M", "Y"),
decreasing("X", "Y"))) %>%
query_model(queries = "Y[X=1]> Y[X=0]",
given = c("X==1 & Y==1",
"X==1 & Y==1 & M==1",
"X==1 & Y==1 & M==0"),
using = c("parameters")) %>%
kable(
caption = "In a monotonic model with flat priors, knowledge
that $M=1$ *reduces* confidence that $X=1$ caused $Y=1$")
```

Query | Given | Using | mean |
---|---|---|---|

Q 1 | X==1 & Y==1 | parameters | 0.615 |

Q 1 | X==1 & Y==1 & M==1 | parameters | 0.600 |

Q 1 | X==1 & Y==1 & M==0 | parameters | 0.667 |

This example shows how inferences change given additional data on \(M\) in a monotonic \(X \rightarrow M \rightarrow Y \leftarrow X\) model. Surprisingly observing \(M=1\) *reduces* beliefs that \(X\) caused \(Y\), the reason being that perhaps \(M\) and not \(X\) was responsible for \(Y=1\).

## 5.2 Posterior queries

Queries can also draw directly from the posterior distribution provided by `stan`

. In this next example we illustrate the joint distribution of the posterior over causal effects, drawing directly from the posterior dataframe generated by `update_model`

:

```
data <- fabricate(N = 100, X = complete_ra(N), Y = X)
model <- make_model("X->Y") %>%
set_confound(list(X = "Y[X=1]>Y[X=0]")) %>%
update_model(data, iter = 4000)
model$posterior_distribution %>%
data.frame() %>%
ggplot(aes(X_1.1 - X_1.0, Y.01 - Y.10)) +
geom_point()
```

We see that beliefs about the size of the overall effect are related to beliefs that \(X\) is assigned differently when there is a positive effect.

## 5.3 Query distribution

`query_distribution`

works similarly except that the query is over an estimand. For instance:

```
make_model("X -> Y") %>%
query_distribution(increasing("X", "Y"), using = "priors") %>%
hist(main = "Prior on Y increasing in X")
```

`## Prior distribution added to model`

## 5.4 Token and general causation

Note that in all these cases we use the same technology to make case level and population inferences. Indeed the case level query is just a conditional population query. As an illustration of this imagine we have a model of the form \(X \rightarrow M \rightarrow Y\) and are interested in whether \(X\) caused \(Y\) in a case in which \(M=1\). We answer the question by asking “what would be the probability that \(X\) caused \(Y\) in a case in which \(X=M=Y=1\)?” (line 3 below). This speculative answer is the same answer as we would get were we to ask the same question having updated our model with knowledge that in a particular case, indeed, \(X=M=Y=1\). See below:

```
model <- make_model("X->M->Y") %>%
set_restrictions(c(decreasing("X", "M"), decreasing("M", "Y"))) %>%
update_model(data = data.frame(X = 1, M = 1, Y = 1), iter = 8000)
query_model(
model,
query = "Y[X=1]> Y[X=0]",
given = c("X==1 & Y==1", "X==1 & Y==1 & M==1"),
using = c("priors", "posteriors"),
expand_grid = TRUE)
```

Query | Given | Using | mean | sd |
---|---|---|---|---|

Q 1 | X==1 & Y==1 | priors | 0.209 | 0.207 |

Q 1 | X==1 & Y==1 | posteriors | 0.224 | 0.212 |

Q 1 | X==1 & Y==1 & M==1 | priors | 0.248 | 0.221 |

Q 1 | X==1 & Y==1 & M==1 | posteriors | 0.252 | 0.221 |

We see the conditional inference is the same using the prior and the posterior distributions.

## 5.5 Complex queries

The Billy Suzy bottle breaking example illustrates complex queries. See Section 7.2.