2024-11-12
📖 Key reading
Bueno de Mesquita & Fowler - Thinking Clearly with Data - Chapters 2 & 9
Definition
A is caused by B when A would not have happened if B had not happened.
⚠️ The problem
We cannot observe both potential outcomes for an individual.
\[ \text{mean}(Y_{i}^{1}-Y_{i}^{0}) = \text{mean}(Y_{i}^{1}) - \text{mean}(Y_{i}^{0}) \]
ATT is the mean \(\tau_i\) for the subset of units who received treatment.
ATU is the mean \(\tau_i\) for the subset of units who did not receive treatment.
The ATT and ATU are different from the ATE when there is biased (i.e., non-random) selection into who receives treatment.
We have some variables:
\(i\) | \(X_i\) | \(D_i\) | \(Y_{i}^{0}\) | \(Y_{i}^{1}\) | \(\tau_i\) |
---|---|---|---|---|---|
\(i_1\) | 5 | 5 | 4 | ||
\(i_2\) | 4 | 3 | 3 | ||
\(i_3\) | 3 | 4 | 5 | ||
\(i_4\) | 4 | 4 | 3 | ||
\(i_5\) | 5 | 5 | 5 | ||
\(i_6\) | 2 | 3 | 3 | ||
\(i_7\) | 3 | 2 | 3 | ||
\(i_8\) | 2 | 1 | 2 | ||
\(i_9\) | 1 | 1 | 1 | ||
\(i_{10}\) | 1 | 1 | 2 |
\(i\) | \(X_i\) | \(D_i\) | \(Y_{i}^{0}\) | \(Y_{i}^{1}\) | \(\tau_i\) |
---|---|---|---|---|---|
\(i_1\) | 5 | 5 | 4 | -1 | |
\(i_2\) | 4 | 3 | 3 | 0 | |
\(i_3\) | 3 | 4 | 5 | 1 | |
\(i_4\) | 4 | 4 | 3 | -1 | |
\(i_5\) | 5 | 5 | 5 | 0 | |
\(i_6\) | 2 | 3 | 3 | 0 | |
\(i_7\) | 3 | 2 | 3 | 1 | |
\(i_8\) | 2 | 1 | 2 | 1 | |
\(i_9\) | 1 | 1 | 1 | 0 | |
\(i_{10}\) | 1 | 1 | 2 | 1 |
\(ATE = mean(\tau_i) = 0.2\)
\(mean(Y_{i}^{1}) = 3.1\)
\(mean(Y_{i}^{0}) = 2.9\)
→ \(ATE = 3.1-2.9 = 0.2\)
\(i\) | \(X_i\) | \(D_i\) | \(Y_{i}^{0}\) | \(Y_{i}^{1}\) |
---|---|---|---|---|
\(i_1\) | 5 | 0 | 5 | ? |
\(i_2\) | 4 | 0 | 3 | ? |
\(i_3\) | 3 | 0 | 4 | ? |
\(i_4\) | 4 | 1 | ? | 3 |
\(i_5\) | 5 | 0 | 5 | ? |
\(i_6\) | 2 | 1 | ? | 3 |
\(i_7\) | 3 | 1 | ? | 3 |
\(i_8\) | 2 | 0 | 1 | ? |
\(i_9\) | 1 | 1 | ? | 1 |
\(i_{10}\) | 1 | 1 | ? | 2 |
\(i\) | \(X_i\) | \(D_i\) | \(Y_i\) |
---|---|---|---|
\(i_1\) | 5 | 0 | 5 |
\(i_2\) | 4 | 0 | 3 |
\(i_3\) | 3 | 0 | 4 |
\(i_4\) | 4 | 1 | 3 |
\(i_5\) | 5 | 0 | 5 |
\(i_6\) | 2 | 1 | 3 |
\(i_7\) | 3 | 1 | 3 |
\(i_8\) | 2 | 0 | 1 |
\(i_9\) | 1 | 1 | 1 |
\(i_{10}\) | 1 | 1 | 2 |
\(i\) | \(X_i\) | \(D_i\) | \(Y_i\) |
---|---|---|---|
\(i_1\) | 5 | 0 | 5 |
\(i_2\) | 4 | 0 | 3 |
\(i_3\) | 3 | 0 | 4 |
\(i_4\) | 4 | 1 | 3 |
\(i_5\) | 5 | 0 | 5 |
\(i_6\) | 2 | 1 | 3 |
\(i_7\) | 3 | 1 | 3 |
\(i_8\) | 2 | 0 | 1 |
\(i_9\) | 1 | 1 | 1 |
\(i_{10}\) | 1 | 1 | 2 |
\(mean(Y_i|D_i = 1)\) = 2.4
\(mean(Y_i|D_i = 0)\) = 3.6
Estimated ATE = \(2.4 - 3.6 = -1.2\)
\(i\) | \(X_i\) | \(D_i\) | \(Y_i\) |
---|---|---|---|
\(i_1\) | 5 | 0 | 5 |
\(i_2\) | 4 | 0 | 3 |
\(i_3\) | 3 | 0 | 4 |
\(i_4\) | 4 | 1 | 3 |
\(i_5\) | 5 | 0 | 5 |
\(i_6\) | 2 | 1 | 3 |
\(i_7\) | 3 | 1 | 3 |
\(i_8\) | 2 | 0 | 1 |
\(i_9\) | 1 | 1 | 1 |
\(i_{10}\) | 1 | 1 | 2 |
\(mean(Y_i|D_i = 1)\) = 2.4
\(mean(Y_i|D_i = 0)\) = 3.6
Estimated ATE = \(2.4 - 3.6 = -1.2\)
What’s the problem?
\(i\) | \(X_i\) | \(D_i\) | \(Y_i\) |
---|---|---|---|
\(i_1\) | 5 | 0 | 5 |
\(i_2\) | 4 | 1 | 3 |
\(i_3\) | 3 | 1 | 5 |
\(i_4\) | 4 | 0 | 4 |
\(i_5\) | 5 | 1 | 5 |
\(i_6\) | 2 | 0 | 3 |
\(i_7\) | 3 | 0 | 2 |
\(i_8\) | 2 | 0 | 1 |
\(i_9\) | 1 | 1 | 1 |
\(i_{10}\) | 1 | 1 | 2 |
\(i\) | \(X_i\) | \(D_i\) | \(Y_i\) |
---|---|---|---|
\(i_1\) | 5 | 0 | 5 |
\(i_2\) | 4 | 1 | 3 |
\(i_3\) | 3 | 1 | 5 |
\(i_4\) | 4 | 0 | 4 |
\(i_5\) | 5 | 1 | 5 |
\(i_6\) | 2 | 0 | 3 |
\(i_7\) | 3 | 0 | 2 |
\(i_8\) | 2 | 0 | 1 |
\(i_9\) | 1 | 1 | 1 |
\(i_{10}\) | 1 | 1 | 2 |
\(mean(Y_i|D_i = 1)\) = 3.2
\(mean(Y_i|D_i = 0)\) = 3
Estimated ATE = \(3.2 - 3 = 0.2\) 1
ggplot2
is based on - hence “gg”GV249 AT5 | 📨 email l.m.metson@lse.ac.uk 🤔 Question? 🙋 raise your hand or 🖥️ use the Moodle Forum.