Causal Inference
Random notes on causal inference.
Panel Data
Difference in Difference
We can impute the potential outcome of the treatment group by adding the difference between \(T_{post} = 0\) and \(T_{post} = 1\) in control group to \(T_{post}=0\) in the treatment group.
\[\begin{split}E[Y_{(0)} | D = 1, T_{post} = 1] = & E[Y | D = 1, T_{post} = 0] + \\
& (E[Y | D = 0, T_{post} = 1] - E[Y | D = 0, T_{post} = 0])\end{split}\]
The imputation is quite intuitive as we can consider \(E[Y | D = 1, T_{post} = 0]\) as a baseline, and see the difference in control group as a trend that’s universal to both of the treatment group and the control group.
The \(ATT\) is defined as:
\[\begin{split}ATT &= E[Y_{it,(1)} - Y_{it,(0)} | D = 1, T_{post} = 1] \\
&= E[Y_{it,(1)} | D = 1, T_{post} = 1] - E[Y_{it,(0)} | D = 1, T_{post} = 1] \\
&= E[Y | D = 1, T_{post} = 1] - E[Y_{it,(0)} | D = 1, T_{post} = 1]\end{split}\]
The first term is correct since \(Y_{(1)} = Y\) in treatment group after treatment, while the second term is the quantity we are trying to impute. Substituting the imputed \(E[Y_{(0)} | D = 1, T_{post} = 1]\) into ATT we can get a nice representation of difference-in-difference:
\[\begin{split}ATT &= (E[Y | D = 1, T_{post} = 1] - E[Y | D = 1, T_{post} = 0]) - \\
&= (E[Y | D = 0, T_{post} = 1] - E[Y | D = 0, T_{post} = 0])\end{split}\]