# Causal Inference on Distribution Functions

##### Posted on (Update: )

### Wasserstein Space

- $\cI$ be an interval of $\IR$,
- $V_1$ and $V_2$ be random variables taking values in $\cI$ with finite second moments
- $\lambda_1, \lambda_2$: cumulative distribution functions

the 2-Wassertein distance between $\lambda_1$ and $\lambda_2$ is defined as

\[W_2(\lambda_1, \lambda_2) = \left(\inf_{\lambda_{12}\in \Lambda(\lambda_1,\lambda_2)}\int_{\cI\times \cI}(s-t)^2d\lambda_{12}(s, t)\right)^{1/2}\,.\]the Wasserstein distance corresponds to the minimum effort that is required in order to transport the mass of $\lambda_1$ to produce the mass distribution of $\lambda_2$

## Causal inference on distribution functions

both $Y_i(1)$ and $Y_i(0)$ take value in the Wassertein space, we define their means using their Wasserstein barycentres:

\[\mu_a = \bbE Y(a) = \argmin_{v\in\cW_2\cI}\bbE[W_2^2(Y(a), v))]\] \[\newcommand\oE{\mathrm{E}\!\!\circ}\]ideally, a causal effect definition in the Wasserstein space should satisfy the following desiderata:

- (a) when $\oE Y(1) = \oE Y(0)$, the causal effect equals zero
- (b) in the degenerate case where $Y_i(a) = \delta_{y_i(a)}$, reduces to the classical scenario
- (c) the average causal effect is a contrast between the averages of potential outcomes in two hypothetical populations
- (d) the average causal effect equals the average of individual causal effects

causal effect defined in this way satisfies desierata (a)-(c), in general, it fails to satisfy desideratum (d)

the paper introduces a novel definition of average causal effect, called the causal effect map

Let $\lambda$ be a continuous distribution function. The individual causal effect map of $A$ on $Y$ is defined as

\[\Delta_i^\lambda (\cdot) = Y_i(1)^{-1} \circ \lambda(\cdot) -Y_i(0)^{-1}\circ \lambda(\cdot)\]where $\lambda$ is a reference distribution.