# Causal Inference on Distribution Functions

### Wasserstein Space

• $\cI$ be an interval of $\IR$,
• $V_1$ and $V_2$ be random variables taking values in $\cI$ with finite second moments
• $\lambda_1, \lambda_2$: cumulative distribution functions

the 2-Wassertein distance between $\lambda_1$ and $\lambda_2$ is defined as

$W_2(\lambda_1, \lambda_2) = \left(\inf_{\lambda_{12}\in \Lambda(\lambda_1,\lambda_2)}\int_{\cI\times \cI}(s-t)^2d\lambda_{12}(s, t)\right)^{1/2}\,.$

the Wasserstein distance corresponds to the minimum effort that is required in order to transport the mass of $\lambda_1$ to produce the mass distribution of $\lambda_2$

## Causal inference on distribution functions

both $Y_i(1)$ and $Y_i(0)$ take value in the Wassertein space, we define their means using their Wasserstein barycentres:

$\mu_a = \bbE Y(a) = \argmin_{v\in\cW_2\cI}\bbE[W_2^2(Y(a), v))]$ $\newcommand\oE{\mathrm{E}\!\!\circ}$

ideally, a causal effect definition in the Wasserstein space should satisfy the following desiderata:

• (a) when $\oE Y(1) = \oE Y(0)$, the causal effect equals zero
• (b) in the degenerate case where $Y_i(a) = \delta_{y_i(a)}$, reduces to the classical scenario
• (c) the average causal effect is a contrast between the averages of potential outcomes in two hypothetical populations
• (d) the average causal effect equals the average of individual causal effects

causal effect defined in this way satisfies desierata (a)-(c), in general, it fails to satisfy desideratum (d)

the paper introduces a novel definition of average causal effect, called the causal effect map

Let $\lambda$ be a continuous distribution function. The individual causal effect map of $A$ on $Y$ is defined as

$\Delta_i^\lambda (\cdot) = Y_i(1)^{-1} \circ \lambda(\cdot) -Y_i(0)^{-1}\circ \lambda(\cdot)$

where $\lambda$ is a reference distribution.

Published in categories Note