WeiYa's Work Yard

A dog, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Causal Inference on Distribution Functions

Posted on (Update: )
Tags: Causal Inference, Wasserstein Space

This post is for Lin, Z., Kong, D., & Wang, L. (2023). Causal inference on distribution functions. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(2), 378–398.

Wasserstein Space

  • $\cI$ be an interval of $\IR$,
  • $V_1$ and $V_2$ be random variables taking values in $\cI$ with finite second moments
  • $\lambda_1, \lambda_2$: cumulative distribution functions

the 2-Wassertein distance between $\lambda_1$ and $\lambda_2$ is defined as

\[W_2(\lambda_1, \lambda_2) = \left(\inf_{\lambda_{12}\in \Lambda(\lambda_1,\lambda_2)}\int_{\cI\times \cI}(s-t)^2d\lambda_{12}(s, t)\right)^{1/2}\,.\]

the Wasserstein distance corresponds to the minimum effort that is required in order to transport the mass of $\lambda_1$ to produce the mass distribution of $\lambda_2$

Causal inference on distribution functions

both $Y_i(1)$ and $Y_i(0)$ take value in the Wassertein space, we define their means using their Wasserstein barycentres:

\[\mu_a = \bbE Y(a) = \argmin_{v\in\cW_2\cI}\bbE[W_2^2(Y(a), v))]\] \[\newcommand\oE{\mathrm{E}\!\!\circ}\]

ideally, a causal effect definition in the Wasserstein space should satisfy the following desiderata:

  • (a) when $\oE Y(1) = \oE Y(0)$, the causal effect equals zero
  • (b) in the degenerate case where $Y_i(a) = \delta_{y_i(a)}$, reduces to the classical scenario
  • (c) the average causal effect is a contrast between the averages of potential outcomes in two hypothetical populations
  • (d) the average causal effect equals the average of individual causal effects

causal effect defined in this way satisfies desierata (a)-(c), in general, it fails to satisfy desideratum (d)

the paper introduces a novel definition of average causal effect, called the causal effect map

Let $\lambda$ be a continuous distribution function. The individual causal effect map of $A$ on $Y$ is defined as

\[\Delta_i^\lambda (\cdot) = Y_i(1)^{-1} \circ \lambda(\cdot) -Y_i(0)^{-1}\circ \lambda(\cdot)\]

where $\lambda$ is a reference distribution.

Published in categories Note