Conditional Independence Test in Single-cell Multiomics
Posted on 0 Comments
This note is for Boyeau, P., Bates, S., Ergen, C., Jordan, M. I., & Yosef, N. (2023). Calibrated Identification of Feature Dependencies in Single-cell Multiomics.
- $X\in \IR^{N\times G}$: a matrix of features
- $y\in \IR^N$
- $s\in \IR^{N\times T}$: observed nuisance factors
goal: detect features in $X$ that are associated with the response variable $y$ whilie controlling for the nuisance factors
conditional independence test, for each gene,
\[H_{0g}: x_g\perp y\mid x_{-g}, s\]the premise of CRT is that while it is difficult to directly assess how the distribution of the response variable $y$ depends on $x$, it is easier to describe how the features of $X$ depend on each other
two ingredients:
- a generative model for $X$ to capture the dependencies between features
- an importance score to evaluate their association with $y$
Generative model
an unobserved low-dimensional variable $z$ is assumed to capture the state of each cell and provide a concise summary of the biological variation among cells.
we assume that the model factorizes, for each cell and under i.i.d. assumptions
\[p(x, z\mid s) = p(z)\left(\prod_{g=1}^G p(x_g\mid z, s)\right)\]where
- $p(z)$ is the latent variable prior
- $p(x_g\mid z, s)$ is the likelihood for gene $g$