SuSiE: Sum of Single Effects Model
Posted on
writing the sparse vector of regression coefficients as a sum of “single-effect” vectors, each with one non-zero element
a new fitting procedure: iterative Bayesian stepwise selection (IBSS)
a Bayesian analogue of stepwise selection methods
instead of selecting a single variable at each step, IBSS computes a distribution on variables that captures uncertainty in which variable to select
\[y = Xb + e\]variables $j$ with non-zero effects as “effect variables”
Assume variables 1 and 4 are two effect variables, but each completely corrleated with another non-effect variable, say $x_1 = x_2$ and $x_3 = x_4$
we may conclude that there are (at least) two effect variables, and that
\[(b_1\neq 0 \text{ or } b_2 \neq 0)\text{ and } (b_3\neq 0\text{ or } b_4\neq 0)\]The goal: is to provide methods that directly produce this kind of inferential statement.
two approaches:
- select groups of variables
- Bayesian approach
the marginal posterior inclusion probability (PIP) of each variable
\[PIP_j = \Pr(b_j\neq 0\mid X, y)\]level $\rho$ credible set: a subset of variables that has probability $\rho$ or greater of containing at least one effect variable. Equivalently, the probability that all variables in the credible set have zero regression coefficients is $1-\rho$ or less.
primary aim: report as many credible sets as the data support, each with as few variables as possible second goal: prioritize the variables within each credible set, assigning each a probability that reflects the strength of the evidence for that variable being an effect variable.
Posterior under single-effect regression model
- $\alpha$ is the vector of PIPs
- from $\alpha$, one can compute a level $\rho$ credible set, $CS(\alpha, \rho)$
The sum of single-effects regression model
Numerical Comparisons
b specified by two parameters:
- S: the number of effect variables
- $\phi$: the proportion of variance in $y$ explained by $X$