# Prediction Risk for the Horseshoe Regression

##### Posted on Mar 24, 2022
Tags: Horseshoe, Ridge

Develop theoretical results on prediction risk in the high-dimensional linear regression model

$y = X\beta + \epsilon$

Consider the quadratic predictive risk

$R = E_{y^\star, y\mid X, \beta}(y^\star - X\hat\beta)^2$

focus on comparing estimators $\hat\beta$ in a non-asymptotic fixed $n$, fixed $p > n$ setting.

global shrinkage: shrinkage estimators with a single tuning parameter

• ridge regression
• principal components regression

purely global shrinkage regression methods suffer from two major difficulties

• the amount of relative shrinkage is monotone in the singular values of the design matrix
• the shrinkage is determined by a single tuning parameter

A finite sample unbiased estimate of $R$ is given by Stein’s unbiased risk estimate or SURE

### contributions

analyze the finite sample predictive risk of global shrinkage regression methods, examine where these methods fall short, and demonstrate a remedy using local shrinkage parameters

• theoretical findings: an orthogonalized representation that allows shrinkage regression estimates to be viewed as posterior means under some suitable priors.
• provide explicit finite sample risk comparisons between the global ridge and global-local horseshoe regressions

## Shrinkage regression estimates as posterior means

Let $X=UDW^T$, and let $Z=UD$ be $n\times n$, and $\alpha = W^T\beta$ be $n\times 1$. Then the regression model becomes

$y=Z\alpha + \epsilon$

The estimates of many shrinkage regression methods can be expressed in terms of posterior mean of the orthogonalized regression coefficients $\alpha$ under the following hierarchical model:

\begin{align*} (\hat\alpha_i\mid \alpha_i, \sigma^2) \sim_{ind} N(\alpha_i, \sigma^2d_i^{-2})\\ (\alpha_i\mid \sigma^2,\tau^2, \lambda_i^2) \sim_{ind} N(0, \sigma^2\tau^2\lambda_i^2) \end{align*}

where $\sigma^2,\tau^2 > 0$,

• $\tau$: control the amount of shrinkage
• fixed $\lambda_i^2$: depend on the method at hand

several examples:

• ridge: $\lambda_i^2=1$

## Stein’s unbiased risk estimate for the horseshoe regression

the global-local horseshoe shrinkage regreesion extends the global shrinkage regression models by putting a local (component-specific), heavy-tailed half-Cauchy prior on the $\lambda_i$ terms that allow these terms to be learned from the data

\begin{align*} (\hat\alpha_i\mid \alpha_i, \sigma^2) &\sim_{ind} N(\alpha_i, \sigma^2d_i^{-2})\\ (\alpha_i\mid \sigma^2, \tau^2, \lambda_i^2) & \sim_{ind} N(0, \sigma^2\tau^2\lambda_i^2)\\ \lambda_i & \sim_{ind} C^+(0, 1) \end{align*}

where $C^+(0, 1) denotes a standard half-Cauchy random variable with density$p(\lambda_i) = (2/\pi)(1+\lambda_i^2)^{-1}\$

Published in categories Note