Generalizing Ridge Regression

Tags: Ridge

Consider

\[(Y-X\beta)'W(Y-X\beta) + (\beta-\beta_0)^T\Delta(\beta - \beta_0)\]

which comprises a weighted least squares criterion and a generalized ridge penalty.

$W$: a $n\times n$ dimensional, diagonal matrix with $W_{ii}\in[0,1]$ representing the weight of the i-th observation.
$\Delta$: a $p\times p$ dimensional, positive definite, symmetric matrix, it allows
- different penalization per regression parameter
- joint (or correlated) shrinkage among the elements of $\beta$
$\beta_0$: a user-specified, non-random target towards which $\beta$ is shrunken as the penalty parameter increases

The solution is

\[\hat\beta(\Delta) = (X'WX + \Delta)^{-1}(X'WY + \Delta\beta_0)\]

Examples:

\[\Vert Y-X\beta\Vert_2^2 + \lambda \sum_{j=2}^p\Vert \beta_j-\beta_{j-1}\Vert^2_2\]

A ridge to homogeneity
Codata: groups of covariates are deemed to be differentially important for the explaination of the response.

Published in categories Note

← previous next →

WeiYa's Work Yard