WeiYa's Work Yard

A dog, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Generalizing Ridge Regression

Posted on
Tags: Ridge

This note is for Chapter 3 of van Wieringen, W. N. (2021). Lecture notes on ridge regression. ArXiv:1509.09169 [Stat].

Consider

\[(Y-X\beta)'W(Y-X\beta) + (\beta-\beta_0)^T\Delta(\beta - \beta_0)\]

which comprises a weighted least squares criterion and a generalized ridge penalty.

  • $W$: a $n\times n$ dimensional, diagonal matrix with $W_{ii}\in[0,1]$ representing the weight of the i-th observation.
  • $\Delta$: a $p\times p$ dimensional, positive definite, symmetric matrix, it allows
    • different penalization per regression parameter
    • joint (or correlated) shrinkage among the elements of $\beta$
  • $\beta_0$: a user-specified, non-random target towards which $\beta$ is shrunken as the penalty parameter increases

The solution is

\[\hat\beta(\Delta) = (X'WX + \Delta)^{-1}(X'WY + \Delta\beta_0)\]

Examples:

  • Fused ridge estimation
\[\Vert Y-X\beta\Vert_2^2 + \lambda \sum_{j=2}^p\Vert \beta_j-\beta_{j-1}\Vert^2_2\]
  • A ridge to homogeneity

  • Codata: groups of covariates are deemed to be differentially important for the explaination of the response.


Published in categories Note