# Generalizing Ridge Regression

##### Posted on Dec 14, 2021
Tags: Ridge

This note is for Chapter 3 of van Wieringen, W. N. (2021). Lecture notes on ridge regression. ArXiv:1509.09169 [Stat].

Consider

$(Y-X\beta)'W(Y-X\beta) + (\beta-\beta_0)^T\Delta(\beta - \beta_0)$

which comprises a weighted least squares criterion and a generalized ridge penalty.

• $W$: a $n\times n$ dimensional, diagonal matrix with $W_{ii}\in[0,1]$ representing the weight of the i-th observation.
• $\Delta$: a $p\times p$ dimensional, positive definite, symmetric matrix, it allows
• different penalization per regression parameter
• joint (or correlated) shrinkage among the elements of $\beta$
• $\beta_0$: a user-specified, non-random target towards which $\beta$ is shrunken as the penalty parameter increases

The solution is

$\hat\beta(\Delta) = (X'WX + \Delta)^{-1}(X'WY + \Delta\beta_0)$

Examples:

• Fused ridge estimation
$\Vert Y-X\beta\Vert_2^2 + \lambda \sum_{j=2}^p\Vert \beta_j-\beta_{j-1}\Vert^2_2$
• A ridge to homogeneity

• Codata: groups of covariates are deemed to be differentially important for the explaination of the response.

Published in categories Note