# Predictive Degrees of Freedom

##### Posted on

This note is for Luan, B., Lee, Y., & Zhu, Y. (2021). Predictive Model Degrees of Freedom in Linear Regression. ArXiv:2106.15682 [Math].

The classical model degrees of freedom fails to explain the intrinsic difference among interpolating models, since it depends on in-sample prediction only.

One of its underlying assumptions is that the values of covariates in test data are considered fixed and the same as those in the training data. (**Fixed-X setting**)

The paper consider out-of-sample prediction for definition of model complexity, where test features are different from those in the training data (**Random-X setting**)

*only defined for linear smoother*

For a linear procedure with hat matrix $H$ such that $\hat\mu=Hy$. For each $x_\star \in \IR^d$, there must exist $h_\star\in \IR^n$ that depends only on $X$ and $x_\star$ such that

\[\hat\mu_\star = h_\star^Ty\]Define the model degrees of freedom under the Random-X setting as

\[df_R = \tr(H) + \frac n2\left(E(\Vert h_\star\Vert^2\mid X) - \frac 1n \tr(H^TH)\right)\]