Predictive Degrees of Freedom
The classical model degrees of freedom fails to explain the intrinsic difference among interpolating models, since it depends on in-sample prediction only.
One of its underlying assumptions is that the values of covariates in test data are considered fixed and the same as those in the training data. (Fixed-X setting)
The paper consider out-of-sample prediction for definition of model complexity, where test features are different from those in the training data (Random-X setting)
only defined for linear smoother
For a linear procedure with hat matrix $H$ such that $\hat\mu=Hy$. For each $x_\star \in \IR^d$, there must exist $h_\star\in \IR^n$ that depends only on $X$ and $x_\star$ such that\[\hat\mu_\star = h_\star^Ty\]
Define the model degrees of freedom under the Random-X setting as\[df_R = \tr(H) + \frac n2\left(E(\Vert h_\star\Vert^2\mid X) - \frac 1n \tr(H^TH)\right)\]