Debiased ML via NN for GLM
Posted on 0 Comments
This is the note for Chernozhukov, V., Newey, W. K., Quintas-Martinez, V., & Syrgkanis, V. (2021). Automatic Debiased Machine Learning via Neural Nets for Generalized Linear Regression. ArXiv:2104.14737 [Econ, Math, Stat].
give debiased machine learners of parameters of interest that depend on generalized linear regressions.
machine learners provide remarkably good predictions in a variety of settings but are inherently biased.
The bias arises from using regularization and/or model selection to control the variance of the prediction.
Confidence intervals based on estimators with approximately balanced variance and squared bias will tend to have poor coverage.
Consider iid observations $W_1,\ldots, W_n$ with $W_i$ having CDF $F_0$.
Take a function to depend on a vector of regressors $X$,
impose the restriction that $\gamma$ is in a set of functions $\Gamma$ that is linear and closed in mean square,
specify that the estimator $\gamma$ is an element of $\Gamma$ with probability one and has a probability limit $\gamma(F)$ when $F$ is the distribution of a single observation $W_i$.
Suppose that $\gamma(F)$ satisfies an orthogonality condition where a residual $\rho(W,\gamma)$ with finite second moment is orthogonal in the population to all $b\in \Gamma$.\[E_F[b(X)\rho(W,\gamma(F))] = 0\]
for all $b\in \Gamma$ and $\gamma(F)\in\Gamma$.
- $\rho(W, \gamma)=Y-\gamma(X)$: orthogonality condition is necessary and sufficient for $\gamma(F)$ to be the least squares projection of $Y$ on $\Gamma$.
- quantile conditions $\rho(W,\gamma)=p-1(Y<\gamma(X))$
- first order conditions for generalized linear models, $\rho(W,\gamma)=\lambda(\gamma(X))[Y-\mu(\gamma(X))]$