Confidence Intervals of Smoothed Isotonic Regression
Posted on
This note is for Groeneboom, P., & Jongbloed, G. (2023). Confidence intervals in monotone regression (arXiv:2303.17988). arXiv.
the ordinary nonparametric bootstrap, based on the nonparametric least squares estimator (LSE) $\hat f_n$ is inconsistent in this situation
the paper shows that a consistent bootstrap can be based on the smoothed $\hat f_n$, to be called the SLSE (Smoothed Least Squares Estimator)
- the asymptotic pointwise distribution of the SLSE is derived.
- The confidence intervals, based on the smoothed bootstrap, are compared to intervals based on the (not necessarily monotone) Nadaraya Watson estimator and the effect of Studentization is investigated.
- also give a method for automatic bandwidth choice
consider the monotone regression setting where we observe independent pairs $(X_i, Y_i)$ of random variables, where the $X_i$ are i.i.d. with non-vanishing density $g$ on $[0, 1]$ and
\[Y_i = f_0(X_i) + \varepsilon_i, 1\le i\le n\]- The regression function $f_0:[0, 1]\rightarrow \IR$ is nondecreasing or nonincreasing
- the $\varepsilon_i$ are i.i.d. sub-Gaussian with expectation 0 and variance $\sigma_0^2$, independent of the $X_i$’s.
Aim: construct pointwise nonparametric confidence intervals for $f_0(t)$
the basic monotone least squared estimate (LSE) $\hat f_n$ of $f_0$ is the so-called isotonic regression of $(X_i, Y_i)$. The estimator is defined as minimizer of
\[\sum_{i=1}^n (Y_i-f(X_i))^2\]over all nondecreasing functions $f$.
The LSE can be computed via a straightforward method, using the so-called cumulative sum diagram (cumsum diagram).
Suppose $X_1, \ldots, X_n$ as ordered in the sense that $X_1 < X_2 <\ldots < X_n$ and relabel the $Y_i$’s accordingly (as $Y_i$ related to the specific $X_i$). The cusum diagram is then the set of points
\[(0, 0), \left(i, \sum_{j\le i} Y_j\right), i=1,\ldots, n\]and the monotone least squares estimator $\hat f_n(X_i)$ is given by the left-continuous slope of the greatest convex minorant of the cusum diagram evaluated at $i$
- Section 2:
- define a smoothed least squares estimator (SLSE) for $f_0$.
- show that the estimator is asymptotically normally distributed with rate $n^{2/5}$ and derive its asymptotic bias and variance.
- consider the smooth but not necessarily monotone NW estimator of $f_0$
- Section 3:
- based on the SLSE and the NW estimator, propose bootstrap methods to construct confidence sets for $f_0(t)$
- prove a theorem stating that the bootstrap method based on the SLSE asymptotically works, with specific choices for the various bandwidths involved.
- Section 4:
- address the problem of bandwidth selection in practice
- propose a smoothed bootstrap approach
- Section 5:
- illustrate the method using a climate change dataset