# Monetone B-spline Smoothing

##### Posted on Mar 09, 2021 (Update: Mar 12, 2021) 0 Comments

This note is based on He, X., & Shi, P. (1998). Monotone B-Spline Smoothing. Journal of the American Statistical Association, 93(442), 643–650., and the reproduced simulations are based on the updated algorithm, Ng, P., & Maechler, M. (2007). A fast and efficient implementation of qualitatively constrained quantile smoothing splines. Statistical Modelling, 7(4), 315–328.

$\newcommand\RMSE{\mathrm{RMSE}}$

## Introduction

There are various smoothing techniques:

• kernel smoothing
• nearest neighbors
• smoothing splines
• local polynomials
• B-spline approximation
• neural networks

Focus: estimate regression curves that are known or required to be monotone.

Examples:

• growth curves, such as weight or height of growing objects over time
• item characteristic curve in the item response theory, which measures the probability of getting a correct answer for an examinee with given latent ability parameter.

### Isotonic regression

The paper points out that the isotonic regression undersmoothes the data and is very sensitive to outlying observations at the endpoints of the design space.

A natural idea is to combine smoothing with isotonic regression, and Mammen (1991) investigates the asymptotic performance of two different strategies,

• $m_{SI}$: first smoothing then isotonic
• $m_{IS}$: first isotonic then smoothing

### $I$ splines

$I$ splines are obtained by integrating $B$ splines with positive coefficients to ensure monotonicity.

But the class of $I$ splines is relatively small compared to the class of monotone splines, and that there is always a possibilities that the fit to the data could be improved by allowing more general monotone splines.

### Proposal

propose a method based on constrained least absolute deviation principle in the space of B-spline functions

• LAD: can be solved by linear programs
• $L_1$ loss function: the resulting fit approximates the conditional median function rather than the conditional mean

## Method

Suppose $n$ pairs of observations ${(x_i, y_i),i=1,\ldots,n}$.

$y_i = g(x_i) + u_i, i=1,\ldots,n$

w.l.o.g, restrict $x\in [0, 1]$.

Let $0 = t_0 < t_1 < \cdots < t_{k_n}=1$ be a partition of $[0, 1]$. Let $N=k_n+p=(k_n-1)+(p+1)$ and $\pi_1(x),\ldots, \pi_N(x)$. Choose quadratic splines $p=2$,

$\pi(x) = (\pi_1(x), \pi_2(x), \ldots, \pi_N(x))$

Estimate $g$ by $\hat g_n(x)=\pi(x)^T\hat\alpha$ by minimizing

$\sum_{i=1}^n\vert y_i-\pi(x_i)^T\hat\alpha\vert$

s.t. monotonicity on $\hat g_n$, or equivalently

$\pi'(t_j)^T\alpha\ge 0, j=1,\ldots,k_n$

Rewrite it as

$\sum_{i=1}^n r_i^+ + r_i^-$

subject to

$r_i^+\ge 0, r_i^-\ge 0, r_i^+ - r_i^- = y_i-\pi(x_i)^T\alpha, i=1,\ldots,n$

and

$\pi'(t_j)^T\alpha\ge 0, j=0,\ldots,k_n$

Propose a set of knots equally spaced in percentile ranks.

View the problem of choosing $k$ as model selection to minimize

$IC(k) = \log\left(\sum_i\vert y_i-\hat g_n(x_i)\vert\right) + 2(k+2)/n$

where the factor $2$ is selected mainly due to their experience.

a possible shortcoming is that the degree of freedom should depends on the number of ties in $\alpha$.

The monotone B-spline smoothing generalizes directly to the problem of estimating monotone quantile functions by replace $\vert \cdot \vert$ with

$\rho_\tau(r) = r(\tau - I(r < 0))$

some applications, satisfy certain boundary conditions, such as $0\le g(0)\le g(1)\le 1$ for a probability curve.

## Asymptotic results

assume iid errors

prove uniform convergence and gives the rates

## Simulations

Compare the root mean squared error (RMSE) with competitors

$\RMSE(q) = \left\{(n-2q)^{-1}\sum_{i=q+1}^{n-q}(\hat g_n(x_i)-g(x_i))^2\right\}^{1/2}$

where $q$ aims to compare the performance in the interior of the design space, and $\RMSE(0)$ is the classical $\RMSE$.

Note that the error term is calculated as $\hat g_n(x_i) - g(x_i)$ instead of $\hat g_n(x_i) - y_i$, where $y_i = g(x_i) + u_i$.

The original competitors are

• Kernel Smoothing with WARPing algorithm, which can automatically determine the bandwidth
• Monotone Kernel, which first perform isotonic regression then smoothing

I did not find available implementation for the WARPing algorithm, and then I pick the Local Polynomial Regression Fitting (loess in R), and also consider $m_{SI}$ in additional to $m_{IS}$.

The implementation of the paper’s method is calling cobs from Ng & Maechler (2007)’s COBS package.

The results are

Note that the numeric results for the proposed method are quite close to the original paper.

Published in categories Report