WeiYa's Work Yard

A traveler with endless curiosity, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Test of Monotonicity and Convexity by Splines

Posted on (Update: )
Tags: Monotone Function, Shape Restrictions

This note is for Wang, J. C., & Meyer, M. C. (2011). Testing the monotonicity or convexity of a function using regression splines. The Canadian Journal of Statistics / La Revue Canadienne de Statistique, 39(1), 89–107.

Consider quadratic B-spline basis functions for the test of monotonicity and cubic B-splines for the test of convexity

suppose that

\[y_i = f(x_i) + \sigma\varepsilon_i, \qquad i = 1,\ldots,n\]

for $x_i\in [0, 1]$ and the $\varepsilon_i$ are iid with mean zero and unit variance.

propose to test $H_1:f’(x)\ge 0$ vs $H_2$

  • knots $0 = t_1 < \cdots < t_k = 1$
  • degree-$p$ B-spline basis functions $B_{1p}(x), \ldots, B_{mp}(x)$ for $m= k+p - 1$

The number of interior knots is $k - 2$, and the order of degree-$p$ is $p+1$, then the number of basis functions is

\[m = (k - 2) + (p + 1) = k + p - 1\]

see also: https://esl.hohoweiya.xyz/05-Basis-Expansions-and-Regularization/Appendix-Computations-for-B-splines/index.html

Define the $m\times k$ matrix $S$ of slopes at the knots by $S_{jl} = B’_{jp}(t_l)$. If the spline function

\[g(x) = \sum_{j=1}^m b_jB_{jp}(x)\]

is non-decreasing, then $S^\top b\ge 0$. For quadratic splines, it is a sufficient and necessary condition.

See also: Wang, L., Fan, X., Li, H., & Liu, J. S. (2024). Monotone Cubic B-Splines with a Neural-Network Generator. Journal of Computational and Graphical Statistics, 0(0), 1–15. https://doi.org/10.1080/10618600.2024.2431070

Similarly, if $T$ is the matrix of the second derivatives of the basis functions at the knots, that is, $T_{jl}=B_{jp}^{\prime\prime}(t_l)$, then for convex spline function, we have $T^\top b\ge 0$. For cubic (and lower order) regression splines, it is a sufficient and necessary condition.

The $\hat b^\star$ that minimizes under the monotonicity constraints (or the convexity constraints) is found using standard quadratic programming routines. Briefly, the constrained solution is a least-squares projection onto a linear space of (possibly) smaller dimension.

The testing procedure is outlined here

  1. obtain the unconstrained minimizer $\hat b$ and determine the minimum of the slopes at the knots, and denote it as $s_\min = \min(S^\top \hat b)$
  2. if $s_\min$ is non-negative, then do not reject $H_1$ (here the original text write the null hypothesis as $H_1$ and alternative hypothesis as $H_2$)
  3. otherwise, estimate the distribution of the minimum slope under the null hypothesis
  4. if $s_\min$ is smaller than the estimated percentile, then reject

propose two approaches for obtaining the null distribution

  • the first approach is an analytic method which requires normality assumption or using normal approaximation
  • the second method is resampling-based

resamples of $\{x_i\}$: $\{x_1^\star, x_2^\star, \ldots, x_n^\star\}$.

The resample could be drawn from the empirical distribution of ${x_i}$, or its smoothed counterpart, or could simply be the original sample if the ${x_i}$’s are considered fixed design points. The resamples of $y$ are simulated using

\[y_i^\star = \sum_{j=1}^m \hat b_j^\star B_{jp}(x_i^\star) + \varepsilon_i^\star\]

fit the unconstrained regression spline model (with same knot placement) to each resampled data set ${(x_1^\star, y_1^\star), (x_2^\star, y_2^\star),\ldots, (x_n^\star, y_n^\star)}$, and calculate the minimum slope estimate $s_\min^\star$ based on each bootstrap resample.

Theoretical Properties

Assuming that the errors $\varepsilon_i$ are iid normal random variables with mean 0 and variance $\sigma^2$, the vector of unconstrained estimates of slopes $S^\top\hat b$ follows a multivariate normal distribution conditioning on the knot placement and covariate values, that is

\[S^\top\hat b \mid k; t_1,\ldots, t_k; x_1,\ldots, x_n\sim N(S^\top\beta, S^\top (\Delta^\top\Delta)^{-1}S\sigma^2)\]

thus

\[P(s_\min \le r\mid k; t_1,\ldots, t_k)\asymp P_{\beta,\sigma^2}(r) = 1 - \int\cdot\int_{\{z\mid z - r1\ge 0\}} \phi(z; S^\top\beta, S^\top(\Delta^\top\Delta)^{-1}S\sigma^2)dz\]

Extension to partial linear models

\[y_i = f(x_i) + z_i^t\gamma + \sigma\varepsilon_i\]

Code

The code is available at https://www.stat.colostate.edu/~meyer/testmonotonicity.R

I also back up the code at https://github.com/szcf-weiya/MonotoneDecomposition.jl/blob/master/src/testmonotonicity.R

Let’s review the code.

#  dir=0 if decreasing; otherwise will fit increasing function
#
#  matrix of covariates z does not contain one-vector and must
#  combine with the splines vectors to form linearly independent set
#
#  IF no covariates, set z=0
#
#  Scatterplot is (x[i],y[i]); k=number of knots (k-2 interior knots), 
#
# w is a vector of weights:
#           var(y[i]) is inversely proportional to w[i]

montest = function(x, y, z, w, k, nsim, dir){
}

the more step-by-step comments can be found here

Simulations and Discussion

seven different regression functions with independent normal errors and $\sigma=1$, each with five different sample sizes

use two pre-specified numbers of knots ($k=4$ and $k=6$)

and in addition, use a data-driven choice of $k$. one might use a method such as generalized cross-validation to choose $k$

the GCV method tends to choose smaller $k$, hence the test size is unacceptably large

if the GCV-k method is implemented, the use be aware of the inflated test size, and the author does not recommend the GCV-k for the smaller sample sizes.


Published in categories Note