WeiYa's Work Yard

A dog, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Notes: Model-Free Scoring System for Risk Prediction

Posted on October 17, 2017 0 Comments



  • $T$: survival time
  • $\mathbf X$: $p-$vector of the covariates
  • $S(\mathbf X)$: scoring system, where higher scores imply higher risk levels and shorter survival time
  • $\mathbf C$: censoring time, independent of $\mathbf T$ conditional on $\mathbf X$
  • $(\mathbf Z_i, \delta_i, \mathbf X_i), i=1,2,\ldots, n$: observed data, and $\overset{iid}{\sim} (\mathbf Z,\delta, \mathbf X)$, where $\mathbf Z=min(T,C)$, and $\mathbf Z$ is allowed to depend on the covariates $\mathbf X$.
  • $\cal{R(t)} = {j:Z_j>t}$: the risk set

then the time-dependent ROC curve is defined by


Let $t_1<\ldots <t_M$ be the ordered unique failure times for ${Z_1,\cdots, Z_n}, M\le n$。

At each time point $t_m$, the subjects in the risk set $\cal R(t_m)$ can be divided into two groups,

  • $\cal R^L(t_m)$, the set of patients with relatively lower risk whose score values are lower than $S(\mathbf X_i;\mathbf \beta)$
  • $\cal R^H(t_m)$, the set of patients with relatively higher risk compared with subject $i$.

Use the proportional of observing a low-risk patient in the risk set, $\frac{\vert \cal R^L(t)\vert}{\vert \cal R(t)\vert}$ as an estimator of $AUC(t)$.

And construct the following pseudo-likelihood function

Then estimate $\mathbf \beta$ by maximizing the log-pseudo-likelihood function.

Considering the computation, adopt a smoothing kernel to approximate the indicator to approximate the above log-pseudo-likelihood function.

Variable Selection

Maximize the following loss function,

where $\lambda_n$ is a tunning parameter and $J(\cdot)$ is a penalty function, and here adaptive LASSO penalty.

Use coordinate descent algorithm.

Asymptotic Results


Score System Without Variable Selection

case 1 - case 5

Variable Selection Examples

case 6 - case 9


Shen W, Ning J, Yuan Y, Lok AS, Feng Z. Model‐free scoring system for risk prediction with application to hepatocellular carcinoma study. Biometrics. 2017 Jul 25.

Published in categories Biostatistics