# Model-Free Scoring System for Risk Prediction

##### Posted on 0 Comments

## Method

### Notation

- $T$: survival time
- $\mathbf X$: $p-$vector of the covariates
- $S(\mathbf X)$: scoring system, where higher scores imply higher risk levels and shorter survival time
- $\mathbf C$: censoring time, independent of $\mathbf T$ conditional on $\mathbf X$
- $(\mathbf Z_i, \delta_i, \mathbf X_i), i=1,2,\ldots, n$: observed data, and $\overset{iid}{\sim} (\mathbf Z,\delta, \mathbf X)$, where $\mathbf Z=min(T,C)$, and $\mathbf Z$ is allowed to depend on the covariates $\mathbf X$.
- $\cal{R(t)} = {j:Z_j>t}$: the risk set

then the time-dependent ROC curve is defined by

### Estimation

Let $t_1<\ldots <t_M$ be the ordered unique failure times for ${Z_1,\cdots, Z_n}, M\le n$。

At each time point $t_m$, the subjects in the risk set $\cal R(t_m)$ can be divided into two groups,

- $\cal R^L(t_m)$, the set of patients with relatively lower risk whose score values are lower than $S(\mathbf X_i;\mathbf \beta)$
- $\cal R^H(t_m)$, the set of patients with relatively higher risk compared with subject $i$.

Use the proportional of observing a low-risk patient in the risk set, $\frac{\vert \cal R^L(t)\vert}{\vert \cal R(t)\vert}$ as an estimator of $AUC(t)$.

And construct the following pseudo-likelihood function

Then estimate $\mathbf \beta$ by maximizing the log-pseudo-likelihood function.

Considering the computation, adopt a smoothing kernel to approximate the indicator to approximate the above log-pseudo-likelihood function.

### Variable Selection

Maximize the following loss function,

where $\lambda_n$ is a tunning parameter and $J(\cdot)$ is a penalty function, and here adaptive LASSO penalty.

Use coordinate descent algorithm.

## Asymptotic Results

## Simulation

### Score System Without Variable Selection

case 1 - case 5

### Variable Selection Examples

case 6 - case 9