WeiYa's Work Yard

A traveler with endless curiosity, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

MMRM: Mixed-Models for Repeated Measures

Posted on (Update: )
Tags: Mixed-Models

This post is based on vignettes of MMRM R package: https://openpharma.github.io/mmrm/main/index.html

Laid and Ware (1982) introduced the basic linear mixed-effects model for a single level of grouping to be any model that expresses an $n_i$-dimensional response (column) vector $y_i$ for the $i$-th subject (or, more generally, group or unit) as

\[y_i = X_i\beta + Z_ib_i + \epsilon_i, i=1,\ldots,M \\ b_i\sim N(0, \Psi), \varepsilon_i\sim N(0, \sigma^2I)\]

MMRM model

directly model the variance-covariance structure of the response, i.e., to account for within-subject dependencies using the within-group component $\Lambda_i$.

This yields the MMRM with

\[y_i = X_i\beta + \epsilon_i, \epsilon_i\sim N(0, \Sigma_i), i=1,\ldots,M\]

In total,

\[Y = X\beta + \epsilon\,,\]

where $\epsilon\sim N(0, \Omega)$ with $\Omega\in \IR^{N\times N}$ being a block-diagonal matrix, containing the subject-specific $\Sigma_i$ on the diagonal.

Covariance Structures

In many covariance structures, use the decomposition

\[\Sigma = DPD\]

where the diagonal standard deviation matrix is

$D = \diag\{\sigma_1,\ldots, \sigma_m\}$ with entries $\sigma_i > 0$ and the symmetric correlation matrix $P = {\rho_{ij}}$ with entries $\rho_{ij}\in (-1, 1)$.

  • for $\sigma$, use the natural logarithm $\log(\sigma)$ to map them to $\IR$
  • for correlation parameters $\rho$, use the transformation
\[\theta = f(\rho) = \sign(\rho) \sqrt{\frac{\rho^2}{1-\rho^2}}\]

which maps the correlation parameter to $\theta\in \IR$. It has the inverse

\[\rho = f^{-1}(\theta) = \frac{\theta}{\sqrt{1+\theta^2}}.\]

Unstructured (us)

then we have $k=m(m+1)/2$ parameters

Homogeneous (ad) and Heterogeneous Ante-dependence (adh)

the ante-dependence correlation structure (Gabriel 1962) is useful for balanced designs where the observations are not necessarily equally spaced in time.

Use an order one ante-dependence model, where the correlation matrix $P$ has elements

\[\rho_{ij} = \prod_{k=i}^{j-1}\rho_k\,.\]

then

\[\theta = (\log(\sigma_1),\ldots, \log(\sigma_m), f(\rho_1), \ldots, f(\rho_{m-1}))\]

in total $2m-1$ variance parameters.

Homogeneous (toep) and Heterogeneous Toeplitz (toeph)

Toeplitz matrices are diagonal-constant matrices. The correlation between two time points only depends on the distance between them, i.e., $\rho_{ij}=\rho_{\vert i-j\vert }$.

then

\[\theta = (\log(\sigma_1),\ldots, \log(\sigma_m), f(\rho_1), \ldots, f(\rho_{m-1}))\]

in total $2m-1$ variance parameters.

Homogeneous (cs) and Heterogeneous Compound Symmetry (csh)

the compound symmetry covariance structures assume a constant correlation between different time points

\[\rho_{ij} = \rho\]

then in total there are $k=m+1$ variance parameters.

Spatial Covariance Structure: spatial exponential (sp_exp)

spatial covariance structures, unlike other covariance structures, does not require that the timepoints are consistent between subjects. Instead, as long as the distance between visits can be quantified in terms of time and/or other coordinates, the spatial covariance structure can be applied.

for spatial exponential, the covariance structure is define as follows:

\[\rho_{ij} = \rho^{d_{ij}}\]

where $d_{ij}$ is the distance between time point $i$ and time point $j$.

total number of parameters $k=2$ is needed.

Hypothesis Testing

Contained Effect

For effect $E_2$, it is said to contain $E_1$ if

  1. the involved numerical covariates are identical for these two effects
  2. all the categorical covariates associated with $E_1$ are also associated with $E_2$

For example, for the following model formula

\[Y = A + B + A * B\]

using $E_A, E_B$ and $E_{A*B}$ to denote the effect for $A, B$ and $A * B$ respectively, we have

  • if $A$ and $B$ are both categorical, then $E_{A*B}$ contains $E_A$ and $E_B$
  • if $A$ and $B$ are both numerical, then $E_{A*B}$ do not contain $E_A$ or $E_B$
  • if $A$ is categorical and $B$ is numerical, then $E_{A*B}$ contains $E_B$ but not $E_A$

Model Fitting Algorithm

Let $i=1,\ldots, n$ denote the subjects from which we observe multiple observations $j=1,\ldots,m_i$ for total $m_i$ time points $t_{ij}\in {t_1,\ldots,t_m}$.

Note that the number of time points for a specific subject $m_i$ can be smaller than $m$, when only a subset of the possible $m$ time points have been observed.

Linear Model

For each subject $i$ we observe a vector $Y_i = (y_{i1}, \ldots, y_{im_i})^\top \in \IR^{m_i}$

and given a design matrix $X_i \in \IR^{m_i\times p}$

and a corresponding coefficient vector $\beta \in \IR^p$ we assume that the observations are multivariate normal distributed:

\[Y_i \sim N(X_i\beta, \Sigma_i)\]

where the covariance matrix $\Sigma_i\in \IR^{m_i\times m_i}$ is derived by subsetting the overall covariance matrix $\Sigma \in \IR^{m\times m}$ appropriately by

\[\Sigma_i = G_i^{-1/2} S_i^T\Sigma S_iG_i^{-1/2}\]

where the subsetting matrix $S_i \in {0, 1}^{m\times m_i}$. For example, assume a subject was observed on time points 1, 3, 4 out of total 5 then the subsetting matrix is

\[S_i = \begin{bmatrix} 1 & 0 & 0\\ 0 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1\\ 0 & 0 & 0 \end{bmatrix}\]

$G_i \in \IR^{m_i\times m_i}_{>0}$ is the diagonal weight matrix, which is the identity matrix if no weights are specified.

Combine together

\[Y = X\beta + \epsilon\]

where $N=\sum_{i=1}^n m_i$

the MLE estimate of $\beta$ is the WLSE

for REML estimate, first obtain the marginal likelihood of the variance parameter $\theta$ by integrating out the remaining parameters $\beta$ from the likelihood.

Prediction of conditional mean

since residuals can be correlated, potentially existing observed outcomes of the same individuals can be informative for predicting the unobserved values of the same individual.

assume that the data is sorted such that $Y_{ij} = y_{ij}, j=k+1,k+2,\ldots, p$ are observed and $Y_{ij}, j=1,2,\ldots, k$ are not. The special case of all outcomes being unobserved (new individual) is covered with $k=p$

Let further

\[\Sigma_i(X_i, \theta) = \begin{pmatrix} \Sigma_{i}^{new, new}(X_i, \theta) & \Sigma_i^{new, old}(X_i, \theta)\\ \Sigma_{i}^{old, new}(X_i, \theta) & \Sigma_i^{old, old}(X_i, \theta) \end{pmatrix}\]

be a block decomposition.

Predictions can then be made based on the conditional distribution

Prediction Sampling for Prediction Interval

with the conditional variance formula

\[\Var(Y_i) = \Var(E(Y_i\mid \theta)) + E(Var(Y_i\mid \theta))\]

Prediction of Conditional Mean for New Subjects

If there are no observations for a subject, then the prediction is quite simple.

\[Y_i = X_i\hat\beta\]

An Example

> head(fev_data)
  USUBJID AVISIT ARMCD                      RACE    SEX  FEV1_BL
1     PT1   VIS1   TRT Black or African American Female 25.27144
2     PT1   VIS2   TRT Black or African American Female 25.27144
3     PT1   VIS3   TRT Black or African American Female 25.27144
4     PT1   VIS4   TRT Black or African American Female 25.27144
5     PT2   VIS1   PBO                     Asian   Male 45.02477
6     PT2   VIS2   PBO                     Asian   Male 45.02477
      FEV1    WEIGHT VISITN    VISITN2
1       NA 0.6767231      1 -0.6264538
2 39.97105 0.8006186      2  0.1836433
3       NA 0.7087859      3 -0.8356286
4 20.48379 0.8088997      4  1.5952808
5       NA 0.4651848      1  0.3295078

> length(fev_data$FEV1)
[1] 800
> sum(is.na(fev_data$FEV1))
[1] 263

# the missing value for each visit

> table(fev_data$AVISIT[is.na(fev_data$FEV1)])

VIS1 VIS2 VIS3 VIS4 
  66   60   71   66 

use an MMRM with the given covariates and an unstructured covariance matrix for the timepoints (AVISIT) within the subjects (USUBJID). By default, this uses restricted maximum likelihood (REML)

> fit
mmrm fit

Formula:     FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID)
Data:        fev_data (used 537 observations from 197 subjects with maximum 4 
timepoints)
Covariance:  unstructured (10 variance parameters)
Inference:   REML
Deviance:    3386.45

Coefficients: 
                  (Intercept) RACEBlack or African American 
                   30.7774065                     1.5305950 
                    RACEWhite                     SEXFemale 
                    5.6435679                     0.3260274 
                     ARMCDTRT                    AVISITVIS2 
                    3.7744139                     4.8396039 
                   AVISITVIS3                    AVISITVIS4 
                   10.3421671                    15.0537863 
          ARMCDTRT:AVISITVIS2           ARMCDTRT:AVISITVIS3 
                   -0.0420899                    -0.6938068 
          ARMCDTRT:AVISITVIS4 
                    0.6241229 

Model Inference Optimization:
Converged with code 0 and message: No message provided.

we can see the coefficients table with Satterthwaite degrees of freedom as well as the covariance matrix estimate

> summary(fit)
mmrm fit

Formula:     FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID)
Data:        fev_data (used 537 observations from 197 subjects with maximum 4 
timepoints)
Covariance:  unstructured (10 variance parameters)
Method:      Satterthwaite
Vcov Method: Asymptotic
Inference:   REML

Model selection criteria:
     AIC      BIC   logLik deviance 
  3406.4   3439.3  -1693.2   3386.4 

Coefficients: 
                               Estimate Std. Error        df t value Pr(>|t|)
(Intercept)                    30.77741    0.88657 218.79000  34.715  < 2e-16
RACEBlack or African American   1.53059    0.62446 168.67000   2.451 0.015263
RACEWhite                       5.64357    0.66559 157.14000   8.479 1.56e-14
SEXFemale                       0.32603    0.53194 166.14000   0.613 0.540776
ARMCDTRT                        3.77441    1.07416 145.55000   3.514 0.000589
AVISITVIS2                      4.83960    0.80173 143.87000   6.036 1.27e-08
AVISITVIS3                     10.34217    0.82269 155.56000  12.571  < 2e-16
AVISITVIS4                     15.05379    1.31288 138.46000  11.466  < 2e-16
ARMCDTRT:AVISITVIS2            -0.04209    1.12933 138.56000  -0.037 0.970324
ARMCDTRT:AVISITVIS3            -0.69381    1.18764 158.17000  -0.584 0.559924
ARMCDTRT:AVISITVIS4             0.62412    1.85096 129.71000   0.337 0.736520
                                 
(Intercept)                   ***
RACEBlack or African American *  
RACEWhite                     ***
SEXFemale                        
ARMCDTRT                      ***
AVISITVIS2                    ***
AVISITVIS3                    ***
AVISITVIS4                    ***
ARMCDTRT:AVISITVIS2              
ARMCDTRT:AVISITVIS3              
ARMCDTRT:AVISITVIS4              
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

Covariance estimate:
        VIS1    VIS2    VIS3    VIS4
VIS1 40.5544 14.3960  4.9760 13.3779
VIS2 14.3960 26.5714  2.7836  7.4773
VIS3  4.9760  2.7836 14.8980  0.9036
VIS4 13.3779  7.4773  0.9036 95.5565

Published in categories Note