High Dimensional Covariance Matrix Estimation

Posted on Mar 19, 2019

This note is based on Cai TT, Ren Z, Zhou HH. Estimating structured high-dimensional covariance and precision matrices: Optimal rates and adaptive estimation. Electronic Journal of Statistics. 2016;10(1):1-59..

Introduction

The standard and most natural estimator, the sample covariance matrix, performs poorly and can lead to invalid conclusions in the high-dimensional settings.

When $p/n\rightarrow c\in (0,\infty]$, the largest eigenvalue of the sample covariance matrix is not a consistent estimate of the largest eigenvalue of the population covariance matrix, and the eigenvectors of the sample covariance matrix can be nearly orthogonal to the truth.
When $p > n$, the sample covariance matrix is not invertible, and thus cannot be applied in many applications that require estimation of the precision matrix.

Overcome the difficulty by structural assumptions:

bandable covariance matrices
sparse covariance matrices
spiked covariance matrices
covariances with a tensor product structure
sparse precision matrices
bandable precision matrix via Cholesky decomposition
latent graphical model

Regularization methods:

banding method
tapering
thresholding
penalized likelihood estimation
regularized principal components
penalized regression for precision matrix estimation

Theoretical studies of the fundamental difficulty of the various estimation problems in terms of the minimax risks

the optimal rates of convergence for estimating a class of high-dimensional bandable covariance matrices under the spectral norm and Frobenious norm losses.
optimal estimation of sparse covariance and sparse precision matrices under a range of losses
optimal estimation of a Toeplitz covariance matrix
the minimax estimation for a large class of sparse spiked covariance matrices under the spectral norm loss.

Goal: a survey of recent optimality results on estimation of structured high-dimensional covariance and precision matrices, and discuss some key technical tools that are used in the theoretical analyses.

The optimal procedures for estimating the bandable, Toeplitz, and sparse covariance matrices are obtained by smoothing or thresholding the sample covariance matrices based on various sparsity assumptions.
In contrast, estimation of sparse spiked covariance matrices, which have sparse principal components, requires significantly different techniques to achieve optimality results.

Notation:

$n$ random sample $\{X^{(1)},\ldots,X^{(n)}\}$
$p$-dimensional random vector $X=(X_1,\ldots,X_p)’$ follows some distribution with covariance matrix $\Sigma=(\sigma_{ij})$.
Goal: Estimate the covariance matrix $\Sigma$ and its inverse.
$\ell_\omega$ operator norm: $\Vert M\Vert_{\ell_\omega}=\max_{\Vert x\Vert_\omega=1}\Vert Mx\Vert_\omega$

The distribution of a random vector $X$ is said to be sub-Gaussian with constant $\rho > 0$ if $$ P\{\vert v'(X-EX)\vert > t\} \le 2e^{-t^2\rho/2}\,, $$ for all $t > 0$ and all deterministic unit vector $\Vert v\Vert=1$.

Estimation of structured covariance matrices

Bandable covariance matrices

Sparse covariance matrices

no information on the “order” among the variables

Sparse spiked covariance matrices

Spiked covariance matrix

\[\Sigma = \sum_{i=1}^r\lambda_rv_iv_i'+I\,,\]

where $\lambda_1 \ge \lambda_2 \ge \ldots \ge \lambda_r > 0$ and the vector $v_1,\ldots,v_r$ are orthonormal. Since the spectrum of $\Sigma$ has $r$ spikes, it was named spiked covariance model.

Estimation of structured precision matrices

Published in categories Memo

← previous next →

See all posts →

WeiYa's Work Yard

A traveler with endless curiosity, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.