High Dimensional Covariance Matrix Estimation
Posted on
This note is based on Cai TT, Ren Z, Zhou HH. Estimating structured high-dimensional covariance and precision matrices: Optimal rates and adaptive estimation. Electronic Journal of Statistics. 2016;10(1):1-59..
Introduction
The standard and most natural estimator, the sample covariance matrix, performs poorly and can lead to invalid conclusions in the high-dimensional settings.
- When $p/n\rightarrow c\in (0,\infty]$, the largest eigenvalue of the sample covariance matrix is not a consistent estimate of the largest eigenvalue of the population covariance matrix, and the eigenvectors of the sample covariance matrix can be nearly orthogonal to the truth.
- When $p > n$, the sample covariance matrix is not invertible, and thus cannot be applied in many applications that require estimation of the precision matrix.
Overcome the difficulty by structural assumptions:
- bandable covariance matrices
- sparse covariance matrices
- spiked covariance matrices
- covariances with a tensor product structure
- sparse precision matrices
- bandable precision matrix via Cholesky decomposition
- latent graphical model
Regularization methods:
- banding method
- tapering
- thresholding
- penalized likelihood estimation
- regularized principal components
- penalized regression for precision matrix estimation
Theoretical studies of the fundamental difficulty of the various estimation problems in terms of the minimax risks
- the optimal rates of convergence for estimating a class of high-dimensional bandable covariance matrices under the spectral norm and Frobenious norm losses.
- optimal estimation of sparse covariance and sparse precision matrices under a range of losses
- optimal estimation of a Toeplitz covariance matrix
- the minimax estimation for a large class of sparse spiked covariance matrices under the spectral norm loss.
Goal: a survey of recent optimality results on estimation of structured high-dimensional covariance and precision matrices, and discuss some key technical tools that are used in the theoretical analyses.
- The optimal procedures for estimating the bandable, Toeplitz, and sparse covariance matrices are obtained by smoothing or thresholding the sample covariance matrices based on various sparsity assumptions.
- In contrast, estimation of sparse spiked covariance matrices, which have sparse principal components, requires significantly different techniques to achieve optimality results.
Notation:
- $n$ random sample $\{X^{(1)},\ldots,X^{(n)}\}$
- $p$-dimensional random vector $X=(X_1,\ldots,X_p)’$ follows some distribution with covariance matrix $\Sigma=(\sigma_{ij})$.
- Goal: Estimate the covariance matrix $\Sigma$ and its inverse.
- $\ell_\omega$ operator norm: $\Vert M\Vert_{\ell_\omega}=\max_{\Vert x\Vert_\omega=1}\Vert Mx\Vert_\omega$
Estimation of structured covariance matrices
Bandable covariance matrices
Sparse covariance matrices
no information on the “order” among the variables
Sparse spiked covariance matrices
Spiked covariance matrix
\[\Sigma = \sum_{i=1}^r\lambda_rv_iv_i'+I\,,\]where $\lambda_1 \ge \lambda_2 \ge \ldots \ge \lambda_r > 0$ and the vector $v_1,\ldots,v_r$ are orthonormal. Since the spectrum of $\Sigma$ has $r$ spikes, it was named spiked covariance model.