# High Dimensional Covariance Matrix Estimation

##### Posted on

This note is based on Cai TT, Ren Z, Zhou HH. Estimating structured high-dimensional covariance and precision matrices: Optimal rates and adaptive estimation. Electronic Journal of Statistics. 2016;10(1):1-59..

## Introduction

The standard and most natural estimator, the sample covariance matrix, performs poorly and can lead to invalid conclusions in the high-dimensional settings.

- When $p/n\rightarrow c\in (0,\infty]$, the largest eigenvalue of the sample covariance matrix is not a consistent estimate of the largest eigenvalue of the population covariance matrix, and the eigenvectors of the sample covariance matrix can be nearly orthogonal to the truth.
- When $p > n$, the sample covariance matrix is not invertible, and thus cannot be applied in many applications that require estimation of the precision matrix.

Overcome the difficulty by structural assumptions:

- bandable covariance matrices
- sparse covariance matrices
- spiked covariance matrices
- covariances with a tensor product structure
- sparse precision matrices
- bandable precision matrix via Cholesky decomposition
- latent graphical model

Regularization methods:

- banding method
- tapering
- thresholding
- penalized likelihood estimation
- regularized principal components
- penalized regression for precision matrix estimation

Theoretical studies of the fundamental difficulty of the various estimation problems in terms of the minimax risks

- the optimal rates of convergence for estimating a class of high-dimensional bandable covariance matrices under the spectral norm and Frobenious norm losses.
- optimal estimation of sparse covariance and sparse precision matrices under a range of losses
- optimal estimation of a Toeplitz covariance matrix
- the minimax estimation for a large class of sparse spiked covariance matrices under the spectral norm loss.

Goal: a survey of recent optimality results on estimation of structured high-dimensional covariance and precision matrices, and discuss some key technical tools that are used in the theoretical analyses.

- The optimal procedures for estimating the bandable, Toeplitz, and sparse covariance matrices are obtained by smoothing or thresholding the sample covariance matrices based on various sparsity assumptions.
- In contrast, estimation of sparse spiked covariance matrices, which have sparse principal components, requires significantly different techniques to achieve optimality results.

Notation:

- $n$ random sample $\{X^{(1)},\ldots,X^{(n)}\}$
- $p$-dimensional random vector $X=(X_1,\ldots,X_p)’$ follows some distribution with covariance matrix $\Sigma=(\sigma_{ij})$.
**Goal:**Estimate the covariance matrix $\Sigma$ and its inverse.- $\ell_\omega$ operator norm: $\Vert M\Vert_{\ell_\omega}=\max_{\Vert x\Vert_\omega=1}\Vert Mx\Vert_\omega$

### Estimation of structured covariance matrices

#### Bandable covariance matrices

#### Sparse covariance matrices

no information on the “order” among the variables

#### Sparse spiked covariance matrices

Spiked covariance matrix

\[\Sigma = \sum_{i=1}^r\lambda_rv_iv_i'+I\,,\]where $\lambda_1 \ge \lambda_2 \ge \ldots \ge \lambda_r > 0$ and the vector $v_1,\ldots,v_r$ are orthonormal. Since the spectrum of $\Sigma$ has $r$ spikes, it was named spiked covariance model.