# High Dimensional Covariance Matrix Estimation

## Introduction

The standard and most natural estimator, the sample covariance matrix, performs poorly and can lead to invalid conclusions in the high-dimensional settings.

• When $p/n\rightarrow c\in (0,\infty]$, the largest eigenvalue of the sample covariance matrix is not a consistent estimate of the largest eigenvalue of the population covariance matrix, and the eigenvectors of the sample covariance matrix can be nearly orthogonal to the truth.
• When $p > n$, the sample covariance matrix is not invertible, and thus cannot be applied in many applications that require estimation of the precision matrix.

Overcome the difficulty by structural assumptions:

• bandable covariance matrices
• sparse covariance matrices
• spiked covariance matrices
• covariances with a tensor product structure
• sparse precision matrices
• bandable precision matrix via Cholesky decomposition
• latent graphical model

Regularization methods:

• banding method
• tapering
• thresholding
• penalized likelihood estimation
• regularized principal components
• penalized regression for precision matrix estimation

Theoretical studies of the fundamental difficulty of the various estimation problems in terms of the minimax risks

• the optimal rates of convergence for estimating a class of high-dimensional bandable covariance matrices under the spectral norm and Frobenious norm losses.
• optimal estimation of sparse covariance and sparse precision matrices under a range of losses
• optimal estimation of a Toeplitz covariance matrix
• the minimax estimation for a large class of sparse spiked covariance matrices under the spectral norm loss.

Goal: a survey of recent optimality results on estimation of structured high-dimensional covariance and precision matrices, and discuss some key technical tools that are used in the theoretical analyses.

• The optimal procedures for estimating the bandable, Toeplitz, and sparse covariance matrices are obtained by smoothing or thresholding the sample covariance matrices based on various sparsity assumptions.
• In contrast, estimation of sparse spiked covariance matrices, which have sparse principal components, requires significantly different techniques to achieve optimality results.

Notation:

• $n$ random sample $\{X^{(1)},\ldots,X^{(n)}\}$
• $p$-dimensional random vector $X=(X_1,\ldots,X_p)’$ follows some distribution with covariance matrix $\Sigma=(\sigma_{ij})$.
• Goal: Estimate the covariance matrix $\Sigma$ and its inverse.
• $\ell_\omega$ operator norm: $\Vert M\Vert_{\ell_\omega}=\max_{\Vert x\Vert_\omega=1}\Vert Mx\Vert_\omega$
The distribution of a random vector $X$ is said to be sub-Gaussian with constant $\rho > 0$ if $$P\{\vert v'(X-EX)\vert > t\} \le 2e^{-t^2\rho/2}\,,$$ for all $t > 0$ and all deterministic unit vector $\Vert v\Vert=1$.

### Estimation of structured covariance matrices

#### Sparse covariance matrices

no information on the “order” among the variables

#### Sparse spiked covariance matrices

Spiked covariance matrix

$\Sigma = \sum_{i=1}^r\lambda_rv_iv_i'+I\,,$

where $\lambda_1 \ge \lambda_2 \ge \ldots \ge \lambda_r > 0$ and the vector $v_1,\ldots,v_r$ are orthonormal. Since the spectrum of $\Sigma$ has $r$ spikes, it was named spiked covariance model.

### Estimation of structured precision matrices

Published in categories Memo