# Canonical Variate Analysis

##### Posted on Jul 16, 2019
Tags: Canonical

This note is based on Campbell, N. A. (1979). CANONICAL VARIATE ANALYSIS: SOME PRACTICAL ASPECTS. 243.

Consider $g$ groups of data, with $v$ variables measured on each of $n_k$ individuals for the $k$-th group.

• $x_{km}$: the vector of obs. on the $m$-th individual for the $k$-th group

Define the sums of squares and products (SSQPR) matrix for the $k$-th group as

$S_k = \sum_{m=1}^{n_k}(x_{km}-\bar x_k)(x_{km}-\bar x_k)^T\,,$

where

$\bar x_k = n_k^{-1}\sum_{m=1}^{n_k}x_{km}$

and write

$W=\sum_{k=1}^gS_k=S$

for the within-groups SSQPR matrix on

$n_w=\sum_{k=1}^g(n_k-1)$

degrees of freedom.

Define the between-groups SSQPR matrix as

$B = \sum_{k=1}^gn_k(\bar x_k-\bar x_T)(\bar x_k-\bar x_T)^T$

where

$\bar x_T = n_T^{-1}\sum_{k=1}^gn_k\bar x_k$

and

$n_T = \sum_{k=1}^gn_k = n\,.$

The simplest formulation of canonical variate analysis is the distribution-free one of finding that linear combination of the original variables which maximizes the variation between groups, relative to the variation within groups.

That is, find the canonical vector $c_1$ which maximizes the ration $c_1^TBc_1/c_1^TWc_1$; the vector is usually scaled so that $c_1^TWc_1=n_w$. The maximized ratio gives the first canonical root $f_1$.

Use of Lagrange multipliers leads directly to the eigenanalysis

$(B-fW)c=0\,.$

Let $h=\min(v, g-1)$,

• $C=[c_1,\ldots,c_h]$
• $F=[f_1,\ldots,f_h]$

Then

$BC=WCF$

with

$C^TWC=n_wI$

and

$C^TBC=n_wF\,,$

the canonical variates are uncorrelated both within and between groups, and have unit variance within groups.

Write

$T = B+W\,,$

then an equivalent formulation is to maximize the ratio $c_1^TBc_1/c_1^TTc_1$, leading to the eigenanalysis

$(B-r^2T)c=0\,.$

The ration $r_1^2$ is the square of the first sample canonical correlation coefficient. The vector $c_1$ is scaled so that $c_1^Tc_1=n_w(1-r_1^2)^{-1}=n_w(1+f_1)$, so that again $c_1^TBc_1=n_wr_1^2(1-r_1^2)^{-1}=n_wf$ and $c_1^Wc_1=n_w$.

Now assume that $x_{km}\sim N_v(\mu_k,\Sigma)$. The maximized likelihood when the $\mu_k$ are unrestricted is

$(2\pi)^{-nv/2}\vert n^{-1}W\vert^{-n/2} e^{-nv/2}$

with $v(v+1)/2+gv$ estimated parameters. why $e^{-nv/2}$

The maximized likelihood for the hypothesis specifying equality of the $\mu_k$ is

$(2\pi)^{-nv/2}\vert n^{-1}(W+B)\vert^{-n/2} e^{-nv/2}$

with $v(v+1)/2 + v$ estimated parameters. This leads to the well-known likelihood ratio statistic given by $\vert W\vert/\vert W+B\vert$, and commonly referred as Wilks $\Lambda$. The statistic $\Lambda$ may be written as

$\Lambda = \vert W\vert / \vert W+B\vert = \vert W\vert / \vert T\vert =\vert I+W^{-1}B\vert^{-1} = \prod_{i=1}^h(1+f_i)^{-1} = \prod_{i=1}^h(1-r_i^2)\,.$

(by the Cayley-Hamilton Theorem)

Published in categories Note