Functional PCA
Posted on
This post is based on Ramsay, J. O., & Silverman, B. W. (2005). Functional data analysis (Second edition). New York, NY: Springer.
PCA and eigenanalysis
Define the covariance function $v(s,t)$ by
\[v(s, t) = N^{-1}\sum_{i=1}^Nx_i(s)x_i(t)\,.\]Each of those principal component weight functions $\xi_j(s)$ satisfies the equation
\[\begin{equation} \int v(s, t)\xi(t) dt = \rho\xi(s)\label{eq:8.9} \end{equation}\]for an appropriate eigenvalue $\rho$. The left side is an integral transform $V$ of the weight function $\xi$ defined by
\[V\xi = \int v(\cdot, t)\xi(t)dt\,.\]The integral transform is called the covariance operator $V$. Therefore we may also express the eigenequation directly as
\[V\xi = \rho\xi\,,\]where $\xi$ is now an eigenfunction rather than an eigenvector.
An important difference between the multivariate and functional eigenanalysis problems, concerning the maximum number of different eigenvalue-eigenfunction pairs. The counterpart of the number of variables $p$ in the multivariate case is the number of function $x_i$ are not linearly dependent, the operator $V$ will have rank $N-1$, and there will be only $N-1$ nonzero eigenvalues.
Computation
Suppose we have a set of $N$ curves $x_i$, and that preliminary steps such as curve registration and the possible subtraction of the mean curve from each (curve centering) have been completed. Let $v(s,t)$ be the sample covariance function of the observed data. In all cases, convert the continuous functional eigenanalysis problem to an approximately equivalent matrix eigenanalysis task.
Discretizing the functions
A simple approach is to discretize the observed functions $x_i$ to a fine grid of $n$ equally spaced values $s_j$ that span the interval $\cal T$. This produces eigenvalues and eigenvectors satisfying
\[Vu=\lambda u\]for $n$-vectors $u$.
The sample variance-covariance matrix $V=N^{-1}X’X$ will have elements $v(s_j,s_k)$ where $v(s,t)$ is the sample covariance function. Given any function $\xi$, let $\tilde \bxi$ be the $n$-vector of values $\xi(s_j)$. Let $w=T/n$ where $T$ is the length of the interval $\cal T$. Then for each $s_j$,
\[\V\xi(s_j) = \int v(s_j, s)\xi(s)ds\approx w\sum v(s_j, s_k)\tilde \xi_k\,,\]so the functional eigenequation $V\xi=\rho\xi$ has the approximate discrete form
\[w \V\tilde\bxi = \rho\tilde \b\xi\,.\]Basis function expansion of the functions
One way of reducing the eigenequation \eqref{eq:8.9} to discrete or matrix form is to express each function $x_i$ as a linear combination of known basis functions $\phi_k$.
Suppose that each function has basis expansion
\[x_i(t) = \sum_{k=1}^Kc_{ik}\phi_k(t)\,.\]Write in vector-form,
\[\x=\C\bphi\,,\]where the coefficient matrix $\C$ is $N\times K$. In matrix terms, the variance-covariance function is
\[v(s, t) = N^{-1}\bphi(s)'\C'\C\bphi(t)\,.\]Define the order $K$ symmetric matrix $\W$ to have entries
\[w_{k_1,k_2} = \int \phi_{k_1}\phi_{k_2}\]or $\W=\int \bphi\bphi’$. For the orthonormal Fourier series, $\W=\I$. Now suppose that an eigenfunction $\xi$ for the eigenequation \eqref{eq:8.9} has an expansion
\[\xi(s) = \sum_{k=1}^Kb_k\phi_k(s)\]or in matrix form, $\xi(s) = \bphi(s)’\b$. This yields
\[\int v(s,t)\xi(t)dt = \int N^{-1}\bphi(s)'\C'\C\bphi(t)\bphi(t)'\b dt = \bphi(s)'N^{-1}\C'\C\W\b\,,\]and \eqref{eq:8.9} becomes
\[\bphi(s)' N^{-1}\C'\C\W\b = \rho \bphi(s)'\b\,.\]Since this equation must hold for all $s$, this implies the purely matrix equation
\[N^{-1}\C'\C\W\b = \rho\b\,,\]and the constrain $\Vert\xi\Vert=1$ implies that $\b’\W\b=1$. Define $\u=\W^{1/2}\b$, solve the equivalent symmetric eigenvalue problem
\[N^{-1}\W^{1/2}\C'\C\W^{1/2}\u=\rho\u\]and compute $\b=\W^{-1/2}\u$ for each eigenvector.
Two special cases:
- the basis is orthonormal, $\W=\I$
- view the observed functions $x_i$ as their own basis expansions, $\C=\I$.