Generalized Functional Linear Models with Semiparametric Single-index Interactions
Posted on (Update: )
This post is based on Li, Y., Wang, N., & Carroll, R. J. (2010). Generalized Functional Linear Models With Semiparametric Single-Index Interactions. Journal of the American Statistical Association, 105(490), 621–633.
Introduce a class of functional generalized linear models
- scalar response: depends on multiple covariates, a finite number of latent features predictor (?), and interaction between the two
- some covariates are functional
the interaction between the multiple covariates and the functional predictor is modelled semiparametrically with a single-index structure.
- propose a two-step estimation procedure based on local estimating equations, and investigate two situations:
- when the basis functions are predetermined (Fourier or wavelet basis functions) and the functional features of interest are known
- when the basis functions are data driven, such as with functional principal components.
- asymptotic properties are developed, show that
- when the basis functions are data-driven, the parameter estimates have an increased asymptotic variance due to the estimation error of the basis functions
- illustrate the methods with a simulation study and applied to an empirical dataset
Introduction
most existing work (relative to the published year 2010) does not readily accommodate the existence of an interaction between the functional predictor and other covariates, the paper addresses this issue by introducing a novel class of models.
In the models, the response depends on latent features of the functional data and their interaction with other possibly multivariate covariates, where the features in functional data that are the projections of the functional data onto orthonormal basis functions.
organizations:
- section 2: define the modeling framework
- split into two tracks
- section 3: the situation that the functional predictors is fully observed and the basis functions are predetermined
- section 4: the functional features/basis functions are data-driven and need to be estimated
- section 5: numerical performance
Model and Data Structure
Data: $(Y_i, X_i, \Z_i), i=1,\ldots,n$
- $Y$: response
- $X$: longitudinal covariate process
- $\Z$: other covariates
model the relationship between $Y$ and ${X(\cdot), \Z}$ by imposing structures on the conditional mean and variance of $Y$. The key component of the mean function is a semiparametric functional linear model where the functional coefficient function varies with both $t$ and $\Z$. Denote $E(Y\mid X, \Z)=\mu_Y(X,\Z)$,
\[\begin{align*} g\{\mu_Y(X,\Z)\} &= \int_\cT\fA(t, \Z_1)X(t)dt + \beta^T\Z\\ \var(Y\mid X,\Z) &=\sigma_Y^2V\{\mu_Y(X,\Z)\} \end{align*}\]where $g(\cdot)$ and $V(\cdot)$ are known functions, $\fA(\cdot, \Z_1)$ and $\beta$ are unknown, and $\cT=[a, b]$ is a fixed interval.
Suppose $\psi_1(t),\psi_2(t),\ldots,\psi_p(t)$ are $p$ orthonormal functional on $\cal T$, and
\[\xi_j = \int \psi_j(t)[X(t) - \mu_X(t) ]dt\]It is commonly assumed that the conditional distribution of $Y$ given $X(\cdot)$ and $\Z$ only depends on $\xi=(\xi_1,\ldots,\xi_p)^T$ and $\Z$.
Two typical structural considerations, corresponds to the two tracks of the paper:
- fixed features: the $\psi_j$ are known basis functions,
- data-driven: the $\psi_j$ are the leading principal components of $X(t)$
To make the model parsimonious without imposing strong parametric structural assumptions, and to allow interactions between $X$ and $\Z_1$, (write $\fA()$ in terms of the basis functions) assume that
\[\fA(\cdot, \Z_1) = \phi^T(t)\alpha(Z;\theta)\,,\]and
\[\alpha(Z_1;\theta) = \alpha_1 + S(Z_1;\theta)\alpha_2\,,\]where $S(\Z_1;\theta)$ is a semiparametric function with a single-index structure, and $(\alpha_1,\alpha_2)$ are unknown coefficient vectors.
The parameters, $\alpha_{2j}, \theta$ and $S(\cdot)$ determine how the variable $\Z_i$ interacts with $X$ in the direction (or say the component ) $j$ through the value of $\alpha_{2j}\xi_{ij}S(\Z_{1i};\theta)$
- when $\Z_1$ is a scalar, $S(\Z_1)$ is a nonparametric function
- when $\Z_1$ is a $d_1\times 1$ vector, $S(\Z_1;\theta)=S(\theta^T\Z_1)$ is a single-index weight vector subject to the usual single-index constraint $\Vert \theta\Vert=1$ and $\theta_1 > 0$.
The model is flexible in the following ways:
- if $S(z;\theta)=0$ for all $z$, it reduces to the functional generalized linear model
- the nonparametric function $S(\cdot)$ enables to model nonlinear interactions flexibly while the single-index $\theta^T\Z_1$ allows one to accommodate a multivariate $\Z_1$ without suffering from the curse of dimensionality.
- the model is not limited to functional data, i.e., $g(E(Y\mid \xi, \Z))=\alpha^T(\Z_1;\theta)\xi + \beta^T\Z$