# Link-free v.s. Semiparametric

##### Posted on Jan 08, 2019

This note is based on Li (1991) and Ma and Zhu (2012).

## SIR for Dimension Reduction

Consider the model

where $\beta$’s are unknown row vectors. We can view this model as the projection of the $p$-dimensional explanatory variable $\x$ onto the $K$ dimensional subspace, $(\beta_1\x,\ldots,\beta_K\x)$.

When $K$ is small, we may achieve the goal of data reduction by estimating the $\beta$’s efficiently. Any linear combination of the $\beta$’s is an effective dimension-reduction (e.d.r) direction, and called the linear space $B$ generated by the $\beta$’s as the e.d.r space.

Toy example, consider $\beta_1=(1,2,0)$, $\beta_2=(0,0,1)$, then any linear combination $c_1\beta_1+c_2\beta_2$ is a point (direction) in the subspace (e.d.r. space) spanned by $(1,2,0)$ and $(0,0,1)$.

Let $\Sigma_{\x\x}$ be the covariance matrix of $\x$, and consider the standardized version of $\x$, $\z=\Sigma_{\x\x}^{-1/2}[\x-\E\x]$. Then we have

where $\eta_k=\beta_k\Sigma_{\x\x}^{1/2}$. Any vector in the linear space generated by the $\eta_k$’s is a standardized e.d.r. direction.

One-component models ($K=1$)

1. the generalized linear model
2. the Box-Cox transformation model

Multicomponent model ($K>1$)

1. general form $g(\beta_1\x,\ldots,\beta_K\x)$
2. additivity form: $g_1(\beta_1\x)+\cdots+g_K(\beta_K\x)$. (PPR)

Evaluate the effectiveness of an estimated e.d.r direction. An affine invariant criterion —- the squared multiple correlation coefficient between the projected variable $b\x$ and the ideally reduced variable $\beta_1\x,\ldots,\beta_K\x$.

where $b$ is the estimated e.d.r direction, and $B$ is true e.d.r space.

### Theorem

Condition: For any $b$ in $\IR^p$, for some constants $c_0,c_1,\ldots,c_K$, we have $\E(b\x\mid\beta_1\x,\ldots,\beta_K\x)=c_0+c_1\beta_1\x+\cdots+c_K\beta_K\x$.

Under the conditions, the centered inverse regression curve $\E(\x\mid y)-\E(\x)$ is contained in the linear subspace spanned by $\beta_k\Sigma_{kk}$.

#### consequence

• The eigenvectors, $\eta_k(k=1,\ldots,K)$, associated with the largest $K$ eigenvalues of $\cov[\E(\z\mid y)]$ are the standardized e.d.r. directions.

• One can quantify how far away from the standardized e.d.r space the inverse regression curve $\E(\z\mid y)$ is when the condition violated.

The procedure is similar to the case $K=1$, which was covered in SIR and Its Implementation.

## A Semiparametric Approach to Dimension Reduction

### Literatures

#### identifying the central space

• sliced average variance estimation
• directional regression
• kernel inverse regression
• CANCOR analysis

but they rely on certain conditions

• $\E(\x\mid\x^T\beta)$ is linear function of $\x$
• $\cov(\x\mid \x^T\beta)$ is assumed to be a constant matrix

others:

• Fourier transformations requires one to estimate the joint pdf of $\x$, which is typically infeasible in a high-dimensional environment.
• dMAVE, which adapting minimum average variance estimation (MAVE),
• SR.

Existing methods impose either the above two conditional moment conditions or distributional assumptions on the covariate vector in one form or another

#### identifying the central mean space

• OLS by assuming $\x$ to satisfy the linearity condition
• average derivative estimation, which requires $\x$ to be continuous
• nonlinear least squares
• minimum average variance estimation
• sliced regression
• principal Hessian directions which requires $\x$ to satisfy both the linearity condition and the constant variance condition.
• minimizing a Kullback-Leibler distance.

### Proposal

Casting the dimensional-reduction problem in the semiparametric framework, the dimension-reduction problems become semiparametric estimation problems. And powerful semiparametric estimation and inference tools become applicable.

superiority:

• relaxation of the linearity condition and the constant variance condition.

### estimating the central subspace via semiparametric

Let $\x$ be a $p\times 1$ covariate vector and $Y$ a univariate response variable. The goal of sufficient dimension reduction is to seek a matrix $\bbeta$ such that

The column space of $\bbeta$ satisfying $\eqref{eq:semiparam}$ is called a dimension-reduction subspace. Since the dimension-reduction subspace is not unique, the primary interest is the central subspace, which is defined as the interaction of all dimension-reduction subspaces, provided that the interaction itself is a dimension-reduction subspace.

The likelihood of one random observation $(\x,Y)$ is

where $\eta_1$ and $\eta_2$ are infinite-dimensional nuisance parameters while $\bbeta$ as the parametric estimation problem.

Influence functions can be viewed as normalized elements in a so called nuisance tangent space orthogonal complement $\Lambda^\perp$. Derive the orthogonal complement, and obtain a general class of estimating equations for any functions $g(Y,\x^T\bbeta)$.