Lagrange Multiplier Test

Posted on Dec 17, 2019

This post is based on Peter BENTLER’s talk, S.-Y. Lee’s Lagrange Multiplier Test in Structural Modeling: Still Useful? in the International Statistical Conference in Memory of Professor Sik-Yum Lee.

In the structured linear models of multivariate random data, $\Sigma = \Sigma(\btheta)$ is a $p\times p$ covariance matrix whose elements are assumed to be differentiable real-valued functions of a true though unknown $q\times 1$ vector of parameter $\btheta$, and the primary statistical problems involve

estimation of parameters of the model,
establishing properties of the estimators,
evaluating goodness of fit of competing models.

In the method of maximum likelihood and generalized least squares, the parameter vector $\btheta$ is considered to represent a vector of free elements that are independent of each other.

Simple equality constraints among parameters can be implemented by a reparameterization, but models containing parameters that are related by general functional constraints had not been studied at the time of the paper, and the paper provides a statistical basis for constrained generalized least squares estimators.

$\Sigma_o=\Sigma(\btheta_o)$: a $p\times p$ population covariance matrix
$\sigma_{ij}(\btheta_o), i, j=1,\ldots,p$: elements of $\Sigma_o$, differentiable real-valued functions of a true though unknown $q\times 1$ vector of parameters $\btheta_o$
$\Omega\in \IR^q$ with elements $\btheta$: a closed and bounded parameter set
$\omega$: a subset of $\Omega$ whose elements satisfy the functional relationship $\bfh(\btheta)=\0$
$\bfh(\btheta)$: an $r\times 1$ real vector-valued continuous function of $\theta$.
$S$: the sample covariance matrix obtained from a random sample of size $N=n+1$ from a multivariate normal population with mean vector $\0$ and covariance matrix $\Sigma_o$.

Consider the generalized least squares function,

\[Q(\btheta) = \frac 12 \tr\{[(S-\Sigma)V]^2\}\]

which comes from the residual quadratic form $(\bfs - \bsigma)’[\Cov(\bfs, \bfs’)]^{-1}(\bfs - \bsigma)$ (Browne, 1974){.comment}

define the constrained generalized least squares estimator $\tilde\btheta$ of $\btheta_o$ as the vector which satisfies $\bfh(\tilde\btheta)=\0$ and minimizes $Q(\btheta)$. It follows from the first order necessary condition that there exists a vector $\tilde \blambda’ = (\tilde \lambda_1,\ldots, \tilde \lambda_r)$ of Lagrange multipliers such that

\[\begin{align} \dot Q(\tilde \btheta) + \tilde L'\tilde \blambda &= \0\\ \h(\tilde \btheta) &=\0\,, \end{align}\]

where $\dot Q= (\partial Q/\partial \theta_i)$ is the gradient vector of $Q(\btheta)$, and $L=(\partial h_i/\partial \theta_j)$ is an $r\times q$ matrix of partial derivatives with $\tilde L = L(\tilde\btheta)$.

The constrained maximum likelihood estimator $\hat\btheta$ of $\btheta_o$, is defined as the vector which satisfies $\bfh(\hat\btheta)=\0$ and minimizes the function

\[F(\btheta) = \log \vert \Sigma\vert +\tr(S\Sigma^{-1}) - \log \vert S\vert -p\,.\]

Similarly, we have

\[\begin{align} \dot F(\btheta) + \hat L'\hat\blambda &= 0\\ \bfh(\hat\btheta) &= 0\,. \end{align}\]

The generalized least squares estimator $\tilde\btheta$ is consistent.

The joint asymptotic distribution of random variables $n^{1/2}(\tilde\btheta-\btheta_o)$ and $n^{1/2}\tilde\blambda$ is multivariate normal with zero mean vector and covariance matrix.

The generalized least squares estimator $(\tilde \btheta, \tilde\blambda)$ is asymptotically equivalent to the maximum likelihood estimator $(\hat\btheta, \hat\blambda)$.

The asymptotic distribution of $nQ(\tilde \theta)$ is chi-square with degrees of freedom $p(p+1)/2-(q-r)$.

Let $\tilde \btheta^*$ be the generalized least squares estimator that is subject to $\h^*(\btheta)=\0$, where $\h^*(\btheta) = (h_1(\btheta), \ldots,h_j(\btheta))$.

The asymptotic distribution of $n[Q(\tilde \btheta) - Q(\tilde \btheta^*)]$ is chi-square with degrees of freedom $r-j$.

Proposition 4 provides an asymptotic test statistic for testing the null hypothesis

\[H_o = \Sigma_o = \Sigma(\btheta_o), \btheta_o\in \omega\]

against the general alternative that $\Sigma_o$ is any symmetric positive definite $p\times p$ matrix. Another asymptotic test statistic for the null hypothesis against the specific alternative

\[H_1 = \Sigma = \Sigma(\btheta_o), \btheta_o\in \Omega\]

is given by the next proposition. (how & why??)

The asymptotic distribution of $-2^{-1}n\tilde\blambda’\tilde R^{-}\tilde\blambda$ under $H_0$ is chi-square with degrees of freedom equal to the rank of $R_0$.

It is well known that this test is asymptotically equal to Rao’s (1948) score test. These tests also are asymptotically equal to the Wald and LR chi-square difference tests.

The LM (Lagrange Multiplier) test for several omitted parameters can be broken down into a series of 1-df tests. Bentler (1983, 1985) developed a forward stepwise LM procedure where, at each step, the parameter is chosen that will maximally increase the LM chi-square, contingent on those already included.

It seems that the most frequent applications of LM tests in SEM are the following:

$\theta_i=0$: evaluate necessity of an omitted parameter. This is often–maybe almost always– post-hoc.
$\theta_i-\theta_j=0$. evaluate the appropriateness of an equality restriction. This can be a priori.
in EQS (Structural Equation Modeling Software): evaluate constraints across multiple groups such as, for a given parameter, $\theta_i^{(1)}=\theta_i^{(2)}=\ldots=\theta_i^{(g)}$, i.e., differences are zero. This is typically a fully a priori test, e.g., of equal factor loadings across groups.

Simple nonlinear constraint such as $\theta_1 = \theta_2^2$ can be done with phantom variables, and do not require constrained optimization.

Published in categories Note

← previous next →

See all posts →

WeiYa's Work Yard

A traveler with endless curiosity, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Lagrange Multiplier Test

Posted on Dec 17, 2019