Lagrange Multiplier Test
Posted on
This post is based on Peter BENTLER’s talk, S.-Y. Lee’s Lagrange Multiplier Test in Structural Modeling: Still Useful? in the International Statistical Conference in Memory of Professor Sik-Yum Lee.
In the structured linear models of multivariate random data, Σ=Σ(θ) is a p×p covariance matrix whose elements are assumed to be differentiable real-valued functions of a true though unknown q×1 vector of parameter θ, and the primary statistical problems involve
- estimation of parameters of the model,
- establishing properties of the estimators,
- evaluating goodness of fit of competing models.
In the method of maximum likelihood and generalized least squares, the parameter vector θ is considered to represent a vector of free elements that are independent of each other.
Simple equality constraints among parameters can be implemented by a reparameterization, but models containing parameters that are related by general functional constraints had not been studied at the time of the paper, and the paper provides a statistical basis for constrained generalized least squares estimators.
- Σo=Σ(θo): a p×p population covariance matrix
- σij(θo),i,j=1,…,p: elements of Σo, differentiable real-valued functions of a true though unknown q×1 vector of parameters θo
- Ω∈IRq with elements θ: a closed and bounded parameter set
- ω: a subset of Ω whose elements satisfy the functional relationship \bfh(θ)=0
- \bfh(θ): an r×1 real vector-valued continuous function of θ.
- S: the sample covariance matrix obtained from a random sample of size N=n+1 from a multivariate normal population with mean vector 0 and covariance matrix Σo.
Consider the generalized least squares function,
Q(θ)=12tr{[(S−Σ)V]2}which comes from the residual quadratic form (\bfs−σ)′[Cov(\bfs,\bfs′)]−1(\bfs−σ) (Browne, 1974){.comment}
define the constrained generalized least squares estimator ˜θ of θo as the vector which satisfies \bfh(˜θ)=0 and minimizes Q(θ). It follows from the first order necessary condition that there exists a vector ˜λ′=(˜λ1,…,˜λr) of Lagrange multipliers such that
˙Q(˜θ)+˜L′˜λ=0h(˜θ)=0,where ˙Q=(∂Q/∂θi) is the gradient vector of Q(θ), and L=(∂hi/∂θj) is an r×q matrix of partial derivatives with ˜L=L(˜θ).
The constrained maximum likelihood estimator ˆθ of θo, is defined as the vector which satisfies \bfh(ˆθ)=0 and minimizes the function
F(θ)=log|Σ|+tr(SΣ−1)−log|S|−p.Similarly, we have
˙F(θ)+ˆL′ˆλ=0\bfh(ˆθ)=0.
- The generalized least squares estimator ˜θ is consistent.
- The joint asymptotic distribution of random variables n1/2(˜θ−θo) and n1/2˜λ is multivariate normal with zero mean vector and covariance matrix.
- The generalized least squares estimator (˜θ,˜λ) is asymptotically equivalent to the maximum likelihood estimator (ˆθ,ˆλ).
- The asymptotic distribution of nQ(˜θ) is chi-square with degrees of freedom p(p+1)/2−(q−r).
Let ˜θ∗ be the generalized least squares estimator that is subject to h∗(θ)=0, where h∗(θ)=(h1(θ),…,hj(θ)).
- The asymptotic distribution of n[Q(˜θ)−Q(˜θ∗)] is chi-square with degrees of freedom r−j.
Proposition 4 provides an asymptotic test statistic for testing the null hypothesis
Ho=Σo=Σ(θo),θo∈ωagainst the general alternative that Σo is any symmetric positive definite p×p matrix. Another asymptotic test statistic for the null hypothesis against the specific alternative
H1=Σ=Σ(θo),θo∈Ωis given by the next proposition. (how & why??)
- The asymptotic distribution of −2−1n˜λ′˜R−˜λ under H0 is chi-square with degrees of freedom equal to the rank of R0.
It is well known that this test is asymptotically equal to Rao’s (1948) score test. These tests also are asymptotically equal to the Wald and LR chi-square difference tests.
The LM (Lagrange Multiplier) test for several omitted parameters can be broken down into a series of 1-df tests. Bentler (1983, 1985) developed a forward stepwise LM procedure where, at each step, the parameter is chosen that will maximally increase the LM chi-square, contingent on those already included.
It seems that the most frequent applications of LM tests in SEM are the following:
- θi=0: evaluate necessity of an omitted parameter. This is often–maybe almost always– post-hoc.
- θi−θj=0. evaluate the appropriateness of an equality restriction. This can be a priori.
- in EQS (Structural Equation Modeling Software): evaluate constraints across multiple groups such as, for a given parameter, θ(1)i=θ(2)i=…=θ(g)i, i.e., differences are zero. This is typically a fully a priori test, e.g., of equal factor loadings across groups.
Simple nonlinear constraint such as θ1=θ22 can be done with phantom variables, and do not require constrained optimization.