Estimate Parameters in Logistic Regression

Posted on Jul 30, 20170 Comments
Tags: Logistic Regression, MLE

Background

Assuming $\boldsymbol y$ is a $n\times 1$ response variable, and $y_i\sim B(1, \pi_i)$. $\boldsymbol x_1,\ldots,\boldsymbol x_p$ are $p$ explanatory variables.

Then the likelihood of $\boldsymbol y$ is

and the log-likelihood is

The logistic regression said that

where

then we have

Thus, the log-likelihood could be

Maximum Likelihood Estimate

We need to find $\boldsymbol \beta$ to minimize $l(\boldsymbol \beta)$, which means that

We adopt Newton-Raphson Algorithm (Multivariate version) to solve $\boldsymbol \beta$ numerically.

Let

then

so

where

The Newton-Raphson formula for multi-variate problem is

Wald Test for Parameters

In the univariate case, the Wald statistic is

which is compared against a chi-squared distribution.

Alternatively, the difference can be compared to a normal distribution. In this case, the test statistic is

where $se(\hat\theta)$ is the standard error of the maximum likelihood estimate. Assuming $\mathbf H$ is the Hessian of the log-likelihood function $l$, then the vector $\sqrt{diag((-\mathbf H)^{-1})}$ is the estimate of the standard error of each parameter value at its maximum.

Implement in C++ and R

Implement above algorithm and apply to an example, and compare the logistic regression results with glm(..., family = binomial()) in R.

If you want to know more details, you can visit my github repository gsl_lm/logit.

References

Published in categories Report