# Poisson Regression

##### Posted on Jul 31, 20170 Comments
Tags: Poisson Regression

## Introduction

The Poisson distribution $P(\mu)$ is often used to model count data. If $Y$ is the number of occurrences, its probability distribution can be written as

$f(y) = \frac{\mu^ye^{-\mu}}{y!},\; y=0,1,2,\ldots$

where $\mu$ is the average number of occurrences. And we can obtain

$E(Y)=\mu\qquad var(Y)=\mu$

## Poisson Regression

Let $Y_1,\ldots, Y_N$ be independent random variables with $Y_i$ denoting the number of events observed from exposure $n_i$ for the $i$-th covariate pattern. We have

$E(Y_i)=\mu_i = n_i\theta_i$

where $\theta_i$ is usually modeled by

$\theta_i=\exp(\boldsymbol x_i^T\boldsymbol \beta)$

Therefore, the generalized linear model is

$E(Y_i)=\mu_i=n_i\exp(\boldsymbol x_i^T\boldsymbol \beta)\qquad Y_i\sim P(\mu_i)$

The natural link function is the logarithmic function

$\log \mu_i = \log n_i + \boldsymbol x_i^T\boldsymbol \beta$

where $\log(n_i)$ called the offset, it is a known constant.

## Measure the goodness of fit

Hypotheses about the parameter $\beta_j$ can be tested using the wald, score or likelihood ration statistics. Approximately, we have

$\frac{b_j-\beta_j}{s.e.(b_j)}\sim N(0, 1)$

The fitted values are given by

$e_i=\hat Y_i=\hat \mu_i=n_i exp(\boldsymbol x_i^T\boldsymbol b), \; i=1,\ldots, N$

The pearson residuals are

$r_i = \frac{o_i-e_i}{\sqrt{e_i}}$

where $o_i$ is the observed value of $Y_i$. The chi-squared goodness of fit statistics are related by

$X^2 = \sum r_i^2 = \sum\frac{(o_i-e_i)^2}{e_i}$

The deviance for a Poisson model can be written in the form

$D=2\sum[o_i\log(o_i/e_i)-(o_i-e_i)]$

For most models, we have $\sum o_i=\sum e_i$, so

$D = 2\sum[o_i\log(o_i/e_i)]$

The deviance residuals are the components of $D$,

$d_i = sign(o_i-e_i)\sqrt{2\sum[o_i\log(o_i/e_i)-(o_i-e_i)]},\; i=1,\ldots, N$

so that $D = \sum d_i^2$.

Using the Taylor series expansion,

$\begin{array}{cc} D &= 2\sum\limits_{i=1}^N[(o_i-e_i)+\frac{1}{2}\frac{(o_i-e_i)^2}{e_i}-(o_i-e_i)]\\ &=\sum\limits_{i=1}^N\frac{(o_i-e_i)^2}{e_i}=X^2 \end{array}$

## Implement

The example which implement in C++ and R can be found on gsl_lm

You can see the code in my github repository gsl_lm/poisson.

## References

An Introduction to Generalized Linear Models, Third Edition (Chapman & Hall/CRC Texts in Statistical Science)

Published in categories Report