WeiYa's Work Yard

A dog, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Bayesian Estimation for Linear Regression

Posted on August 13, 2017 0 Comments

Linear regression can describe the tendency of one random variable $Y$ varies with another set of variables, $\boldsymbol x=(x_1,\ldots, x_p)$, and assumes that the conditional expectation $E(Y\mid\boldsymbol x)$ has a form that is linear in a set of parameters:

In addition to the linear assimption, linear regression also assumes that the variability around the mean is i.i.d. from a normal distribution

So the regression model under Gaussian Assumption provides a complete specification of the joint probability density of observed data $y_1,\ldots,y_n$ conditional upon $\boldsymbol x_1,\ldots,\boldsymbol x_n$ and values of $\boldsymbol \beta$ and $\sigma$

Rewriting this conditional probability in terms of the multivariate normal distribution, let $\boldsymbol y$ be the $n$-dimensional column vector $(y_1,\ldots,y_n)^T$, and let $\mathbf X$ be the $n\times p $ matrix whose $i$-th row is $\boldsymbol x_i$, then we obtain

Considering the sampling density of the data as a function of $\boldsymbol \beta$,

A semiconjugate prior distribution

If $\boldsymbol \beta\sim N(\boldsymbol\beta_0,\Sigma_0)$, then

Recognizing this as being proportional to a multivariate normal density, with

As in most normal sampling problems, the semiconjugate prior distribution for $\sigma^2$ is an inverse-gamma distribution. Letting $\gamma = 1/\sigma^2$ be the measurement precision, if $\gamma\sim \Gamma(\nu_0/2,\nu_0\sigma_0^2/2)$, then

which recognized as a gamma density, so that

We can construct the following Gibbs sampler to approximate the joint posterior distribution $p(\boldsymbol \beta, \sigma^2\mid \boldsymbol y,\boldsymbol X)$.

Given current values ${\boldsymbol\beta ^{(s)}, \sigma^{2(s)}}$

new values can be generated by

Step 1: Updating $\boldsymbol \beta$

a) compute $\mathbf V = Var(\boldsymbol \beta\mid \boldsymbol y,\mathbf X, \sigma^{2(s)})$ and $\mathbf m = E(\boldsymbol \beta\mid\boldsymbol y,\mathbf X,\sigma^{2(s)})$

b) sample $\boldsymbol \beta^{(s+1)}\sim N(\mathbf m, \mathbf V)$

Step 2: Updating $\sigma^2$

a) compute SSR$(\boldsymbol \beta^{(s+1)})$

b) sample $\sigma^{2(s+1)}\sim \text{inverse-gamma}([\nu_0+n]/2,[\nu_0\sigma_0^2+SSR(\boldsymbol\beta^{(s+1)})]/2)$

Default and weakly informative prior distributions



P.D. Hoff, A First Course in Bayesian Statistical Methods, Springer Texts in Statistics, DOI 10.1007/978-0-387-92407-6 9, © Springer Science+Business Media, LLC 2009

Published in categories Regression