Sequential Monte Carlo samplers

Posted on Jun 11, 2017

This note is for Moral, P. D., Doucet, A., & Jasra, A. (2006). Sequential Monte Carlo samplers. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(3), 411–436.

Introduction

MCMC’s drawbacks

it is difficult to assess when the Markov chain has reached its stationary regime and it can easily become trapped in local modes.
cannot be used in a sequential Bayesian estimation context.

Sequential importance sampling

Importance sampling

\[\pi_n(x) = \gamma_n(x)/Z_n\qquad (1)\]

where

$\gamma_n(x)$ is known pointwise;
the normalizing constant $Z_n$ is unknown

\[E_{\pi_n}(\phi) = Z_n^{-1}\int_E \phi(x)w_n(x)\eta_n(x)dx\qquad (2)\]

where $\eta_n(x)$ is a positive density with respect to dx, referred to as the importance distribution.

\[Z_n=\int_Ew_n(x)\eta_n(x)dx\qquad (3)\] \[w_n(x)=\gamma_n(x)/\eta_n(x)\qquad (4)\]

We can sample $N$ particles ${X_n^{(i)}}$ from $\eta_n$ and substituting the Monte Carlo approximation

\[\eta_n^N(dx)=\frac 1N \sum\limits_{i=1}^N\delta_{X_n^{(i)}}(dx) \qquad (5)\]

where $\delta$ is Dirac measure

\[\delta_x(A) = 1_A(x) \qquad (6)\]

then we can substitute the Monte Carlo approximation into equations (2) and (3), and we will obtain an approximation for $E_{\pi_n}(\phi)$ and $Z_n$.

Sequential importance sampling

at time $n=1$, start with $\pi_1$, simply, let $\eta_1=\pi_1$

at time $n-1$, we have $N$ particles ${X_{n-1}^{i}}$ distributed according to $\eta_{n-1}$. Move these particles by using a Markov kernel $K_n$, with associated density denoted $K_n(x,x’)$. The particles ${X_n^{(i)}}$ that are obtained this way are marginally distributed according to

\[\eta_n(x')=\int_E \eta_{n-1}(x)K_n(x,x')dx\]

Algorithm settings

sequence of distributions ${\pi_n}$

$\pi_n(x) = p(x\mid y_1,\ldots,y_n)$
$\pi_n(x)\approx \pi (x)^{\phi_n}\mu_1(x)^{1-\phi_n}$
$\pi_n(x)\approx \pi (x)^{\phi_n}$
$\pi_n(x)\approx \pi (x)I_{E_n}(x)$

sequence of transition kernels ${K_n}$

independent proposal.(gaussian)
local random-walk moves.
markov chain monte carlo moves
approximate gibbs moves

Sequential Monte Carlo samplers

Methodology and algorithm

the importance weight can be computed exactly at time 1
$n>1$, impossible to compute $\eta_n(x_n)$ pointwise, as it requires an integration with respect to $x_{1:n-1}$

propose Markov kernels $L_{n-1}: E\times \varepsilon\rightarrow [0,1]$ with density $L_{n-1}(x_n, x_{n-1})$

\[\tilde \pi_n (x_{1:n}) = \tilde \gamma_n(x_{1:n})/Z_n\]

where

\[\tilde \gamma_n(x_{1:n})=\gamma_n(x_n)\prod\limits_{k=1}^{n-1}L_k(x_{k+1},x_k)\]

at time $n-1$, assume that a set of weighted particles $\{W_{n-1}^{(i)}, X_{1:n-1}^{(i)}\}(i=1,\ldots,N)$ approximating $\tilde \pi_{n-1}$ is available

\[\tilde \pi_{n-1}^N(dx_{1:n-1})=\sum\limits_{i=1}^NW_{n-1}^{(i)}\delta_{X_{1:i-1}^{(i)}}(dx_{1:n-1})\] \[W_{(n-1)}^{(i)}=w_{n-1}(X_{1:n-1}^{(i)})/\sum_{j=1}^N w_{n-1}(X_{1:n-1}^{(j)})\]

at time $n$, extend the path of each particle with a markov kernel $K_n(x_{n-1},x_n)$

\[w_n(x_{1:n})=\tilde \gamma_n(x_{1:n})/\eta_n(x_{1:n}) = w_{n-1}(x_{1:n-1})\tilde w_n(x_{n-1},x_n)\]

where the so-called incremental weight $\tilde w_n(x_{n-1},x_n)$ is equal to

\[\tilde w_n(x_{n-1},w_n)=\frac{\gamma_n(x_n)L_{n-1}(x_n,x_{n-1})}{\gamma_{n-1}(x_{n-1}K_n(x_{n-1},x_n))}\]

Algorithm

notion:

$\eta_n(x)$: the importance distribution

$\pi_n$: a target density

step 1: initialization

set $n = 1$

for $i=1,\ldots, N$ draw $X_1^{(i)}\sim \eta_1$

\[w_1(X_1^{(i)})=\gamma_1(X_1^{(i)})/\eta_1(X_1^{(i)})\]

and normalize these weights to obtain ${W_1^{(i)}}$

Iterate steps 3 and 3

step 2: resampling

if ESS < T, resample the particles and set $W_n^{(i)}=1/N$

step 3: sampling

set n = n+1; if n = p+1 stop;

for $i = 1, \ldots, N$ draw $X_n^{(i)} \sim K_n(X_{n-1}^{(i)}, \cdot)$

\[\tilde w_n(x_{n-1}, x_n) = \frac{\gamma_n(x_nL_{n-1}(x_n, x_{n-1}))}{\gamma_{n-1}(x_{n-1}K_n(x_{n-1}, x_n))}\]

normalize the weights

\[W_n^{(i)} = \frac{W_{n-1}^{(i)}\tilde w_n(X_{n-1:n}^{(i)})}{\sum\limits_{j=1}^NW_{n-1}^{(j)}\tilde w_n(X_{n-1:n}^{(j)})}\]

Notes on algorithm

Estimates of target distributions and normalizing constants

\[\pi_n^N(dx) = \sum\limits_{i=1}^NW_n^{(i)}\delta_{X_n^{i}}(dx)\] \[\frac{Z_n}{Z_{n-1}}=\frac{\int \gamma_n(x_n)dx}{\int \gamma_{n-1}(x_{n-1})dx_{n-1}}\] \[\widehat{\frac{Z_n}{Z_{n-1}}}=\sum\limits_{i=1}^NW_{n-1}^{(i)}\tilde w_n(X_{n-1:n}^{(i)})\]

Mixture of Markov kernels

it is to SMC sampling what the MH algorithm is to MCMC sampling

Algorithm settings

optimal backward kernels

suboptimal backward kernels

From MCMC to SMC

Methodology

build a sequence of distributions ${\pi_n}, n=1,\ldots, p$
build a sequence of MCMC transition kernels
based on ${\pi_n}$ and ${K_n}$, build a sequence of artificial backward markov kernels ${L_n}$
use the SMC to approximate ${\pi_n}$ and to estimate ${Z_n}$

Reference

Del Moral, Pierre, Arnaud Doucet, and Ajay Jasra. “Sequential monte carlo samplers.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.3 (2006): 411-436.

Published in categories Note

← previous next →

See all posts →

WeiYa's Work Yard

A traveler with endless curiosity, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Sequential Monte Carlo samplers

Posted on Jun 11, 2017

Introduction

MCMC’s drawbacks

Sequential importance sampling

Importance sampling

Sequential importance sampling

Algorithm settings

sequence of distributions ${\pi_n}$

sequence of transition kernels ${K_n}$

Sequential Monte Carlo samplers

Methodology and algorithm

Algorithm

step 1: initialization

step 2: resampling

step 3: sampling

Notes on algorithm

Estimates of target distributions and normalizing constants

Mixture of Markov kernels

Algorithm settings

optimal backward kernels

suboptimal backward kernels

From MCMC to SMC

Methodology

Reference