Change Points
Posted on (Update: )
Introduction
- Page’s (1954, 1955) classical formulation
- Shiryaev (1963) and Lorden (1971) then developed
One is concerned with sequential detection of a change-point, which represents a disruption in a continuous production process.
- problems in fixed samples
- return to sequential detection motivated by problems involving parallel streams of data subject to disruptions in some fraction of them.
- not discuss applications to finance
Page’s problem
Suppose $X_1,\ldots,X_m$ are independent observations.
- for $j\le K$, have the distribution $F_0$
- for $j > K$, have the distribution $F_1$
where $F_i$ may be completely specified or may depend on unknown parameters.
Page’s solution and Barnard’s Suggestion
For sequential detection, Page (1954) suggested the stopping rule
\[N_0 = \min\{n:S_t-\min_{0\le k\le t}S_k\ge b\}\,,\]$n$ should be $t$.
where $S_t$ is the $t$-th cumulative sum (CUSUM) of scores $Z(X_i)$.
Barnard (1959) discussed graphical methods for implementing Page’s sequential procedure and suggested a modified procedure for the case of normally distributed random variables with a mean value subject to change from an initial value of 0. Let $S_t = \sum_0^tX_i$, Barnard suggested the stopping rule
\[N = \min\{t:\max_{0\le k < t}\vert S_t-S_k\vert / \sigma(t-k)^{1/2}\ge b\}\,.\]Note that if $F_1(F_0)$ denotes a normal distribution with unit variance and mean value equal to $\mu_1=\delta$ ($\mu_0=0$), the log-likelihood ratio at $n>K$ is
\[\delta \sum_{i=K+1}^n(X_i-\delta/2).\]Maximization w.r.t. $\delta$ and $k < n$ leads to $\max_{0\le k < t}(S_t-S_k)^2/[2(t-k)]$, so Barnard’s suggestion can be described as stopping as soon as the generalized likelihood ratio statistic exceeds a suitable threshold.
Shiryaev’s and Lorden’s contributions
Shiryaev (1963) considered the case of completely specified $F_0$ and $F_1$. He assumed that $K$ is random and used optimal stopping theory to describe an exact solution to a well-formulated Bayesian version of the problem, and he computed the Bayes solution in a continuous time formulation involving Brownian motion.
- loss: $1{K>n}+C(n-K)^+$
- approximation: under geometric prior and P(a change in any bounded interval) = vanishingly small
Lorden (1971) took a maximum likelihood approach, in the case of two completely specified distributions, leading to the stopping rule
\[N_2 = \min\left\{t : \max_{0\le k\le t}\sum_{j=k+1}^t\log[(dF_1/dF_0)(X_j)]\ge b\right\}\,.\]Some related fixed sample problems
Having observed $X_1,\ldots,X_m$, suppose we are interested in testing the hypothesis that there is no change-point. The statistic suggested by Page was
\[\max_{0\le k\le m}\sum_{j=k+1}^m [\delta(X_j-\delta/2)]\,,\]which is the likelihood ratio statistic.
Hypothesis Testing When a Nuisance Parameter Is Only Present under the Alternative
“semi-linear” regression example:
\[Y_i=\alpha+\beta f_i(\theta)+e_i\,,\]where $f$ is nonlinear and $\theta$ can be multidimensional.
The hypothesis to be tested is that $\beta = 0$; under this hypothesis the parameter $\theta$ has no meaning. The special case $f_i(\theta) = (x_i-\theta)^+$ is in the spirit of a change-point problem, where the change occurs in the slope of a linear regression.