Processing math: 100%

WeiYa's Work Yard

A traveler with endless curiosity, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Change Points

Posted on (Update: )
Tags: Change Points

Introduction

  1. Page’s (1954, 1955) classical formulation
  2. Shiryaev (1963) and Lorden (1971) then developed

One is concerned with sequential detection of a change-point, which represents a disruption in a continuous production process.

  1. problems in fixed samples
  2. return to sequential detection motivated by problems involving parallel streams of data subject to disruptions in some fraction of them.
  3. not discuss applications to finance

Page’s problem

Suppose X1,,Xm are independent observations.

  • for jK, have the distribution F0
  • for j>K, have the distribution F1

where Fi may be completely specified or may depend on unknown parameters.

Page’s solution and Barnard’s Suggestion

For sequential detection, Page (1954) suggested the stopping rule

N0=min{n:Stmin0ktSkb},

n should be t.

where St is the t-th cumulative sum (CUSUM) of scores Z(Xi).

Barnard (1959) discussed graphical methods for implementing Page’s sequential procedure and suggested a modified procedure for the case of normally distributed random variables with a mean value subject to change from an initial value of 0. Let St=t0Xi, Barnard suggested the stopping rule

N=min{t:max0k<t|StSk|/σ(tk)1/2b}.

Note that if F1(F0) denotes a normal distribution with unit variance and mean value equal to μ1=δ (μ0=0), the log-likelihood ratio at n>K is

δni=K+1(Xiδ/2).

Maximization w.r.t. δ and k<n leads to max0k<t(StSk)2/[2(tk)], so Barnard’s suggestion can be described as stopping as soon as the generalized likelihood ratio statistic exceeds a suitable threshold.

Shiryaev’s and Lorden’s contributions

Shiryaev (1963) considered the case of completely specified F0 and F1. He assumed that K is random and used optimal stopping theory to describe an exact solution to a well-formulated Bayesian version of the problem, and he computed the Bayes solution in a continuous time formulation involving Brownian motion.

  • loss: 1K>n+C(nK)+
  • approximation: under geometric prior and P(a change in any bounded interval) = vanishingly small
N1=min{t:tk=0tj=kdF1dF0(Xj)B}

Lorden (1971) took a maximum likelihood approach, in the case of two completely specified distributions, leading to the stopping rule

N2=min{t:max0kttj=k+1log[(dF1/dF0)(Xj)]b}.

Having observed X1,,Xm, suppose we are interested in testing the hypothesis that there is no change-point. The statistic suggested by Page was

max0kmmj=k+1[δ(Xjδ/2)],

which is the likelihood ratio statistic.

Hypothesis Testing When a Nuisance Parameter Is Only Present under the Alternative

“semi-linear” regression example:

Yi=α+βfi(θ)+ei,

where f is nonlinear and θ can be multidimensional.

The hypothesis to be tested is that β=0; under this hypothesis the parameter θ has no meaning. The special case fi(θ)=(xiθ)+ is in the spirit of a change-point problem, where the change occurs in the slope of a linear regression.

DNA/Protein Sequence Analysis


Published in categories Note