# Studentized U-statistic

##### Posted on Feb 15, 20190 Comments

In Prof. Shao’s wonderful talk, Wandering around the Asymptotic Theory, he mentioned the Studentized U-statistics. I am interested in the derivation of the variances in the denominator. Consider the U-statistic,

where $\sum_{C_n}$ denotes the summation over the $\binom{n}{m}$ combinations of $m$ distinct elements $\{\alpha_1,\ldots,\alpha_m\}$ from $\{1,\ldots,n\}$.

## Standardized Form

The Hoeffding’s Theorem can be used to calculate the variance of $U_n$:

(Hoeffding's theorem) For a $U$-statistic $U_n$ given by $\eqref{eq:udef}$ with $\E[h(X_1,\ldots,X_m)]^2<\infty$, $$\Var(U_n) = \binom{n}{m}^{-1}\sum_{k=1}^m\binom{m}{k}\binom{n-m}{m-k}\zeta_k\,,$$ where $$\zeta_k = \Var(h_k(X_1,\ldots,X_k))\,,$$ and $$h_k(x_1,\ldots,x_k)=\E[h(X_1,\ldots,X_m)\mid X_1=x_1,\ldots,X_k=x_k]\,.$$

For $m=2$,

where $\zeta_1=\Var(h_1(X_1))$ and $\zeta_2 = \Var(h_2(X_1,X_2))$. Note that

thus,

which is similar to the standardized term $\frac{4\Var(g(X_1))}{n}$ in the slide except the extra term $\zeta_2$ and an additional coefficient $\frac{n-2}{n-1}$ for $\zeta_1$.

Let us resort to the asymptotic result (Jun Shao, 2003):

Let $U_n$ be given by $\eqref{eq:udef}$ with $\E[h(X_1,\ldots,X_m)]^2<\infty$. If $\zeta_1>0$, then $$\sqrt n[U_n-\E(U_n)]\rightarrow_d N(0,m^2\zeta_1)\,.$$

Now for the special case $m=2$, we have

then the standardized form is

## Studentized Form

Arvesen, J. N. (1969) considers the jackknife estimate, define

where $C^{\backslash k}_{n-1}$ indicates that the summation is over all combinations $(\beta_1^{\backslash k},\ldots, \beta_m^{\backslash k})$ of $m$ integers chosen from $(1,\ldots,k-1,k+1,\ldots, n)$.

Arvesen, J. N. (1969) proves that

For $m=2$, we have

which implies that

But in the following slide, $s_*^2$ and $s_1^2$ are too different, only share the same coefficient.

I am confused, and asked help from Prof. Shao, he explained that there are various definitions of jackknife estimate, which means that it is also possible to construct other approximations different from $\eqref{eq:approx}$, and the Studentized form in the slide can be viewed as only keeping $X_i$ in each “jackknife”, i.e., the summation is over $(i, j)$, where $i$ is fixed, $j$ is chosen from $(1,\ldots,i-1,i+1,\ldots,n)$.

However, I haven’t discuss more details with Prof. Shao, and I am still curious about the derivation of variance in the Jackknife form, and want to know if it is possible to work further with $\eqref{eq:sstar}$.

Jun Shao (2003). Mathematical Statistics.

Arvesen, J. N. (1969). Jackknifing U-Statistics. The Annals of Mathematical Statistics, 40(6), 2076–2100.