Wilcoxon signed-rank是一种非参数检验的统计量,用于检验对称分布的均值是否为0。给出iid数据Y1,?,YnY_1,\cdots,Y_nY1?,?,Yn?,Zj=sign(Yj)Z_j = sign(Y_j)Zj?=sign(Yj?),RjR_jRj?为ZjZ_jZj?的秩(rank). Wilcoxon signed-rank statistics定义为:
W=∑jZjRjW=\sum_{j} Z_{j} R_{j}W=j∑?Zj?Rj?
并且可以得到其均值为0,方差为n(n+1)(2n+1)/6n(n + 1)(2n + 1)/6n(n+1)(2n+1)/6。
事实上,WWW和UUU统计量有很直接的联系。而UUU统计量有很好的性质:
If Eh2(X1,…,Xr)<∞,\mathrm{Eh}^{2}\left(X_{1}, \ldots, X_{r}\right)<\infty,Eh2(X1?,…,Xr?)<∞, then n(U?θ?U^)→P0.\sqrt{n}(U-\theta-\hat{U}) \stackrel{P}{\rightarrow} 0 .n?(U?θ?U^)→P0. Consequents,
the sequence n(U?θ)\sqrt{n}(U-\theta)n?(U?θ) is asymptotically normal with mean 0 and variance r2ζ1,r^{2} \zeta_{1},r2ζ1?, where. with X1,…,Xr,X1′,…,Xr′X_{1}, \ldots, X_{r}, X_{1}^{\prime}, \ldots, X_{r}^{\prime}X1?,…,Xr?,X1′?,…,Xr′? denoting i.i.d. variables.
ζ1=cov?(h(X1,X2,…,Xr),h(X1,X2′,….Xr′))\zeta_{1}=\operatorname{cov}\left(h\left(X_{1}, X_{2}, \ldots, X_{r}\right), h\left(X_{1}, X_{2}^{\prime}, \ldots . X_{r}^{\prime}\right)\right) ζ1?=cov(h(X1?,X2?,…,Xr?),h(X1?,X2′?,….Xr′?))
其中U^\hat{U}U^是一个projection,这里不细说。利用这里的结论我们可以知道WWW是渐进正态分布分的。显然根据上面的方差计算我们知道
3n3Wn→DN(0,1)\sqrt{\frac{3}{n^3}}W_n\overset{D}{\to}N(0,1)n33??Wn?→DN(0,1)
下面是用R代码做的一个简单模拟,可以看到其渐进正态性表现得不错。
library(ggplot2)
set.seed(1234)
N=10000
n = 1000
W = rep(0,N)
for(j in 1:N)for(i in 1:n){W[j] = W[j] + i*(2*rbinom(1,1,0.5)-1)}Z = W/sqrt(n^3/3)
ggplot() + geom_histogram(aes(Z),stat = "bin",bins = 30)
qqnorm(Z)