导航
- Slice sampler
-
- 2D slice sample
- General Slice Sampler
- Hierarchical models
- python Code download
- References
Slice sampler
另一种MCMC
算法是切片采样,和Gibbs sampling
类似,采样过程中没有tunning processes
并且所有提取的样本都被接受.
In slice sampling, the Markov chain is constructed by using an auxiliary variable representing slices through the (unnormalized) posterior that is constructed using only the current parameter value. Like Gibbs sampling, there is no tunning process and all proposals are accepted. For slice sampling, you either need the inverse distribution function or some way to estimate it.
假设要从后验分布N(0,1)\mathcal{N}(0, 1)N(0,1)中采集随机样本
Start with some value xxx-sample yyy from U(0,f(x))\mathcal{U}(0, f(x))U(0,f(x)) this is the horizontal
slice
that gives the method its name - sample the next xxx from f?1(x)f^{-1}(x)f?1(x) - this is typically done numerically.
在两个方向上交替采样,在均匀分布U(0,f(x))\mathcal{U}(0, f(x))U(0,f(x))上采集样本yyy,再从f?1(y)f^{-1}(y)f?1(y)上采集xxx
def slice_sampler():dist = ss.norm(5, 3) # Gaussian分布w = 0.5x = dist.rvs()niters = 1000 # 采集1000个样本xs = []while len(xs)<niters:y = np.random.uniform(0, dist.pdf(x)) # y为U[0, f(x)]上的样本lb = xrb = xwhile y<dist.pdf(lb): lb-=wwhile y<dist.pdf(rb): rb+=wx = np.random.uniform(lb, rb) # x样本if y > dist.pdf(x):if np.abs(x-lb)<np.abs(x-rb): lb=xelse: lb=yelse: xs.append(x)plt.hist(xs, bins=20)plt.grid(True)plt.legend()plt.show()
slice_sampler()
Notes:
- the slice may consist of disjoint pieces for multimodal distributions.
- the slice can be rectangular hyperslab for multivariable psterior distributions.
- sampling from the slice is non-trivial and may involve iterative rejection steps.
2D slice sample
在两个方向上交替迭代取值,即在第i
次迭代中
u(t+1)?U[0,f(xt)]x(t+1)?UA(t+1)A(t+1)={x:f(x)≥u(t+1)}\begin{aligned} &u^{(t+1)}\sim U_{[0, f(x^t)]}\\ &x^{(t+1)}\sim U_{A^{(t+1)}}\\ &A^{(t+1)}=\{x: f(x)\geq u^{(t+1)}\} \end{aligned} ?u(t+1)?U[0,f(xt)]?x(t+1)?UA(t+1)?A(t+1)={
x:f(x)≥u(t+1)}?
给定pdf
f(x)=1Zexp?(??(x+1)22),x∈(0,1)f(x)=\frac{1}{Z}\exp(-\frac{-(x+1)^2}{2}), x\in(0, 1) f(x)=Z1?exp(?2?(x+1)2?),x∈(0,1)
设置起点值为x=0.10x=0.10x=0.10,采集前30个样本点的轨迹如下(Matlab Code)
% 2D slice sampler
T = 0:10000;
T = T/10000;
% N(-1,1)
y = exp(-(T+1).^2/2);
plot(T,y);
hold on;
x = 0.10; % 设置初始点
u = rand *(exp(-(x+1).^2/2));
x_s = [x];
u_s = [u];
for k = 1:30;limit = -1 + sqrt(-2*log(u));limit = min([limit 1]);x = rand * limit;x_s = [x_s x];u_s = [u_s u];u = rand *(exp(-(x+1).^2/2));x_s = [x_s x];u_s = [u_s u];
end
plot(x_s,u_s,'-*');
hold off;
设置不同起始点对比采样结果
%%
x = 0.01;
u = 0.01;
x_s = [x];
u_s = [u];
for k = 1:50;limit = -1 + sqrt(-2*log(u));limit = min([limit 1]);x = rand * limit;x_s = [x_s x];u_s = [u_s u];u = rand *(exp(-(x+1).^2/2));x_s = [x_s x];u_s = [u_s u];
end
figure;
subplot(1,3,1);
plot(x_s,u_s,'*');hold on;plot(T,y);
x = 0.99;
u = 0.0001;
x_s = [x];
u_s = [u];
for k = 1:50;limit = -1 + sqrt(-2*log(u));limit = min([limit 1]);x = rand * limit;x_s = [x_s x];u_s = [u_s u];u = rand *(exp(-(x+1).^2/2));x_s = [x_s x];u_s = [u_s u];
end
subplot(1,3,2);
plot(x_s,u_s,'*');hold on;plot(T,y);
x = 0.25;
u = 0.0025;
x_s = [x];
u_s = [u];
for k = 1:50;limit = -1 + sqrt(-2*log(u));limit = min([limit 1]);x = rand * limit;x_s = [x_s x];u_s = [u_s u];u = rand *(exp(-(x+1).^2/2));x_s = [x_s x];u_s = [u_s u];
end
subplot(1,3,3);
plot(x_s,u_s,'*');hold on;plot(T,y);
General Slice Sampler
在一般slice sampler
中,由于pdf函数的复杂性,集合A(t+1)A^{(t+1)}A(t+1)的范围比较难以确定,可以将pdf函数分解为多个较为简单函数积处理
f(x)∝∏i=1kfi(x)(x,ω1,…,ωk)?p(x,ω1,…,ωk)∝∏i=1kπ0≤ωi≤fi(x)fi(x)=∫0≤wi≤fi(x)dωi\begin{aligned} &f(x)\propto \prod_{i=1}^k f_i(x)\\ &(x, \omega_1, \dots, \omega_k)\sim p(x, \omega_1, \dots, \omega_k)\propto \prod_{i=1}^k \pi_{0\leq \omega_i\leq f_i(x)}\\ &f_i(x)=\int 0\leq w_i\leq f_i(x)d\omega_i \end{aligned} ?f(x)∝i=1∏k?fi?(x)(x,ω1?,…,ωk?)?p(x,ω1?,…,ωk?)∝i=1∏k?π0≤ωi?≤fi?(x)?fi?(x)=∫0≤wi?≤fi?(x)dωi??
slice sampler algorithm
{ω1t+1?U[0,f1(x(t))]?wkt+1?U[0,fk(x(t))]x(t+1)?UA(t+1)A(t+1)={y;fi(y)≥wi(t+1),i=1,2,…,k}\left\{ \begin{aligned} &\omega_1^{t+1}\sim U_{[0, f_1(x(t))]}\\ &\vdots\\ &w_k^{t+1}\sim U_{[0, f_k(x(t))]}\\ &x^{(t+1)}\sim U_{A^{(t+1)}}\\ &A^{(t+1)}=\{y; f_i(y)\geq w_i^{(t+1)}, i=1, 2, \dots, k\} \end{aligned} \right. ?????????????????????ω1t+1??U[0,f1?(x(t))]??wkt+1??U[0,fk?(x(t))]?x(t+1)?UA(t+1)?A(t+1)={
y;fi?(y)≥wi(t+1)?,i=1,2,…,k}?
设
f(x)=(1+sin?2(3x))(1+cos?4(5x))exp?(?x2/2)=f1(x)f2(x)f3(x)f(x)=(1+\sin^2(3x))(1+\cos^4(5x))\exp(-x^2/2)=f_1(x)f_2(x)f_3(x) f(x)=(1+sin2(3x))(1+cos4(5x))exp(?x2/2)=f1?(x)f2?(x)f3?(x)
% general slice sample
x=0;
u1 = rand*(1+sin(3*x)^2);
u2 = rand*(1+cos(5*x)^4);
u3 = rand*(exp(-x^2/2));
x_s = zeros(1, 10000);
for k=1:10000limit = sqrt(-2*log(u3));x=-limit+2*limit*rand;while ((sin(3*x))^2<u1-1 || (cos(5*x))^4<u2-1)x=-limit+2*limit*rand;endu1 = rand*(1+sin(3*x)^2);u2 = rand*(1+cos(5*x)^4);u3 = rand*(exp(-x^2/2));x_s(k)=x;
end
histogram(x_s, 150, 'FaceColor', [0.55, 0.55, 0.55], 'FaceAlpha', 0.90, 'EdgeColor', [0.9, 0.9, 0.9], 'Normalization', 'probability'); % 采样直方图
hold on;
yyaxis right; % 开启右轴
t = -4:0.1:4;
yt = (1+sin(3.*t).^2).*(1+cos(5.*t).^4).*exp(-t.^2/2);
plot(t, yt, 'Color', 'red', 'LineStyle', '-.');
hold off
Hierarchical models
层次模型的结构如下,首先指定数据来自参数为θ\thetaθ的分布
X?f(X∣θ)X\sim f(X\mid\theta) X?f(X∣θ)
参数本身也来自另一个以λ\lambdaλ为参数的分布
θ?g(θ∣λ)\theta\sim g(\theta\mid\lambda) θ?g(θ∣λ)
λ\lambdaλ来自先验分布
λ?h(λ)\lambda\sim h(\lambda) λ?h(λ)
The essential idea of the hierarchical model is because the θ\thetaθs are not independent but rather are drawn from a common distribution with parameter λ\lambdaλ,
we can share information across the
θ\thetaθsby also estimating
λ\lambdaλ at the same time.
例:对一批来自相同铸币厂的缺陷硬币进行检测,可以通过多次投掷结果使用统计推断的方法找出bias
,一般有两种方法解决该问题
- estimate the bias of each coin from its coin toss data independently of all the others.(孤立样本点估计)
- pool the results together and estimate the same bias for all coins.(所有样本点聚集估计)
由于层次模型的条件独立性,一般使用Gibbs sampling
作为MCMC
的采样策略.
Note that because of the conditionally independent structure of hierarchial models, Gibbs sampling is often a natural choice for the MCMC sampling strategy.
假设核电站的泵每10个有yiy_iyi?个是次品,每次观测泵的时间点为tit_iti?,使用Poisson likelihood
对失败的次数进行建模,其中强度参数λi\lambda_iλi?(表示失败的次数)在每个泵中都不相同,由于每个泵的观测时间点是不同的,需要将λi\lambda_iλi?放缩至λiti\lambda_it_iλi?ti?.
似然函数fff为
∏i=110Poisson(λiti)\prod_{i=1}^{10}Poisson(\lambda_it_i) i=1∏10?Poisson(λi?ti?)
其中参数λ\lambdaλ服从先验概率密度函数ggg
Γ(α,β)\Gamma(\alpha, \beta) Γ(α,β)
设置α=1.8\alpha=1.8α=1.8,超参数β\betaβ的分布为
Γ(γ,δ)\Gamma(\gamma, \delta) Γ(γ,δ)
其中γ=0.01;δ=1\gamma=0.01;\delta=1γ=0.01;δ=1
层次模型中有11个未知参数(10个λ\lambdaλ和β\betaβ),后验分布为
p(λ,β∣y,t)=∏i=110Poisson(λiti)×Γ(α,β)×Γ(γ,δ)p(\lambda, \beta\mid y, t)=\prod_{i=1}^{10}Poisson(\lambda_it_i)\times \Gamma(\alpha, \beta)\times \Gamma(\gamma, \delta) p(λ,β∣y,t)=i=1∏10?Poisson(λi?ti?)×Γ(α,β)×Γ(γ,δ)
条件分布为
{p(λi∣λ?i,β,y,t)=Γ(yi+α,ti+β)p(β∣λ,y,t)=Γ(10α+γ,δ+∑i=110λi)\left\{ \begin{aligned} &p(\lambda_i\mid \lambda_{-i}, \beta, y, t)=\Gamma(y_i+\alpha, t_i+\beta)\\ &p(\beta\mid \lambda, y, t)=\Gamma(10\alpha+\gamma, \delta+\sum_{i=1}^{10}\lambda_i) \end{aligned} \right. ?????????p(λi?∣λ?i?,β,y,t)=Γ(yi?+α,ti?+β)p(β∣λ,y,t)=Γ(10α+γ,δ+i=1∑10?λi?)?
# 根据关于lambda_i的条件概率公式进行Gamma抽样
def lambda_update(alpha, beta, y, t):return G(size=len(y), shape=y+alpha, scale=1.0/(t+beta))# 根据关于beta的条件概率公式进行Gamma抽样
def beta_update(alpha, gamma, delta, lam, y):return G(size=1, shape=len(y)*alpha+gamma, scale=1.0/(delta+lam.sum()))def gibbs(niter, y, t, alpha, gamma, delta):lambdas_ = np.zeros((niter, len(y)), np.float)betas_ = np.zeros((niter, 1), np.float)lambda_ = y/t # 设置初始值for i in range(niter):beta_ = beta_update(alpha, gamma, delta, lambda_, y)lambda_ = lambda_update(alpha, beta_, y, t)betas_[i] = beta_lambdas_[i, :]=lambda_return betas_, lambdas_def hierarchical_model_demo():alpha = 1.8gamma = 0.01delta = 1.0beta0 = 1y = np.array([5, 1, 5, 14, 3, 19, 1, 1, 4, 22], np.int)t = np.array([94.32, 15.72, 62.88, 125.76, 5.24, 31.44, 1.05, 1.05, 2.10, 10.48], np.float)niter = 1000betas, lambdas = gibbs(niter, y, t, alpha, gamma, delta)print('betas mean: {:.3f}'.format(betas.mean()))print('betas std: {:.3f}'.format(betas.std(ddof=1)))print('lambdas mean:')print(lambdas.mean(axis=0))print('lambdas std:')print(lambdas.std(ddof=1, axis=0))plt.figure(figsize=(10, 20))plt.subplots_adjust(left=None, bottom=None, right=None, top=None,wspace=None, hspace=0.6)for i in range(lambdas.shape[1]):plt.subplot(5,2,i+1)plt.plot(lambdas[::10, i], ':', linewidth=1)plt.title('Trace for $\lambda_{}$'.format(i))plt.grid(True)plt.show()
python Code download
Hierarchical model demo code
References
Computational Statistics in Python
马尔可夫链蒙特卡洛算法 (三) slice sampling
Slice sampler