第六届中国 R语言会议 (北京)
贝叶斯分级模型 Bayesian hierarchical modeling
李欣海
用R和WinBUGS实现贝叶斯分级模型
Bayesian hierarchical modeling using R and WinBUGS
李欣海
(中国科学院动物研究所)
2013年5月19日
lixh@ioz.ac.cn
http://people.gucas.ac.cn/~LiXinhai
http://blog.sciencenet.cn/u/lixinhai
http://weibo.com/lixinhaiblog
1
/ 32
第六届中国 R语言会议 (北京)
贝叶斯分级模型 Bayesian hierarchical modeling
李欣海
贝叶斯方法 Bayesian method
(Reverend Thomas Bayes 1763)
Bayesian:
Frequentist: Probability (data, given parameter)
Probability (parameter, given data)
Bayes’ Theorem
Likelihood
Posterior probability
yP
(
)
yf
(
P
)( )
ym
)(
Prior probability
Marginal probability of y
ym
)(
yf
(
p
d)()
2
/ 32
第六届中国 R语言会议 (北京)
贝叶斯分级模型 Bayesian hierarchical modeling
李欣海
Bayesian method is appropriate for dynamics,
nonlinear, and full-of-noise ecological processes
Systems invariably driven by endogenous dynamic processes plus demographic
and environmental process noise, and are only observable with error.
The inability to make well-founded statistical inferences about biological dynamic
models in the chaotic and near-chaotic regimes, …, leaves dynamic theory without
the methods of quantitative validation that are essential tools in the rest of biological
science.
Here I show that this impasse can be resolved in a simple and general manner, …,
using a straightforward Markov chain Monte Carlo (MCMC) sampler (Wood 2010).
Wood SN (2010). Statistical inference for noisy nonlinear ecological dynamic systems.
Nature 466: 1102-1113.
3
/ 32
第六届中国 R语言会议 (北京)
贝叶斯分级模型 Bayesian hierarchical modeling
李欣海
A simple example: mortality of moths exposed to cypermethrin
(Royle and Dorazio 2008 Page 66)
The experiment was designed to test whether males and
females moths suffered the same mortality when exposed
to identical doses of cypermethrin (氯氰菊酯).
Data observed in a dose-response experiment involving
adults of the tobacco budworm (Heliothis virescens, 烟青
虫), a moth species whose larvae are responsible for
damage to cotton crops in the United States and Central
and South America (Collett, 1991, Example 3.7).
In the experiment, batches of 20 moths of each sex were
exposed to a pesticide called cypermethrin for a period of
72 hours, beginning two days after the adults had
emerged from pupation. Both sexes were exposed to the
same range of pesticide doses: 1, 2, 4, 8, 16, and 32 g
cypermethrin.
At the end of the experiment the number of dead moths
in each batch were recorded.
y
1
4
9
13
18
20
0
2
6
10
12
16
sex
male
male
male
male
male
male
female
female
female
female
female
female
sexcode dose
1
1
1
1
1
1
0
0
0
0
0
0
1
2
4
8
16
32
1
2
4
8
16
32
4
/ 32
第六届中国 R语言会议 (北京)
贝叶斯分级模型 Bayesian hierarchical modeling
李欣海
Models
Let xi denote the log2(dose) of cypermethrin administered to the ith
batch of moths that contained either males (zi = 1) or females (zi = 0).
Each batch has N = 20 moths. A logistic-regression model containing 3
parameters is:
pNy
,
i
i
p
(
logit
i
~
(Bin
)
pN
)
,
i
xβ
i
zγ
i
y
1
4
9
13
18
20
0
2
6
10
12
16
sex
male
male
male
male
male
male
female
female
female
female
female
female
sexcode dose
1
1
1
1
1
1
0
0
0
0
0
0
1
2
4
8
16
32
1
2
4
8
16
32
where alpha is the intercept, beta is the effect of cypermethrin
and gamma is the effect of sex.
5
/ 32
第六届中国 R语言会议 (北京)
贝叶斯分级模型 Bayesian hierarchical modeling
李欣海
# -------------data------------------------------------
N = 20 # 每组烟青虫的数量,雌雄各6组
y = c(1,4,9,13,18,20, 0,2,6,10,12,16) # 每组死亡的烟青虫数量
sex = c(rep('male',6), rep('female',6))
dose = rep(c(1,2,4,8,16,32), 2) # 农药剂量
ldose = log(dose)/log(2) # 对数转换
sexcode = rep(0,length(sex)) # 定义性别代码为0
i = sex==‘male’ # 区分雌雄(FALSE-TRUE)
sexcode[i] = 1 # 定义雄性代码为1,剩下的雌性依旧为0
as.data.frame(cbind(y, sex, sexcode, dose)) # 显示数据
# -------------arguments for R2WinBUGS---------------------
data
params = list('alpha', 'beta', 'gamma', 'w')
inits
= list(n=length(y), N=N, y=y, x=ldose, z=sexcode)
= function() {
list(alpha=rnorm(1), beta=rnorm(1),
gamma=rnorm(1), w=rbinom(1,1,0.5))
}
# -------------native WinBUGS code---------------------------
modelFilename = 'd:/code/bugs/model.txt'
cat('
model {
alpha ~ dnorm(0, 0.01)
beta ~ dnorm(0, 0.01)
w ~ dbin(0.5,1) #确定雌雄
gamma ~ dnorm(0, 0.01)
for (i in 1:n) {
y[i] ~ dbin(p[i], N)
logit(p[i]) <- alpha + beta*x[i] + w*gamma*z[i]
}
}
', fill = TRUE, file = modelFilename)
Codes
# ------------call bugs() to fit model-----------------------
library(R2WinBUGS)
modelFilename = 'd:/code/bugs/model.txt'
fit = bugs(data, inits, params, model.file = modelFilename,
n.chains = 1, n.iter = 10000, n.burnin = 5000, n.thin = 5,
bugs.seed = sample(1:9999, size=1), debug = TRUE,
DIC = FALSE,
bugs.directory = "d:/softwares/WinBUGS14/")
fit
fit$sims.matrix
# Key WinBUGS code
alpha ~ dnorm(0, 0.01)
beta ~ dnorm(0, 0.01)
w ~ dbin(0.5, 1)
gamma ~ dnorm(0, 0.01)
#确定雌雄
for (i in 1:n) {
}
y[i] ~ dbin(p[i], N)
logit(p[i]) <- alpha + beta*x[i] + w*gamma*z[i]
6
/ 32
第六届中国 R语言会议 (北京)
贝叶斯分级模型 Bayesian hierarchical modeling
李欣海
Results
-1.0
-2.0
-3.0
-4.0
-5.0
-6.0
1.75
1.5
1.25
1.0
0.75
0.5
40.0
20.0
0.0
-20.0
-40.0
1.0
0.5
0.0
alpha
beta
1001
1250
gamma
1001
1250
w
1001
1250
1001
1250
1500
iteration
1500
iteration
1500
iteration
1500
iteration
1750
2000
1750
2000
1750
2000
1750
2000
node
alpha
beta
gamma
w
mean
-3.423
1.077
1.143
0.822
sd MC error
0.5463
0.03888
0.1354 0.007451
1.885
0.06088
0.0342
0.3825
2.50%
-4.494
0.827
0.3462
0
median
-3.406
1.07
1.187
1
97.50%
-2.397
1.364
2.05
1
start
1001
1001
1001
1001
sample
1000
1000
1000
1000
7
/ 32
第六届中国 R语言会议 (北京)
贝叶斯分级模型 Bayesian hierarchical modeling
李欣海
分级模型 Hierarchical modeling
一个案例(an example)
物种在地点 i 的数量
(Abundance of a species at site i):
N
i
Poisson
( i
)
每个物种个体在地点 i 被发现的概率
(Detection rate of an individual at site i):
ir
需要估计的参数
(Parameters to be estimated):
i ir
8
/ 32