http://www.paper.edu.cn
广义估计方程中工作相关结构选择
付利亚1,王友乾2
1 西安交通大学数学与统计学院,西安 710049
2 昆士兰科技大学数学科学学院,布里斯班 4000
摘要:在纵向数据分析中选择正确的相关结构可以改进参数估计,从而得到更可靠的统计推
断。近年来,研究者提出了许多不同的选择相关结构准则,但是这些准则之间的差别还有待比
较。本文通过大量的数值模拟对已有的准则进行比较评估。模拟结果表明基于高斯工作似然函
数和经验似然函数的 AIC 和 BIC 准则要优于其它准则。
关键词:经验似然;高斯分布;模型选择
中图分类号: O212
Working Correlation Structure Selection in
Generalized Estimating Equations
FU Liya1, WANG You-Gan2
1 School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an 710049
2 School of Mathematical Sciences, Queensland University of Technology, Brisbane 4000
Abstract: Selecting an appropriate correlation structure in analysis of longitudinal data can
greatly improve parameter estimation efficiency and hence leads to more reliable statistical
inferences. There are a number of criteria proposed in recent literature derived from different
perspectives and little is known on their relative performance. To this end, this paper
reviewed and evaluated these criteria by carrying out extensive simulation studies including
Gaussian, binary and Poisson responses. Surprisingly, the AIC and BIC based on either
Gaussian working likelihood or empirical likelihood were found that they outperformed the
others.
Key words: empirical likelihood;Gaussian distribution;model selection
0 Introduction
The characteristic of longitudinal data is that the measurements collected from the same
subject are correlated [1]. The correlation matrix of the repeated measurements from the same
subject is usually difficult to specify. To this end, Liang and Zeger [2] developed the generalized
Foundations: Foundations: the National Science Foundation of China (Nos.11201365 and 11301408), Specialized Research
Fund for the Doctoral Program of Higher Education (20120201120053).
Author Introduction: Correspondence author: Fu Liya (1982-), female, lecturer, major research direction:longitudinal
data analysis.
Wang You-Gan (1965-), male, professor, major research direction:longitudinal data analysis.
- 1 -
http://www.paper.edu.cn
estimating equations (GEE) approach by incorporating a working correlation matrix for the
repeated measurements to account for within-subject correlation. Let Yi = (yi1, . . . , yini)T be
the underlying outcome for the ith subject, where i = 1, . . . , m. Let Xi = (xi1, . . . , xini)T be
the corresponding covariate vector, and xik is a p 1 vector. Let µik be the expectation of yik,
and µik = h(xikβ), where h is a known link function and β is a p 1 parameter vector. The
variance of yik, σik = ϕν(µik), where ϕ is a scale parameter, and ν is a given function of µik.
Covariance matrix of Yi, Vi = ϕA1/2
, where Ai = diag(σik) and Ri is the correlation
matrix. The GEE can be expressed as
i RiA1/2
i
U (β) =
DT
i A
1/2
i R
1/2
1
i A
i
Si,
(1)
m∑
i=1
where Si = Yi µi with µi = (µi1, . . . , µini)T and Di = ∂µi/∂β. The estimating function U (β)
is unbiased and let ˆβ(Ri) be the estimators derived from (1). Suppose that Ri is governed by
a q dimension vector α. The covariance matrix of ˆβ(Ri) is given by
(
m∑
)1(
m∑
)(
m∑
)1
VR =
DT
i V
1
i Di
1
DT
i V
i
cov(Si)V
1
i Di
DT
i V
1
i Di
.
i=1
i=1
i=1
The variance estimate ˆVR of ˆβ(Ri) can be obtained by replacing cov(Si) by SiST
by their estimates.
i and β, α, ϕ
The estimator ˆβ(Ri) is consistent even when the structure of the correlation matrix Ri
is misspecified. However, the efficiency of ˆβ(Ri) depends on the proximity of ϕA1/2
i RiA1/2
to the true covariance matrix Vi. When the working correlation structure is misspecified, the
asymptotic relative efficiency of ˆβ could be compromised low [3, 4, 5]. Therefore, it is important
to seek better modeling of Ri to hence enhance the efficiency of ˆβ(Ri). The most commonly
used working structures are: (i) independence, q = 0, Ri = Ini for all i; (ii) exchangeable, q = 1,
1) < α < 1, where e = (1, . . . , 1)T; (iii) the
Ri = (1 α)Ini + αeTe with 1/(maxfnigm
) with 1 < α < 1, and (iv) stationary
first order autoregression (AR(1)), q = 1, Ri = (α
correlation structure, q = ni 1, Ri = R(α) in which each element descending diagonal from
left to right is a constant, and α = (α1, . . . , αni1)T. Working independence, exchangeable, and
AR(1) correlation structures are all special cases of the stationary correlation structure.
jklj
i=1
i
Traditional model selection criteria such as Akaike information criteria (AIC) and Bayesian
information criterion (BIC) are not useful for selection a correlation structure. Pan [6] modified
AIC and developed a quasi-likelihood information criterion (QIC) for model selection in the
GEE framework. This criterion can be used to select covariates in mean function modeling
and a working correlation structure modeling. Wang and Carey [7] first utilized the method
proposed by Rotnitzky and Jewell [8] to choose a working correlation matrix in GEE. Shults
and Chaganty [9] proposed a simple selection criterion (SC) to choose a correlation structure,
which corresponds to the minimum value of the weighted error sum of squares [10]. Shults et
- 2 -
http://www.paper.edu.cn
al. [11] evaluated this criterion in GEE analysis of longitudinal binary data. Hin and Wang
[12] proposed a correlation information criterion (CIC) as a modification of QIC to improve
its performance. Gosho et al. [13] provided a criterion (C(R)) based on a statistic designed
to test the hypothesis that the covariance matrix equals a given matrix. Carey and Wang
[14] evaluated Gaussian pseudolikelihood and geodesic distance criteria for variance model and
correlation structure selection for GEE. Chen and Lazar [15] proposed two criteria EAIC and
EBIC, by substituting empirical likelihood for parametric likelihood in AIC and BIC, which
are more powerful than QIC and CIC.
Since these criteria can help to identify the proper correlation structure, it is important to
compare how their performance changes under different data setups, including type of responses,
such as normal, binary and poisson responses, and the true correlation structure, such as,
independence, exchangeable, AR(1), or stationary correlation structures. The knowledge of
their performance profiles can reveal their strengths and limitations, which can help shed more
light on the choice of a more suitable working correlation structure among competing structures
in longitudinal data analysis. In this paper, we describe and evaluate the performance of these
criteria based on extensive simulation studies.
The rest paper is organized as follows: In Section 2, we review the methods for a working
correlation structure selection. In Section 3, we report the results of extensive simulation studies
designed to evaluate the performance of these methods. In Section 4, we analyze a real dataset.
In Section 5, we conclude by suggesting strategy for a working correlation structure selection
in GEE analysis of longitudinal data.
1 Criteria for selecting a working correlation structure
In this section, we describe the criteria we consider for a working correlation structure
selection in GEE analysis of longitudinal data.
1.1 Quasi-likelihood criterion
Classical model criteria such as AIC and BIC assume the observations are independent
and hence they are not useful for selecting a correlation structure. Pan [6] proposed a criterion
which is a modification of AIC based on quasi-likelihood, named QIC, to choose the proper
mean model or the working correlation structure. The QIC for a given working correlation
matrix can be expressed as
QIC = 2
Q( ˆβ(Ri), ˆϕ(Ri), (yik, xik)) + 2tr( ˆQI
m∑
ni∑
ˆVR),
(2)
∑
i=1
k=1
where the first term is the sum of the quasi-log-likelihood for all the observations, and ˆQI =
1
β= ^β(Ri),ϕ= ^ϕ(Ri). The quasi-likelihood in (2) is the quasi-log-likelihood as-
ϕ
i=1 DT
i A
i Dij
1
m
- 3 -
http://www.paper.edu.cn
suming a independence working model (Table 9.1 in [16]).
1.2 Correlation information criterion
Hin and Wang [12] noted that the expectation of the fist term in QIC is free from both
Ri and RT , RT being the true correlation matrix. Therefore, they proposed a correlation
information criterion (CIC) as a simple modification of QIC:
CIC = tr( ˆQI
ˆVR).
Without the effect of the random error from the first term in (2), CIC is more powerful than
QIC. The theoretical underpinning for the biased first term in (2) has been outlined in Wang
and Hin [17].
1.3 Rotnitzky-Jewell’s criterion
∑
∑
1
m
i=1 DT
i V
Define Q0 = m
1
0 Q1.
When Ri is correctly specified, Q should be close to an identify matrix, and hence both C1 =
tr(Q)/p and C2 = tr(Q2)/p should be close to one in value [7]. Hin et al. [18] defined a
Rotnitzky-Jewell’s criterion (RJ) based on the following measure
1
i Di, and Q = Q
1
i Di, Q1 = m
1
i SiST
i V
m
i=1 DT
i V
1
√
(1 C1)2 + (1 C2)2.
RJ =
The value of RJ should be close to zero when Ri is accurately specified.
Carey and Wang [14] considered two criteria by geodesic distance: ∆1 =
C2 2C1 + 1 and ∆2 =
the true correlation matrix, λi will close to one and values of ∆1 and ∆2 are closed to zero.
=1(λi 1)2/p =
=1(log λi)2, where λi are eigenvalues of Q. When Ri approximate
p
p
∑
∑
1.4 Shults-Chaganty’s criterion
Shults et al. [11] proposed the Shults-Chaganty’s criterion (SC) based on the approach by
Shults and Chaganty [9] to select a working correlation structure for correlated binary data as
m∑
SC =
ST
i V
i Sij
1
β= ^β(Ri),ϕ= ^ϕ(Ri)
This criterion is motivated by the hypothesis that the proper correlation matrix should minimize
the error sum of squares weighted by the inverse of the working correlation matrix [11].
i=1
- 4 -
1.5 C(R) statistic criterion
http://www.paper.edu.cn
Gosho et al. [13] proposed for balanced-cluster data a criterion to choose a working corre-
lation structure by minimizing C(R) where
(
m∑
i=1
1
m
)(
)1 In
2
.
m∑
i=1
1
m
Vi
C(R) = tr
SiST
i
(3)
From a theoretical point of view, C(R) measures the discrepancy between the covariance matrix
estimator and the specified working covariance matrix.
1.6 Gaussian pseudolikelihood criterion
Carey and Wang [14] used the following Gaussian pseudolikelihood criterion to choose a
working correlation structure and a variance model:
(Yi µi)TV
1
i
(Yi µi) + log(jVij)
β= ^β(Ri),ϕ= ^ϕ(Ri) .
} j
m∑
{
i=1
LG = 1
2
We choose the correlation structure yielding the largest LG value. On grounds that compet-
ing working correlation structures may have different numbers of correlation parameters, it
is necessary to consider the model dimension q. Zhu and Zhu [19] substituted the Gaussian
pseudolikelihood for the parametric likelihood in AIC and BIC and obtain two criteria:
GAIC = 2LG + 2dim(θ),
GBIC = 2LG + log(m)dim(θ),
where θ = (β, α), β 2 Rp, α 2 Rq. Based on the values of GAIC or GBIC, we select a correlation
structure.
1.7 Empirical likelihood method
Empirical likelihoods provide a valuable approach for parameter estimation and hypoth-
esis tests in nonparametric or distribution-free contexts [20, 21]. Chen and Lazar [15] pro-
posed a novelty criterion for selecting a correlation structure via empirical likelihood. Let
α = (α1, . . . , αni1)T. Define estimating functions for the ith subject:
DT
i A
1/2
1/2
1
F (α)A
i R
i
k=1 eikeik+1 α1
ˆϕ(β)(ni 1 p/m)
ni1
...
∑
∑
k=1 eikeik+ni1 αni1
ˆϕ(β)(1 p/m)
Si
1
,
g(Yi, Xi, β, α, RF ) =
- 5 -
{
m∏
http://www.paper.edu.cn
}
√
where RF (α) is a correlation matrix in which the jth off diagonal element is αj, and eik =
(Yik µik)/
ν(µik) and ˆϕ = (M p)
ik, where M is the total numbers of
observations. The empirical likelihood ratio is
∑
ni
k=1 e2
1
m
i=1
∑
m∑
m∑
RF (θ) = max
mpi; 0 pi 1,
pi = 1,
pig(Yi, Xi, β, α, RF ) = 0
.
i=1
i=1
i=1
Chen and Lazar [15] modified AIC and BIC by substituting empirical likelihood for parametric
likelihood and gave empirical likelihood versions of AIC and BIC:
EAIC = 2 log RF (ˆθ) + 2dim(θ),
EBIC = 2 log RF (ˆθ) + log(m)dim(θ).
When calculating EAIC and EBIC, it is worth mentioning that ˆθ is the GEE estimator and
RF (α) is embed by the working correlation matrix.
2
Simulation studies
2.1 Data generation designs
To assess and compare the performance of the aforementioned criteria, we report the results
of extensive simulation studies in this section. The following three settings are considered:
i = 1, . . . , m, where β1 = β2 = 1.
(1) Gaussian response, µik = xik1β1 + xik2β2, k = 1, . . . , n,
Covariate xi11 = . . . = xin1 Bernoulli(0.5) and xi12, . . . , xin2 are generated independently
from N (0, 1).
(2) Binary response, logit(µik) = β0 + xik1β1 + (k 1)β2, k = 1, . . . , n,
β1 = 0.5, β2 = β3 = 0.2, and xik1 Bernoulli(0.5). This model was used in [13].
(3) Poisson response, log(µik) = β0 + xik1β1 + (k 1)β2, k = 1, . . . , n,
xik1 and β = (β0, β1, β2) are the same as those in case (2).
i = 1, . . . , m, where
i = 1, . . . , m, where
In each case, we assume n = 4 and 8 observations per subject and simulation data for
m = 50 and 100 subjects and conduct our simulations with 1000 independent replications. All
computations are performed using statistical software R 3.0.2 version and parameter estimates
( ˆβ, ˆα, ˆϕ) are obtained using the gee library. Gaussian random variables are generated using
function rmvnorm in mvtnorm library [22], binary random variables generated using function
rmvbin in bindata library [23], and poisson random variable generated using function rcounts
in corcounts library [24]. R codes for our simulation studies and real data analysis in Section
4 can be obtained by contacting the authors.
- 6 -
http://www.paper.edu.cn
2.2 Stationary correlation structure excluded
For both the true correlation matrix RT and the working correlation matrix R, we consider
three structures:
independence (IN), exchangeable (EX), and AR(1) (AR) structures, which
were also considered by [15]. When RT is exchangeable or AR(1) with parameter α, the
correlation coefficient parameter α takes a value of 0.3. The simulation results are presented
in Tables 1-3.
It can be seen that, for any choice of RT , GAIC, GBIC, EAIC, and EBIC perform well for
Gaussian, binary, and Poisson responses. When RT is exchangeable or AR(1), GAIC, GBIC,
EAIC, EBIC and C(R) have better performance than other criteria. For Gaussian response,
because the quasi-likelihood in equation (2) is a constant, the frequencies of CIC for choosing a
true correlation structure is the same as those of QIC. For binary or Poisson responses, CIC has
better performance than QIC. While when RT is independence, GAIC, GBIC, EAIC, and EBIC
are much more effective than other criteria. Furthermore, if RT is independence structure, the
GAIC is consistently better than EAIC, and GBIC is consistently better than EBIC as well.
Criteria ∆1 and ∆2 perform better when the true correlation structure is exchangeable. For a
given sample size, when RT is exchangeable or AR(1), the frequencies of selecting the correct
structure for most of criteria is higher with n = 10 than with n = 4, which implies these
criteria easily recognize the correlation pattern for larger cluster size. According to above
analysis, GAIC, GBIC, EAIC, and EBIC outperform, and do not particularly depend on the
true correlation structure, the type of the response variable. Therefore, they are more reliable
and robust.
3 Epileptic seizure data
In this section, we applied the aforementioned methods to the epileptic seizure data from
a clinical trial [1, 25], which has been analyzed by Carey and Wang [14]. For each patient,
the number of epileptic seizures was recorded during a baseline period of eight weeks. Patients
were randomized to treatment with progabide or placebo. The number of seizures was recorded
in four consecutive two-week intervals. What is of interest is whether he progabide reduces the
rate of epileptic seizures. Covariates considered in original analysis of Thall and Vail [25] are
progabide treatment, the logarithm of 1/4 the eight-week baseline counts, logarithm of age of
patient, the interaction between treatment and the logarithm of 1/4 the eight-week baseline
rate, and an indicator of period four.
Following Carey and Wang [14], we consider variance function ν(µik) = µ2
ik. We fit the
GEE model assuming four competing working correlation structures: IN, EX, AR(1) and ST,
with and without the 49th patient who had unusual pre- and post- randomization seizure counts
[1]. The parameter estimates, their standard errors and values of criteria are presented in Table
- 7 -
http://www.paper.edu.cn
4. When the 49th patient is included, we can see that Gaussian pseudolikelihood criteria (GAIC
and GBIC), ∆1, ∆2, SC and C(R) select exchangeable structure in agreement with Carey and
Wang [14], whereas the empirical likelihood criteria (EAIC and EBIC), RJ, QIC and CIC
select AR(1) instead. When the 49th patient is excluded, the choices of GAIC, GBIC, EAIC,
EBIC, CIC, ∆1, ∆2, and C(R) remain unchanged, while QIC and RJ choose the exchangeable
structure in this case. This indicates that the choices of QIC and RJ are sensitive to the
outlier. The parameter estimates under exchangeable correlation structure assumption have
smaller standard errors than those obtained under AR(1) correlation structure assumption
except for the log(base) effect. Taking into consideration the standard errors of parameter
estimates, exchangeable correlation structure may be preferred over AR(1).
4 Conclusions and Future work
It is important to select an optimal working correlation structure in GEE framework in
terms of estimation precision. We have reviewed and evaluated several criteria by simulation
studies. The results indicate that GAIC, GBIC, EAIC, and EBIC are more powerful than oth-
er criteria when the true correlation structure is independence, exchangeable, or AR(1). The
modified version of AIC or BIC via Gaussian pseudolikelihood is better than that via empirical
likelihood in most of cases except RT is stationary. GBIC is better than GAIC, and EBIC per-
forms better than EAIC except RT is stationary, which seems to their parametric counterparts.
When correlation structure candidates are exchangeable and AR(1), C(R) performs very well.
When RT is stationary structure but can be approximated by an exchangeable or AR(1), GA-
IC, GBIC, EAIC and EBIC tend to select a parsimonious structure, but they will not result in
significant efficiency loss [15]. According to the simulation results analysis, GAIC, GBIC, EAIC
and EBIC are seem to be reasonably reliable tools for analysts away from efficiency-diminishing
misspecified correlation structures in general. Our suggestion is to compute GAIC, GBIC, EA-
IC, and EBIC for all the plausible candidate working correlation structures, and then choose
a working correlation structure based on the values of these criteria. If correlation structures
selected using these criteria are different, the one with small standard errors can be trusted.
In our simulation studies, we only consider balanced data where the number of observations is
the same for each subject. However, unbalanced case is common in longitudinal data because
subjects are lost or drop out in the follow-up study. All the criteria can be directly used in
unbalanced case except C(R). It can if we incorporate C(R) with the indicator function Ji in
eqn (7) of Gosho et al. [26].
In this study, we focus on correlation structure selection. While covariates selection in
regression is quite important. CIC, C(R), RJ, ∆1, and ∆2 cannot used to select covariates in
the mean regression, since these criteria depend upon the regression mean model specification.
- 8 -