gpt4 book ai didi

r - 警告 lme4 : Model failed to converge with max|grad|

转载 作者:行者123 更新时间:2023-12-03 17:09:12 26 4
gpt4 key购买 nike

我必须使用对数转换的响应变量、作为固定效应的连续变量和嵌套的随机效应运行 lmer:

first<-lmer(logterrisize~spm + (1|studyarea/teriid),
data = Data_table_for_analysis_Character_studyarea,
control=lmerControl(optimizer="Nelder_Mead",
optCtrl=list(maxfun=1e4)))

我收到此错误消息:长度错误(值 <- as.numeric(value)) == 1L :
过时的 VtV 不是正定的

我用 bobyqa() 作为优化参数进行了尝试,并收到了以下警告消息:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : 
Model failed to converge with max|grad| = 0.753065 (tol = 0.002, component
1) 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,:
Model is nearly unidentifiable: very large eigenvalue-Rescale variables?

我的总结是这样的:
Linear mixed model fit by REML ['lmerMod'] 
Formula: logterrisize ~ spm + (1 studyarea/teriid) Data: Data_table_for_analysis_Character_studyareaControl: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 10000)) REML criterion at convergence: -6079.6Scaled residuals:
Min 1Q Median 3Q Max
-3.639e-07 -4.962e-08 3.310e-09 5.293e-08 9.725e-07
Random effects:
Groups Name Variance Std.Dev.
teriid:studyarea (Intercept) 1.291e-01 3.593e-01
studyarea (Intercept) 1.944e-02 1.394e-01
Residual 4.506e-15 6.712e-08
Number of obs: 273, groups: teriid:studyarea, 66; studyarea, 22
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.480e+00 5.631e-02 26.28
spm -5.785e-16 8.507e-10 0.00
Correlation of Fixed Effects:
(Intr) spm 0.000 convergence code: 0
Model failed to converge with max|grad| = 0.753065 (tol = 0.002, component1)
Model is nearly unidentifiable: very large eigenvalue - Rescale variables?

我的数据如下所示:
show(logterrisize) [1] 1.3317643 1.3317643 1.3317643 0.1295798 0.1295798 1.5051368 1.5051368 1.5051368 1.5051368 [10] 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 [19] 1.5051368 1.5051368 1.5051368 1.4665993 1.4665993 1.4665993 1.8282328 1.8282328 1.9252934 [28] 1.9252934 1.9252934 2.3006582 2.3006582 2.5160920 2.7774040 2.7774040 3.3398623 3.3398623 [37] 3.4759297 1.2563594 1.6061204 1.6061204 1.7835139 1.7835139 2.1669498 2.1669498 2.1669498 [46] 2.1669498 0.7264997 0.7458155 0.8380524 0.8380524 0.8380524 0.8380524 0.8380524 0.8380524

show(spm) [1] 18.461538 22.641509 35.172414 10.418006 15.611285 3.482143 3.692308 4.483986 4.821429 [10] 6.000000 6.122449 6.176471 6.220736 6.260870 6.593407 7.010309 9.200000 9.473684 [19] 9.600000 12.600000 14.200000 16.146179 28.125000 30.099010 13.731343 14.432990 11.089109 [28] 17.960526 32.903226 8.955224 33.311688 8.800000 11.578947 20.000000 14.455446 18.181818 [37] 28.064516 25.684211 17.866667 23.142857 18.208955 20.536913 11.419355 11.593220 12.703583 [46] 20.000000 3.600000 11.320755 6.200000 6.575342 12.800000 19.109589 20.124224 22.941176 [55] 4.600000 6.600000 6.771160 8.000000 19.200000 19.400000 22.773723 3.333333 4.214047

Studyarea 是字符名称,teriID 表示研究站点的连续编号。

我的数据框如下所示: enter image description here

在使用对数转换变量时,我是否忘记了方程式中要包含的任何内容?
谢谢!

编辑:
我使用 ?convergence 来检查收敛错误。我试过这个:

## 3. 使用理查森外推重新计算梯度和 Hessian
devfun <- update(first, devFunOnly=TRUE)
if (isLMM(first)) {
pars <- getME(first,"theta")
} else {## GLMM: requires both random and fixed parameters
pars <- getME(first, c("theta","fixef"))
}
if (require("numDeriv")) {
cat("hess:\n"); print(hess <- hessian(devfun, unlist(pars)))
cat("grad:\n"); print(grad <- grad(devfun, unlist(pars)))
cat("scaled gradient:\n")
print(scgrad <- solve(chol(hess), grad))}

并得到了这个答案:
hess:
[,1] [,2]
[1,] 147.59157 -14.37956
[2,] -14.37956 120.85329
grad:
[1] -222.1020 -108.1038
scaled gradient:
[1] -19.245584 -9.891077

不幸的是,我不知道答案应该告诉我什么。

第二次编辑:

我尝试了许多优化器,并在使用它时:
first<-lmer(logterrisize~spm + (1|studyarea/teriid),REML=FALSE,
data = Data_table_for_analysis_Character_studyarea,
control=lmerControl(optimizer="optimx",
optCtrl=list(method='nlminb')))

我只收到一个警告: In optwrap(optimizer, devfun, getStart(start, rho$lower, rho$pp), :
convergence code 1 from optimx

现在我的总结是这样的:
Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: logterrisize ~ spm + (1 | studyarea/teriid)
Data: Data_table_for_analysis_Character_studyarea
Control: lmerControl(optimizer = "optimx", optCtrl = list(method ="nlminb"))
AIC BIC logLik deviance df.resid
-3772.4 -3754.3 1891.2 -3782.4 268
Scaled residuals:
Min 1Q Median 3Q Max
-1.523e-04 -1.693e-05 1.480e-06 1.436e-05 3.332e-04
Random effects:
Groups Name Variance Std.Dev.
teriid:studyarea (Intercept) 8.219e-02 0.2866882
studyarea (Intercept) 7.478e-02 0.2734675
Residual 3.843e-10 0.0000196
Number of obs: 273, groups: teriid:studyarea, 66; studyarea, 22
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.551e+00 7.189e-02 21.58
spm 3.210e-11 2.485e-07 0.00
Correlation of Fixed Effects:
(Intr)spm 0.000
convergence code: 1

那么我是否可以对这条警告消息视而不见,否则这会是一个巨大的错误吗?

最佳答案

tl;博士 领土上的每个观察都共享相同的领土大小,因此领土 ID 的随机效应基本上解释了一切,并且对于 log(terrsize) 都没有任何变化。固定效应或残差。将领土 ID 的随机效应排除在模型之外似乎可以给出合理的答案;模拟数据集很好地复制了这种病理,但表明您最终会低估 spm影响 ...

读取数据并绘图

library(readxl)
library(dplyr)

dd <- (read_excel("lme4_terr_dataset.xlsx")
%>% rename(spm="scans per min",
studyarea="Study areaID",
teriid="TerritoryID",
terrsize="Territory_Size")
)

library(ggplot2); theme_set(theme_bw())
library(ggalt)
(ggplot(dd, aes(spm,terrsize,colour=studyarea))
+geom_point()
+geom_encircle(aes(group=teriid))
+theme(legend.position="none")
+ scale_y_log10()
)

enter image description here

该图具有来自同一地区 ID 的水平值线,有助于我诊断问题。确认每个领土 ID 对于所有观察都有一个单一的领土大小:
tt <- with(dd,table(terrsize,teriid))
all(rowSums(tt>0)==1) ## TRUE

模型拟合
library(lme4)
m1 <- lmer(log(terrsize) ~ spm + (1|studyarea/teriid), dd)
## replicate warnings
m2 <- lmer(log(terrsize) ~ spm + (1|studyarea), dd)
## no warnings

现在模拟外观相似的数据
set.seed(101)
## experimental design: rep within f2 (terr_id) within f1 (study area)
ddx <- expand.grid(studyarea=factor(letters[1:10]),
teriid=factor(1:4),rep=1:5)
## study-area, terr_id effects, and spm
b_studyarea <- rnorm(10)
b_teriid <- rnorm(40)
ddx <- within(ddx, {
int <- interaction(studyarea,teriid)
spm <- rlnorm(nrow(ddx), meanlog=1,sdlog=1)
})
## compute average spm per terr/id
## (because response will be identical across id)
spm_terr <- aggregate(spm~int, data=ddx, FUN=mean)[,"spm"]
ddx <- within(ddx, {
mu <- 1+0.2*spm_terr[int]+b_studyarea[studyarea] + b_teriid[int]
tsize <- rlnorm(length(levels(int)), meanlog=mu, sdlog=1)
terrsize <- tsize[int]
})
gg1 %+% ddx

enter image description here

拟合模拟数据

这给出了与真实数据类似的行为:
lmer(log(terrsize) ~ spm + (1|studyarea/teriid), ddx)

我们可以通过删除 teriid 来避免警告。 :
m1 <- lmer(log(terrsize) ~ spm + (1|studyarea), ddx)

但是 spm的真实效果(0.2) 将被低估(因为忽略了来自 teriid ...)
round(confint(m1, parm="beta_"),3)
## 2.5 % 97.5 %
## (Intercept) 1.045 2.026
## spm 0.000 0.070

聚合

在这个单一模拟的基础上,它看起来像是聚合到领土的水平(如 Murtaugh 2007 推荐的,“生态数据分析的简单性和复杂性”生态学)并按每个领土的样本数量加权给出了合理的估计真实 spm影响 ...
ddx_agg <- (ddx
%>% group_by(studyarea,terrsize,teriid)
%>% summarise(spm=mean(spm),
n=n())
)
library(nlme)
m3x <- lme(log(terrsize) ~ spm, random=~1|studyarea, data=ddx_agg,
weights=varFixed(~I(1/n)))
round(summary(m3x)$tTab,3)
Value Std.Error DF t-value p-value
(Intercept) 0.934 0.465 29 2.010 0.054
spm 0.177 0.095 29 1.863 0.073

关于r - 警告 lme4 : Model failed to converge with max|grad|,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53034261/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com