gpt4 book ai didi

r - R 中多重插补数据集的多级回归模型(Amelia、zelig、lme4)

转载 作者:行者123 更新时间:2023-12-04 15:00:28 26 4
gpt4 key购买 nike

我正在尝试对多重插补数据(由 Amelia 创建)运行多级模型;该样本基于组 = 24,N = 150 的聚类样本。

library("ZeligMultilevel")
ML.model.0 <- zelig(dv~1 + tag(1|group), model="ls.mixed",
data=a.out$imputations)
summary(ML.model.0)

此代码产生以下错误代码:
Error in object[[1]]$result$call : 
$ operator not defined for this S4 class

如果我运行 OLS 回归,它会起作用:
model.0 <- zelig(dv~1, model="ls", data=a.out$imputations)
m.0 <- coef(summary(model.0))
print(m.0, digits = 2)

Value Std. Error t-stat p-value
[1,] 45 0.34 130 2.6e-285

我很高兴提供 工作示例 .
require(Zelig)
require(Amelia)
require(ZeligMultilevel)

data(freetrade)
length(freetrade$country) #grouping variable

#Imputation of missing data

a.out <- amelia(freetrade, m=5, ts="year", cs="country")

# Models: (1) OLS; (2) multi-level

model.0 <- zelig(polity~1, model="ls", data=a.out$imputations)
m.0 <- coef(summary(model.0))
print(m.0, digits = 2)

ML.model.0 <- zelig(polity~1 + tag(1|country), model="ls.mixed", data=a.out$imputations)
summary(ML.model.0)

我认为问题可能在于 Zelig 如何与 Amelia 的 mi 类交互。因此,我转向了另一种 R 包:lme4。
require(lme4)
write.amelia(obj=a.out, file.stem="inmi", format="csv", na="NA")
diff <-list(5) # a list to store each model, 5 is the number of the imputed datasets

for (i in 1:5) {
file.name <- paste("inmi", 5 ,".csv",sep="")
data.to.use <- read.csv(file.name)
diff[[5]] <- lmer(polity ~ 1 + (1 | country),
data = data.to.use)}
diff

结果如下:
[[1]]
[1] 5

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
Linear mixed model fit by REML
Formula: polity ~ 1 + (1 | country)
Data: data.to.use
AIC BIC logLik deviance REMLdev
1006 1015 -499.9 1002 999.9
Random effects:
Groups Name Variance Std.Dev.
country (Intercept) 14.609 3.8222
Residual 17.839 4.2236
Number of obs: 171, groups: country, 9

Fixed effects:
Estimate Std. Error t value
(Intercept) 2.878 1.314 2.19

当我替换 diff[[5]] 时,结果保持不变来自 diff[[4]] , diff[[3]]等等。不过,我想知道这实际上是组合数据集的结果还是单个估算数据集的结果。有什么想法吗?谢谢!

最佳答案

我修改了这个对象的汇总函数(获取源代码并打开 ./R/summary.R 文件)。我添加了一些花括号以使代码流畅并更改了 getcoefcoef .这应该适用于这种特殊情况,但我不确定它是否通用。功能 getcoef搜索插槽 coef3 ,我从来没有见过这个。也许@BenBolker 可以在这里看一眼?我不能保证这是结果的样子,但输出对我来说看起来是合法的。也许您可以联系软件包作者以在 future 版本中更正此问题。

summary(ML.model.0)


  Model: ls.mixed
Number of multiply imputed data sets: 5

Combined results:

Call:
zelig(formula = polity ~ 1 + tag(1 | country), model = "ls.mixed",
data = a.out$imputations)

Coefficients:
Value Std. Error t-stat p-value
[1,] 2.902863 1.311427 2.213515 0.02686218

For combined results from datasets i to j, use summary(x, subset = i:j).
For separate results, use print(summary(x), subset = i:j).

修改功能:
summary.MI <- function (object, subset = NULL, ...) {
if (length(object) == 0) {
stop('Invalid input for "subset"')
} else {
if (length(object) == 1) {
return(summary(object[[1]]))
}
}

# Roman: This function isn't fecthing coefficients robustly. Something goes wrong. Contact package author.
getcoef <- function(obj) {
# S4
if (!isS4(obj)) {
coef(obj)
} else {
if ("coef3" %in% slotNames(obj)) {
obj@coef3
} else {
obj@coef
}
}
}

#
res <- list()

# Get indices
subset <- if (is.null(subset)) {
1:length(object)
} else {
c(subset)
}

# Compute the summary of all objects
for (k in subset) {
res[[k]] <- summary(object[[k]])
}


# Answer
ans <- list(
zelig = object[[1]]$name,
call = object[[1]]$result@call,
all = res
)

#
coef1 <- se1 <- NULL

#
for (k in subset) {
# tmp <- getcoef(res[[k]]) # Roman: I changed this to coef, not 100% sure if the output is the same
tmp <- coef(res[[k]])
coef1 <- cbind(coef1, tmp[, 1])
se1 <- cbind(se1, tmp[, 2])
}

rows <- nrow(coef1)
Q <- apply(coef1, 1, mean)
U <- apply(se1^2, 1, mean)
B <- apply((coef1-Q)^2, 1, sum)/(length(subset)-1)
var <- U+(1+1/length(subset))*B
nu <- (length(subset)-1)*(1+U/((1+1/length(subset))*B))^2

coef.table <- matrix(NA, nrow = rows, ncol = 4)
dimnames(coef.table) <- list(rownames(coef1),
c("Value", "Std. Error", "t-stat", "p-value"))
coef.table[,1] <- Q
coef.table[,2] <- sqrt(var)
coef.table[,3] <- Q/sqrt(var)
coef.table[,4] <- pt(abs(Q/sqrt(var)), df=nu, lower.tail=F)*2
ans$coefficients <- coef.table
ans$cov.scaled <- ans$cov.unscaled <- NULL

for (i in 1:length(ans)) {
if (is.numeric(ans[[i]]) && !names(ans)[i] %in% c("coefficients")) {
tmp <- NULL
for (j in subset) {
r <- res[[j]]
tmp <- cbind(tmp, r[[pmatch(names(ans)[i], names(res[[j]]))]])
}
ans[[i]] <- apply(tmp, 1, mean)
}
}

class(ans) <- "summaryMI"
ans
}

关于r - R 中多重插补数据集的多级回归模型(Amelia、zelig、lme4),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16571580/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com