gpt4 book ai didi

r - 自动比较来自小鼠 glm.mids 的嵌套模型

转载 作者:行者123 更新时间:2023-12-01 09:07:01 25 4
gpt4 key购买 nike

我有一个来自 R 的 mice 包的多重估算模型,其中有很多因子变量。例如:

library(mice)
library(Hmisc)

# turn all the variables into factors
fake = nhanes
fake$age = as.factor(nhanes$age)
fake$bmi = cut2(nhanes$bmi, g=3)
fake$chl = cut2(nhanes$chl, g=3)

head(fake)
age bmi hyp chl
1 1 <NA> NA <NA>
2 2 [20.4,25.5) 1 [187,206)
3 1 <NA> 1 [187,206)
4 3 <NA> NA <NA>
5 1 [20.4,25.5) 1 [113,187)
6 3 <NA> NA [113,187)

imput = mice(nhanes)

# big model
fit1 = glm.mids((hyp==2) ~ age + bmi + chl, data=imput, family = binomial)

我想通过针对每次删除一个变量的每个可能的嵌套模型测试完整模型来测试模型中每个整个因子变量(不是每个级别的指示变量)的重要性.手动,我可以这样做:

# small model (no chl)
fit2 = glm.mids((hyp==2) ~ age + bmi, data=imput, family = binomial)

# extract p-value from pool.compare
pool.compare(fit1, fit2)$pvalue

我如何为我的模型中的所有因子变量自动执行此操作?非常有用的函数 drop1 是为 a previous question 向我推荐的-- 现在我想做一些与 mice 情况完全相同的事情。

可能有帮助的注意事项:pool.compare 的一个烦人的特性是它似乎希望将较大模型中的“额外”变量放在那些与较小的模型共享。

最佳答案

在按照 pool.compare 所需的顺序排列预测变量后,您可以使用循环遍历预测变量的不同组合。

因此使用上面的数据 - 调整类别数

library(mice)
library(Hmisc)
# turn all the variables into factors
# turn all the variables into factors
fake <- nhanes
fake$age <- as.factor(nhanes$age)
fake$bmi <- cut2(nhanes$bmi, g=2)
fake$chl <- cut2(nhanes$chl, g=2)

# Impute
imput <- mice(fake, seed=1)

# Create models
# - reduced models with one variable removed
# - full models with extra variables at end of expression
vars <- c("age", "bmi", "chl")

red <- combn(vars, length(vars)-1 , simplify=FALSE)
diffs <- lapply(red, function(i) setdiff(vars, i) )
(full <- lapply(1:length(red), function(i)
paste(c(red[[i]], diffs[[i]]), collapse=" + ")))
#[[1]]
#[1] "age + bmi + chl"

#[[2]]
#[1] "age + chl + bmi"

#[[3]]
#[1] "bmi + chl + age"

(red <- combn(vars, length(vars)-1 , FUN=paste, collapse=" + "))
#[1] "age + bmi" "age + chl" "bmi + chl"

模型现在以正确的顺序传递给 glm 调用。我还替换了 glm.mids 方法,因为它已被替换为 with.mids - 请参阅 ?glm.mids

out <- vector("list", length(red))

for( i in 1:length(red)) {

redMod <- with(imput,
glm(formula(paste("(hyp==2) ~ ", red[[i]])), family = binomial))

fullMod <- with(imput,
glm(formula(paste("(hyp==2) ~ ", full[[i]])), family = binomial))

out[[i]] <- list(predictors = diffs[[i]],
pval = c(pool.compare(fullMod, redMod)$pvalue))
}

do.call(rbind.data.frame, out)
# predictors pval
#2 chl 0.9976629
#21 bmi 0.9985028
#3 age 0.9815831

# Check manually by leaving out chl
mod1 <- with(imput, glm((hyp==2) ~ age + bmi + chl , family = binomial))
mod2 <- with(imput, glm((hyp==2) ~ age + bmi , family = binomial))
pool.compare(mod1, mod2)$pvalue
# [,1]
#[1,] 0.9976629

使用这个数据集你会得到很多警告

编辑

你可以把它包装在一个函数里

impGlmDrop1 <- function(vars, outcome, Data=imput,  Family="binomial") 
{

red <- combn(vars, length(vars)-1 , simplify=FALSE)
diffs <- lapply(red, function(i) setdiff(vars, i))
full <- lapply(1:length(red), function(i)
paste(c(red[[i]], diffs[[i]]), collapse=" + "))
red <- combn(vars, length(vars)-1 , FUN=paste, collapse=" + ")

out <- vector("list", length(red))
for( i in 1:length(red)) {

redMod <- with(Data,
glm(formula(paste(outcome, red[[i]], sep="~")), family = Family))
fullMod <- with(Data,
glm(formula(paste(outcome, full[[i]], sep="~")), family = Family))
out[[i]] <- list(predictors = diffs[[i]],
pval = c(pool.compare(fullMod, redMod)$pvalue) )
}
do.call(rbind.data.frame, out)
}

# Run
impGlmDrop1(c("age", "bmi", "chl"), "(hyp==2)")

关于r - 自动比较来自小鼠 glm.mids 的嵌套模型,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26621092/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com