gpt4 book ai didi

r - 从逻辑回归列表中提取系数的所有标准误差

转载 作者:行者123 更新时间:2023-12-04 11:04:11 25 4
gpt4 key购买 nike

我想从逻辑回归模型列表中提取标准误差。

这是逻辑回归函数,以这种方式设计,因此我可以一次运行多个分析:

glmfunk <- function(x) glm(  ldata$DFREE ~  x , family=binomial)

我在数据框 ldata 中的变量子集上运行它:
glmkort <- lapply(ldata[,c(2,3,5,6,7,8)],glmfunk)

我可以像这样提取系数:
sapply(glmkørt, "[[", "coefficients")

但是我如何提取系数的标准误差?我似乎在 str(glmkort) 中找不到它?

这是我正在寻找标准错误的 AGE 的 str(glmkort):
str(glmkort)
List of 6
$ AGE :List of 30
..$ coefficients : Named num [1:2] -1.17201 -0.00199
.. ..- attr(*, "names")= chr [1:2] "(Intercept)" "x"
..$ residuals : Named num [1:40] -1.29 -1.29 -1.29 -1.29 4.39 ...
.. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
..$ fitted.values : Named num [1:40] 0.223 0.225 0.225 0.225 0.228 ...
.. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
..$ effects : Named num [1:40] 3.2662 -0.0282 -0.4595 -0.4464 2.042 ...
.. ..- attr(*, "names")= chr [1:40] "(Intercept)" "x" "" "" ...
..$ R : num [1:2, 1:2] -2.64 0 -86.01 14.18
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:2] "(Intercept)" "x"
.. .. ..$ : chr [1:2] "(Intercept)" "x"
..$ rank : int 2
..$ qr :List of 5
.. ..$ qr : num [1:40, 1:2] -2.641 0.158 0.158 0.158 0.159 ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:40] "1" "2" "3" "4" ...
.. .. .. ..$ : chr [1:2] "(Intercept)" "x"
.. ..$ rank : int 2
.. ..$ qraux: num [1:2] 1.16 1.01
.. ..$ pivot: int [1:2] 1 2
.. ..$ tol : num 1e-11
.. ..- attr(*, "class")= chr "qr"
..$ family :List of 12
.. ..$ family : chr "binomial"
.. ..$ link : chr "logit"
.. ..$ linkfun :function (mu)
.. ..$ linkinv :function (eta)
.. ..$ variance :function (mu)
.. ..$ dev.resids:function (y, mu, wt)
.. ..$ aic :function (y, n, mu, wt, dev)
.. ..$ mu.eta :function (eta)
.. ..$ initialize: expression({ if (NCOL(y) == 1) { if (is.factor(y)) y <- y != levels(y)[1L] n <- rep.int(1, nobs) y[weights == 0] <- 0 if (any(y < 0 | y > 1)) stop("y values must be 0 <= y <= 1") mustart <- (weights * y + 0.5)/(weights + 1) m <- weights * y if (any(abs(m - round(m)) > 0.001)) warning("non-integer #successes in a binomial glm!") } else if (NCOL(y) == 2) { if (any(abs(y - round(y)) > 0.001)) warning("non-integer counts in a binomial glm!") n <- y[, 1] + y[, 2] y <- ifelse(n == 0, 0, y[, 1]/n) weights <- weights * n mustart <- (n * y + 0.5)/(n + 1) } else stop("for the binomial family, y must be a vector of 0 and 1's\n", "or a 2 column matrix where col 1 is no. successes and col 2 is no. failures") })
.. ..$ validmu :function (mu)
.. ..$ valideta :function (eta)
.. ..$ simulate :function (object, nsim)
.. ..- attr(*, "class")= chr "family"
..$ linear.predictors: Named num [1:40] -1.25 -1.24 -1.24 -1.24 -1.22 ...
.. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
..$ deviance : num 42.7
..$ aic : num 46.7
..$ null.deviance : num 42.7
..$ iter : int 4
..$ weights : Named num [1:40] 0.173 0.174 0.174 0.174 0.176 ...
.. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
..$ prior.weights : Named num [1:40] 1 1 1 1 1 1 1 1 1 1 ...
.. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
..$ df.residual : int 38
..$ df.null : int 39
..$ y : Named num [1:40] 0 0 0 0 1 0 1 0 0 0 ...
.. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
..$ converged : logi TRUE
..$ boundary : logi FALSE
..$ model :'data.frame': 40 obs. of 2 variables:
.. ..$ ldata$DFREE: int [1:40] 0 0 0 0 1 0 1 0 0 0 ...
.. ..$ x : int [1:40] 39 33 33 32 24 30 39 27 40 36 ...
.. ..- attr(*, "terms")=Classes 'terms', 'formula' length 3 ldata$DFREE ~ x
.. .. .. ..- attr(*, "variables")= language list(ldata$DFREE, x)
.. .. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : chr [1:2] "ldata$DFREE" "x"
.. .. .. .. .. ..$ : chr "x"
.. .. .. ..- attr(*, "term.labels")= chr "x"
.. .. .. ..- attr(*, "order")= int 1
.. .. .. ..- attr(*, "intercept")= int 1
.. .. .. ..- attr(*, "response")= int 1
.. .. .. ..- attr(*, ".Environment")=<environment: 0x017a5674>
.. .. .. ..- attr(*, "predvars")= language list(ldata$DFREE, x)
.. .. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. .. .. ..- attr(*, "names")= chr [1:2] "ldata$DFREE" "x"
..$ call : language glm(formula = ldata$DFREE ~ x, family = binomial)
..$ formula :Class 'formula' length 3 ldata$DFREE ~ x
.. .. ..- attr(*, ".Environment")=<environment: 0x017a5674>
..$ terms :Classes 'terms', 'formula' length 3 ldata$DFREE ~ x
.. .. ..- attr(*, "variables")= language list(ldata$DFREE, x)
.. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:2] "ldata$DFREE" "x"
.. .. .. .. ..$ : chr "x"
.. .. ..- attr(*, "term.labels")= chr "x"
.. .. ..- attr(*, "order")= int 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: 0x017a5674>
.. .. ..- attr(*, "predvars")= language list(ldata$DFREE, x)
.. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. .. ..- attr(*, "names")= chr [1:2] "ldata$DFREE" "x"
..$ data :<environment: 0x017a5674>
..$ offset : NULL
..$ control :List of 3
.. ..$ epsilon: num 1e-08
.. ..$ maxit : num 25
.. ..$ trace : logi FALSE
..$ method : chr "glm.fit"
..$ contrasts : NULL
..$ xlevels : Named list()
..- attr(*, "class")= chr [1:2] "glm" "lm"
$ BECK :List of 30

这是我用于示例的数据。为了这个问题,我缩短了它:
> ldata
ID AGE BECK IVHX NDRUGTX RACE TREAT SITE DFREE
1 1 39 9.000 3 1 0 1 0 0
2 2 33 34.000 2 8 0 1 0 0
3 3 33 10.000 3 3 0 1 0 0
4 4 32 20.000 3 1 0 0 0 0
5 5 24 5.000 1 5 1 1 0 1
6 6 30 32.550 3 1 0 1 0 0
7 7 39 19.000 3 34 0 1 0 1
8 8 27 10.000 3 2 0 1 0 0
9 9 40 29.000 3 3 0 1 0 0
10 10 36 25.000 3 7 0 1 0 0
11 11 38 18.900 3 8 0 1 0 0
12 12 29 16.000 1 1 0 1 0 0
13 13 32 36.000 3 2 1 1 0 1
14 14 41 19.000 3 8 0 1 0 0
15 15 31 18.000 3 1 0 1 0 0
16 16 27 12.000 3 3 0 1 0 0
17 17 28 34.000 3 6 0 1 0 0
18 18 28 23.000 2 1 0 1 0 0
19 19 36 26.000 1 15 1 1 0 1
20 20 32 18.900 3 5 0 1 0 1
21 21 33 15.000 1 1 0 0 0 1
22 22 28 25.200 3 8 0 0 0 0
23 23 29 6.632 2 0 0 0 0 0
24 24 35 2.100 3 9 0 0 0 0
25 25 45 26.000 3 6 0 0 0 0
26 26 35 39.789 3 5 0 0 0 0
27 27 24 20.000 1 3 0 0 0 0
28 28 36 16.000 3 7 0 0 0 0
29 29 39 22.000 3 9 0 0 0 1
30 30 36 9.947 2 10 0 0 0 0
31 31 37 9.450 3 1 0 0 0 0
32 32 30 39.000 3 1 0 0 0 0
33 33 44 41.000 3 5 0 0 0 0
34 34 28 31.000 1 6 1 0 0 1
35 35 25 20.000 1 3 1 0 0 0
36 36 30 8.000 3 7 0 1 0 0
37 37 24 9.000 1 1 0 0 0 0
38 38 27 20.000 1 1 0 0 0 0
39 39 30 8.000 1 2 1 0 0 1
40 40 34 8.000 3 0 0 1 0 0

最佳答案

使用 ?glm 中的示例

## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())

## copy twice to a list to illustrate
lmod <- list(mod1 = glm.D93, mod2 = glm.D93)

然后我们可以将它们计算为 summary()将,或在调用 summary() 后提取它们.前者效率更高,因为你只计算你想要的。后者不依赖于了解标准误差是如何得出的。

直接计算标准误

标准误差可以从模型的方差-协方差矩阵中计算出来。该矩阵的对角线包含系数的方差,标准误差只是这些方差的平方根。 vcov()提取器函数为我们获取方差 - 协方差矩阵,我们将对角线与 sqrt(diag()) 平方根:
> lapply(lmod, function(x) sqrt(diag(vcov(x))))
$mod1
(Intercept) outcome2 outcome3 treatment2 treatment3
0.1708987 0.2021708 0.1927423 0.2000000 0.2000000

$mod2
(Intercept) outcome2 outcome3 treatment2 treatment3
0.1708987 0.2021708 0.1927423 0.2000000 0.2000000

从对 summary() 的调用中提取它们

或者我们可以让 summary()计算标准误差(以及更多),然后使用 lapply()sapply()应用提取 coef(summary(x)) 的匿名函数并取第二列(其中存储标准错误)。
lapply(lmod, function(x) coef(summary(x))[,2])

这使
> lapply(lmod, function(x) coef(summary(x))[,2])
$mod1
(Intercept) outcome2 outcome3 treatment2 treatment3
0.1708987 0.2021708 0.1927423 0.2000000 0.2000000

$mod2
(Intercept) outcome2 outcome3 treatment2 treatment3
0.1708987 0.2021708 0.1927423 0.2000000 0.2000000

sapply()会给:
> sapply(lmod, function(x) coef(summary(x))[,2])
mod1 mod2
(Intercept) 0.1708987 0.1708987
outcome2 0.2021708 0.2021708
outcome3 0.1927423 0.1927423
treatment2 0.2000000 0.2000000
treatment3 0.2000000 0.2000000

根据您想要做的事情,您可以通过一次调用提取系数和标准误差:
> lapply(lmod, function(x) coef(summary(x))[,1:2])
$mod1
Estimate Std. Error
(Intercept) 3.044522e+00 0.1708987
outcome2 -4.542553e-01 0.2021708
outcome3 -2.929871e-01 0.1927423
treatment2 1.337909e-15 0.2000000
treatment3 1.421085e-15 0.2000000

$mod2
Estimate Std. Error
(Intercept) 3.044522e+00 0.1708987
outcome2 -4.542553e-01 0.2021708
outcome3 -2.929871e-01 0.1927423
treatment2 1.337909e-15 0.2000000
treatment3 1.421085e-15 0.2000000

但是您可能更喜欢单独使用它们?

关于r - 从逻辑回归列表中提取系数的所有标准误差,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13045032/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com