gpt4 book ai didi

r - ggplot2:如何在 geom_smooth 中获得可靠的预测置信区间?

转载 作者:行者123 更新时间:2023-12-04 11:56:26 29 4
gpt4 key购买 nike

考虑这个简单的例子

dataframe <- data_frame(x = c(1,2,3,4,5,6),
y = c(12,24,24,34,12,15))
> dataframe
# A tibble: 6 x 2
x y
<dbl> <dbl>
1 1 12
2 2 24
3 3 24
4 4 34
5 5 12
6 6 15

dataframe %>% ggplot(., aes(x = x, y = y)) +
geom_point() +
geom_smooth(method = 'lm', formula = y~x)

这里的标准误差是使用默认选项计算的。但是,我想使用 坚固sandwich中提供的方差-协方差矩阵和 lmtest
也就是说,使用 vcovHC(mymodel, "HC3")
有没有办法使用 geom_smooth() 以简单的方式获得它功能?

enter image description here

最佳答案

更新:2021-03-17 最近有人向我指出 ggeffects包自动处理不同的 VCOV,包括我最初在下面演示的更棘手的 HAC 案例。后者的快速示例:

library(ggeffects)
library(sandwich) ## For HAC and other robust VCOVs

d <- data.frame(x = c(1,2,3,4,5,6),
y = c(12,24,24,34,12,15))

reg1 <- lm(y ~ x, data = d)

plot(ggpredict(reg1, "x", vcov.fun = "vcovHAC"))
#> Loading required namespace: ggplot2

## This gives you a regular ggplot2 object. So you can add layers as you
## normally would. E.g. If you'd like to compare with the original data...
library(ggplot2)
last_plot() +
geom_point(data = d, aes(x, y)) +
labs(caption = 'Shaded region indicates HAC 95% CI.')

创建于 2021-03-17 由 reprex package (v1.0.0)
我的原始答案如下...
HC 稳健 SE(简单)
多亏了 estimatr,这现在很容易做到。包装及其系列 lm_robust职能。例如。
library(tidyverse)
library(estimatr)

d <- data.frame(x = c(1,2,3,4,5,6),
y = c(12,24,24,34,12,15))

d %>%
ggplot(aes(x = x, y = y)) +
geom_point() +
geom_smooth(method = 'lm_robust', formula = y~x, fill="#E41A1C") + ## Robust (HC) SEs
geom_smooth(method = 'lm', formula = y~x, col = "grey50") + ## Just for comparison
labs(
title = "Plotting HC robust SEs in ggplot2",
subtitle = "Regular SEs in grey for comparison"
) +
theme_minimal()

创建于 2020-03-08 由 reprex package (v0.3.0)
HAC 强大的 SE(更多跑腿工作)
一个警告是 估算 does not还提供对 HAC(即异方差性和自相关一致)SE 的支持,如 Newey-West。但是,可以使用 手动获取这些信息。三明治 包......无论如何,这就是最初的问题所问的。然后您可以使用 geom_ribbon() 绘制它们.
我要郑重声明,HAC SE 对这个特定的数据集没有多大意义。但这里有一个如何做到这一点的例子,即兴演奏 this excellent所以回答相关主题。
library(tidyverse)
library(sandwich)

d <- data.frame(x = c(1,2,3,4,5,6),
y = c(12,24,24,34,12,15))

reg1 <- lm(y~x, data = d)

## Generate a prediction DF
pred_df <- data.frame(fit = predict(reg1))

## Get the design matrix
X_mat <- model.matrix(reg1)

## Get HAC VCOV matrix and calculate SEs
v_hac <- NeweyWest(reg1, prewhite = FALSE, adjust = TRUE) ## HAC VCOV (adjusted for small data sample)
#> Warning in meatHAC(x, order.by = order.by, prewhite = prewhite, weights =
#> weights, : more weights than observations, only first n used
var_fit_hac <- rowSums((X_mat %*% v_hac) * X_mat) ## Point-wise variance for predicted mean
se_fit_hac <- sqrt(var_fit_hac) ## SEs

## Add these to pred_df and calculate the 95% CI
pred_df <-
pred_df %>%
mutate(se_fit_hac = se_fit_hac) %>%
mutate(
lwr_hac = fit - qt(0.975, df=reg1$df.residual)*se_fit_hac,
upr_hac = fit + qt(0.975, df=reg1$df.residual)*se_fit_hac
)

pred_df
#> fit se_fit_hac lwr_hac upr_hac
#> 1 20.95238 4.250961 9.149822 32.75494
#> 2 20.63810 2.945392 12.460377 28.81581
#> 3 20.32381 1.986900 14.807291 25.84033
#> 4 20.00952 1.971797 14.534936 25.48411
#> 5 19.69524 2.914785 11.602497 27.78798
#> 6 19.38095 4.215654 7.676421 31.08548

## Plot it
bind_cols(
d,
pred_df
) %>%
ggplot(aes(x = x, y = y, ymin=lwr_hac, ymax=upr_hac)) +
geom_point() +
geom_ribbon(fill="#E41A1C", alpha=0.3, col=NA) + ## Robust (HAC) SEs
geom_smooth(method = 'lm', formula = y~x, col = "grey50") + ## Just for comparison
labs(
title = "Plotting HAC SEs in ggplot2",
subtitle = "Regular SEs in grey for comparison",
caption = "Note: Do HAC SEs make sense for this dataset? Definitely not!"
) +
theme_minimal()

创建于 2020-03-08 由 reprex package (v0.3.0)
请注意,如果您愿意,您也可以使用此方法手动计算和绘制其他稳健的 SE 预测(例如 HC1、HC2 等)。您需要做的就是使用相关的三明治估算器。例如,使用 vcovHC(reg1, type = "HC2")而不是 NeweyWest(reg1, prewhite = FALSE, adjust = TRUE)将为您提供与使用 的第一个示例相同的 HC-robust CI。估算包裹。

关于r - ggplot2:如何在 geom_smooth 中获得可靠的预测置信区间?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45313482/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com