r - ggplot2:如何在 geom_smooth 中获得可靠的预测置信区间？-6ren

r - ggplot2:如何在 geom_smooth 中获得可靠的预测置信区间？

转载作者：行者123 更新时间：2023-12-04 11:56:26

29

4

考虑这个简单的例子

dataframe <- data_frame(x = c(1,2,3,4,5,6),
                        y = c(12,24,24,34,12,15))
> dataframe
# A tibble: 6 x 2
      x     y
  <dbl> <dbl>
1     1    12
2     2    24
3     3    24
4     4    34
5     5    12
6     6    15    

dataframe %>% ggplot(., aes(x = x, y = y)) + 
geom_point() + 
geom_smooth(method = 'lm', formula = y~x)

这里的标准误差是使用默认选项计算的。但是，我想使用坚固包 sandwich中提供的方差-协方差矩阵和 lmtest
也就是说，使用 vcovHC(mymodel, "HC3")
有没有办法使用 geom_smooth() 以简单的方式获得它功能？

最佳答案

更新:2021-03-17 最近有人向我指出 ggeffects包自动处理不同的 VCOV，包括我最初在下面演示的更棘手的 HAC 案例。后者的快速示例:

library(ggeffects)
library(sandwich)  ## For HAC and other robust VCOVs

d <- data.frame(x = c(1,2,3,4,5,6),
                                y = c(12,24,24,34,12,15))

reg1 <- lm(y ~ x, data = d)

plot(ggpredict(reg1, "x", vcov.fun = "vcovHAC"))
#> Loading required namespace: ggplot2

## This gives you a regular ggplot2 object. So you can add layers as you
## normally would. E.g. If you'd like to compare with the original data...
library(ggplot2)
last_plot() +
    geom_point(data = d, aes(x, y)) +
    labs(caption = 'Shaded region indicates HAC 95% CI.')

创建于 2021-03-17 由 reprex package (v1.0.0)
我的原始答案如下...
HC 稳健 SE(简单)
多亏了 estimatr，这现在很容易做到。包装及其系列 lm_robust职能。例如。

library(tidyverse)
library(estimatr)

d <- data.frame(x = c(1,2,3,4,5,6),
                y = c(12,24,24,34,12,15))

d %>% 
  ggplot(aes(x = x, y = y)) + 
  geom_point() + 
  geom_smooth(method = 'lm_robust', formula = y~x, fill="#E41A1C") + ## Robust (HC) SEs
  geom_smooth(method = 'lm', formula = y~x, col = "grey50") + ## Just for comparison
  labs(
    title = "Plotting HC robust SEs in ggplot2",
    subtitle = "Regular SEs in grey for comparison"
    ) +
  theme_minimal()

创建于 2020-03-08 由 reprex package (v0.3.0)
HAC 强大的 SE(更多跑腿工作)
一个警告是估算 does not还提供对 HAC(即异方差性和自相关一致)SE 的支持，如 Newey-West。但是，可以使用 手动获取这些信息。三明治 包......无论如何，这就是最初的问题所问的。然后您可以使用 geom_ribbon() 绘制它们.
我要郑重声明，HAC SE 对这个特定的数据集没有多大意义。但这里有一个如何做到这一点的例子，即兴演奏 this excellent所以回答相关主题。

library(tidyverse)
library(sandwich)

d <- data.frame(x = c(1,2,3,4,5,6),
                y = c(12,24,24,34,12,15))

reg1 <- lm(y~x, data = d)

## Generate a prediction DF
pred_df <- data.frame(fit = predict(reg1))

## Get the design matrix
X_mat <- model.matrix(reg1)

## Get HAC VCOV matrix and calculate SEs
v_hac <- NeweyWest(reg1, prewhite = FALSE, adjust = TRUE) ## HAC VCOV (adjusted for small data sample)
#> Warning in meatHAC(x, order.by = order.by, prewhite = prewhite, weights =
#> weights, : more weights than observations, only first n used
var_fit_hac <- rowSums((X_mat %*% v_hac) * X_mat)  ## Point-wise variance for predicted mean
se_fit_hac <- sqrt(var_fit_hac) ## SEs

## Add these to pred_df and calculate the 95% CI
pred_df <-
  pred_df %>%
  mutate(se_fit_hac = se_fit_hac) %>%
  mutate(
    lwr_hac = fit - qt(0.975, df=reg1$df.residual)*se_fit_hac,
    upr_hac = fit + qt(0.975, df=reg1$df.residual)*se_fit_hac
    )

pred_df
#>        fit se_fit_hac   lwr_hac  upr_hac
#> 1 20.95238   4.250961  9.149822 32.75494
#> 2 20.63810   2.945392 12.460377 28.81581
#> 3 20.32381   1.986900 14.807291 25.84033
#> 4 20.00952   1.971797 14.534936 25.48411
#> 5 19.69524   2.914785 11.602497 27.78798
#> 6 19.38095   4.215654  7.676421 31.08548

## Plot it
bind_cols(
  d,
  pred_df
  ) %>%
  ggplot(aes(x = x, y = y, ymin=lwr_hac, ymax=upr_hac)) + 
  geom_point() + 
  geom_ribbon(fill="#E41A1C", alpha=0.3, col=NA) + ## Robust (HAC) SEs
  geom_smooth(method = 'lm', formula = y~x, col = "grey50") + ## Just for comparison
  labs(
    title = "Plotting HAC SEs in ggplot2",
    subtitle = "Regular SEs in grey for comparison",
    caption = "Note: Do HAC SEs make sense for this dataset? Definitely not!"
    ) +
  theme_minimal()

创建于 2020-03-08 由 reprex package (v0.3.0)
请注意，如果您愿意，您也可以使用此方法手动计算和绘制其他稳健的 SE 预测(例如 HC1、HC2 等)。您需要做的就是使用相关的三明治估算器。例如，使用 vcovHC(reg1, type = "HC2")而不是 NeweyWest(reg1, prewhite = FALSE, adjust = TRUE)将为您提供与使用 的第一个示例相同的 HC-robust CI。估算包裹。

关于r - ggplot2:如何在 geom_smooth 中获得可靠的预测置信区间？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45313482/

29

4

0

文章推荐： Cassandra:使用包含大于或小于(< 和 >)的 where 子句进行查询

文章推荐： scala - Play 2.0 : Optional list in query

文章推荐： r - 一个热编码向量列表

文章推荐： qt4 - Ubuntu 12.04LTS 中的 QtMultimedia 模块

Python ggplot 和 ggplotly
前 R 用户，我曾经通过 ggplotly() 函数广泛地结合 ggplot 和 plot_ly 库来显示数据。刚到 Python 时，我看到 ggplot 库可用，但在与 plotly 的简单组合
r - ggplotly 从 ggplot 中删除图例
ggplotly 使用 ggplot 删除 geom_line 图的图例。见例如以下: library(plotly) g % ggplotly() 关于r - ggplotly 从 gg
r - 设置带有端点的 ggplot 网格线/ggplot 的中断计算
我有一个 ggplot我试图以非常简约的外观制作线图的问题。我已经摆脱了图例，转而使用每行右侧的文本标签。如果标签不是那么长，它可能不会那么明显，但如果网格线停在最大 x 值(在这种情况下，在 201
r - 在一个 ggplot() 中生成多个 ggplot 图形
我想使用相同的 ggplot 代码以我的数据框中的数字为条件生成 8 个不同的数字。通常我会使用 facet_grid，但在这种情况下，我希望最终得到每个单独数字的 pdf。例如，我想要这里的每一行一
r - ggplot : conflict between geom_text and ggplot(fill)
当我在 ggplot 上使用 geom_text 时，与 ggplot 的“填充”选项发生冲突。这是问题的一个明显例子: library(ggplot2) a=ChickWeight str(a)
r - 将 ggplotly 和 ggplot 与拼凑而成？
是否可以结合使用 ggplot ly 和拼凑而成的ggplot？例子这将并排显示两个图 library(ggplot2) library(plotly) library(patchwork) a
r - ggplot、ggplotly、scale_y_连续、ylim 和百分比
我想绘制一个图表，其中 y 轴以百分比表示: p = ggplot(test, aes(x=creation_date, y=value, color=type)) + geom_line(aes
R ggplot，删除 ggsave/ggplot 中的白边
如何去除ggsave中的白边距？我的问题和Remove white space (i.e., margins) ggplot2 in R一模一样。然而，那里的答案对我来说并不理想。我不想对固定但未知
r - 文本层在 ggplot 中工作，但用 ggplotly 删除
我有一个带有一些文本层的条形图，在 ggplot 库中一切正常，但现在我想添加一些与 ggplotly 的交互性，但它无法显示文本层我更新了所有软件包但问题仍然存在 df = read.table(
r - ggplot 到 ggplotly 不适用于自定义的 geom_boxplot 宽度
当我尝试在 ggplot 中为我的箱线图设置自定义宽度时，它工作正常: p=ggplot(iris, aes(x = Species,y=Sepal.Length )) + geom_boxplot(
r - 如何通过从 ggplot 中的不同数据帧映射 aes_string 在 ggplot 中生成图例？
我正在尝试为 ggplot 密度创建一个图例，将一个组与所有组进行比较。使用此示例 - R: Custom Legend for Multiple Layer ggplot - 我可以使用下面的代码成
r - ggplot 在多面图上有一些错误。尝试使用多面 ggplot 协调 y 值
所以我试图在一个多面的 ggplot 上编辑 y 值，因为我在编织时在情节上有几个不准确之处。我对 R 和 R Markdown 很陌生，所以我不太明白为什么，例如，美国的 GDP PPP 在美元金额
python-ggplot - 如何在 Python Ggplot 上格式化 x 轴？
我需要在 python 条形图的 x 轴 ggplot 上格式化日期。我该怎么做？最佳答案使用 scale_x_date() 格式化 x 轴上的日期。 p = ggplot(aes(x='dat
r - 为什么 ggplotly 在 rmarkdown 中不能像 ggplot 一样工作
我想使用 ggplotly因为它的副作用相同ggplot甚至graphics做。我的意思是当我 knitr::knit或 rmarkdown::render我期望的 Rmd 文档 print(obj)
r - 在 Shiny 的应用程序中显示 ggplot 时，如何捕获控制台中出现的 ggplot 警告并显示在应用程序中？
我在下面有一个简单的应用程序，它显示了一个 ggplot。 ggplot 在控制台中生成警告(见底部图片)。我想捕获警告，并将其显示在应用程序的情节下方。这是我的代码: library(shiny)
r - 在 Shiny 的应用程序中缓存基本 ggplot 并允许动态修改图层(与 ggplot 等效的leafletProxy)
如果显示的基本数据集很大(下面的示例工作代码)，则在 Shiny 的应用程序中向/从 ggplot 添加/删除图层可能需要一段时间。问题是: 有没有办法缓存 ggplot(基本图)并添加/删除/修改
r - ggplot 和网格 : Find the relative x and y positions of a point in a ggplot grob
我正在组合 ggplot 的多个绘图，使用网格视口(viewport)，这是必要的(我相信)，因为我想旋转绘图，这在标准 ggplot 中是不可能的，甚至可能是 gridExtra 包。我想在两个图
R中的相对频率直方图，ggplot
我可以使用 lattice 在 R 中绘制相对频率直方图包裹: a <- runif(100) library(lattice) histogram(a) 我想在 ggplot 中获得相同的图形.我试
ggplot geom_area的R堆叠区域顺序
我需要重新安装 R，但我现在遇到了 ggplot 的一个小问题。我确信有一个简单的解决方案，我感谢所有提示! 我经常使用堆叠面积图，通常我通过定义因子水平并以相反的顺序绘制来获得所需的堆叠和图例顺序。
ggplot 中的数据重新排序
新的并且坚持使用ggplot: 我有以下数据: tribe rho preference_watermass 1 Luna2 -1.000 hypolimnic 2 OP10I-A1

首页

博学

6Ren·AI

商城

r - ggplot2:如何在 geom_smooth 中获得可靠的预测置信区间？