r - ggplot 中的森林图，引用水平来自回归模型-6ren

r - ggplot 中的森林图，引用水平来自回归模型

转载作者：行者123 更新时间：2023-12-05 07:35:10

我想使用 ggplot2 包制作一个森林图，我对我的输出很满意(见下面的森林图)。

该图具有回归模型中给定变量的水平(优势比和内部置信度)以及引用水平。

问题是生成情节需要大量的人工劳动。

第一个问题，我希望引用水平跟随图中给定变量的其他水平，所以我手动输入了每个这样的引用水平(参见下面的森林表)。为了让 ggplot2 正常工作，我输入了任意负比值比和引用水平的置信区间值，然后将绘图限制设置为从零到一个大的正数。

第二个问题，因为我的原始变量在单个列中，所以我手动输入了颜色，这很耗时。

有没有更直接的方法来生成这样的图？任何帮助将非常感激。

# DATA 
mtcars
mtcars$gear <- as.factor(mtcars$gear)
mtcars$carb <- as.factor(mtcars$carb)

# PREPARE ODDS RATIO & CONFIDENCE INTERVALS DATA FRAME 
model = lm(mpg ~ gear + carb + disp, data = mtcars ) # make regression model
forest_table = data.frame(
  or= round(exp(coef(model)),2), 
  round(exp(confint(model, level = 0.95)),2), 
  check.names = F) # make a table with odds ratio and confidence intervals
names(forest_table) = c("or", "ci_lb", "ci_ub") # give columns clear names
library(data.table)
setDT(forest_table, keep.rownames = TRUE)[] # turn row names into a column
forest_table <- as.data.frame(forest_table) # turn table into a data frame
forest_table <- forest_table[-1, ] # get rid of the intercept row

# ADD ROWS WITH REFERENCE LEVELS TO PREPARED DATA FRAME
r <- 2 # row after which new row is to be inserted
newrow <- c("3 reference", -10.00, -9.00, -11.00) # row to be inserted 
forest_table <- rbind(forest_table[1:r, ], newrow, forest_table[-(1:r), ]) # insert row
r <- 8 # row after which new row is to be inserted
newrow <- c("1 reference", -10.00, -9.00, -11.00) # row to be inserted 
forest_table <- rbind(forest_table[1:r, ], newrow, forest_table[-(1:r), ]) # insert row

# FIX CLASSES IN PREPARED DATA FRAME 
forest_table$or <- as.numeric(forest_table$or)
forest_table$ci_lb <- as.numeric(forest_table$ci_lb)
forest_table$ci_ub <- as.numeric(forest_table$ci_ub)

# ADD DUMMY VARIABLE TO CONTROL ORDER IN PLOT 
forest_table$order <- as.factor(rep(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))) # create dummy variable 
forest_table$order <- factor(forest_table$order, 
                             levels = rev(levels(forest_table$order))) 
# use dummy variable to counteract ggplot2 default of reversing the order of levels in 
# the prepared data frame when plotting  

# PLOT
library(ggplot2)
forestplot <- ggplot(forest_table, aes(or, order)) + 
  geom_point(size = 5, shape = 18, aes(colour = order)) + # data points
  geom_errorbarh(aes(xmax = ci_ub, xmin = ci_lb, colour = order), 
                 height = 0.15) + # error bars
  geom_vline(xintercept = 1, linetype = "longdash") + # line marking 0 on x axis
  scale_x_continuous(breaks = seq(0, 40000, 10000), 
                     labels = seq(0, 40000, 10000),
                     limits = c(0, 50000)) + # x axis scale and labels
 scale_colour_manual(values = c("blue", "red", "red", "red", "red", "red", "red", 
                                "green", "green", "green")) # manually set one colour per variable

最佳答案

您可以在表格中包含一个额外的列，指示模型中的实际项，并根据该列分配颜色。这是一个例子:

第 1 步。从模型系数创建数据框:

new.table <- data.frame(
  coef = names(coef(model)),
  or = round(exp(coef(model)), 2),
  ci_lb = round(exp(confint(model, level = 0.95)), 2)[, 1],
  ci_ub = round(exp(confint(model, level = 0.95)), 2)[, 2],
  stringsAsFactors = FALSE, row.names = NULL
)

> new.table
         coef           or        ci_lb        ci_ub
1 (Intercept) 1.226831e+11 249767693.18 6.026058e+13
2       gear4 5.396000e+01         0.31 9.403190e+03
3       gear5 2.193800e+02         1.03 4.662360e+04
4       carb2 1.400000e-01         0.00 4.340000e+00
5       carb3 2.000000e-02         0.00 1.280000e+00
6       carb4 0.000000e+00         0.00 2.000000e-01
7       carb6 0.000000e+00         0.00 3.700000e-01
8       carb8 0.000000e+00         0.00 2.100000e-01
9        disp 9.800000e-01         0.96 1.000000e+00

第 2 步。请注意，在“lm”模型中，model$xlevels 包含有关包含多个因子级别的项的信息。

> model$xlevels
$gear
[1] "3" "4" "5"

$carb
[1] "1" "2" "3" "4" "6" "8"

这可用于创建所有因素水平的引用数据框:

library(dplyr)
library(data.table)

terms.with.levels <- names(model$xlevels)
df.with.levels <- lapply(terms.with.levels, 
       function(x) data.frame(term = x,
                              coef = paste0(x, model$xlevels[[x]]),
                              stringsAsFactors = FALSE)) %>%
  rbindlist()

> df.with.levels
   term  coef
1: gear gear3
2: gear gear4
3: gear gear5
4: carb carb1
5: carb carb2
6: carb carb3
7: carb carb4
8: carb carb6
9: carb carb8

第 3 步。合并两个数据框。现在所有引用因子水平都存在，我们有一列指定术语:

new.table <- merge(new.table, df.with.levels, all = TRUE)

> new.table
          coef           or        ci_lb        ci_ub term
1  (Intercept) 1.226831e+11 249767693.18 6.026058e+13 <NA>
2        carb1           NA           NA           NA carb
3        carb2 1.400000e-01         0.00 4.340000e+00 carb
4        carb3 2.000000e-02         0.00 1.280000e+00 carb
5        carb4 0.000000e+00         0.00 2.000000e-01 carb
6        carb6 0.000000e+00         0.00 3.700000e-01 carb
7        carb8 0.000000e+00         0.00 2.100000e-01 carb
8         disp 9.800000e-01         0.96 1.000000e+00 <NA>
9        gear3           NA           NA           NA gear
10       gear4 5.396000e+01         0.31 9.403190e+03 gear
11       gear5 2.193800e+02         1.03 4.662360e+04 gear

第 4 步。进一步修改数据框:

new.table <- new.table %>%

  # drop intercept
  filter(coef != "(Intercept)") %>%

  # indicate whether each row is for a reference level
  mutate(is.reference = is.na(or)) %>%

  # for non-factor term in the model (e.g. disp) which
  # have a single coefficient, term == coef
  mutate(term = ifelse(is.na(term), coef, term)) %>%

  # set reference levels' x values to 0
  mutate_at(vars(or, ci_lb, ci_ub),
            funs(ifelse(is.reference, 0, .))) %>%

  # order terms according to the model specifications
  mutate(term = factor(term,
                       levels = attr(model$terms, "term.labels")))

> new.table
    coef     or ci_lb    ci_ub term is.reference
1  carb1   0.00  0.00     0.00 carb         TRUE
2  carb2   0.14  0.00     4.34 carb        FALSE
3  carb3   0.02  0.00     1.28 carb        FALSE
4  carb4   0.00  0.00     0.20 carb        FALSE
5  carb6   0.00  0.00     0.37 carb        FALSE
6  carb8   0.00  0.00     0.21 carb        FALSE
7   disp   0.98  0.96     1.00 disp        FALSE
8  gear3   0.00  0.00     0.00 gear         TRUE
9  gear4  53.96  0.31  9403.19 gear        FALSE
10 gear5 219.38  1.03 46623.60 gear        FALSE

第 5 步。创建情节。可以通过将它们的 alpha 设置为 0(即 100% 透明度)来隐藏引用级别，并且系数的顺序通过方面的顺序控制:

p <- ggplot(new.table,
       aes(x = or, xmin = ci_lb, xmax = ci_ub,
           y = coef, color = term, alpha = !is.reference)) +
  geom_vline(xintercept = 1, linetype = "longdash") +
  geom_errorbarh(height = 0.15) +
  geom_point(size = 5, shape = 18) +
  facet_grid(term~., scales = "free_y", space = "free_y") +
  scale_alpha_identity()

第 6 步。如果需要，进一步调整绘图:

p +
  # specify colour for each term in the model
  scale_color_manual(values = c("gear" = "green",
                                "carb" = "red",
                                "disp" = "blue")) +

  # hide facet labels
  theme(strip.background = element_blank(),
        strip.text = element_blank()) +

  # remove spacing between facets
  theme(panel.spacing = unit(0, "pt"))

关于r - ggplot 中的森林图，引用水平来自回归模型，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49690498/

文章推荐： javascript - FCM 缺少授权问题

文章推荐： angular - 无法使用柔性布局定位 Angular Material 垫卡

Python ggplot 和 ggplotly
前 R 用户，我曾经通过 ggplotly() 函数广泛地结合 ggplot 和 plot_ly 库来显示数据。刚到 Python 时，我看到 ggplot 库可用，但在与 plotly 的简单组合
r - ggplotly 从 ggplot 中删除图例
ggplotly 使用 ggplot 删除 geom_line 图的图例。见例如以下: library(plotly) g % ggplotly() 关于r - ggplotly 从 gg
r - 设置带有端点的 ggplot 网格线/ggplot 的中断计算
我有一个 ggplot我试图以非常简约的外观制作线图的问题。我已经摆脱了图例，转而使用每行右侧的文本标签。如果标签不是那么长，它可能不会那么明显，但如果网格线停在最大 x 值(在这种情况下，在 201
r - 在一个 ggplot() 中生成多个 ggplot 图形
我想使用相同的 ggplot 代码以我的数据框中的数字为条件生成 8 个不同的数字。通常我会使用 facet_grid，但在这种情况下，我希望最终得到每个单独数字的 pdf。例如，我想要这里的每一行一
r - ggplot : conflict between geom_text and ggplot(fill)
当我在 ggplot 上使用 geom_text 时，与 ggplot 的“填充”选项发生冲突。这是问题的一个明显例子: library(ggplot2) a=ChickWeight str(a)
r - 将 ggplotly 和 ggplot 与拼凑而成？
是否可以结合使用 ggplot ly 和拼凑而成的ggplot？例子这将并排显示两个图 library(ggplot2) library(plotly) library(patchwork) a
r - ggplot、ggplotly、scale_y_连续、ylim 和百分比
我想绘制一个图表，其中 y 轴以百分比表示: p = ggplot(test, aes(x=creation_date, y=value, color=type)) + geom_line(aes
R ggplot，删除 ggsave/ggplot 中的白边
如何去除ggsave中的白边距？我的问题和Remove white space (i.e., margins) ggplot2 in R一模一样。然而，那里的答案对我来说并不理想。我不想对固定但未知
r - 文本层在 ggplot 中工作，但用 ggplotly 删除
我有一个带有一些文本层的条形图，在 ggplot 库中一切正常，但现在我想添加一些与 ggplotly 的交互性，但它无法显示文本层我更新了所有软件包但问题仍然存在 df = read.table(
r - ggplot 到 ggplotly 不适用于自定义的 geom_boxplot 宽度
当我尝试在 ggplot 中为我的箱线图设置自定义宽度时，它工作正常: p=ggplot(iris, aes(x = Species,y=Sepal.Length )) + geom_boxplot(
r - 如何通过从 ggplot 中的不同数据帧映射 aes_string 在 ggplot 中生成图例？
我正在尝试为 ggplot 密度创建一个图例，将一个组与所有组进行比较。使用此示例 - R: Custom Legend for Multiple Layer ggplot - 我可以使用下面的代码成
r - ggplot 在多面图上有一些错误。尝试使用多面 ggplot 协调 y 值
所以我试图在一个多面的 ggplot 上编辑 y 值，因为我在编织时在情节上有几个不准确之处。我对 R 和 R Markdown 很陌生，所以我不太明白为什么，例如，美国的 GDP PPP 在美元金额
python-ggplot - 如何在 Python Ggplot 上格式化 x 轴？
我需要在 python 条形图的 x 轴 ggplot 上格式化日期。我该怎么做？最佳答案使用 scale_x_date() 格式化 x 轴上的日期。 p = ggplot(aes(x='dat
r - 为什么 ggplotly 在 rmarkdown 中不能像 ggplot 一样工作
我想使用 ggplotly因为它的副作用相同ggplot甚至graphics做。我的意思是当我 knitr::knit或 rmarkdown::render我期望的 Rmd 文档 print(obj)
r - 在 Shiny 的应用程序中显示 ggplot 时，如何捕获控制台中出现的 ggplot 警告并显示在应用程序中？
我在下面有一个简单的应用程序，它显示了一个 ggplot。 ggplot 在控制台中生成警告(见底部图片)。我想捕获警告，并将其显示在应用程序的情节下方。这是我的代码: library(shiny)
r - 在 Shiny 的应用程序中缓存基本 ggplot 并允许动态修改图层(与 ggplot 等效的leafletProxy)
如果显示的基本数据集很大(下面的示例工作代码)，则在 Shiny 的应用程序中向/从 ggplot 添加/删除图层可能需要一段时间。问题是: 有没有办法缓存 ggplot(基本图)并添加/删除/修改
r - ggplot 和网格 : Find the relative x and y positions of a point in a ggplot grob
我正在组合 ggplot 的多个绘图，使用网格视口(viewport)，这是必要的(我相信)，因为我想旋转绘图，这在标准 ggplot 中是不可能的，甚至可能是 gridExtra 包。我想在两个图
R中的相对频率直方图，ggplot
我可以使用 lattice 在 R 中绘制相对频率直方图包裹: a <- runif(100) library(lattice) histogram(a) 我想在 ggplot 中获得相同的图形.我试
ggplot geom_area的R堆叠区域顺序
我需要重新安装 R，但我现在遇到了 ggplot 的一个小问题。我确信有一个简单的解决方案，我感谢所有提示! 我经常使用堆叠面积图，通常我通过定义因子水平并以相反的顺序绘制来获得所需的堆叠和图例顺序。
ggplot 中的数据重新排序
新的并且坚持使用ggplot: 我有以下数据: tribe rho preference_watermass 1 Luna2 -1.000 hypolimnic 2 OP10I-A1

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

r - ggplot 中的森林图，引用水平来自回归模型