r - 带有 2 向方差分析的 p 值的汇总表(平均值 + 标准误差)-6ren

r - 带有 2 向方差分析的 p 值的汇总表(平均值 + 标准误差)

转载作者：行者123 更新时间：2023-12-04 03:36:16

我正在尝试制作一个表格，用于输出我们通常通过 2-way anova 分析的大型研究的汇总统计数据，查看两个变量的主要影响以及相互作用。
我想要一种快速运行统计数据的方法，并以易于阅读的格式输出它们，如果格式很好，那就更好了。
我已经能够获得 2 路 anova 输出，而且我还使用了 gtsummary 包和 tbl_summary做一张 table 。但是，我无法弄清楚如何按 1 个以上的变量进行分组。我的解决方案是创建一个结合两个自变量的新变量，只是为了将数据分成正确的组。
下面的可重现示例。
我想知道是否有一种方法可以用我拥有的均值(sem)输出制作表格，但要获得我的 2 路方差分析的结果(也在下面粘贴)。在这个巨大的例子中，我想要一列表示“性别”的主要影响的 P 值，下一列表示“登船”的主要影响的 p 值，然后是交互的 p 值。
有什么想法吗？

library(titanic)
library(tidyverse)
library(gtsummary)
library(plotrix) #has a std.error function


##I really want to look at a 2-way anova, looking for the p-value for Sex, Embarked, and their interaction.
#This code just allows me to make a table with the 4 columns I want, but of course it now won't do the correct stats.
df <- titanic_train %>%
  filter(Embarked != "C" &  Embarked != "") %>%
  mutate(grp = paste(Sex, Embarked)) #add a new column that combines Sex & Pclass

#code to make my table 
  
table1 <- df %>%  
  select(grp, Age, Fare, Survived) %>%
  tbl_summary(
    by = grp,  ##can't figure out a way to put 2 variables here (Sex & Embarked)
    missing = "ifany", 
    statistic = all_continuous() ~ "{mean} ({std.error})",
    digits = all_continuous() ~ 1) %>% #this puts 1 decimal place for all values
   modify_header(stat_by = md("**{level}**<br>N =  {n}")) %>%
  bold_labels() %>%
  modify_spanning_header(all_stat_cols() ~ "**These are the Columns I Want**") %>%
  add_p(test = everything() ~ "aov",  ##This is a 1-way ANOVA, but I need 2 variables
  )

table1

#these are the p-values I want in my table:
two_way_anova_age <- aov(Age ~ Sex * Embarked, data = df)
summary(two_way_anova_age)

two_way_anova_fare <- aov(Fare ~ Sex * Embarked, data = df)
summary(two_way_anova_fare)

two_way_anova_surv <- aov(Survived ~ Sex * Embarked, data = df)
summary(two_way_anova_surv)

最佳答案

以下是将结果合并到 gtsummary 表中的方法。

library(gtsummary)
library(titanic)
library(tidyverse)
library(plotrix) #has a std.error function

packageVersion("gtsummary")
#> [1] '1.4.0'

# create smaller version of the dataset
df <- 
  titanic_train %>%
  select(Sex, Embarked, Age, Fare) %>%
  filter(Embarked != "") # deleting empty Embarked status

# first, write a little function to get the 2-way ANOVA p-values in a table
# function to get 2-way ANOVA p-values in tibble
twoway_p <- function(variable) {
  paste(variable, "~ Sex * Embarked") %>%
    as.formula() %>%
    aov(data = df) %>% 
    broom::tidy() %>%
    select(term, p.value) %>%
    filter(complete.cases(.)) %>%
    pivot_wider(names_from = term, values_from = p.value) %>%
    mutate(
      variable = .env$variable,
      row_type = "label"
    )
}

# add all results to a single table (will be merged with gtsummary table in next step)
twoway_results <-
  bind_rows(
    twoway_p("Age"),
    twoway_p("Fare")
  )
twoway_results
#> # A tibble: 2 x 5
#>            Sex Embarked `Sex:Embarked` variable row_type
#>          <dbl>    <dbl>          <dbl> <chr>    <chr>   
#> 1 0.00823      3.97e- 1         0.611  Age      label   
#> 2 0.0000000191 4.27e-16         0.0958 Fare     label


tbl <-
  # first build a stratified `tbl_summary()` table to get summary stats by two variables
  df %>%
  tbl_strata(
    strata =  Sex,
    .tbl_fun =
      ~.x %>%
      tbl_summary(
        by = Embarked,
        missing = "no",
        statistic = all_continuous() ~ "{mean} ({std.error})",
        digits = everything() ~ 1
      ) %>%
      modify_header(all_stat_cols() ~ "**{level}**")
  ) %>%
  # merge the 2way ANOVA results into tbl_summary table
  modify_table_body(
    ~.x %>%
      left_join(
        twoway_results,
        by = c("variable", "row_type")
      )
  ) %>%
  # by default the new columns are hidden, add a header to unhide them
  modify_header(list(
    Sex ~ "**Sex**", 
    Embarked ~ "**Embarked**", 
    `Sex:Embarked` ~ "**Sex * Embarked**"
  )) %>%
  # adding spanning header to analysis results
  modify_spanning_header(c(Sex, Embarked, `Sex:Embarked`) ~ "**Two-way ANOVA p-values**") %>%
  # format the p-values with a pvalue formatting function
  modify_fmt_fun(c(Sex, Embarked, `Sex:Embarked`) ~ style_pvalue) %>%
  # update the footnote to be nicer looking
  modify_footnote(all_stat_cols() ~ "Mean (SE)")

创建于 2021-03-27 由 reprex package (v1.0.0)

关于r - 带有 2 向方差分析的 p 值的汇总表(平均值 + 标准误差)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/66835663/

文章推荐： reactjs - 使用 useState 对路由进行 React 身份验证

mysql - 汇总表 MySQL
我的数据库中有 4 个表。 2012,2013,2014,2015. 每一个都是这样布置的: 我想对每年的每个 StartStation 求和: StartStation 2012
php - Mysql - 汇总表
您建议使用哪种方法，为什么？创建汇总表和 . . . 1) 实时更新表格。 2) 每 15 分钟运行一次 group by 查询以更新汇总表。 3) 还有别的吗？数据必须接近实时，不能等一个小时、
python - 具有零值的 Pandas 汇总表
我正在尝试使用来自 python 的 pandas 的 .describe() 创建一个汇总表。我有以下数据框: df = pd.DataFrame({'Group':['Group1', 'Gro
python - 创建 pandas 汇总表(但不是 groupby)
我在 pandas 中得到了下表: x 是 1 1 2 3 2 5 2 4 1 4 1 5 我想看看变量x的模式，所以我想看看模式是什么。在表格中，您会看到 x=1 然后 x=2 三次，然后返回
sql - Web App 建议的 MYSQL 汇总表
我有一个数据库，其中有许多具有关系的表中的数据 TABLE Cars (stock) --------------------- Model colourid Doors --------
python - 汇总表 : Skip first even number
我正在尝试对列表求和，但跳过第一个偶数，然后继续添加列表的其余部分，包括其余的偶数，但我似乎不太正确。 list = [-3, -7, -1, 0, 1, 2, 3, 4, 5, 6, 7] def
php - 使用 MySQL 的物化 View (汇总表)的首选方法
我正在开发一个我需要创建和维护的项目汇总表出于性能原因。我相信正确的术语是物化 View . 我有两个主要原因这样做: 非规范化我尽可能地对表格进行了标准化。所以在某些情况下，我必须加入许多表
带有 SELECT 汇总表 2 数据的 MySql UPDATE 语句
我有两个表，表 A 有列 token(主键)和停机时间(INT)，表 B 有列 token, status(ENUM 有 3 种状态:active, unstable, inactive ), du
r - 如何使用 purrr 中的 map 和 dplyr 中的 mutate 来生成 glm 汇总表？
我正在使用包 purrr 和 broom 来生成一系列 glm 并构建一个包含模型信息的表格，以便我可以比较它们。当我从 purrr 调用 map 函数时，代码失败。我认为问题与 mutate 和

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

r - 带有 2 向方差分析的 p 值的汇总表(平均值 + 标准误差)