gpt4 book ai didi

r - 使用 Ns 和比例对 tabyl 输出进行排序

转载 作者:行者123 更新时间:2023-12-02 02:03:04 29 4
gpt4 key购买 nike

我的第一个问题!

我正在尝试从 janitor 包中订购 tabyl 调用的结果。我不知道如何对 adorn_ns() 中附加的数字进行排序。

使用 tabyl,我设法使用以下代码创建了一个包含频率、比例和总计的表格。我想要实现的是以“总计”列的频率降序对表进行排序。最终,我想将表格传递给 knit 的 kable() 进行报告。

在 tabyl 上调用 arrange 后,adorn_ns() 将 N 粘贴到不正确的“原始”位置,而不是排序的位置。这已经在 Github 中注意到了,(据我所知)是由于 tabyl 排序时 'core' 没有改变造成的。请参阅:https://github.com/sfirke/janitor/issues/352

Github 上的评论指出:“这不是一个关键问题,您可以将自定义 N 提供给 adorn_ns() 调用,您也可以在那里进行排序。”不幸的是,我不知道如何放置这些自定义 N。

或者,我考虑过使用因子来更改顺序,但是我希望有一个更强大的解决方案,因为该变量在我的真实数据中包含许多类别,并且我希望能够应用它(或替代方案) )将来对不同变量进行表格渲染的方法,而不必费力地按频率键入级别。

所以,非常感谢有关自定义 N、替代排序方法或(如果有必要)替代表格方法的任何帮助。

这是一些玩具数据以及我陷入困境的地方。

library(dplyr)
library(janitor)

# some toy data
var1 <- c("aaa", "bbb", "ccc", "ccc", "ddd", "ddd", "ddd", "ddd", "aaa", "ddd", "ddd", "bbb", "bbb", "ddd")
sex <- c("f", "f", "m", "f", "m", "m", "f", "f", "m", "m", "f", "m", "f", "f")
df <- data.frame(var1,sex)


# First a tabyl with proportions, Ns and totals
tabyl(df, var1, sex) %>%
adorn_totals(where = c("col", "row")) %>%
adorn_percentages("col") %>%
adorn_pct_formatting(digits = 0) %>%
adorn_ns(position = "front")

# Results in (as expected)

|var1 |f |m |Total |
|:-----|:--------|:--------|:---------|
|aaa |1 (12%) |1 (17%) |2 (14%) |
|bbb |2 (25%) |1 (17%) |3 (21%) |
|ccc |1 (12%) |1 (17%) |2 (14%) |
|ddd |4 (50%) |3 (50%) |7 (50%) |
|Total |8 (100%) |6 (100%) |14 (100%) |

我想要实现的目标:

# descending order of frequency
|var1 |f |m |Total |
|:-----|:--------|:--------|:---------|
|ddd |4 (50%) |3 (50%) |7 (50%) |
|bbb |2 (25%) |1 (17%) |3 (21%) |
|aaa |1 (12%) |1 (17%) |2 (14%) |
|ccc |1 (12%) |1 (17%) |2 (14%) |
|Total |8 (100%) |6 (100%) |14 (100%) |

我尝试过的:

# Order by the Total column in descending frequency

df %>% tabyl(var1,sex) %>%
adorn_totals(where = "col") %>% # split col and row totals
arrange(desc(Total)) %>%
adorn_totals(where = "row") %>% # prevents total-row appearing at top)
adorn_percentages("col") %>%
adorn_pct_formatting(digits = 0) %>%
adorn_ns(position = "front")

# Results in (not what I expected)
|var1 |f |m |Total |
|:-----|:--------|:--------|:---------|
|ddd |1 (50%) |1 (50%) |2 (50%) |
|bbb |2 (25%) |1 (17%) |3 (21%) |
|aaa |1 (12%) |1 (17%) |2 (14%) |
|ccc |4 (12%) |3 (17%) |7 (14%) |
|Total |8 (100%) |6 (100%) |14 (100%) |

# The categories have changed order, the N's have not (are in original position in table),
# and the % have been recalculated...

最佳答案

更新OP的请求:请参阅评论:

这不是那么优雅,但它会给你带来你想要的输出:

df1 <- df %>% tabyl(var1,sex) %>%
adorn_totals(where = "col") %>% # split col and row totals
adorn_totals(where = "row") %>% # prevents total-row appearing at top)
adorn_percentages("col") %>%
adorn_pct_formatting(digits = 0) %>%
adorn_ns(position = "front") %>%
arrange(desc(Total))

df2 <- df1[1,]
df3 <- df1[-1,]

bind_rows(df3, df2)

输出:

  var1        f        m     Total
ddd 4 (50%) 3 (50%) 7 (50%)
bbb 2 (25%) 1 (17%) 3 (21%)
aaa 1 (12%) 1 (17%) 2 (14%)
ccc 1 (12%) 1 (17%) 2 (14%)
Total 8 (100%) 6 (100%) 14 (100%)

第一个答案:使用sort = TRUE

df %>% tabyl(var1,sex, sort = TRUE) %>%
adorn_totals(where = "col") %>% # split col and row totals
#arrange(desc(Total)) %>%
adorn_totals(where = "row") %>% # prevents total-row appearing at top)
adorn_percentages("col") %>%
adorn_pct_formatting(digits = 0) %>%
adorn_ns(position = "front")

输出:

  var1        f        m     Total
aaa 1 (12%) 1 (17%) 2 (14%)
bbb 2 (25%) 1 (17%) 3 (21%)
ccc 1 (12%) 1 (17%) 2 (14%)
ddd 4 (50%) 3 (50%) 7 (50%)
Total 8 (100%) 6 (100%) 14 (100%)

关于r - 使用 Ns 和比例对 tabyl 输出进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68750035/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com