gpt4 book ai didi

r - 使用 summarize 和 across 函数聚合字符串变量

转载 作者:行者123 更新时间:2023-12-05 01:49:22 24 4
gpt4 key购买 nike

df_input是输入文件,理想的输出文件是df_output。

df_input <- data.frame(id  = c(1,2,3,4,4,5,5,5,6,7,8,9,10),
party = c("A","B","C","D","E","F","G","H","I","J","K","L","M"),
winner= c(1,1,1,1,1,1,1,1,1,1,1,1,1))


df_output <- data.frame(id = c(1,2,3,4,5,6,7,8,9,10),
party = c("A","B","C","D,E","F_G_H","I","J","K","L","M"),
winner_sum = c(1,1,1,2,3,1,1,1,1,1))

以前,代码使用“summarise_at”函数工作,如下所示:

df_output <- df_input %>%
dplyr::group_by_at(.vars = vars(id)) %>%
{left_join(
dplyr::summarise_at(., vars(party), ~ str_c(., collapse = ",")),
dplyr::summarise_at(., vars(winner), funs(sum))
)}

但它不再有效,因为似乎“summarise_at”和“funs”都已被弃用。

我正在尝试使用 dplyr (1.0.10) 复制 across,但出现错误。这是我的尝试:

df_output <- df_input %>% 
group_by(id) %>%
summarise(across(winner, sum, na.rm=T)) %>%
summarise(across(party, str_c(., collapse = ",")))

我有多个数字和字符变量,而不是示例中的一个。非常感谢。

最佳答案

如果我们需要在单个列上应用不同的函数,我们不需要across

library(dplyr)
library(stringr)
df_input %>%
group_by(id) %>%
summarise(party = str_c(party, collapse = ","),
winner_sum = sum(winner))

-输出

# A tibble: 10 × 3
id party winner_sum
<dbl> <chr> <dbl>
1 1 A 1
2 2 B 1
3 3 C 1
4 4 D,E 2
5 5 F,G,H 3
6 6 I 1
7 7 J 1
8 8 K 1
9 9 L 1
10 10 M 1

如果有多个“party”、“winner”列,在第一个 summarise 之后,在单个 summarise 中循环跨越它们,我们只有带有组列的汇总列

df_input %>% 
group_by(id) %>%
summarise(across(winner, sum, na.rm=TRUE),
across(party, ~ str_c(.x, collapse = ",")), .groups = "drop")

-输出

# A tibble: 10 × 3
id winner party
<dbl> <dbl> <chr>
1 1 1 A
2 2 1 B
3 3 1 C
4 4 2 D,E
5 5 3 F,G,H
6 6 1 I
7 7 1 J
8 8 1 K
9 9 1 L
10 10 1 M

注意:如果列有简单的前缀,则使用 starts_with 选择所有这些列,即 across(starts_with("party"), 或者如果有不同的列名称 - across(c(party, othercol), 或者如果应用的函数基于它们的类型 - across(where(is.numeric), sum,, na.rm = TRUE)

df_input %>%
group_by(id) %>%
summarise(across(where(is.numeric), sum, na.rm = TRUE),
across(where(is.character), str_c, collapse = ","),
.groups = 'drop')

关于r - 使用 summarize 和 across 函数聚合字符串变量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74165416/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com