gpt4 book ai didi

r - dplyr 在变量子集上使用 pivot_longer 和 pivot_wider

转载 作者:行者123 更新时间:2023-12-03 20:26:08 25 4
gpt4 key购买 nike

有没有办法使用pivot_longerpivot_wider在变量的子集上?这是一个例子。首先,我将创建一个具有所需起始结构的数据框。

library(tidyverse)

# Assume this as starting df
arrests <- USArrests %>%
as_tibble(rownames = "State") %>%
pivot_longer(-State, names_to = "Crime", values_to = "Value") %>%
group_by(State) %>%
mutate(Total = sum(Value)) %>%
ungroup()

arrests
# A tibble: 200 x 4
State Crime Value Total
<chr> <chr> <dbl> <dbl>
1 Alabama Murder 13.2 328.
2 Alabama Assault 236 328.
3 Alabama UrbanPop 58 328.
4 Alabama Rape 21.2 328.
5 Alaska Murder 10 366.
6 Alaska Assault 263 366.
7 Alaska UrbanPop 48 366.
8 Alaska Rape 44.5 366.
9 Arizona Murder 8.1 413.
10 Arizona Assault 294 413.
# ... with 190 more rows

所以我们使用 arrest数据框。现在我想把“Total”折叠成“Crime”,这样“Total”就是Crime中的一个值,就像“Murder”一样。

我也想反过来。 “Total”折叠成“Crime”后,我想用 pivot_wider关于“犯罪”,但仅适用于 Crime == "Total" 的值.

这些行动可行吗?

最佳答案

一种选择是 add_row .在按“状态”进行分组后,循环遍历 listmap并添加一行( add_row from tibble )与“Total”列的第一个值并删除“Total”列

library(dplyr)
library(purrr)
library(tibble)
arrests2 <- arrests %>%
group_split(State) %>%
map_dfr(~ .x %>%
add_row(State = .$State[1], Crime = 'Total',
Value = .$Total[1]) %>%
select(-Total))
arrests2
# A tibble: 250 x 3
# State Crime Value
# * <chr> <chr> <dbl>
# 1 Alabama Murder 13.2
# 2 Alabama Assault 236
# 3 Alabama UrbanPop 58
# 4 Alabama Rape 21.2
# 5 Alabama Total 328.
# 6 Alaska Murder 10
# 7 Alaska Assault 263
# 8 Alaska UrbanPop 48
# 9 Alaska Rape 44.5
#10 Alaska Total 366.
# … with 240 more rows

或者另一种选择是 summarise使用“总计”值,然后执行 bind_rows
arrests %>% 
group_by(State) %>%
summarise(Crime = 'Total', Value = first(Total)) %>%
bind_rows(arrests %>% select(-Total), .) %>%
arrange(State)

或使用 pivot_longer
library(tidyr)
arrests %>%
pivot_longer(cols = Value:Total) %>%
mutate(Crime = replace(Crime, name == 'Total', 'Total')) %>%
select(-name) %>%
distinct()
# A tibble: 250 x 3
# State Crime value
# <chr> <chr> <dbl>
# 1 Alabama Murder 13.2
# 2 Alabama Total 328.
# 3 Alabama Assault 236
# 4 Alabama UrbanPop 58
# 5 Alabama Rape 21.2
# 6 Alaska Murder 10
# 7 Alaska Total 366.
# 8 Alaska Assault 263
# 9 Alaska UrbanPop 48
#10 Alaska Rape 44.5
# … with 240 more rows

如果我们需要做相反的事情,然后按'State'分组,通过提取与'Crime'对应的'Value'作为'Total'来创建'Total'列,然后 filter出犯罪为“总”的那一行
arrests2 %>%
group_by(State) %>%
mutate(Total = Value[Crime == 'Total']) %>%
filter(Crime != 'Total')
# A tibble: 200 x 4
# Groups: State [50]
# State Crime Value Total
# <chr> <chr> <dbl> <dbl>
# 1 Alabama Murder 13.2 328.
# 2 Alabama Assault 236 328.
# 3 Alabama UrbanPop 58 328.
# 4 Alabama Rape 21.2 328.
# 5 Alaska Murder 10 366.
# 6 Alaska Assault 263 366.
# 7 Alaska UrbanPop 48 366.
# 8 Alaska Rape 44.5 366.
# 9 Arizona Murder 8.1 413.
#10 Arizona Assault 294 413.
# … with 190 more rows

关于r - dplyr 在变量子集上使用 pivot_longer 和 pivot_wider,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60895646/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com