gpt4 book ai didi

在 R 中使用 dplyr reshape 表格

转载 作者:行者123 更新时间:2023-12-04 02:00:45 26 4
gpt4 key购买 nike

欢迎就如何在 R 中正确应用 dplyr 提出一些建议。我们有以下数据:

   City            Amount    Category
1 Los Angeles 100 Film
2 Los Angeles 200 Film
3 Los Angeles 400 Music
4 Seattle 300 Coffee
5 Boston 600 Books
...

最终结果应该是这样的:

                        Film   Coffee   Books   ...
City
Los Angeles, CA Sum Sum Sum Sum
Seattle, WA Sum Sum Sum Sum
Boston, MA Sum Sum Sum Sum

我希望数据透视表汇总每个城市中每个类别的“金额”总值,以便城市在一列的左侧,所有类别在顶部作为一行。

尝试过:

data %>%                                            
group_by(Location, Category) %>%
summarise(Amount = sum(Amount))

哪个看起来更像

   City            Amount    Category
1 Los Angeles 300 Film
3 Los Angeles 400 Music
4 Seattle 300 Coffee
5 Boston 600 Books

计算是正确的,但如前所述,我们需要将 City 和 Category 作为矩阵,其中每个单元格内的每个 Amount 的总和。

感谢您的帮助!

最佳答案

您正在寻找的是 tidyr::spread 将您的 data.frame 从长格式 reshape 为宽格式:

library(tidyverse)

# recreate the data
data <- tribble(
~City, ~Amount, ~Category,
"Los Angeles", 100, "Film",
"Los Angeles", 200, "Film",
"Los Angeles", 400, "Music",
"Seattle", 300, "Coffee",
"Boston", 600, "Books"
)

# using your code to get the data in the long-format
data_long <- data %>%
group_by(City, Category) %>%
summarise(Amount = sum(Amount))

data_long
#> # A tibble: 4 x 3
#> # Groups: City [?]
#> City Category Amount
#> <chr> <chr> <dbl>
#> 1 Boston Books 600
#> 2 Los Angeles Film 300
#> 3 Los Angeles Music 400
#> 4 Seattle Coffee 300

# spread to wide using the tidyr-package (in tidyverse)
data_wide <- spread(data_long, key = "Category", value = "Amount", fill = 0)

data_wide
#> # A tibble: 3 x 5
#> # Groups: City [3]
#> City Books Coffee Film Music
#> * <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Boston 600 0 0 0
#> 2 Los Angeles 0 0 300 400
#> 3 Seattle 0 300 0 0

走向矩阵

mat <- as.matrix(data_wide %>% ungroup %>% select(-City))
rownames(mat) <- data_wide$City

mat
#> Books Coffee Film Music
#> Boston 600 0 0 0
#> Los Angeles 0 0 300 400
#> Seattle 0 300 0 0

str(mat)
#> num [1:3, 1:4] 600 0 0 0 0 300 0 300 0 0 ...
#> - attr(*, "dimnames")=List of 2
#> ..$ : chr [1:3] "Boston" "Los Angeles" "Seattle"
#> ..$ : chr [1:4] "Books" "Coffee" "Film" "Music"

关于在 R 中使用 dplyr reshape 表格,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47462199/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com