gpt4 book ai didi

r - 跨多个列计算出现次数并按年份分组

转载 作者:行者123 更新时间:2023-12-02 17:57:03 25 4
gpt4 key购买 nike

我有一个电影数据集,其中有一个年份列和三个流派列。

这是一个例子:

genre_structure<-structure(
list(
year = c(
"2008",
"2003",
"2010",
"2001",
"2002",
"1999",
"1980",
"2020",
"1977",
"1991",
"1954",
"2022",
"1962",
"2000",
"1994",
"2019",
"2019",
"1981",
"2012",
"2003"
),
genre1 = c(
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action",
"Action"
),
genre2 = c(
"Crime",
"Adventure",
"Adventure",
"Adventure",
"Adventure",
"SciFi",
"Adventure",
"Drama",
"Adventure",
"SciFi",
"Drama",
"Drama",
"Drama",
"Adventure",
"Crime",
"Adventure",
"Adventure",
"Adventure",
"Drama",
"Drama"
),
genre3 = c(
"Drama",
"Drama",
"SciFi",
"Drama",
"Drama",
"",
"Fantasy",
"",
"Fantasy",
"",
"",
"Mystery",
"Mystery",
"Drama",
"Drama",
"Crime",
"Drama",
"",
"",
"Mystery"
)
),
row.names = c(NA,-20L),
class = "data.frame"
)

我正在尝试计算每年的所有 3 种类型。预期结果是(示例):

genre | year| count
Action |2008| 1
Comedy | 2008 | 3
Drama | 2008 | 4
...

我试过:

genre_years_test<-genre_structure %>% 
group_by(genre1, genre2, genre3, year) %>%
summarise(total=n(), .groups = "drop")

但每当有新类型在该年发布时,它就在重复年份。

最佳答案

我们可能会 reshape 为“长”并获得计数

library(dplyr)
library(tidyr)
genre_structure %>%
pivot_longer(cols = -year, values_to = 'genre') %>%
count(year, genre, name = 'count')

关于r - 跨多个列计算出现次数并按年份分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/75366193/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com