gpt4 book ai didi

r - 如何在R中按两个以上的因素对数据进行分组

转载 作者:行者123 更新时间:2023-12-04 10:55:52 24 4
gpt4 key购买 nike

我有一个如下所示的数据集。
在实际数据集中,有 8619 行。

Athlete      Competing Country  Year    Total Medals
Michael Phelps United States 2012 6
Alicia Coutts Australia 2012 5
Missy Franklin United States 2012 5
Brian Leetch United States 2002 1
Mario Lemieux Canada 2002 1
Ylva Lindberg Sweden 2002 1
Eric Lindros Canada 2002 1
Ulrica Lindström Sweden 2002 1
Shelley Looney United States 2002 1

我想按国家、年份和奖牌总数重新排列这些数据。

我想要这样的结果
Country        Year  SumOfMedals
United States 2012 11
United States 2002 2
...

by(newmd$Total.Medals, newmd$Year, FUN=sum)
by(md$Total.Medals, md$Competing.Country, FUN=sum)

我厌倦了通过争论使用,但仍然卡住了。
你们中的任何人都可以帮助我吗?

最佳答案

或使用 data.table ,我们将 'data.frame' 转换为 'data.table' ( setDT(df1) ),按 'Competing_Country', 'Year' 分组,得到 sum的 'Total_Medals and then按感兴趣的变量排序。

library(data.table)
setDT(df1)[,list(SumOfMedals = sum(Total_Medals)),
by = .(Competing_Country, Year)
][order(-Competing_Country, -Year, -SumOfMedals)]

或与 dplyr ,我们使用相同的方法。
library(dplyr)
df1 %>%
group_by(Competing_Country, Year) %>%
summary(SumOfMedals = sum(Total_Medals) %>%
arrange(desc(Competing_Country), desc(Year), desc(SumOfMedals))

数据
 df1 <- structure(list(Athlete = c("Michael Phelps", "Alicia Coutts", 
"Missy Franklin", "Brian Leetch", "Mario Lemieux", "Ylva Lindberg",
"Eric Lindros", "Ulrica Lindström", "Shelley Looney"), Competing_Country = c("United States",
"Australia", "United States", "United States", "Canada", "Sweden",
"Canada", "Sweden", "United States"), Year = c(2012L, 2012L,
2012L, 2002L, 2002L, 2002L, 2002L, 2002L, 2002L), Total_Medals = c(6L,
5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("Athlete", "Competing_Country",
"Year", "Total_Medals"), class = "data.frame", row.names = c(NA,
-9L))

关于r - 如何在R中按两个以上的因素对数据进行分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33991272/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com