gpt4 book ai didi

R数据表: How to sum variables by group based on a condition?

转载 作者:行者123 更新时间:2023-12-01 22:16:13 26 4
gpt4 key购买 nike

假设我有以下 R data.table(尽管我也很高兴使用 base R,data.frame)

library(data.table)

dt = data.table(Category=c("First","First","First","Second","Third", "Third", "Second"), Frequency=c(10,15,5,2,14,20,3), times = c(0, 0, 0, 3, 3, 1))

> dt
Category Frequency times
1: First 10 0
2: First 15 0
3: First 5 0
4: Second 2 3
5: Third 14 3
6: Third 20 1
7: Second 3 0

如果我想按类别对频率求和,我会使用以下内容:

data[, sum(Frequency), by = Category]

但是,如果且仅当 times 非零且不等于 时,假设我想按 CategoryFrequency 求和>不适用?

如何根据单独列的值使此总和成为条件?

编辑:为显而易见的问题道歉。快速补充:如果某列的元素是字符串怎么办?

例如

> dt
Category Frequency times
1: First ten 0
2: First ten 0
3: First five 0
4: Second five 3
5: Third five 3
6: Third five 1
7: Second ten 0

Sum() 不会计算 105

的频率

最佳答案

记住data.table的逻辑:dt[i, j, by],即取dt,子集行使用i,然后计算按by分组的j

dt[times != 0 & !is.na(times), sum(Frequency), by = Category]
Category V1
1: Second 2
2: Third 34

关于R数据表: How to sum variables by group based on a condition?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45679883/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com