gpt4 book ai didi

r - 值出现的累积序列

转载 作者:行者123 更新时间:2023-12-04 01:40:03 24 4
gpt4 key购买 nike

这个问题在这里已经有了答案:





Numbering rows within groups in a data frame

(9 个回答)


3年前关闭。




我有一个看起来像这样的数据集,其中一列可以有四个不同的值:

dataset <- data.frame(out = c("a","b","c","a","d","b","c","a","d","b","c","a"))

在 R 中,我想创建第二列,按顺序记录包含特定值的累积行数。因此,输出列将如下所示:
out
1
1
1
2
1
2
2
3
2
3
3
4

最佳答案

尝试这个:

dataset <- data.frame(out = c("a","b","c","a","d","b","c","a","d","b","c","a"))
with(dataset, ave(as.character(out), out, FUN = seq_along))
# [1] "1" "1" "1" "2" "1" "2" "2" "3" "2" "3" "3" "4"

当然,您可以将输出分配给 data.frame 中的列。使用类似 out$asNumbers <- with(dataset, ave(as.character(out), out, FUN = seq_along)) 的东西

更新

“dplyr”方法也很不错。逻辑与“data.table”方法非常相似。一个优点是您不需要用 as.numeric 包装输出。这是 ave 所必需的上面提到的方法。
dataset %>% group_by(out) %>% mutate(count = sequence(n()))
# Source: local data frame [12 x 2]
# Groups: out
#
# out count
# 1 a 1
# 2 b 1
# 3 c 1
# 4 a 2
# 5 d 1
# 6 b 2
# 7 c 2
# 8 a 3
# 9 d 2
# 10 b 3
# 11 c 3
# 12 a 4

第三种选择是使用 getanID来自我的“splitstackshape”包。对于这个特定的例子,你只需要指定 data.frame名称(因为它是单列),但是,通常,您会更具体并提及当前用作“id”的列,并且该函数将检查它们是否唯一或是否需要累积序列使它们独一无二。
library(splitstackshape)
# getanID(dataset, "out") ## Example of being specific about column to use
getanID(dataset)
# out .id
# 1: a 1
# 2: b 1
# 3: c 1
# 4: a 2
# 5: d 1
# 6: b 2
# 7: c 2
# 8: a 3
# 9: d 2
# 10: b 3
# 11: c 3
# 12: a 4

关于r - 值出现的累积序列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15230446/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com