gpt4 book ai didi

r - 如何使用 data.table 通过累积比较来确定值

转载 作者:行者123 更新时间:2023-12-01 23:46:52 26 4
gpt4 key购买 nike

我正在寻找一种方法来确定下面 data.table 中每个子组的父组。

    Group SubGroup Level Parent
1: A A1 0 NA
2: A A2 1 A1
3: A A3 1 A1
4: A A4 2 A3
5: A A5 3 A4
6: A A6 3 A4
7: A A7 3 A4
8: A A8 2 A3
9: A A9 2 A3
10: A A10 2 A3

这是我正在使用的计算方法,但我想知道是否有更好的方法。我的实际数据集包括多个组,因此我还想在计算中添加一个 by= 参数。可以假设父组是最大行索引小于当前行且级别小于当前级别的子组。

tmp = data.table(Group = "A", SubGroup = paste0("A", 1:10),
Level = c(0, 1, 1, 2, 3, 3, 3, 2, 2, 2))
tmp[, Parent := sapply(1:nrow(tmp), function(x)
tmp[, SubGroup[(suppressWarnings(max(which(Level[1:x] < Level[x]))))]])]

最佳答案

dt = data.table(Group = "A", SubGroup = paste0("A", 1:11),
Level = c(0, 1, 1, 2, 3, 3, 3, 2, 2, 2, 3))

# need another grouping layer, to satisfy the row requirements
dt[, rowGroup := cumsum(c(0, diff(Level) != 0)), by = Group]

# get the parent for each Level and rowGroup
parents = dt[, .(Level = Level[.N] + 1, Parent = SubGroup[.N]), by = .(Group, rowGroup)]

setkey(parents, Group, Level, rowGroup)
setkey(dt, Group, Level, rowGroup)

# rolling merge that matches to previous rowGroup
parents[dt, roll = T][order(Group, rowGroup)]
# Group rowGroup Level Parent SubGroup
# 1: A 0 0 NA A1
# 2: A 1 1 A1 A2
# 3: A 1 1 A1 A3
# 4: A 2 2 A3 A4
# 5: A 3 3 A4 A5
# 6: A 3 3 A4 A6
# 7: A 3 3 A4 A7
# 8: A 4 2 A3 A8
# 9: A 4 2 A3 A9
#10: A 4 2 A3 A10
#11: A 5 3 A10 A11

关于r - 如何使用 data.table 通过累积比较来确定值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28681811/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com