gpt4 book ai didi

r - 计算 R data.table 中多个概率的分位数,一起计算多列

转载 作者:行者123 更新时间:2023-12-05 02:12:44 26 4
gpt4 key购买 nike

DT = data.table(x=rep(c("b","a","c"),each=3), y=c(1,3,6), v=1:9)

# Desired output
rbind(cbind(id = "v", DT[x == "a", as.list(quantile(.SD, prob = c(0.05, .5, 0.95), na.rm = T)), by = x, .SDcols = c("v")]),
cbind(id = "y", DT[x == "a", as.list(quantile(.SD, prob = c(0.05, .5, 0.95), na.rm = T)), by = x, .SDcols = c("y")]),

cbind(id = "v", DT[x == "b", as.list(quantile(.SD, prob = c(0.05, .5, 0.95), na.rm = T)), by = x, .SDcols = c("v")]),
cbind(id = "y", DT[x == "b", as.list(quantile(.SD, prob = c(0.05, .5, 0.95), na.rm = T)), by = x, .SDcols = c("y")]),
cbind(id = "v", DT[x == "c", as.list(quantile(.SD, prob = c(0.05, .5, 0.95), na.rm = T)), by = x, .SDcols = c("v")]),
cbind(id = "y", DT[x == "c", as.list(quantile(.SD, prob = c(0.05, .5, 0.95), na.rm = T)), by = x, .SDcols = c("y")])
)
# id x 5% 50% 95%
# 1: v a 4.1 5 5.9
# 2: y a 1.2 3 5.7
# 3: v b 1.1 2 2.9
# 4: y b 1.2 3 5.7
# 5: v c 7.1 8 8.9
# 6: y c 1.2 3 5.7

如何使用 data.table(几 GB 的内存)在非常大的数据集上高效地实现上述输出?我试过了,但这不是我想要的

# not right, want all 3 percentiles on the same row, for x and then y:
out <- DT[ , lapply(.SD, quantile, prob = c(0.05, .5, 0.95), na.rm = T), .SDcols = c("v", "y"), keyby = "x"]
out

那么我怎样才能得到上面我想要的输出,但 id 分布在列中,这样它就变成了一个 3 x 6 的 data.table。例如列 v5% v50% v95% y5% y50% y95% 3 行。

最佳答案

您可以使用 melt/dcast 来实现这一点:

dcast(melt(out[, p := rep(paste0(c(5, 50, 95), "%"), 3)], 
c("p", "x"),
variable.name = "id"),
id + x ~ ...)[order(x, id)]
# id x 5% 50% 95%
# 1: v a 4.1 5 5.9
# 2: y a 1.2 3 5.7
# 3: v b 1.1 2 2.9
# 4: y b 1.2 3 5.7
# 5: v c 7.1 8 8.9
# 6: y c 1.2 3 5.7

没有中间结果的另一种选择;

melt(DT[, v := as.numeric(v)], 
"x",
c("v", "y"),
variable.name = "id")[, as.list(quantile(value,
prob = c(.05, .5, .95))),
.(x, id)][order(x, id)]
# x id 5% 50% 95%
# 1: a v 4.1 5 5.9
# 2: a y 1.2 3 5.7
# 3: b v 1.1 2 2.9
# 4: b y 1.2 3 5.7
# 5: c v 7.1 8 8.9
# 6: c y 1.2 3 5.7

注意。我将 v 列转换为 numeric(来自 int)以避免来自 融化

关于r - 计算 R data.table 中多个概率的分位数,一起计算多列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55511518/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com