gpt4 book ai didi

r - 临时连接周围的r.data.table函数包装器(链中聚合)

转载 作者:行者123 更新时间:2023-12-04 23:17:45 24 4
gpt4 key购买 nike

[data.table_1.9.6]
问题的背景是,我试图在类似星形模式的数据布局中构建类似olap的查询功能,即大型事实表和几个元表。我正在围绕data.table联接构建函数包装器,然后像这样在链中进行聚合:

# dummy data
dt1 = data.table(id = 1:5, x=letters[1:5], a=11:15, b=21:25)
dt2 = data.table(k=11:15, z=letters[11:15])

# standard data.table query with ad-hoc key -> works fine
dt1[dt2, c("z") := .(i.z), with = F,
on = c(a="k")][, .(m = sum(a, na.rm = T),
count = .N), by = c("z")]

# wrapper function with setkey -> works fine
agg_foo <- function(x, meta_tbl, x_key, meta_key, agg_var) {
setkeyv(x, x_key)
setkeyv(meta_tbl, meta_key)
x[meta_tbl, (agg_var) := get(agg_var)][,.(a_sum = sum(a, na.rm=T),
count = .N),
by = c(agg_var)]
x[, (agg_var) := .(NULL)]
}

# call function (works fine)
agg_foo(x=dt1, meta_tbl=dt2, x_key="a", meta_key="k",agg_var="z")

# wrapper function with ad-hoc key -> does not work
agg_foo_ad_hoc <- function(x, meta_tbl, x_key, meta_key, agg_var) {
x[meta_tbl, (agg_var) := get(agg_var),
on = c(x_key = meta_key)][,.(a_sum = sum(a, na.rm=T),
count = .N), by = c(agg_var)]
x[, (agg_var) := .(NULL)]
}

# call function (causes error)
agg_foo_ad_hoc(x=dt1, meta_tbl=dt2, x_key="a", meta_key="k",agg_var="z")

Error in forderv(x, by = rightcols) :
'by' value -2147483648 out of range [1,4]


我的猜测是,我必须以不同的方式提供临时的“ on”参数。我试过= c(get(x_key)= meta_key),但随后他抱怨出现意外的括号。我可以使用该函数的setkey版本,但是我不知道这是否有效,因为该函数将根据使用哪个属性来在不同的元表上工作,并使用该属性来聚合,从而不断地重新设置密钥。还是总是首选setkey?实际事实表(此处为x)具有> 3000万行。

最佳答案

您需要做的就是构建带有正确标签的向量。这是一种方法:

agg_foo_ad_hoc <- function(x, meta_tbl, x_key, meta_key, agg_var) { 
x[meta_tbl, (agg_var) := get(agg_var),
on = setNames(meta_key, x_key)][,.(a_sum = sum(a, na.rm=T),
count = .N), by = c(agg_var)]
x[, (agg_var) := .(NULL)]
}

关于r - 临时连接周围的r.data.table函数包装器(链中聚合),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37706385/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com