r - 如何将 fun.aggregate 作为参数传递给 dcast.data.table？-6ren

r - 如何将 fun.aggregate 作为参数传递给 dcast.data.table？

转载作者：行者123 更新时间：2023-12-01 00:59:42

TL；DR:我怎样才能通过 fun.aggregate进入 dcast.data.table当调用dcast.data.table在函数内完成(我将 fun.aggregate 传递给该函数)？

我有一张这样的 table :

library(data.table)
t <- data.table(id=rep(1:2, c(3,4)), k=c(rep(letters[1:3], 2), 'c'), v=1:7)
t
   id k v
1:  1 a 1
2:  1 b 2
3:  1 c 3
4:  2 a 4
5:  2 b 5
6:  2 c 6
7:  2 c 7  # note the duplicate (2, c)

我 reshape 为长格式，保留最后一次出现的重复

dcast.data.table(t, id ~ k, value.var='v', fun.aggregate=last) # last is in data.table
   id a b c
1:  1 1 2 3
2:  2 4 5 7

但是，如果我包装我的 dcast.data.table调用一个函数:

f <- function (tbl, fun.aggregate) {
    dcast.data.table(tbl, id ~ k, value.var='v', fun.aggregate=fun.aggregate)
}
f(t, last)
Error in `[.data.table`(data, , eval(fun.aggregate), by = c(ff_)) : 
  could not find function "fun.aggregate"

它看起来像符号 fun.aggregate正在评估( eval(fun.aggregate) )但未找到(因为函数“fun.aggregate”不存在)。

我应该如何通过我想要的fun.aggregate进入 f ?

(我确信它与 quote ， substitute 等有关，但我对这些功能非常挣扎，我通常只是将它们随机链接在一起，直到某些东西起作用)。

编辑:

> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-pc-linux-gnu (64-bit)

...

other attached packages:
[1] data.table_1.9.3

糟糕，我刚刚意识到这个错误在 1.9.3(开发版本，我升级到它以避免不相关的错误)和不是在 1.9.2(当前 CRAN 发布版本)中。

我宁愿不降级到 1.9.2(我正在避免上述错误)，所以 一般来说，有一种方法可以保护函数的参数不受 eval() 的影响称呼？

最佳答案

现在已在 commit 1303 中修复。来自 v 1.9.3 - 当前的开发版本。来自 NEWS :

dcast.data.table handles fun.aggregate argument properly when called from within a function that accepts fun.aggregate argument and passes to dcast.data.table(). Closes #713. Thanks to mathematicalcoffee for reporting here on SO.

请注意， dcast.data.table 中还有一个小疏忽。现在已修复 - #715 .

问题是 last函数不会为所有输入值生成长度为 1 的值 - 这是 fun.aggregate 的要求.

last(integer(0))
# [1] integer(0)

当 fill参数未设置，这是用于填充缺失组合的值。这个案子以前没有被抓到，但现在已经解决了。

这是现在(正确)行为的示例:

tt <- t[1:5] # t is from your example
dcast.data.table(tt, id ~ k, fun.aggregate=last)
# Error in dcast.data.table(tt, id ~ k, fun.aggregate = last) : 
#   Aggregating function provided to argument 'fun.aggregate' should always return 
#   a length 1 vector, but returns 0-length value for fun.aggregate(integer(0)). 
#   This value will have to be used to fill missing combinations, if any, and 
#   therefore can not be of length 0. Either override by setting the 'fill' argument 
#   explicitly or modify your function to handle this case appropriately.

dcast.data.table(tt, id ~ k, fun.aggregate=last, fill=NA)
#    id a b  c
# 1:  1 1 2  3
# 2:  2 4 5 NA

关于r - 如何将 fun.aggregate 作为参数传递给 dcast.data.table？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/24542976/