gpt4 book ai didi

R用列表列 reshape 熔化的data.table

转载 作者:行者123 更新时间:2023-12-04 21:19:45 25 4
gpt4 key购买 nike

我有一个很大的(数百万行)融化了 data.table与平常 melt式在 variable 中展开和 value列。我需要以宽形式转换表格(向上滚动变量)。问题是数据表还有一个列表列叫data ,我需要保留。这使得无法使用 reshape2因为 dcast无法处理非原子列。因此,我需要自己进行卷起。

来自 previous question 的回答由于列表列,关于处理熔化数据表的内容不适用于此处。

我对我提出的解决方案不满意。我正在寻找更简单/更快实现的建议。

x <- LETTERS[1:3]
dt <- data.table(
x=rep(x, each=2),
y='d',
data=list(list(), list(), list(), list(), list(), list()),
variable=rep(c('var.1', 'var.2'), 3),
value=seq(1,6)
)

# Column template set up
list_template <- Reduce(
function(l, col) { l[[col]] <- col; l },
unique(dt$variable),
list())

# Expression set up
q <- substitute({
l <- lapply(
list_template,
function(col) .SD[variable==as.character(col)]$value)
l$data = .SD[1,]$data
l
}, list(list_template=list_template))

# Roll up
dt[, eval(q), by=list(x, y)]

x y var.1 var.2 data
1: A d 1 2 <list>
2: B d 3 4 <list>
3: C d 5 6 <list>

最佳答案

我有一些作弊方法可能会奏效 - 重要的是,我假设每个 x,y,list 组合都是独一无二的!如果不是,请无视。

我将创建两个单独的数据表,第一个是没有数据列表对象的 dcasted,第二个只有唯一的数据列表对象和一个键。然后只需将它们合并在一起即可获得所需的结果。

require(data.table)
require(stringr)
require(reshape2)

x <- LETTERS[1:3]
dt <- data.table(
x=rep(x, each=2),
y='d',
data=list(list("a","b"), list("c","d")),
variable=rep(c('var.1', 'var.2'), 3),
value=seq(1,6)
)


# First create the dcasted datatable without the pesky list objects:
dt_nolist <- dt[,list(x,y,variable,value)]
dt_dcast <- data.table(dcast(dt_nolist,x+y~variable,value.var="value")
,key=c("x","y"))


# Second: create a datatable with only unique "groups" of x,y, list
dt_list <- dt[,list(x,y,data)]

# Rows are duplicated so I'd like to use unique() to get rid of them, but
# unique() doesn't work when there's list objects in the data.table.
# Instead so I cheat by applying a value to each row within an x,y "group"
# that is unique within EACH group, but present within EVERY group.
# Then just simply subselect based on that unique value.
# I've chosen rank(), but no doubt there's other options

dt_list <- dt_list[,rank:=rank(str_c(x,y),ties.method="first"),by=str_c(x,y)]

# now keep only one row per x,y "group"
dt_list <- dt_list[rank==1]
setkeyv(dt_list,c("x","y"))

# drop the rank since we no longer need it
dt_list[,rank:=NULL]

# Finally just merge back together
dt_final <- merge(dt_dcast,dt_list)

关于R用列表列 reshape 熔化的data.table,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14556593/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com