gpt4 book ai didi

r - 合并多个列名重复的数据表

转载 作者:行者123 更新时间:2023-12-04 01:37:48 25 4
gpt4 key购买 nike

我正在尝试合并(加入)多个数据表(通过 fread 从 5 个 csv 文件中获得)以形成一个数据表。当我尝试合并 5 个数据表时出现错误,但当我只合并 4 个时工作正常。 MWE 如下:

# example data
DT1 <- data.table(x = letters[1:6], y = 10:15)
DT2 <- data.table(x = letters[1:6], y = 11:16)
DT3 <- data.table(x = letters[1:6], y = 12:17)
DT4 <- data.table(x = letters[1:6], y = 13:18)
DT5 <- data.table(x = letters[1:6], y = 14:19)

# this gives an error
Reduce(function(...) merge(..., all = TRUE, by = "x"), list(DT1, DT2, DT3, DT4, DT5))

Error in merge.data.table(..., all = TRUE, by = "x") : x has some duplicated column name(s): y.x,y.y. Please remove or rename the duplicate(s) and try again.


# whereas this works fine
Reduce(function(...) merge(..., all = TRUE, by = "x"), list(DT1, DT2, DT3, DT4))

x y.x y.y y.x y.y
1: a 10 11 12 13
2: b 11 12 13 14
3: c 12 13 14 15
4: d 13 14 15 16
5: e 14 15 16 17
6: f 15 16 17 18

我有一个解决方法,如果我更改 DT1 的第二列名称:
setnames(DT1, "y", "new_y")

# this works now
Reduce(function(...) merge(..., all = TRUE, by = "x"), list(DT1, DT2, DT3, DT4, DT5))

为什么会发生这种情况,有没有办法在不更改任何列名的情况下合并任意数量的具有相同列名的数据表?

最佳答案

这是一种将计数器保持在 Reduce 内的方法, 如果要在合并期间重命名:

Reduce((function() {counter = 0
function(x, y) {
counter <<- counter + 1
d = merge(x, y, all = T, by = 'x')
setnames(d, c(head(names(d), -1), paste0('y.', counter)))
}})(), list(DT1, DT2, DT3, DT4, DT5))
# x y.x y.1 y.2 y.3 y.4
#1: a 10 11 12 13 14
#2: b 11 12 13 14 15
#3: c 12 13 14 15 16
#4: d 13 14 15 16 17
#5: e 14 15 16 17 18
#6: f 15 16 17 18 19

关于r - 合并多个列名重复的数据表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32526889/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com