gpt4 book ai didi

r - 字典已满!使用 dplyr 的错误消息

转载 作者:行者123 更新时间:2023-12-04 13:31:55 25 4
gpt4 key购买 nike

你好,我正想在字典上做点什么,
这是一个头:

           V1 V2 V3  scaf_name
1: scaffold_0 1 1 scaffold_0
2: scaffold_0 2 1 scaffold_0
3: scaffold_0 3 1 scaffold_0
4: scaffold_0 4 1 scaffold_0
5: scaffold_0 5 1 scaffold_0
6: scaffold_0 6 1 scaffold_0
这是我试过的代码:
tab3<-tab %>% 
group_by(scaf_name) %>%
summarise(Avg_group=mean(V3),Length=last(V2))
这是我收到的错误消息
Error: Internal error: Dictionary is full!
这是选项卡的尺寸
> dim(tab)
[1] 852355422 4
因此,使用 dplyr 的数据框似乎太大了,有人知道我该如何克服这个问题吗?
非常感谢您
这是 df 的一小部分
> dput(tab_bis)
structure(list(V1 = c("scaffold_0", "scaffold_0", "scaffold_0",
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0",
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0",
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0",
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0",
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0",
"scaffold_0", "scaffold_0"), V2 = 1:30, V3 = c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), scaf_name = c("scaffold_0",
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0",
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0",
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0",
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0",
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0",
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0")), row.names = c(NA,
-30L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x556f4666b340>)

最佳答案

这是一个 tidyverse 已经知道的问题。
https://github.com/r-lib/vctrs/issues/1133
您绕过了某个值的限制。他们必须修复它。... uint32_t. I thought about just making sure that we store this instead as a uint64_t ...并举例
https://github.com/tidyverse/dplyr/issues/5291
我的解决方案是使用 data.table。

library(data.table)
dt = data.table(tab)
dt[,.(Avg_group=mean(V3),Length=last(V2)),by = .(scaf_name)]

关于r - 字典已满!使用 dplyr 的错误消息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64563951/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com