gpt4 book ai didi

r - fread 无法将裸露的 + 或 - 识别为字符字段

转载 作者:行者123 更新时间:2023-12-01 10:00:31 24 4
gpt4 key购买 nike

我正在尝试 fread 一些看起来像这样并且大小约为 2GB 的文件:

head file.bed
chr1 19922471 19924471 + NM_001204088 tss 1 0
chr1 19922471 19924471 + NM_001204088 tss 2 0
chr1 19922471 19924471 + NM_001204088 tss 3 0
chr1 19922471 19924471 + NM_001204088 tss 4 0
chr1 19922471 19924471 + NM_001204088 tss 5 0
chr1 19922471 19924471 + NM_001204088 tss 6 0
chr1 19922471 19924471 + NM_001204088 tss 7 0
chr1 19922471 19924471 + NM_001204088 tss 8 0
chr1 19922471 19924471 + NM_001204088 tss 9 0
chr1 19922471 19924471 + NM_001204088 tss 10 0

第 4 列也有类似数量的“-”。在 R 中读取 +/- 变成 0:

cov.data <- fread(file)
head(cov.data)
V1 V2 V3 V4 V5 V6 V7 V8
1: chr1 19922471 19924471 0 NM_001204088 tss 1 1
2: chr1 19922471 19924471 0 NM_001204088 tss 2 1
3: chr1 19922471 19924471 0 NM_001204088 tss 3 1
4: chr1 19922471 19924471 0 NM_001204088 tss 4 1
5: chr1 19922471 19924471 0 NM_001204088 tss 5 1
6: chr1 19922471 19924471 0 NM_001204088 tss 6 0

我查看了文档,但没有弄清楚原因。有什么建议么?由于 fread 仍在开发中,这可能是一个错误吗?

最佳答案

两件事:

首先,如果您的文件带有引号,也就是说,如果您的 strand 列是 "+""-",然后 data.table 版本 1.8.8 中的 fread 将正确读取。

第二,this has been rectified in data.table version 1.8.9 ,您可以使用以下方式安装:

install.packages("data.table",repos="http://R-Forge.R-project.org", type="source")

如果需要,你可以安装devtools 然后使用dev_mode(TRUE) 进入开发模式然后安装data.table这样就不会影响你正常运行data.table 1.8.8.

从 1.8.9 复制/粘贴相关变更日志:

NEW FEATURES
o fread :
* If some column names are blank they are now given default names rather than causing
the header row to be read as a data row. Thanks to Simon Judes for suggesting.

* "+" and "-" are now read as character rather than integer 0. Thanks to Alvaro Gonzalez for reporting.
https://stackoverflow.com/questions/15388714/reading-strand-column-with-fread-data-table-package

....

关于r - fread 无法将裸露的 + 或 - 识别为字符字段,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16884613/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com