r - 使用 R 中 'na.strings' 函数的 'colClasses' 和 'fread' 参数读取数据时，列模式错误-6ren

r - 使用 R 中 'na.strings' 函数的 'colClasses' 和 'fread' 参数读取数据时，列模式错误

转载作者：行者123 更新时间：2023-12-01 22:32:30

25

4

Windows 8.1、R 版本 3.1.1 (2014-07-10)、系统 x86_64、mingw32

我有一个包含大量观察结果的文件 ( here )。以下是文件中的一些字符串

Date;Time;Global_active_power;Global_reactive_power;Voltage;Global_intensity;Sub_metering_1;Sub_metering_2;Sub_metering_3
16/12/2006;17:24:00;4.216;0.418;234.840;18.400;0.000;1.000;17.000
16/12/2006;17:25:00;5.360;0.436;233.630;23.000;0.000;1.000;16.000
28/4/2007;00:20:00;0.492;0.208;236.240;2.200;0.000;0.000;0.000
28/4/2007;00:21:00;?;?;?;?;?;?;
21/12/2006;11:25:00;0.246;0.000;241.740;1.000;0.000;0.000;0.000
21/12/2006;11:26:00;0.246;0.000;241.830;1.000;0.000;0.000;0.000

NA值用“?”表示。我正在尝试使用

读取文件

epcData <- fread(dataFile,
                 sep = ";",
                 header = TRUE,
                 na.strings = "?",
                 colClasses = c("character", "character", rep("numeric", 7)),
                 stringsAsFactors = FALSE)

我收到如下警告:

Bumped column 3 to type character on data row 10, field contains '?'. Coercing previously read values in this column from integer or numeric back to character which may not be lossless; e.g., if '00' and '000' occurred before they will now be just '0', and there may be inconsistencies with treatment of ',,' and ',NA,' too (if they occurred in this column before the bump). If this matters please rerun and set 'colClasses' to 'character' for this column. Please note that column type detection uses the first 5 rows, the middle 5 rows and the last 5 rows, so hopefully this message should be very rare. If reporting to datatable-help, please rerun and include the output from verbose=TRUE.

第 10 行是

   28/4/2007;00:21:00;?;?;?;?;?;?;

epcData[10]

打印

         Date     Time Global_active_power Global_reactive_power Voltage
1: 28/4/2076 00:21:00                  NA                    NA      NA
   Global_intensity Sub_metering_1 Sub_metering_2 Sub_metering_3
1:               NA             NA             NA             NA

但是所有列的模式都是“字符”，即使对于第 3:9 列也是如此(但是 colClasses = c("character", "character", rep("数字”，7)))。

出了什么问题？

最佳答案

从今天开始，data.table 包的版本为 1.12.2。这不再是问题，上述 csv 数据的导入工作完美，所有问号都替换为 NAs

关于r - 使用 R 中 'na.strings' 函数的 'colClasses' 和 'fread' 参数读取数据时，列模式错误，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/25724918/

25

4

0

文章推荐： excel - 单元格引用前的感叹号是什么意思？

文章推荐： java - 安装后我找不到如何启动 eclipse IDE

文章推荐： excel - 导入数据后替换所有列的所有错误值(同时保留行)

c++ - fwrite, fread - fread 的问题
我有以下代码: int main() { char* pedal[20]; char* pedal2[20]; for (int i = 0; i < 20; i++)
c - 为什么第二次使用 fread() 时它不是从头开始而是从第一次 fread() 读取的末尾开始读取文件？
我想用 in.wav 文件中的数据填充 hdr (结构)变量，并且我想复制 in 的前 64 个字节。 wav 文件转换为另一个文件 (out.wav)。但是!当第二次使用fread()时，它开始从
php - Warning : fread() [function. fread]: Length参数必须大于0
我有一个由 1and1 托管的网站 - 他们没用!由于某种原因，他们提供的备份脚本不再有效，他们无法为我提供答案!所以我想我会自己写，这是我的代码: if (file_exists('backup
c - fread(&buffer...) 和 fread(buffer...) 有什么区别
我正在尝试从文件中读取并将其复制到另一个文件。我正在网上查看一些代码，我似乎注意到有些人以这种方式声明 fread: fread (buffer, 1, 1000, src) 一些这样 fread (
fread - fread()中的 "short item count"是什么？
当我是“男人的恐惧”时，我得到了: RETURN VALUE fread() and fwrite() return the number of items successfully read or
fread - 如何处理 fread 中的 Coverity 错误 TAINTED_SCALAR
从文件中读取整数值时，覆盖率检查给出以下错误调用函数“fread”会污染参数“readval” //coverity note: Calling function "fread" taints ar
php - 在单独的行上使用 fseek() fread() 还是 fread() 整个文件和 substr 来解析更好？
为了更清楚地说明这一点，我将放置代码示例: $file = fopen('filename.ext', 'rb'); // Assume $pos has been declared // metho
c++ - 此 matlab 代码的 C++ 等效项是什么(fread matlab 与 fread C/C++)？
尝试转换此 matlab 代码: fid = fopen([fpath, '/file.bin'],'rb'); content = fread(fid, 11,'single'); 我当前的尝试如下
c - fread 和 endianness : will fread(pointer, sizeof(some),1,file pointer) 有相同的结果吗？
假设我有: FILE* fp = fopen("myfile.bin", "r"); char something[30]; fread(something,sizeof(char)*30,1,fp)
c - 如何 fwrite 和 fread endianness independent integers，这样我就可以在许多机器上 fwrite 和 fread 并且总是有相同的结果
fwrite 一个整数取决于字节序，但是有没有一种方法可以将一个整数 0x00000004 写入一个文件，这样无论它运行在什么机器上，它都可以始终被读取为 0x00000004。一个想法是始终按照特
c++ - 什么是 Matlabs `fread(fp, 1, ' int3 2')` 和 fread(fp, n, 'uchar' ) 的 C++ 翻译
所以我尝试将此类 Matlab 代码转换为 C++: ss = 'file.mask' fp = fopen(ss, 'rb'); sx = fread(fp, 1, 'int32') sy = f
fread() 可以检测空字符吗？
使用 C，可以使用函数 fread 来读取以 null 结尾的字符串吗？我必须读取一个以 ip 开头的文件，该文件是 4 个无符号字符，后跟一个描述空终止字符串数的整数。之后，我需要读取字符串，直到
r - fread 和带引号的多行列值
> fread('col1,col2\n') Empty data.table (0 rows) of 2 cols: col1,col2 > fread('col1,col2\n5,4') c
r - fread - 读取所有列作为字符
我正在尝试使用 data.table 将文件读入 R/fread .一些字段有前导零，我只想将数据作为字符读取并手动修复它们。但是我不知道如何将其传达给 fread .我正在尝试这个，它像往常一样分配
检索 fread 使用的列分隔符
fread来自 data.table包一般可以在读取文件时自动确定列分隔符( sep )。例如，这里fread自动检测 |作为列分隔符: library(data.table) fread(past
使用带有行名和列名的 fread 读取文件
使用 fread，如何读取包含行名和列名的 CSV 文件。我尝试了以下操作，但它没有正确读取行和列名称。 csv 文件看起来像(其中 C1、C2、C3 是列名，r1、r2、r3 是行名) input
使用 fread 读取对齐的列数据
我遇到了这样的文件: COL1 COL2 COL3 weqw asrg qerhqetjw weweg ethweth
r - fread - 字符串中的多个分隔符
我正在尝试使用 fread 读取表格。 txt 文件具有如下所示的文本: "No","Comment","Type" "0","he said:"wonderful|"","A" "1","Pr/ "
r - fread 没有正确读取列名
我正在尝试使用从 Apple 移动性报告生成的 csv，可以找到 here . 现在一切正常，我能够按预期获得 .csv，它看起来像这样的文字: csvtxt <- "geo_type,region,
r - fread 保护堆栈溢出错误
我在 data.table (1.8.8, R 3.0.1) 中使用 fread 试图读取非常大的文件。有问题的文件有 313 行和约 660 万列数字数据行，文件大小约为 12GB。这是具有 51

首页

博学

6Ren·AI

商城

r - 使用 R 中 'na.strings' 函数的 'colClasses' 和 'fread' 参数读取数据时，列模式错误