gpt4 book ai didi

r - 在 fread 中跳过并自动启动

转载 作者:行者123 更新时间:2023-12-02 08:41:09 24 4
gpt4 key购买 nike

我使用以下代码来读取带有 data.table 库的文件:

fread(myfile, header=FALSE, sep=",", skip=100, colClasses=c("character","numeric","NULL","numeric"))

但我收到以下错误:

The supplied 'sep' was not found on line 80. To read the file as a single character column set sep='\n'.

它说它在第 80 行没有找到 sep,但是我设置了skip=100,所以它不应该关注前 100 行。

更新:我尝试使用skip=101,它有效,但它跳过了数据开始的第一行

我在 Windows 7 上使用 data.table 包的版本 1.9.2 和 R 版本 3.02 64 位

最佳答案

我们不知道您使用的版本号,但在这种情况下我可以猜测。

尝试设置autostart=101

注意?fread中Details的第一段:

Once the separator is found on line autostart, the number of columns is determined. Then the file is searched backwards from autostart until a row is found that doesn't have that number of columns. Thus, the first data row is found and any human readable banners are automatically skipped. This feature can be particularly useful for loading a set of files which may not all have consistently sized banners. Setting skip>0 overrides this feature by setting autostart=skip+1 and turning off the search upwards step.

skip 参数有:

If -1 (default) use the procedure described below starting on line autostart to find the first data row. skip>=0 means ignore autostart and take line skip+1 as the first data row (or column names according to header="auto"|TRUE|FALSE as usual). skip="string" searches for "string" in the file (e.g. a substring of the column names row) and starts on that line (inspired by read.xls in package gdata).

并且autostart参数有:

Any line number within the region of machine readable delimited text, by default 30. If the file is shorter or this line is empty (e.g. short files with trailing blank lines) then the last non empty line (with a non empty line above that) is used. This line and the lines above it are used to auto detect sep, sep2 and the number of fields. It's extremely unlikely that autostart should ever need to be changed, we hope.

在您的情况下,人类可读的标题可能比 30 行大得多,这就是为什么我认为设置 autostart=101 可能有效。无需使用skip

一个动机是为了方便当一个文件包含多个表时。通过将 autostart 设置为您想要从文件中提取的表内的任何行,它会自动为您找到第一个数据行和标题行,然后仅读取该表。您不必像使用 skip 那样担心在数据开头获取确切的行号。 fread 目前只能读取一张表。它可以从单个文件中返回表列表,但这变得有点复杂,而且没有人要求这样做。

关于r - 在 fread 中跳过并自动启动,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22086780/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com