gpt4 book ai didi

r - 使用 write.csv (base R) 保留数字作为字符字段

转载 作者:行者123 更新时间:2023-12-01 10:45:01 25 4
gpt4 key购买 nike

我有包含数字(如'0123'、'1234'等)的字符列的data.frames。当我将它们写入 csv 并读取它们时,它们最终会变成数字列。 write.csvread.csv功能有quote参数,默认情况下应该在输出中引用字符串并在输入时尊重它们,因此这种行为是意外的。

如何避免这种情况,而无需手动指定 colClasses当我读回文件时?

可重现的例子:

# dummy data
fake_data <-
data.frame(num=1:25, char=letters[1:25], charnum=as.character(1:25),
stringsAsFactors=F)

# check out col classes - all good
sapply(fake_data, class)

# num char charnum
# "integer" "character" "character"

# write it to a file and read it back
fpath <- '~/Desktop/fake_data.csv'
write.csv(fake_data, fpath, row.names=F)
fake_data2 <- read.csv(fpath, stringsAsFactors=F)

# but now look, different classes!
sapply(fake_data2, class)

# num char charnum
# "integer" "character" "integer"

似乎错误在读取端,因为文件是用引号写入的。
> cat(readLines(fpath))
"num","char","charnum" 1,"a","1" 2,"b","2" 3,"c","3" 4,"d","4" 5,"e","5" 6,"f","6" 7,"g","7" 8,"h","8" 9,"i","9" 10,"j","10" 11,"k","11" 12,"l","12" 13,"m","13" 14,"n","14" 15,"o","15" 16,"p","16" 17,"q","17" 18,"r","18" 19,"s","19" 20,"t","20" 21,"u","21" 22,"v","22" 23,"w","23" 24,"x","24" 25,"y","25"

session 信息:

R 版本 3.1.1 (2014-07-10) |平台:x86_64-apple-darwin13.1.0(64 位)

最佳答案

感谢您的回答。进一步看这个,我有以下几点要补充。

选项 1:只使用 data.table::fread -- 像我想的那样工作

选项 2:执行此操作以构造 colClasses 字符串

 # read header and first data line
first_data_line <- strsplit(readLines(fpath, n=2L)[2], ',')[[1]]

# find which fields have double quotes
char_fields <- grep('"', first_data_line)

# construct colClasses vec
cc <- rep(NA, length(first_data_line))
cc[char_fields] <- 'character'

反正我是 data.table 的粉丝,#1 可能就是我要做的。

关于r - 使用 write.csv (base R) 保留数字作为字符字段,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27432339/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com