gpt4 book ai didi

ruby-on-rails - Ruby:CSV 解析器在我的数据中被双引号绊倒

转载 作者:太空宇宙 更新时间:2023-11-03 17:46:36 27 4
gpt4 key购买 nike

我正在处理每日计划的 rake 任务,该任务将下载每天自动发送到 Dropbox 的 CSV,解析它并保存到数据库。我无法控制将数据输入到为此生成 CSV 报告的程序中的方式,因此我无法避免在某些数据中使用双引号。但是,我想知道是否有一种方法可以在 rake 任务中用单引号去除或替换它们,或者以某种方式通知解析器,这样它就不会抛出这个错误。

抽取任务代码:

require 'net/http'
require 'csv'
require 'open-uri'

namespace :fp_import do
desc "download abc_relations from dropbox, save as csv, create or update record in db"
task :fp => :environment do
data = URI.parse("<<file's dropbox link>>").read

File.open(Rails.root.join('lib/assets', 'fp_relation.csv'), 'w') do |file|
file.write(data)
end

file= Rails.root.join('lib/assets', 'fp_relation.csv')

CSV.foreach(file) do |row|
div, fg_style, fg_color, factory, part_style, part_color, comp_code, vendor, design_no, comp_type = row
fg_sku = fg_style + "-" + fg_color
part_sku = part_style + "-" + part_color

relation = FgPart.where('part_sku LIKE ? AND fg_sku LIKE?', "%#{part_sku}%", "%#{fg_sku}%").exists?
if relation == false

FgPart.create(fg_style: fg_style, fg_color: fg_color, fg_sku: fg_sku, factory: factory, part_style: part_style, part_color: part_color, part_sku: part_sku, comp_code: comp_code, comp_type: comp_type, design_no: design_no)
end
end
end
end

此 CSV 中大约有 35,000 行。下面是一个示例。您可以在示例的第 4 行中看到双引号。

示例数据:

"01","502210","018","ZH","5931","001","M","","UPHOLSTERED GLIDER A","RM"
"01","502310","053","ZH","25332","NO","O","","UPHOLSTERED GLIDER","BAG"
"01","502310","065","ZH","25332","NO","O","","UPHOLSTERED GLIDER","BAG"
"01","502312","424","ZH","25332","NO","O","","UPHOLSTERED GLIDER"AUS"","BAG"
"01","503210","277","ZH","25332","NO","O","","UPHOLSTERED GLIDER","BAG"
"01","503310","076","ZH","25332","NO","O","","UPHOLSTERED GLIDER","BAG"
"01","506210","018","ZH","25332","NO","O","","UPHOLSTERED GLIDER","BAG"
"01","506210","467","ZH","25332","NO","O","","UPHOLSTERED GLIDER","BAG"
"01","507610","932","AZ","25332","NO","O","","GLIDER","BAG"
"01","507610","932","AZ","5936","001","M","","GLIDER","RM"

最佳答案

源 CSV 格式错误,之前应转义引号。

我会在用 CSV 解析文件之前编辑文件并删除逗号之间的引号,并用简单的双引号替换双引号,如果您不想编辑原始文件,您可以创建一个新文件。

def fix_csv(file)
out = File.open("fixed_"+file, 'w')
File.readlines(file).each do |line|
line = line[1...-2] #remove beggining and end quotes
line.gsub!(/","/,",") #remove all quotes between commas
line.gsub!(/"/,"'") #replace double quotes to single
out << line +"\n" #add the line plus endline to output
end

out.close
return "fixed_"+file
end

如果你想修改同一个 CSV 文件,你可以这样做:

require 'tempfile'
require 'fileutils'

def modify_csv(file)
temp_file = Tempfile.new('temp')
begin
File.readlines(file).each do |line|
line = line[1...-2]
line.gsub!(/","/,",")
line.gsub!(/"/,"'")
temp_file << line +"\n"
end
temp_file.close
FileUtils.mv(temp_file.path, file)
ensure
temp_file.close
temp_file.unlink
end
end

这解释了here如果您想查看,这将修复或清理您的原始 CSV 文件

关于ruby-on-rails - Ruby:CSV 解析器在我的数据中被双引号绊倒,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35733934/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com