gpt4 book ai didi

Ruby 无法解析 CSV 文件:CSV::MalformedCSVError(第 1 行中的非法引用。)

转载 作者:数据小太阳 更新时间:2023-10-29 06:30:00 26 4
gpt4 key购买 nike

Ubuntu 12.04 LTS

Ruby ruby​​ 1.9.3dev(2011-09-23 修订版 33323)[i686-linux]

轨道 3.2.9

以下是我收到的 CSV 文件的内容:

"date/time","settlement id","type","order id","sku","description","quantity","marketplace","fulfillment","order city","order state","order postal","product sales","shipping credits","gift wrap credits","promotional rebates","sales tax collected","selling fees","fba fees","other transaction fees","other","total"
"Mar 1, 2013 12:03:54 AM PST","5481545091","Order","108-0938567-7009852","ALS2GL36LED","Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor","1","amazon.com","Amazon","Pasadena","CA","91104-1056","43.00","3.25","0","-3.25","0","-6.45","-3.75","0","0","32.80"

但是,当我尝试解析 CSV 文件时出现错误:

1.9.3dev :016 > options = { col_sep: ",", quote_char:'"' }
=> {:col_sep=>",", :quote_char=>"\""}

1.9.3dev :022 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }
CSV::MalformedCSVError: Illegal quoting in line 1.
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'
from (irb):22
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>'

然后我尝试简化数据,即

"name","age","email"
"jignesh","30","jignesh@example.com"

但是我仍然遇到同样的错误:

      1.9.3dev :023 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }
CSV::MalformedCSVError: Illegal quoting in line 1.
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'
from (irb):23
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>'

我再次尝试像这样简化数据:

name,age,email
jignesh,30,jignesh@example.com

它起作用了。请看下面的输出:

  1.9.3dev :024 > CSV.foreach("/tmp/my_data.csv") { |row| puts row }
name
age
email
jignesh
30
jignesh@example.com
=> nil

但我将收到包含引用数据的 CSV 文件,因此删除引号解决方案实际上并不是我正在寻找的。我无法找出导致错误的原因:CSV::MalformedCSVError: Illegal quoting in line 1. 在我之前的示例中。

我已经通过在我的文本编辑器中启用“显示空白字符”和“显示行尾”来验证 CSV 中没有前导/尾随空格。我还使用以下方法验证了编码。

  1.9.3dev :026 > File.open("/tmp/my_data.csv").read.encoding
=> #<Encoding:UTF-8>

注意:我也尝试使用 CSV.read,但该方法出现同样的错误。

谁能帮我解决这个问题,让我明白哪里出了问题?

=====================

我刚刚在以下位置找到以下帖子:http://www.ruby-forum.com/topic/448070并尝试以下操作:

  file_data = file.read
file_data.gsub!('"', "'")
arr_of_arrs = CSV.parse(file_data)

arr_of_arrs.each do |arr|
Rails.logger.debug "=======#{arr}"
end

得到如下输出:

   =======["\xEF\xBB\xBF'date/time'", "'settlement id'", "'type'", "'order id'", "'sku'", "'description'", "'quantity'", "'marketplace'", "'fulfillment'", "'order city'", "'order state'", "'order postal'", "'product sales'", "'shipping credits'", "'gift wrap credits'", "'promotional rebates'", "'sales tax collected'", "'selling fees'", "'fba fees'", "'other transaction fees'", "'other'", "'total'"]
=======["'Mar 1", " 2013 12:03:54 AM PST'", "'5481545091'", "'Order'", "'108-0938567-7009852'", "'ALS2GL36LED'", "'Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor'", "'1'", "'amazon.com'", "'Amazon'", "'Pasadena'", "'CA'", "'91104-1056'", "'43.00'", "'3.25'", "'0'", "'-3.25'", "'0'", "'-6.45'", "'-3.75'", "'0'", "'0'", "'32.80'"]

由于使用的默认 col_sep 是逗号字符,因此无法正确读取数据。但是我尝试像这样使用 quote_char 选项:

  arr_of_arrs = CSV.parse(file_data, :quote_char => "'")

但最终出现以下错误:

   CSV::MalformedCSVError (Illegal quoting in line 1.):

谢谢,吉涅什

最佳答案

quote_chars = %w(" | ~ ^ & *)
begin
@report = CSV.read(csv_file, headers: :first_row, quote_char: quote_chars.shift)
rescue CSV::MalformedCSVError
quote_chars.empty? ? raise : retry
end

它并不完美,但大部分时间都有效。

注意CSV.parse 采用与 CSV.read 相同的参数,因此可以使用文件或内存中的数据

关于Ruby 无法解析 CSV 文件:CSV::MalformedCSVError(第 1 行中的非法引用。),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16772830/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com