gpt4 book ai didi

rubyXL (Errno::ENOENT)

转载 作者:数据小太阳 更新时间:2023-10-29 08:05:00 24 4
gpt4 key购买 nike

我在使用 ruby​​XL 构建的爬虫时遇到了问题。它正确地遍历了我的文件系统,但我收到了 (Errno::ENOENT) 错误。我已经检查了所有 ruby​​XL 代码,一切似乎都已检查。我的代码附在下面 - 有什么建议吗?

/Users/.../testdata.xlsx
/Users/.../moretestdata.xlsx
/Users/.../Lab 1 Data.xlsx
/Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:404:in `initialize': No such file or directory - /Users/Dylan/.../sheet6.xml (Errno::ENOENT)
from /Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:404:in `open'
from /Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:404:in `block in decompress'
from /Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:402:in `upto'
from /Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:402:in `decompress'
from /Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:47:in `parse'
from xlcrawler.rb:9:in `block in xlcrawler'
from /Users/Dylan/.rvm/rubies/ruby-1.9.3-p327/lib/ruby/1.9.1/find.rb:41:in `block in find'
from /Users/Dylan/.rvm/rubies/ruby-1.9.3-p327/lib/ruby/1.9.1/find.rb:40:in `catch'
from /Users/Dylan/.rvm/rubies/ruby-1.9.3-p327/lib/ruby/1.9.1/find.rb:40:in `find'
from xlcrawler.rb:6:in `xlcrawler'
from xlcrawler.rb:22:in `<main>'

require 'find'
require 'rubyXL'

def xlcrawler(path)
count = 0
Find.find(path) do |file| # begin iteration of each file of a specified directory
if file =~ /\b.xlsx$\b/ # check if a given file is xlsx format
puts file # ensure crawler is traversing the file system
workbook = RubyXL::Parser.parse(file).worksheets # creates an object containing all worksheets of an excel workbook
workbook.each do |worksheet| # begin iteration over each worksheet
data = worksheet.extract_data.to_s # extract data of a given worksheet - must be converted to a string in order to match a regex
if data =~ /regex/
puts file
count += 1
end
end
end
end
puts "#{count} files were found"
end

xlcrawler('/Users/')

最佳答案

我在 github 上对 ruby​​XL 代码进行了一些挖掘,看起来解压缩方法中存在错误。

  files['styles'] = Nokogiri::XML.parse(File.open(File.join(dir_path,'xl','styles.xml'),'r'))
@num_sheets = files['workbook'].css('sheets').children.size
@num_sheets = Integer(@num_sheets)

#adds all worksheet xml files to files hash
i=1
1.upto(@num_sheets) do
filename = 'sheet'+i.to_s # <----- BUG IS HERE
files[i] = Nokogiri::XML.parse(File.open(File.join(dir_path,'xl','worksheets',filename+'.xml'),'r'))
i=i+1
end

此代码块对 Excel 中的工作表编号进行了假设,但这是不正确的。此代码简单地计算纸张的数量,并按数字分配它们。但是,如果您删除一个工作表然后创建一个新工作表,则数字序列将被破坏。

如果你检查你的 Lab Data 1.xlsx 文件,你会看到没有 sheet6 如果你拉起 vba 开发者窗口(通过按 alt + F11)你应该看到类似的东西

sheet list

如您所见,这种安排将破坏 for 循环并在 i = 6 时导致异常。

关于rubyXL (Errno::ENOENT),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14174451/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com