gpt4 book ai didi

ruby-on-rails - 属性在后台时不更新,但在内联运行时工作正常

转载 作者:可可西里 更新时间:2023-11-01 11:34:34 24 4
gpt4 key购买 nike

我正在编写一个屏幕抓取工具,它从帖子中获取 url 列表,然后访问 url 并获取页面上所有链接的列表。然后它会访问所有链接(原始链接和来自抓取的链接)并获取图像列表。当我运行内联作业时一切正常(除了需要 30 秒才能完成,这是一个问题,因为它需要永远响应 API 调用)。出于某种原因,当我使用相同的代码并使用后台工作程序运行它时,有 2 个 url 永远不会更新为完成。它始终是相同的 2 个网址。

奇怪的是我收到了错误消息

3 TID-ov9t89ido WARN: NoMethodError: undefined method `search' for #<Mechanize::File:0x007f9d86e77a40>

3 TID-ov9t89ido 警告:/app/app/models/scraper.rb:16:in scrape_images'
/app/app/workers/image_worker.rb:5:in
执行'/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/processor.rb:151:in execute_job'
/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/processor.rb:133:in
block (2 levels) in process'/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/middleware/chain.rb:127:in block in invoke'
/app/vendor/bundle/ruby/2.2.0/gems/newrelic_rpm-3.12.1.298/lib/new_relic/agent/instrumentation/sidekiq.rb:33:in
block in call'/app/vendor/bundle/ruby/2.2.0/gems/newrelic_rpm-3.12.1.298/lib/new_relic/agent/instrumentation/controller_instrumentation.rb:361:in perform_action_with_newrelic_trace'
/app/vendor/bundle/ruby/2.2.0/gems/newrelic_rpm-3.12.1.298/lib/new_relic/agent/instrumentation/sidekiq.rb:29:in
call'/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/middleware/chain.rb:129:in block in invoke'
/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/middleware/server/active_record.rb:6:in
call'/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/middleware/chain.rb:129:in block in invoke'
/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/middleware/server/retry_jobs.rb:74:in
call'/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/middleware/chain.rb:129:in block in invoke'
/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/middleware/server/logging.rb:11:in
调用中的 block '/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/logging.rb:31:in with_context'
/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/middleware/server/logging.rb:7:in
call'/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/middleware/chain.rb:129:in block in invoke'
/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/middleware/chain.rb:132:in
call'/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/middleware/chain.rb:132:in invoke'
/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/processor.rb:128:in
进程中的 block '/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/processor.rb:167:in stats'
/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/processor.rb:127:in
process'/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/processor.rb:79:in process_one'
/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/processor.rb:67:in
run'/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/util.rb:16:in watchdog'
/app/vendor/bundle/ruby/2.2.0/gems/sidekiq-4.1.1/lib/sidekiq/util.rb:24:in
安全线程中的 block

这是来自这段代码:

 def self.scrape_images(uri)
page = get_page(uri)
base_url = page.uri.to_s
images = page.search('//img') || []
qualify_images(uri, images).push(base_url)
end

我看到 Mechanize 不是线程安全的,我认为这可能是我的问题,但我不明白当它适用于其他所有内容时如何给我这个错误。任何帮助都是光荣的,感谢阅读。

最佳答案

我正在添加答案,因为我在搜索时没有在 SO 上找到答案。如果 Mechanize 访问内容类型为 .txt 的页面,它不会返回 Page 对象,而是返回 File 对象。在我的案例中,我用保护条款解决了它:

return [] unless page.class == Mechanize::Page

关于ruby-on-rails - 属性在后台时不更新,但在内联运行时工作正常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36903254/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com