gpt4 book ai didi

ruby Mechanize 404 => 网络::HTTPNotFound

转载 作者:数据小太阳 更新时间:2023-10-29 07:31:46 25 4
gpt4 key购买 nike

我有一个无法使用 Mechanize 访问的 URL,我不知道为什么:

# Use ruby 2.1.6
require 'mechanize'
require 'axlsx' # 2.0.1
require 'roo' # 1.13.2

mechanize = Mechanize.new
mechanize.request_headers = { "Accept-Encoding" => "" }
mechanize.ignore_bad_chunking = true
mechanize.follow_meta_refresh = true

xlsx = Roo::Excelx.new("./base_list.xlsx")

xlsx.each_with_pagename do |page, sheet|
sheet.each do |row|
page = mechanize.get(row[0])
end
end

当我遍历我的列表时,我得到的 url 如下:https://angel.co/_helencousins ,我可以用我的浏览器访问它,但不能用 Mechanize,我有这个错误:

/.rvm/gems/ruby-2.1.6/gems/mechanize-2.7.4/lib/mechanize/http/agent.rb:316:in `fetch': 404 => Net::HTTPNotFound for https://angel.co/_helencousins -- unhandled response (Mechanize::ResponseCodeError)
from /Users/xxx/.rvm/gems/ruby-2.1.6/gems/mechanize-2.7.4/lib/mechanize.rb:464:in `get'
from scraper.rb:15:in `block (2 levels) in <main>'
from /Users/xxx/.rvm/gems/ruby-2.1.6/gems/roo-1.13.2/lib/roo/base.rb:428:in `block in each'
from /Users/xxx/.rvm/gems/ruby-2.1.6/gems/roo-1.13.2/lib/roo/base.rb:427:in `upto'
from /Users/xxx/.rvm/gems/ruby-2.1.6/gems/roo-1.13.2/lib/roo/base.rb:427:in `each'
from scraper.rb:14:in `block in <main>'
from /Users/xxx/.rvm/gems/ruby-2.1.6/gems/roo-1.13.2/lib/roo/base.rb:398:in `block in each_with_pagename'
from /Users/xxx/.rvm/gems/ruby-2.1.6/gems/roo-1.13.2/lib/roo/base.rb:397:in `each'
from /Users/xxx/.rvm/gems/ruby-2.1.6/gems/roo-1.13.2/lib/roo/base.rb:397:in `each_with_pagename'
from scraper.rb:13:in `<main>'

最佳答案

好的,

问题是网站禁用了 Mechanize 用户代理。

我只是将其更改为:mechanize.user_agent_alias = 'Windows Chrome'

关于 ruby Mechanize 404 => 网络::HTTPNotFound,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34749136/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com