gpt4 book ai didi

Ruby Mechanize 抓取 ResponseCodeError

转载 作者:行者123 更新时间:2023-12-04 16:21:13 28 4
gpt4 key购买 nike

我正在尝试抓取网站的所有结果页面,它可以工作,但有时脚本会停止并显示此错误:

502 => Net::HTTPBadGateway for https://website.com/id/12/ -- unhandled response (Mechanize::ResponseCodeError)

即使发现错误,我也想继续执行该脚本。

我的脚本:
require 'mechanize'
require 'csv'

a = Mechanize.new
CSV.open('datas.csv', "wb") do |csv|
page = a.get("https://website.com/?page=1-200") #498
number = 0
page.links_with(:class => "btn btn-default").each do |link|
post_link = link.href
inside_page = a.get("https://website.com#{post_link}")
title = inside_page.at("h1.serviceTitle").text.strip
author = inside_page.at(".name").text.strip
number+=1
csv << [title, author]
end
end

任何的想法 ?

最佳答案

这可以通过适当的异常处理轻松解决。您可以check this page for a better explanation

至于你的代码,你可以像这样处理异常

CSV.open('datas.csv', "wb") do |csv|
begin
a = Mechanize.new
page = a.get("https://website.com/?page=1-200") #498
number = 0
page.links_with(:class => "btn btn-default").each do |link|
post_link = link.href
inside_page = a.get("https://website.com#{post_link}")
title = inside_page.at("h1.serviceTitle").text.strip
author = inside_page.at(".name").text.strip
number+=1
csv << [title, author]
end
rescue => e
// do nothing and move on to the next line
end
end

关于Ruby Mechanize 抓取 ResponseCodeError,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46832980/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com