gpt4 book ai didi

python - beautiful Soup中python响应报错如何继续

转载 作者:太空宇宙 更新时间:2023-11-04 08:51:26 25 4
gpt4 key购买 nike

我做了一个网络爬虫,它从一个文本文件中获取数千个 Urls,然后爬取该网页上的数据。
现在它有很多网址;一些网址也被破坏了。
所以它给了我错误:

Traceback (most recent call last):  
File "C:/Users/khize_000/PycharmProjects/untitled3/new.py", line 57, in <module>

crawl_data("http://www.foasdasdasdasdodily.com/r/126e7649cc-sweetssssie-pies-mac-and-cheese-recipe-by-the-dr-oz-show")

File "C:/Users/khize_000/PycharmProjects/untitled3/new.py", line 18, in crawl_data

data = requests.get(url)

File "C:\Python27\lib\site-packages\requests\api.py", line 67, in get
return request('get', url, params=params, **kwargs)

File "C:\Python27\lib\site-packages\requests\api.py", line 53, in request
return session.request(method=method, url=url, **kwargs)

File "C:\Python27\lib\site-packages\requests\sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)

File "C:\Python27\lib\site-packages\requests\sessions.py", line 576, in send
r = adapter.send(request, **kwargs)

File "C:\Python27\lib\site-packages\requests\adapters.py", line 437, in send
raise ConnectionError(e, request=request)

requests.exceptions.ConnectionError: HTTPConnectionPool(host='www.foasdasdasdasdodily.com', port=80): Max retries exceeded with url: /r/126e7649cc-sweetssssie-pies-mac-and-cheese-recipe-by-the-dr-oz-show (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x0310FCB0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed',))

这是我的代码:

def crawl_data(url):
global connectString
data = requests.get(url)
response = str( data )
if response != "<Response [200]>":
return
soup = BeautifulSoup(data.text,"lxml")
titledb = soup.h1.string

但它仍然给我相同的异常或错误。

I simply want it to ignore that Urls from which there is no response and move on to the next Url.

最佳答案

您需要了解异常处理。忽略这些错误的最简单方法是用 try-except 构造包围处理单个 URL 的代码,让您的代码读起来像这样:

try:
<process a single URL>
except requests.exceptions.ConnectionError:
pass

这意味着如果发生指定的异常,您的程序将只执行pass(什么都不做)语句并继续下一个

关于python - beautiful Soup中python响应报错如何继续,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34837333/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com