gpt4 book ai didi

python - URLLib2.URL 错误 : Reading Server Response Codes (Python)

转载 作者:行者123 更新时间:2023-11-28 21:54:19 24 4
gpt4 key购买 nike

我有一个 url 列表。我想查看每个服务器的响应代码,看看是否有任何损坏。我可以读取服务器错误 (500) 和损坏的链接 (404) 没问题,但是一旦读取非网站(例如“notawebsite_broken.com”),代码就会中断。我四处搜索但没有找到答案...希望您能提供帮助。

代码如下:

import urllib2

#List of URLs. The third URL is not a website
urls = ["http://www.google.com","http://www.ebay.com/broken-link",
"http://notawebsite_broken"]

#Empty list to store the output
response_codes = []

# Run "for" loop: get server response code and save results to response_codes
for url in urls:
try:
connection = urllib2.urlopen(url)
response_codes.append(connection.getcode())
connection.close()
print url, ' - ', connection.getcode()
except urllib2.HTTPError, e:
response_codes.append(e.getcode())
print url, ' - ', e.getcode()

print response_codes

这给出了...的输出

http://www.google.com  -  200
http://www.ebay.com/broken-link - 404
Traceback (most recent call last):
File "test.py", line 12, in <module>
connection = urllib2.urlopen(url)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 404, in open
response = self._open(req, data)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 422, in _open
'_open', req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1214, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1184, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 8] nodename nor servname provided, or not known>

有谁知道这个问题的解决方法,或者谁能指出我正确的方向?

最佳答案

您可以使用请求:

import requests

urls = ["http://www.google.com","http://www.ebay.com/broken-link",
"http://notawebsite_broken"]

for u in urls:
try:
r = requests.get(u)
print "{} {}".format(u,r.status_code)
except Exception,e:
print "{} {}".format(u,e)

http://www.google.com 200
http://www.ebay.com/broken-link 404
http://notawebsite_broken HTTPConnectionPool(host='notawebsite_broken', port=80): Max retries exceeded with url: /

关于python - URLLib2.URL 错误 : Reading Server Response Codes (Python),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24419961/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com