gpt4 book ai didi

python - Urllib 错误请求问题

转载 作者:太空宇宙 更新时间:2023-11-03 16:24:31 25 4
gpt4 key购买 nike

我尝试了 here 中的每个'User-Agent' ,我仍然收到 urllib.error.HTTPError: HTTP Error 400: Bad Request。我也尝试过this ,但我得到urllib.error.URLError: File Not Found。我不知道该怎么做,我当前的代码是;

from bs4 import BeautifulSoup
import urllib.request,json,ast

with open ("urller.json") as f:
cc = json.load(f) #the file I get links, you can try this link instead of this
#cc = ../games/index.php?g_id=23521&game=0RBITALIS

for x in ast.literal_eval(cc): #cc is a str(list) so I have to convert
if x.startswith("../"):

r = urllib.request.Request("http://www.game-debate.com{}".format(x[2::]),headers={'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'})
#x[2::] because I removed '../' parts from urlls

rr = urllib.request.urlopen(r).read()
soup = BeautifulSoup(rr)

for y in soup.find_all("ul",attrs={'class':['devDefSysReqList']}):
print (y.text)

编辑:如果您只尝试 1 个链接,可能不会显示任何错误,因为我每次在第 6 个链接处都会收到错误。

最佳答案

快速修复方法是将空格替换为 +:

url = "http://www.game-debate.com"
r = urllib.request.Request(url + x[2:] ,headers={'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'})

更好的选择可能是让 urllib quote参数:

from bs4 import BeautifulSoup
import urllib.request,json,ast
from urllib.parse import quote, urljoin

with open ("urller.json") as f:
cc = json.load(f) #the file I get links, you can try this link instead of this
url = "http://www.game-debate.com"


for x in ast.literal_eval(cc): # cc is a str(list) so I have to convert
if x.startswith("../"):
r = urllib.request.Request(urljoin(url, quote(x.lstrip("."))), headers={
'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'})

rr = urllib.request.urlopen(r).read()
soup = BeautifulSoup(rr)
print(rr.decode("utf-8"))

for y in soup.find_all("ul", attrs={'class':['devDefSysReqList']}):
print (y.text)

网址中的空格无效,需要将其百分比编码为 %20 或替换为 +

关于python - Urllib 错误请求问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38112806/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com