gpt4 book ai didi

python - HTTP错误 : HTTP Error 403: Forbidden

转载 作者:太空狗 更新时间:2023-10-29 17:21:49 25 4
gpt4 key购买 nike

我制作了一个供个人使用的 python 脚本,但它不适用于维基百科...

这项工作:

import urllib2, sys
from bs4 import BeautifulSoup

site = "http://youtube.com"
page = urllib2.urlopen(site)
soup = BeautifulSoup(page)
print soup

这行不通:

import urllib2, sys
from bs4 import BeautifulSoup

site= "http://en.wikipedia.org/wiki/StackOverflow"
page = urllib2.urlopen(site)
soup = BeautifulSoup(page)
print soup

这是错误:

Traceback (most recent call last):
File "C:\Python27\wiki.py", line 5, in <module>
page = urllib2.urlopen(site)
File "C:\Python27\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 406, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 519, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 444, in error
return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 378, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 527, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden

最佳答案

在当前代码中:

python 2.X

import urllib2, sys
from BeautifulSoup import BeautifulSoup

site= "http://en.wikipedia.org/wiki/StackOverflow"
hdr = {'User-Agent': 'Mozilla/5.0'}
req = urllib2.Request(site,headers=hdr)
page = urllib2.urlopen(req)
soup = BeautifulSoup(page)
print soup

python 3.X

from bs4 import BeautifulSoup
from urllib.request import Request, urlopen

site= "http://en.wikipedia.org/wiki/StackOverflow"
hdr = {'User-Agent': 'Mozilla/5.0'}
req = Request(site,headers=hdr)
page = urlopen(req)
soup = BeautifulSoup(page)
print(soup)

Python 3.X with Selenium(Javascript 函数执行)

from selenium import webdriver as driver

browser = driver.PhantomJS()
p = browser.get("http://en.wikipedia.org/wiki/StackOverflow")
assert "Stack Overflow - Wikipedia" in browser.title

修改版本起作用的原因是维基百科检查 User-Agent 是否属于“流行浏览器”

关于python - HTTP错误 : HTTP Error 403: Forbidden,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13055208/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com