gpt4 book ai didi

python - 如何在scrapy中同时使用http和https代理?

转载 作者:太空宇宙 更新时间:2023-11-04 10:22:47 28 4
gpt4 key购买 nike

我是 scrapy 的新手。我发现要使用 http 代理,但我想同时使用 http 和 https 代理,因为当我抓取链接时,那里有 http 和 https 链接。如何同时使用 http 和 https 代理?

class ProxyMiddleware(object):
def process_request(self, request, spider):
request.meta['proxy'] = "http://YOUR_PROXY_IP:PORT"
#like here request.meta['proxy'] = "https://YOUR_PROXY_IP:PORT"
proxy_user_pass = "USERNAME:PASSWORD"
# setup basic authentication for the proxy
encoded_user_pass = base64.encodestring(proxy_user_pass)
request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass

最佳答案

您可以结合使用标准环境变量 HttpProxyMiddleware :

This middleware sets the HTTP proxy to use for requests, by setting the proxy meta value for Request objects.

Like the Python standard library modules urllib and urllib2, it obeys the following environment variables:

http_proxy
https_proxy
no_proxy

You can also set the meta key proxy per-request, to a value like http://some_proxy_server:port.

关于python - 如何在scrapy中同时使用http和https代理?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31313760/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com