gpt4 book ai didi

python - 网页抓取错误: Request and No Connection

转载 作者:行者123 更新时间:2023-12-02 01:24:19 24 4
gpt4 key购买 nike

我正在尝试用 python 编写我的第一个程序。网络抓取程序的目的是从可能 100 个或更多网站上获取多种类型产品的价格。我能够为一个网站编写程序并将其导出到 Excel 文件,没有任何问题。但是,我现在在尝试网络抓取多个网站时遇到了问题。

我试图将多个 URL 放入一个列表中,然后创建一个 for 循环来为每个 URL 运行相同的代码。下面是代码:

import pandas as pd
import requests
from bs4 import BeautifulSoup

#Aero Stripped Lowers
url = ['https://www.aeroprecisionusa.com/ar15/lower-receivers/stripped-lowers?product_list_limit=all', 'https://www.aeroprecisionusa.com/ar15/lower-receivers/complete-lowers?product_list_limit=all']
for website in url:
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:71.0) Gecko/20100101 Firefox/71.0"}
page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')

#Locating All Stripped Aero Lowers On Site
all_aero_stripped_lowers = soup.find(class_='products wrapper container grid products-grid')
items = all_aero_stripped_lowers.find_all(class_='product-item-info')

#Identifying All Aero Stipped Lower Names And Prices
aero_stripped_lower_names = [item.find(class_='product-item-link').text for item in items]
aero_stripped_lower_prices = [item.find(class_='price').text for item in items]


Aero_Stripped_Lowers_Consolidated = pd.DataFrame(
{'Aero Stripped Lower': aero_stripped_lower_names,
'Prices': aero_stripped_lower_prices,
})

Aero_Stripped_Lowers_Consolidated.to_csv('MasterPriceTracker.csv')

我收到以下错误:

Traceback (most recent call last):
File "C:/Users/ComputerName/Documents/PyCharm_Projects/Aero Stripped Lower List/NewAeroStrippedLower.py", line 9, in <module>
page = requests.get(url, headers=headers)
File "C:\Python\Python38\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "C:\Python\Python38\lib\site-packages\requests\api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Python\Python38\lib\site-packages\requests\sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python\Python38\lib\site-packages\requests\sessions.py", line 640, in send
adapter = self.get_adapter(url=request.url)
File "C:\Python\Python38\lib\site-packages\requests\sessions.py", line 731, in get_adapter
raise InvalidSchema("No connection adapters were found for '%s'" % url)
requests.exceptions.InvalidSchema: No connection adapters were found for '['https://www.aeroprecisionusa.com/ar15/lower-receivers/stripped-lowers?product_list_limit=all', 'https://www.aeroprecisionusa.com/ar15/lower-receivers/complete-lowers?product_list_limit=all']'

预先感谢您提供的任何帮助!

最佳答案

您正在列表上使用 requests.get()。这是一个简单的错误:

# -- snip --

for website in url:
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:71.0) Gecko/20100101 Firefox/71.0"}
page = requests.get(website, headers=headers) # not 'url'
soup = BeautifulSoup(page.content, 'html.parser')

# -- snip --

关于python - 网页抓取错误: Request and No Connection,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59645753/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com