gpt4 book ai didi

python - 抓取的 Span 返回 None Get_Text() Python Beautiful Soup

转载 作者:太空宇宙 更新时间:2023-11-03 15:45:40 26 4
gpt4 key购买 nike

我已经抓取了汽车的链接,现在希望点击链接并抓取有关每辆车的一些数据,但我的代码返回一个空数组(如果我单独打印,则没有)。有什么想法可以解决这个问题吗?

import bs4 as bs
import urllib

source = urllib.request.urlopen('http://www.25thstauto.com/inventory.aspx?cursort=asc&pagesize=500').read()
soup = bs.BeautifulSoup(source, 'lxml')

car = soup.select('a[id*=ctl00_cphBody_inv1_rptInventoryNew]')
for a in car:
source2 = urllib.request.urlopen('http://www.25thstauto.com/'+a.get('href')).read()
price.append(soup.find('span', {'id': 'ctl00_cphBody_inv1_lblPrice'}))
print(price)

最佳答案

import bs4 as bs
import urllib

source = urllib.request.urlopen('http://www.25thstauto.com/inventory.aspx?cursort=asc&pagesize=500').read()
soup = bs.BeautifulSoup(source, 'lxml')
price = []
car = soup.select('a[id*=ctl00_cphBody_inv1_rptInventoryNew]')
for a in car:
source2 = urllib.request.urlopen('http://www.25thstauto.com/'+a.get('href')).read()
# make a new soup baesd on the link, do not use old soup
soup2 = bs.BeautifulSoup(source2, 'lxml')
price.append(soup2.find('span', {'id': 'ctl00_cphBody_inv1_lblPrice'}))
print(price)

输出:

[<span id="ctl00_cphBody_inv1_lblPrice">$2,995</span>]
[<span id="ctl00_cphBody_inv1_lblPrice">$2,995</span>, <span id="ctl00_cphBody_inv1_lblPrice">$2,995</span>]
[<span id="ctl00_cphBody_inv1_lblPrice">$2,995</span>, <span id="ctl00_cphBody_inv1_lblPrice">$2,995</span>, <span id="ctl00_cphBody_inv1_lblPrice">$2,995</span>]

关于python - 抓取的 Span 返回 None Get_Text() Python Beautiful Soup,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41784246/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com