gpt4 book ai didi

Python、Beautiful Soup、WebScraping、Pandas、Dataframe

转载 作者:行者123 更新时间:2023-11-27 23:39:22 24 4
gpt4 key购买 nike

Complex Beautiful Soup query

我逐渐熟悉 Beautiful Soup 和 Pandas 的 Dataframe,但我似乎无法将两者结合起来。

import urllib.request
from bs4 import BeautifulSoup
import pandas as pd


connection = urllib.request.urlopen('http://www.carfolio.com/specifications/models/?man=557')
soup = BeautifulSoup(connection, "html.parser", from_encoding='utf-7')

soup.decode('utf-7','ignore')

href_tag = soup.find_all(span="detail")
for href_tag in soup.body.stripped_strings:
print(str(href_tag.encode('utf-7')))

最终,我的目标是抓取每辆车并创建一个包含相关信息(“详细信息”)的数据框,例如马力、扭矩、重量等。我只是不知道如何“抓取” “细节。 Relevant HTML Code

我环顾四周,有例子,但大多数都没有访问“缩写标题”谢谢

最佳答案

如果您可以为列表中的每辆汽车提出额外请求,那么这里是一个如何获取汽车特征的工作演示示例:

>>> import requests
>>> from bs4 import BeautifulSoup
>>>
>>> soup = BeautifulSoup(requests.get("http://www.carfolio.com/specifications/models/car/?car=427691").content)
>>> for item in soup.select("div.summary dl dt"):
... print(item.get_text(strip=True), item.find_next_sibling("dd").get_text(strip=True))
...
(u'What body style?', u'hatchback with 4/5 seats')
(u'How long?', u'3973mm')
(u'How heavy?', u'1110kg')
(u'What size engine?', u'1 litre, 999cm3')
(u'How many cylinders?', u'3, Straight')
(u'How much power?', u'95PS/ 94bhp/ 70kW@ 5000-5500rpm')
(u'How much torque?', u'160Nm/ 118ft.lb/ 16.3kgm@ 1500-3500rpm')
(u'How quick?', u'0-100km/h: 10.9s')
(u'How fast?', u'186km/h, 116mph')
(u'How economical?', u'5.0/3.7/4.2 l/100km urban/extra-urban/combined')
(u'Whatcarbon dioxide emissions?', u'97.0CO2g/km')

关于Python、Beautiful Soup、WebScraping、Pandas、Dataframe,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32343976/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com