gpt4 book ai didi

python - 用 BeautifulSoup 进行网页抓取 4

转载 作者:太空宇宙 更新时间:2023-11-03 19:52:52 25 4
gpt4 key购买 nike

我正在尝试访问网站上的 HTML 代码 forexfactory.com并返回具有 worsebetter 类的所有 span 标记。

import requests
from bs4 import BeautifulSoup

r = requests.get("https://www.forexfactory.com/#closed")

soup = BeautifulSoup(r.text, 'lxml')

table = soup.find("table", class_="calendar__table")

Wnews = []
Bnews = []
Tnews = []

for row in table.find_all('tr', class_='calendar__row--grey'):

currency = row.find("td", class_="currency")
# print(currency.prettify()) # before get text
currency = currency.get_text(strip=True)

actual = row.find("td", class_="actual")
actual = actual.get_text(strip=True)

impact = row.find("span", class_="worse")
try:
impactW = impact.get_text(strip=True)
except AttributeError:
continue

impact2 = row.find("span", class_="better")
try:
impactB = impact2.get_text(strip=True)
except AttributeError:
continue

# print(impact)

# news.append(currency)news.append(actual)

if currency == "GBP":

actual = row.find("td", class_="actual")
actual = actual.get_text(strip=True)

Tnews.append(currency)

forecast = row.find("td", class_="forecast")
forecast = forecast.get_text(strip=True)

Wnews.append(impactW)
Bnews.append(impactB)

print(impact2)

print(impact2) 返回多个带有 class = "Revised Better" 的所有 span 标签,而不仅仅是 better。我写错了什么?

最佳答案

要获取类worse的所有span标签,只需尝试下面的代码。使用css选择器。

worsedata=[item.text.strip() for item in soup.select('table.calendar__table tr.calendar__row--grey span.worse:not(.revised)')]
print(worsedata)

输出:

['0.0%', '-0.2%', '-0.3%', '-1.7%', '0.1%', '-1.2%']
<小时/>

仅获取span tag更好

betterdata=[item.text.strip() for item in soup.select('table.calendar__table tr.calendar__row--grey span.better:not(.revised)')]
print(betterdata)

输出:

['1.9%', '-5.3B']

关于python - 用 BeautifulSoup 进行网页抓取 4,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59721267/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com