gpt4 book ai didi

python - 如何从我需要的文本中跳过或截断字符或符号。网页抓取与 BeautifulSoup

转载 作者:行者123 更新时间:2023-12-01 06:57:23 26 4
gpt4 key购买 nike

我需要获取 div 标签之间的价格 (61,990),但如何去掉货币符号?

enter image description here

与此处相同,我只需要获取评级 (4.7),但之后我不需要任何内容​​,例如 img src。我怎么能忽视它呢?或者跳过它?

enter image description here

代码示例:

from bs4 import BeautifulSoup
import requests

price = []
ratings=[]
response = requests.get("https://www.flipkart.com/laptops/~buyback-guarantee-on-laptops-/pr?sid=6bo%2Cb5g&uniq")
soup = BeautifulSoup(response.text, 'html.parser')
for a in soup.findAll('a',href=True, attrs={'class':'_31qSD5'}):
price=a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'})
rating=a.find('div', attrs={'class':'hGSR34'})

最佳答案

这里。您只需使用 .text 方法并将其视为普通字符串即可。在这种情况下,保留除第一个字符之外的所有字符。

from bs4 import BeautifulSoup
import requests

price = []
ratings=[]
response = requests.get("https://www.flipkart.com/laptops/~buyback-guarantee-on-laptops-/pr?sid=6bo%2Cb5g&uniq")
soup = BeautifulSoup(response.text, 'html.parser')
for a in soup.findAll('a',href=True, attrs={'class':'_31qSD5'}):
price=a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'}).text[1:]
rating=a.find('div', attrs={'class':'hGSR34'}).text
print(price)
print(rating)
Out[110]: '4.3'
Out[111]: '52,990'

关于python - 如何从我需要的文本中跳过或截断字符或符号。网页抓取与 BeautifulSoup ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58751934/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com