gpt4 book ai didi

python - BeautifulSoup find_all() 不返回任何内容 []

转载 作者:行者123 更新时间:2023-12-01 00:52:46 26 4
gpt4 key购买 nike

我正在尝试抓取this page所有优惠,并想要迭代 <p class="white-strip">但是page_soup.find_all("p", "white-strip")返回一个空列表 []。

到目前为止我的代码-

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'https://www.sbicard.com/en/personal/offers.page#all-offers'

# Opening up connection, grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

# html parsing
page_soup = soup(page_html, "lxml")

编辑:我使用 Selenium 让它工作,下面是我使用的代码。但是,我无法找出可以完成相同任务的其他方法。

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome("C:\chromedriver_win32\chromedriver.exe")
driver.get('https://www.sbicard.com/en/personal/offers.page#all-offers')

# html parsing
page_soup = BeautifulSoup(driver.page_source, 'lxml')

# grabs each offer
containers = page_soup.find_all("p", {'class':"white-strip"})

filename = "offers.csv"
f = open(filename, "w")

header = "offer-list\n"

f.write(header)

for container in containers:
offer = container.span.text
f.write(offer + "\n")

f.close()
driver.close()

最佳答案

如果您查找其中任何一项,您可以在包含 var OfferData 的脚本标记中找到它们。要从该脚本中获取所需的内容,您可以尝试以下操作。

import re
import json
import requests

url = "https://www.sbicard.com/en/personal/offers.page#all-offers"

res = requests.get(url)
p = re.compile(r"var offerData=(.*?);",re.DOTALL)
script = p.findall(res.text)[0].strip()
items = json.loads(script)
for item in items['offers']['offer']:
print(item['text'])

输出如下:

Upto Rs 8000 off on flights at Yatra
Electricity Bill payment – Phonepe Offer
25% off on online food ordering
Get 5% cashback at Best Price stores
Get 5% cashback

关于python - BeautifulSoup find_all() 不返回任何内容 [],我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56455255/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com