gpt4 book ai didi

python - 从 bs4 的动态类 ="????"获取数据

转载 作者:行者123 更新时间:2023-12-01 09:04:31 26 4
gpt4 key购买 nike

我可以使用动态类,即 class="cfBD1"和 class="cfJLC"或 class="????"从标签获取=“数据”?

from bs4 import BeautifulSoup

soup=BeautifulSoup("""<div class="couponTable"><div id="tgCou1" class="tgCoupon couponRow"><span class="spBtnMinus"></span><!-- react-text: 67 -->Wednesday Matches<!-- /react-text --></div><div class="couponRow rAlt1 tgCou1" id="rmid20180905WED1"><img src="/ContentServer/jcbw/images/flag_JLC.gif?CV=L302R1g" alt="Japanese League Cup" title="Japanese League Cup" class="cfJLC"><img src="/ContentServer/jcbw/images/icon_tv-C661.gif?CV=L302R1g" alt="C661-i-CABLE 661 C601-i-CABLE 601" title="C661-i-CABLE 661 C601-i-CABLE 601"></span></span><img src="/football/info/images/btn_odds.gif?CV=L302R1g" alt="All Odds" title="All Odds"></a></div><div class="couponRow rAlt0 tgCou1" id="rmid20180905WED2"><img src="/ContentServer/jcbw/images/flag_JLC.gif?CV=L302R1g" alt="Japanese League Cup" title="Japanese League Cup" class="cfJLC"><img src="/ContentServer/jcbw/images/icon_tv-C662.gif?CV=L302R1g" alt="C662-i-CABLE 662 C602-i-CABLE 602" title="C662-i-CABLE 662 C602-i-CABLE 602"></span></span><img src="/football/info/images/btn_odds.gif?CV=L302R1g" alt="All Odds" title="All Odds"></a></div></div></div><div class="couponRow rAlt1 tgCou1" id="rmid20180905WED12"><img src="/ContentServer/jcbw/images/flag_BD1.gif?CV=L302R1g" alt="Brazilian Division 1" title="Brazilian Division 1" class="cfBD1"><img src="/football/info/images/btn_odds.gif?CV=L302R1g" alt="All Odds" title="All Odds"></a></div></div>""",'html.parser')

lines=soup.find_all('img')
for line in lines:
print(line['alt'])

输出:

Japanese League Cup
C661-i-CABLE 661 C601-i-CABLE 601
All Odds
Japanese League Cup
C662-i-CABLE 662 C602-i-CABLE 602
All Odds
Brazilian Division 1
All Odds

预期输出:

Japanese League Cup
Japanese League Cup
Brazilian Division 1

最佳答案

在这种情况下,您只需检查 img 标签是否具有 class 属性即可:

soup.find_all('img', attrs={'class': True})

示例:

In [1570]: [img['alt'] for img in soup.find_all('img', attrs={'class': True})]
Out[1570]: ['Japanese League Cup', 'Japanese League Cup', 'Brazilian Division 1']
<小时/>

为了完整性,匹配任何动态属性值,您需要在命名中找到一个通用模式,例如在这种情况下,似乎所有类名都以字符 c 开头;因此,您可以使用 CSS 选择器:

img[class^="c"]

示例:

In [1571]: [img['alt'] for img in soup.select('img[class^="c"]')]
Out[1571]: ['Japanese League Cup', 'Japanese League Cup', 'Brazilian Division 1']

关于python - 从 bs4 的动态类 ="????"获取数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52170598/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com