gpt4 book ai didi

python - 用汤获取多种元素

转载 作者:行者123 更新时间:2023-12-01 07:08:32 25 4
gpt4 key购买 nike

我有以下 HTML 代码,我试图获取每个特定“日期”的“客户”,但我只得到第一个下一个元素:

<div class="info">
<div class="left-wrap"><span class="date">DATE-1</span></div>
</div>

<div class="clients-list">
<div>
<span class="client" >client1</span>
<span class="client" >client2</span>
<span class="client" >client3</span>
</div>
</div>

<div class="clients-list">
<div>
<span class="client" >client4</span>
<span class="client" >client5</span>
<span class="client" >client6</span>
</div>
</div>

<div class="info">
<div class="left-wrap"><span class="date" >DATE-2</span></div>
</div>
<div class="clients-list">
<div>
<span class="client" >client7</span>
<span class="client" >client8</span>
</div>
</div>
<div class="clients-list">
<div>
<span class="client" >client9</span>
<span class="client" >client10</span>
</div>
</div>
<div class="clients-list">
<div>
<span class="client" >client11</span>
<span class="client" >client12</span>
</div>
</div>

我正在使用以下代码:

soup=BeautifulSoup(html,'html.parser')
dates=soup.find_all(class_='date')
for date in dates:
print(date.text)
for item in date.find_next(class_='clients-list').find_all(class_='client'):
print(item.text)

得到的输出是:

DATE-1
client1
client2
client3
DATE-2
client7
client8

我尝试使用 find_next_all,但得到了相同的输出。

最佳答案

有点棘手,但你会得到输出。使用 find_next_siblings()

from bs4 import BeautifulSoup
html='''<div class="info">
<div class="left-wrap"><span class="date">DATE-1</span></div>
</div>

<div class="clients-list">
<div>
<span class="client" >client1</span>
<span class="client" >client2</span>
<span class="client" >client3</span>
</div>
</div>

<div class="clients-list">
<div>
<span class="client" >client4</span>
<span class="client" >client5</span>
<span class="client" >client6</span>
</div>
</div>

<div class="info">
<div class="left-wrap"><span class="date" >DATE-2</span></div>
</div>
<div class="clients-list">
<div>
<span class="client" >client7</span>
<span class="client" >client8</span>
</div>
</div>
<div class="clients-list">
<div>
<span class="client" >client9</span>
<span class="client" >client10</span>
</div>
</div>
<div class="clients-list">
<div>
<span class="client" >client11</span>
<span class="client" >client12</span>
</div>
</div>'''

soup=BeautifulSoup(html,'html.parser')
dates=soup.find_all(class_='date')
for date in dates:
print(date.text)
for item in date.parent.parent.find_next_siblings(class_='clients-list'):

if item.find_previous_sibling(class_='info').find_next(class_='date').text==date.text:
for client in item.find_all(class_='client'):
print(client.text)

输出:

DATE-1
client1
client2
client3
client4
client5
client6
DATE-2
client7
client8
client9
client10
client11
client12

关于python - 用汤获取多种元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58331914/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com