gpt4 book ai didi

python - BeautifulSoup 嵌套div递归获取文本

转载 作者:太空宇宙 更新时间:2023-11-04 04:54:10 25 4
gpt4 key购买 nike

我想要嵌套 div 中的数据,但我无法获取它。

有嵌套的 div 我需要正确格式化数据。

我已经编写了 bs4 模块,但出现错误

BeautifulSoup: AttributeError: 'NavigableString' 对象没有属性 'name'

请帮助我!

我的 HTML

<div id="new">
<div id="newDat">
<div class="Data">
<div class="DataNew">
<div class="DataNew new">
<div class="Data Left">
<div class="name"><a class="name" href="">Jack Daniels</a></div>
<div class="details"><span class="loc">Barcelona</span></div>
<div class="header"><a class="looking"> Looking for meeting new people</a></div>
<div class="ideas"><a class="ideas">I have new ideas</a></div>
<div class="profile"> <em class="profilss"></em>MS in cs<br></div>

</div>
<div class="Data Right">
<a class="phone"><span class="txt">+123123123123123231</span></a>
</div>
</div>

</div>
</div>
<div class="DataOne">
<div class="DataNew">
<div class="DataNew new">
<div class="Data Left">
<div class="name"><a class="name" href="">Jack Daniels</a></div>
<div class="details"><span class="loc">Barcelona</span></div>
<div class="header"><a class="looking"> Looking for meeting new people</a></div>
<div class="ideas"><a class="ideas">I have new ideas</a></div>
<div class="profile"> <em class="profilss"></em>MS in cs<br></div>

</div>
<div class="Data Right">
<a class="phone"><span class="txt">+123123123123123231</span></a>
</div>
</div>

</div>
</div>
<div class="DataTwo">
<div class="DataNew">
<div class="DataNew new">
<div class="Data Left">
<div class="name"><a class="name" href="">Jack Daniels</a></div>
<div class="details"><span class="loc">Barcelona</span></div>
<div class="header"><a class="looking"> Looking for meeting new people</a></div>
<div class="ideas"><a class="ideas">I have new ideas</a></div>
<div class="profile"> <em class="profilss"></em>MS in cs<br></div>

</div>
<div class="Data Right">
<a class="phone"><span class="txt">+123123123123123231</span></a>
</div>
</div>
</div>
</div>
<div class="DataThree">
<div class="DataNew">
<div class="DataNew new">
<div class="Data Left">
<div class="name"><a class="name" href="">Jack Daniels</a></div>
<div class="details"><span class="loc">Barcelona</span></div>
<div class="header"><a class="looking"> Looking for meeting new people</a></div>
<div class="ideas"><a class="ideas">I have new ideas</a></div>
<div class="profile"> <em class="profilss"></em>MS in cs<br></div>

</div>
<div class="Data Right">
<a class="phone"><span class="txt">+123123123123123231</span></a>
</div>
</div>

</div>
</div>
</div>
</div>

我 BeautifulSoup 代码

    li = page.find('div', {'id': 'new'})
for tag in li:
for i in tag.find_all("div", {"class": "name"}):
print i.getText()
break

for i in tag.find_all("div", {"class": "details"}):
print i.getText()
break

for i in tag.find_all("div", {"class": "header"}):
print i.getText()
break


for i in tag.find_all("div", {"class": "ideas"}):
print i.getText()
break


for i in tag.find_all("div", {"class": "profile"}):
print i.getText()
break

for i in tag.find_all("div", {"class": "phone"}):
print i.getText()
break

我想要这样的输出

Div one 
Name : Jack Daniels
Details : Barcelona
header : Looking for meeting new people
ideas : I have new ideas
profile: MS in cs
tel : +123123123123123231

Div two
Name : Jack Daniels
Details : Barcelona
header : Looking for meeting new people
ideas : I have new ideas
profile: MS in cs
tel : +123123123123123231

等等。

如果我在 <div id = "new"> 中有 100 个 Div我需要这样的输出。

最佳答案

你可以做到这一点。这将为每个 div 返回数据。

from bs4 import BeautifulSoup
soup = BeautifulSoup(b) // b is html
rows =soup.find_all('div', {'class': 'DataNew'})
for tag in rows:
for tag in li:
for i in tag.find_all("div", {"class": "name"}):
print i.getText()
break

for i in tag.find_all("div", {"class": "details"}):
print i.getText()
break

for i in tag.find_all("div", {"class": "header"}):
print i.getText()
break


for i in tag.find_all("div", {"class": "ideas"}):
print i.getText()
break


for i in tag.find_all("div", {"class": "profile"}):
print i.getText()
break

for i in tag.find_all("div", {"class": "Data Right"}):
print i.getText()
break

关于python - BeautifulSoup 嵌套div递归获取文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47468194/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com