gpt4 book ai didi

python - 如何使用 beautifulsoup4 选择除某个 html 元素以外的所有元素?

转载 作者:行者123 更新时间:2023-11-28 22:58:49 24 4
gpt4 key购买 nike

例子:

import bs4

html = '''
<div class="short-description std ">
<em>Android Apps Security</em> provides guiding principles for how to
best design and develop Android apps with security in mind. The book explores
techniques that developers can use to build additional layers of security into
their apps beyond the security controls provided by Android itself.
<p class="scroll-down">∨ <a href="#main-desc" onclick="Effect.ScrollTo(
'main-desc', { duration:'0.2'}); return false;">Full Description</a> ∨</p></div>
'''
soup = bs4.BeautifulSoup(html)

如何从 soup 中获取以下内容(一个 beautifulsoup 对象)?

<div class="short-description std ">
<em>Android Apps Security</em> provides guiding principles for how to
best design and develop Android apps with security in mind. The book explores
techniques that developers can use to build additional layers of security into
their apps beyond the security controls provided by Android itself.
</div>

最佳答案

只需搜索它:

soup.find('p', class_='scroll-down')

我使用类来限制查找,但由于没有其他 p 元素,所以这里有点多余。

相反,如果您需要删除标签,请使用上述方法先找到它,然后调用 .extract()将其从文档中删除:

>>> soup.find('p', class_='scroll-down').extract()
<p class="scroll-down"> <a href="#main-desc" onclick="Effect.ScrollTo(
'main-desc', { duration:'0.2'}); return false;">Full Description</a> </p>
>>> print soup

<div class="short-description std ">
<em>Android Apps Security</em> provides guiding principles for how to
best design and develop Android apps with security in mind. The book explores
techniques that developers can use to build additional layers of security into
their apps beyond the security controls provided by Android itself.
</div>

两件事:删除的标签从 .extract() 方法返回,您可以保存它供以后使用。标签已从文档中完全删除,如果您仍然需要它在文档中,您必须稍后手动重新添加它。

或者,您可以使用 .decompose() method ,它会从文档中完全删除标签,而不返回引用。然后标签就永远消失了。

关于python - 如何使用 beautifulsoup4 选择除某个 html 元素以外的所有元素?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13493579/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com