gpt4 book ai didi

python - Beautiful Soup 子项的额外换行符

转载 作者:行者123 更新时间:2023-11-30 23:13:35 24 4
gpt4 key购买 nike

我在 html 片段上使用 BeautifulSoup,如下所示:

 s = """<div class="views-row views-row-1 views-row-odd views-row-  first">
<span class="views-field views-field-title">
<span class="field-content"><a href="/party-pictures/2015/love-heals">Love Heals</a>
</span>
</span>
<span class="views-field views-field-created">
<span class="field-content">Friday, March 20, 2015
</span>
</span>
</div>"""

soup = BeautifulSoup(s)

为什么s.span只返回第一个span标签?

此外 s.contents 返回一个长度为 4 的列表。两个 span 标签都在列表中,但第 0 个和第 2 个索引是“\n$ 换行符。换行符是无用的。这是有原因的吗?完成了吗?

最佳答案

Why does s.span only return the first span tag?

s.spans.find('span') 的快捷方式它将仅查找 span 标记的第一次出现

Moreover s.contents returns a list of length 4. Both span tags are in the list but the 0th and 2nd index are "\n$ new line characters. The new line character is useless. Is there a reason why this is done?

根据定义,.contents输出所有元素子元素的列表,包括文本节点 - NavigableString class 的实例.

如果您只想要标签,可以使用find_all():

soup.find_all()

并且,如果只有 span 标签:

soup.find_all('span')

示例:

>>> from bs4 import BeautifulSoup
>>> s = """<div class="views-row views-row-1 views-row-odd views-row- first">
... <span class="views-field views-field-title">
... <span class="field-content"><a href="/party-pictures/2015/love-heals">Love Heals</a>
... </span>
... </span>
... <span class="views-field views-field-created">
... <span class="field-content">Friday, March 20, 2015
... </span>
... </span>
... </div>"""
>>> soup = BeautifulSoup(s)
>>> for span in soup.find_all('span'):
... print span.text.strip()
...
Love Heals
Love Heals
Friday, March 20, 2015
Friday, March 20, 2015

重复的原因是存在嵌套的 span 元素。您可以通过不同的方式修复它。例如,您可以仅使用 recursive=Falsediv 内进行搜索:

>>> for span in soup.find('div', class_='views-row-1').find_all('span', recursive=False):
... print span.text.strip()
...
Love Heals
Friday, March 20, 2015

或者,您可以使用 CSS Selectors :

>>> for span in soup.select('div.views-row-1 > span'):
... print span.text.strip()
...
Love Heals
Friday, March 20, 2015

关于python - Beautiful Soup 子项的额外换行符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29245069/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com