python - BeautifulSoup:AttributeError: 'NavigableString'对象没有属性 'children'-6ren

python - BeautifulSoup:AttributeError: 'NavigableString'对象没有属性 'children'

转载作者：太空宇宙更新时间：2023-11-03 17:41:07

26

4

当使用 BeautifulSoup4 时，我可以运行此代码来毫无问题地获得一个“Shout”。当我使用 for 循环时，出现错误 AttributeError: 'NavigableString' object has no attribute 'children'

class Shout:
    def __init__(self, user, msg, date):
        self.user = user
        self.msg = msg
        self.date = date

def getShouts():
    #s is a requests Session()
    new_shouts = s.get(shouts_url).text
    #set shouts page as parsable object
    soup = BeautifulSoup(new_shouts)
    shouts = []
    shout_heads = soup.find_all("h2", {'class': 'A'})
    shout_feet = soup.find_all("h2", {'class': 'B'})
    for i in range(len(shout_heads)):
        shout = Shout('', '', '')
        shout.user = list(list(list(shout_heads[i].children)[0].children)[1].children)[1].get_text()
        foot = shout_feet[i].get_text().split('-')
        shout.msg = foot[1]
        foot[2] = foot[2].split()
        shout.date = foot[2][0] + " " + foot[2][1]
        shouts.append(shout)
    return shouts

什么会导致此错误仅在循环期间发生？

最佳答案

children 不仅包括元素中的标签，还包括任何文本(使用 NavigableString 对象建模)。即使是空格也会导致第一个元素之前出现文本:

<h2>
    <a href="...">Some text</a>
</h2>

将有一个文本节点作为第一个子节点。您必须过滤掉这些文本节点，或使用 element.find_all(True, recursive=False) 仅列出直接子标签。 element.find(True) 查找第一个子标签，如果没有这样的标签，则查找None。

或者也许您可以寻找更具体的标签，而不仅仅是第一个 child ，然后是第二个 child ，然后再次是第二个 child ；如果您有特定的标签，那么只需使用它们的名称:

shout_heads[i].a.i.span.string

例如。

请注意，.children 为您提供了一个迭代器；如果您想要一个列表，*不要在 .children 上使用 list()。请改用 .contents 属性，它是一个列表对象。

最后但并非最不重要的一点是，当您可以直接循环列表时，不要使用 range() 循环:

for shout_head in shout_heads:
    shout = Shout('', '', '')
    shout.user = shout_head.find(True)[0] # etc.

如果您需要合并两个列表，可以使用zip():

for shout_head, shout_foot in zip(shout_heads, shout_feet):

尽管您也可以使用 find_next_sibling() 来查找那些额外的 h2 元素(如果这些元素交替出现)。

关于python - BeautifulSoup:AttributeError: 'NavigableString'对象没有属性 'children'，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30544622/

26

4

0

文章推荐： ruby-on-rails - 使用 each_line 进行字符串迭代

文章推荐： html - 由于验证消息，Div 在验证失败时向左移动

文章推荐： ruby - WATIR 安装 Server 2003 失败

python - 如何从树中删除 NavigableString？
我有点困惑:所有标签都有一个 decompose() 方法，该方法允许从树中就地删除标签。但是如果我想删除 NavigableString 该怎么办？它没有这样的方法: >>> b = Beautif
Python BeautifulSoup 'NavigableString' 对象没有属性 'get_text'
这可能看起来很简单，但我无法让它发挥作用。最近刚开始学习scraping，也遇到了这个问题。尝试了 python REPL 中的代码，它似乎可以工作，但是不确定为什么当我编码它时，它无法工作。这是我
python - 打印错误 'bs4.element.NavigableString'
这个问题已经有答案了: Why do I get a recursion error with BeautifulSoup and IDLE? (1 个回答) 已关闭 8 年前。我正在使用 Beau
python - BeautifulSoup 错误地检查 NavigableString 元素的子成员身份？
我有一个 HTML 页面，其树的一部分看起来像这样(请参阅下面包含 html 的代码片段): | |
Python Beautiful Soup 'NavigableString' 对象没有属性 'get_text'
我正在尝试从以下 html 结构中提取文本: Text to extract 我有以下 B
python - 如何从 bs4.element.NavigableString 中提取字符串或数字
这是我的代码: soup_detail.find_all("script",type="application/ld+json")[0].contents[0] 这是上面代码的输出: '{ "@con
python - 将 NavigableString 转换为 unicode 字符串
当我运行以下代码时: 如果 substr in movie.lowercase: 出现以下错误 AttributeError: 'NavigableString' 对象没有 'lowercase' 属
python - BeautifulSoup: AttributeError: 'NavigableString' 对象没有属性 'name'
你知道为什么 BeautifulSoup 教程中的第一个例子 http://www.crummy.com/software/BeautifulSoup/documentation.html#Quick
python - BeautifulSoup - 属性错误: 'NavigableString' object has no attribute 'find_all'
尝试让此脚本迭代 html 文件并打印出所需的结果。它一直给我这个错误。当表中只有一场“游戏”时，它可以正常工作，但如果有多个“游戏”，它就会崩溃。尝试修复它，以便它可以迭代多个游戏/ parking
python - BeautifulSoup 中的 navigablestrings 和 unicode 问题
我在 BeautifulSoup (python) 中遇到一些 navigablestrings 和 unicode 问题。基本上，我正在解析来自 youtube 的四个结果页面，并将顶部结果的扩展
python - 将 'bs4.element.NavigableString' 转换为 json
我需要将 bs4.element.NavigableString (来自 beautiful soup: http://www.crummy.com/software/BeautifulSoup/bs
python - 属性错误: 'NavigableString' object has no attribute 'find_all' (NameError)
import requests from bs4 import BeautifulSoup url=("http://finance.naver.com/news/mainnews.nhn") r=r
Python BS4 抓取 : AttributeError: 'NavigableString' object has no attribute 'text'
我想从以下页面中使用 class="academicsList" 抓取 ul 中每个 li 的文本: https://www.eduvision.edu.pk/institutions-detail.
python - BS4 + Python3 : unable to crawl tree: 'NavigableString' object has no attribute 'has_attr'
我是 Python 的新手(我只知道 powershell)，我正在尝试使用 BS4+Python3 学习网络爬虫。这是我练习的一个简单练习: test1 test2 我想做的是仅获取具有属性“
Python Beautiful Soup - 如何检测 DOM 中两个 Tag 或 NavigableString 对象是否相同
如何使用 Beautiful Soup API 来检查从同一 BeautifulSoup 对象检索的两个 Tag 或 NavigableString 对象是否确实是 DOM 中的同一对象？例如，下面
python - BeautifulSoup 标签是类型 bs4.element.NavigableString 和 bs4.element.Tag
我正在尝试抓取维基百科文章中的表格，每个表格元素的类型似乎都是和 . import requests import bs4 import lxml resp = requests.get('htt

首页

博学

6Ren·AI

商城

python - BeautifulSoup:AttributeError: 'NavigableString'对象没有属性 'children'