python - 美丽汤 : <div class <span class></span><span class>TEXT I WANT</span>-6ren

python - 美丽汤 :
TEXT I WANT

转载作者：行者123 更新时间：2023-11-28 19:37:23

28

4

我正在尝试使用 BeautifulSoup 提取包含在 id="titleDescription"范围内的字符串。

<div class="itemText">
    <div class="wrapper">
        <span class="itemPromo">Customer Choice Award Winner</span>
        <a href="http://www.newegg.com/Product/Product.aspx?Item=N82E16819116501" title="View Details" >
            <span class="itemDescription" id="titleDescriptionID" style="display:inline">Intel Core i7-3770K Ivy Bridge 3.5GHz &#40;3.9GHz Turbo&#41; LGA 1155 77W Quad-Core Desktop Processor Intel HD Graphics 4000 BX80637I73770K</span>
            <span class="itemDescription" id="lineDescriptionID" style="display:none">Intel Core i7-3770K Ivy Bridge 3.5GHz &#40;3.9GHz Turbo&#41; LGA 1155 77W Quad-Core Desktop Processor Intel HD Graphics 4000 BX80637I73770K</span>
        </a>
    </div>

代码片段

f = open('egg.data', 'rb')
content = f.read()
content = content.decode('utf-8', 'replace')
content = ''.join([x for x in content if ord(x) < 128])

soup = bs(content)

for itemText in soup.find_all('div', attrs={'class':'itemText'}):
    wrapper = itemText.div
    wrapper_href = wrapper.a
    for child in wrapper_href.descendants:
        if child['id'] == 'titleDescriptionID':
           print(child, "\n")

回溯错误:

Traceback (most recent call last):
  File "egg.py", line 66, in <module>
    if child['id'] == 'titleDescriptionID':
TypeError: string indices must be integers

最佳答案

spans = soup.find_all('span', attrs={'id':'titleDescriptionID'})
for span in spans:
    print span.string

在您的代码中，wrapper_href.descendants 包含至少 4 个元素、2 个 span 标签和由 2 个 span 标签包围的 2 个字符串。它递归地搜索它的 child 。

关于python - 美丽汤 : <div class <span class></span><span class>TEXT I WANT</span>，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/17613606/

28

4

0

文章推荐： python - 在静态方法中引用自己的类

文章推荐： python - 如何检查 Python 中的函数类型？

文章推荐： python - 如何从python中的列表字典中添加元素

Ruby Greed Koan - 如何改进我的 if/then 汤？
我正在努力学习 Ruby Koans 以尝试学习 Ruby，到目前为止一切顺利。我已经得到了贪婪的公案，在撰写本文时它是 183。我有一个可行的解决方案，但我觉得我只是拼凑了一堆 if/then 逻辑
c++ - 使用 boost 图形库的模板化 typedef 汤
我正在尝试创建一个扩展 boost 图形库行为的类。我希望我的类是一个模板，用户提供一个类型(类)，用于在每个顶点存储属性。那只是背景。我正在努力创建一个更简洁的 typedef 来定义我的新类。基
python - 来自 SUDS.client 的未知字符串格式(汤？)的可能解析器
我正在使用 suds 包从网站查询 API，从他们的网站返回的数据如下所示: (1)。谁能告诉我这是什么格式？ (2)。如果是这样，解析数据的最简单方法是什么？我已经使用 BeautifulSoup
python (汤): get nested data and get last item in a tag
所以我有一个看起来像这样的 html 文档: Speaker Name: Title of Talk | Subtitle | website.com ... [Other Stuff] Poste

首页

博学

6Ren·AI

商城

python - 美丽汤 :
TEXT I WANT

首页

博学

6Ren·AI

商城

python - 美丽汤 : TEXT I WANT

python - 美丽汤 :
TEXT I WANT