python - 使用 python 和 scrapy 删除第一个标签 html-6ren

python - 使用 python 和 scrapy 删除第一个标签 html

转载作者：太空宇宙更新时间：2023-11-03 17:40:15

25

4

我有一个 HTML:

<div class="abc">
            <div class="xyz">
                <div class="needremove"></div>
                <p>text</p>
                <p>text</p>
                <p>text</p>
                <p>text</p>
            </div>
    </div>

我用过: response.xpath('//div[contains(@class,"abc")]/div[contains(@class,"xyz")]').extract()

结果:

u'['<div class="xyz">
        <div class="needremove"></div>
        <p>text</p>
        <p>text</p>
        <p>text</p>
        <p>text</p>
    </div>']

我想删除<div class="needremove"></div> 。你可以帮我吗？

最佳答案

您可以通过 class="needremove" 获取除 div 之外的所有子标签:

response.xpath('//div[contains(@class, "abc")]/div[contains(@class, "xyz")]/*[local-name() != "div" and not(contains(@class, "needremove"))]').extract()

来自 shell 的演示:

$ scrapy shell index.html
In [1]: response.xpath('//div[contains(@class, "abc")]/div[contains(@class, "xyz")]/*[local-name() != "div" and not(contains(@class, "needremove"))]').extract()
Out[1]: [u'<p>text</p>', u'<p>text</p>', u'<p>text</p>', u'<p>text</p>']

关于python - 使用 python 和 scrapy 删除第一个标签 html，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30661440/

25

4

0

文章推荐： c# - 在 C# 中使用 Selenium Webdriver 在按钮组类中查找按钮

文章推荐： C# 构造函数给出 "Method must have a return type"

文章推荐： c# - 如何获取 List 的前一个元素

文章推荐： python - FFT - 滤波 - 逆 FFT - 剩余偏移

首页

博学

6Ren·AI

商城

python - 使用 python 和 scrapy 删除第一个标签 html

标签)？
根据 Web 标准，创建带有标题 1 的链接的正确代码是什么？是吗 stackoverflow 或 stackoverflow 谢谢最佳答案根据网络标准，您不能将 block 元素放入内

首页

博学

6Ren·AI

商城

python - 使用 python 和 scrapy 删除第一个标签 html

标签)？ 根据 Web 标准，创建带有标题 1 的链接的正确代码是什么？ 是吗 stackoverflow 或 stackoverflow 谢谢 最佳答案 根据网络标准，您不能将 block 元素放入内

标签)？
根据 Web 标准，创建带有标题 1 的链接的正确代码是什么？是吗 stackoverflow 或 stackoverflow 谢谢最佳答案根据网络标准，您不能将 block 元素放入内