gpt4 book ai didi

Python 3 获取子元素(lxml)

转载 作者:行者123 更新时间:2023-11-28 02:27:57 27 4
gpt4 key购买 nike

我在 html 中使用 lxml:

from lxml import html
import requests

我如何检查一个元素的任何子元素是否有 class = "nearby"我的代码(本质上):

url = "www.example.com"
Page = requests.get(url)
Tree = html.fromstring(Page.content)
resultList = Tree.xpath('//p[@class="result-info"]')
i=len(resultList)-1 #to go though the list backwards
while i>0:
if (resultList[i].HasChildWithClass("nearby")):
print('This result has a child with the class "nearby"')

如何替换“HasChildWithClass()”以使其真正起作用?

这是一个示例树:

...
<p class="result-info">
<span class="result-meta">
<span class="nearby">
... #this SHOULD print something
</span>
</span>
</p>
<p class="result-info">
<span class="result-meta">
<span class="FAR-AWAY">
... # this should NOT print anything
</span>
</span>
</p>
...

最佳答案

我试图理解您为什么使用 lxml 来查找元素。然而 BeautifulSoupre 可能是更好的选择。

lxml = """
<p class="result-info">
<span class="result-meta">
<span class="nearby">
... #this SHOULD print something
</span>
</span>
</p>
<p class="result-info">
<span class="result-meta">
<span class="FAR-AWAY">
... # this should NOT print anything
</span>
</span>
</p>
"""

但我做了你想要的。

from lxml import html

Tree = html.fromstring(lxml)
resultList = Tree.xpath('//p[@class="result-info"]')
i = len(resultList) - 1 #to go though the list backwards
for result in resultList:
for e in result.iter():
if e.attrib.get("class") == "nearby":
print(e.text)

尝试使用bs4

from bs4 import BeautifulSoup


soup = BeautifulSoup(lxml,"lxml")
result = soup.find_all("span", class_="nearby")
print(result[0].text)

关于Python 3 获取子元素(lxml),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52771402/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com