gpt4 book ai didi

python - lxml/ python : get previous-sibling

转载 作者:太空狗 更新时间:2023-10-30 02:21:26 25 4
gpt4 key购买 nike

我有以下 html:

<div id = "big">
<span>header 1</span>
<ul id = "outer">
<li id = "inner">aaa</li>
<li id = "inner">bbb</li>
</ul>

<span>header 2</span>
<ul id = "outer">
<li id = "inner">ccc</li>
<li id = "inner">ddd</li>
</ul>
</div>

我希望它按顺序循环:

header 1
aaa
bbb
header 2
ccc
ddd

我尝试遍历每个 ul,然后打印标题和 li 值。但是,我不知道如何获取与 ul 关联的跨度 header 。

sets = tree.xpath("//div[@id='big']//ul[@id='outer']")

for set in sets:

# Print header. Not sure how to get it
header = set.xpath(".//li/preceding-sibling::span")
print header

# Print texts. This works.
values = set.xpath(".//li//text()")
for v in values:
print v

仅仅循环所有文本节点是行不通的,因为我需要知道它是标题还是 li 值。

最佳答案

这有效:

header = ingred_set.getprevious().xpath(".//text()")[0]

关于python - lxml/ python : get previous-sibling,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16262532/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com