gpt4 book ai didi

python - 损坏的 XML 文件解析和使用 XPATH

转载 作者:行者123 更新时间:2023-12-01 05:56:17 25 4
gpt4 key购买 nike

我正在嗅探网络上的数据包,并使用 Scapy 和 Python 从原始有效负载中恢复 XML 数据。当我组装框架时,我获得的 XML 数据缺少一些标签。因此,我无法使用 etree.parse() 函数解析 XML 文件。有什么方法可以解析损坏的 XML 文件并使用 XPATH 表达式遍历并获取我想要的数据。

最佳答案

我确信我的解决方案太简单了,无法涵盖所有​​情况,但它应该能够在缺少结束标记时涵盖简单的情况:

>>> def fix_xml(string):
"""
Tries to insert missing closing XML tags
"""
error = True
while error:
try:
# Put one tag per line
string = string.replace('>', '>\n').replace('\n\n', '\n')
root = etree.fromstring(string)
error = False
except etree.XMLSyntaxError as exc:
text = str(exc)
pattern = "Opening and ending tag mismatch: (\w+) line (\d+) and (\w+), line (\d+), column (\d+)"
m = re.match(pattern, text)
if m:
# Retrieve where error took place
missing, l1, closing, l2, c2 = m.groups()
l1, l2, c2 = int(l1), int(l2), int(c2)
lines = string.split('\n')
print 'Adding closing tag <{0}> at line {1}'.format(missing, l2)
missing_line = lines[l2 - 1]
# Modified line goes back to where it was
lines[l2 - 1] = missing_line.replace('</{0}>'.format(closing), '</{0}></{1}>'.format(missing, closing))
string = '\n'.join(lines)
else:
raise
print string

这似乎正确添加了缺失的标签 B 和 C:

>>> s = """<A>
<B>
<C>
</B>
<B></A>"""
>>> fix_xml(s)
Adding closing tag <C> at line 4
Adding closing tag <B> at line 7
<A>
<B>
<C>
</C>
</B>
<B>
</B>
</A>

关于python - 损坏的 XML 文件解析和使用 XPATH,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12494277/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com