gpt4 book ai didi

python - 如何使用 python minidom 从 XML 中提取数据

转载 作者:行者123 更新时间:2023-12-01 06:07:42 26 4
gpt4 key购买 nike

给定这个 xml 文件,我想从中提取数据。但是,我无法从 <LandmarkPointListXml> 中提取数据。从此。

XML 文件:

  <?xml version="1.0" encoding="utf-8"?>
<Map xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<MapName>er</MapName>
<MapURL>er.gif</MapURL>
<Name>er</Name>
<URL>er.gif</URL>
<LandmarkPointListXml>
<anyType xsi:type="LandmarkPointProperty">
<LandmarkPointX>400</LandmarkPointX>
<LandmarkPointY>292</LandmarkPointY>
<LandmarkDesc>my room door</LandmarkDesc>
</anyType>
<anyType xsi:type="LandmarkPointProperty">
<LandmarkPointX>399</LandmarkPointX>
<LandmarkPointY>219</LandmarkPointY>
<LandmarkDesc>bro room door</LandmarkDesc>
</anyType>
</LandmarkPointListXml>
<RegionPointListXml />
</Map>

Python程序:

    def GetMapData(self):
result = ""
haha = self.XMLdoc.firstChild #root node
for child in haha.childNodes:
if (cmp(child.nodeName,'LandmarkPointListXml')==0):
result = result + '|' + self.loopLandmark(child.childNodes) + '|'
else:
result = result + child.firstChild.nodeValue + ','
return result

def loopLandmark(self, landmarks):
result=""
haha=landmarks.getElementsByTagName('anyType')
for child in haha.childNodes:
if (cmp(haha.firstChild.nodeName,'LandmarkPointX') == 0):
result=result+child.firstChild.nodeValue+','
ChildNode = ChildNode.nextSibling
result=result+child.firstChild.nodeValue+','
ChildNode = ChildNode.nextSibling
result=result+child.firstChild.nodeValue
return result

我能够检索结果“er,er.gif,er,er.gif”,直到程序到达 <LandmarkPointListXml> .

最佳答案

这段代码非常脆弱。它对 XML 输入做出强有力的假设,如果以有效方式修改 XML(例如 if 不是紧接在 后面),则会失败。

我建议在解析 XML 时使用标准库,例如 Element Tree ( http://docs.python.org/library/xml.etree.elementtree.html ) 或 lxml ( http://lxml.de ),它们也可以验证您的 XML 输入。

我在下面编写的代码使用元素树并适用于您的 XML 输入(我已删除了父类的“self”参数)。它还容忍(忽略)XML 元素中的空值。

import xml.etree.ElementTree as ET

def GetMapData( xmlfile ):
result = ""
try:
tree = ET.parse( xmlfile )
except IOError, e:
print "Failure Parsing %s: %s" % (xmlfile, e)
root = tree.getroot() # root node
for child in root:
if ( child.tag == 'LandmarkPointListXml' ):
result += '|' + loopLandmark(child) + '|'
elif child.text is not None:
result += child.text + ','
return result

def loopLandmark( landmarks ):
result=""
for landmark in landmarks:
if ( landmark.tag == 'anyType' ): # check also xsi:type="LandmarkPointProperty"?
for child in landmark:
if ( child.text and child.tag in [ 'LandmarkPointX', 'LandmarkPointY' ] ):
result += child.text + ','
return result

GetMapData( 'xml.in' )

关于python - 如何使用 python minidom 从 XML 中提取数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7342394/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com