gpt4 book ai didi

python - 使用 python xml 库将 XML 转换为 csv

转载 作者:行者123 更新时间:2023-12-01 23:47:36 29 4
gpt4 key购买 nike

很抱歉,如果这个问题已经被问到。我有以下 XML 文件,我想将其转换为 CSV 或 excel。在这里我想在 IpRoutelist 下提取 NodeName 及其子 DestIPAddress。和自定义/名称标签下的值

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<EnbConfigGetResponse xmlns="http://Airspan.Netspan.WebServices">
<EnbConfigGetResult>
<ErrorCode>OK</ErrorCode>
<NodeResult>
<NodeResultCode>OK</NodeResultCode>
<NodeName>IMUMB0899</NodeName>
<NodeDetail>
<Custom>
<Name>Circle</Name>
<Value>MU</Value>
</Custom>
<Custom>
<Name>GW VLAN 601</Name>
<Value>2405:200:101::</Value>
</Custom>
<Custom>
<Name>GW VLAN 602</Name>
<Value>2405:200:104::</Value>
</Custom>
</NodeDetail>
<EnbConfig>
<Name>IMUMB0899</Name>
<Hardware>1000 SFP</Hardware>
<Description>TT</Description>
<Site>DND</Site>
<Region>DND</Region>
<Altitude>0</Altitude>
<NbifEventAlarmForwarding>Enabled</NbifEventAlarmForwarding>
<ENodeBType>Macro</ENodeBType>
<ENodeBID>397063</ENodeBID>
<M1SubnetMask>120</M1SubnetMask>
<IpRouteList>
<IpRoute>
<DestIpAddress>172.172.6.20</DestIpAddress>
<IpSubnetMask>255.255.255.255</IpSubnetMask>
<GatewayIpAddress>172.21.200.1</GatewayIpAddress>
</IpRoute>
<IpRoute>
<DestIpAddress>2405:20:1::</DestIpAddress>
<IpSubnetMask>40</IpSubnetMask>
<GatewayIpAddress>2405:20:101:4:7:2:61:1</GatewayIpAddress>
</IpRoute>
</IpRouteList>
<NodeResult>
</EnbConfigGetResult>
</EnbConfigGetResponse>
</soap:Body>
</soap:Envelope>

我尝试了下面提取名称和 IProute 的代码,但是当我尝试合并只有一个 IP 路由时,我得到了 NodeName,但有两个可用。

from bs4 import BeautifulSoup
import pandas as pd
import lxml
import xml.etree.cElementTree
import openpyxl
import inspect
import os

sites = "xml"

with open(sites, "r",encoding='unicode_escape') as f:
xml_data = f.read()

soup = BeautifulSoup(xml_data, "xml")

tag1 = input("Enter tagname1:")
tag2 = input("Enter tagname2:")

data = []
dd = []

for td in soup.find_all(tag1):
data.append({"NodeName": td.text})

for iproute in soup.find_all(tag2):
dd.append({"IpRoute": iproute.text})

df1 = pd.DataFrame(data)
df2 = pd.DataFrame(dd)

df = pd.merge(df1,df2,left_index=True, right_index=True)

df.to_excel(sites + '.xlsx', sheet_name='Detail', index = False)
print("*************Done*************")

预期输出: enter image description here

最佳答案

另一种方法。

from simplified_scrapy import SimplifiedDoc, utils, req
# xml = utils.getFileContent('file.xml')
xml = ''' Your xml string'''

doc = SimplifiedDoc(xml)
lstNodeResult = doc.selects('NodeResult')
data = [['NodeName','DestIpAddress','GatewayIpAddress','value1','value2','value3']]
for result in lstNodeResult:
lstCustom = result.selects('NodeDetail>Custom')
if lstCustom:
lstCustom = lstCustom.Value.text
NodeName = result.NodeName.text
lstIpRoute = result.IpRoutes
for IpRoute in lstIpRoute:
row = [NodeName,IpRoute.DestIpAddress.text,IpRoute.GatewayIpAddress.text]
if lstCustom: row.extend(lstCustom)
data.append(row)
# print (data)
utils.save2csv('test.csv',data)

# Or
data = {
'NodeName':lstNodeResult.NodeName.text,
'DestIpAddress':lstNodeResult.select('IpRoute>DestIpAddress>text()'),
'GatewayIpAddress':lstNodeResult.select('IpRoute>GatewayIpAddress>text()')
}
# print (data)

结果:

.......
I-MU-NVMB-OSC-0900-SMC004,2405:200:310:5a::,2405:200:101:500:7:2:602:3503
I-MU-NVMB-OSC-0900-SMC004,2405:200:310:1::,2405:200:101:500:7:2:602:3503
I-MU-NVMB-ISC-0181-SWC0002,2405:200:310:1::,2405:200:101:500:7:2:602:5d03,MU,2405:200:101:500:7:2:601:5d03,2405:200:101:500:7:2:602:5d03
I-MU-NVMB-ISC-0181-SWC0002,2405:200:311:2::,2405:200:101:500:7:2:602:5d03,MU,2405:200:101:500:7:2:601:5d03,2405:200:101:500:7:2:602:5d03
I-MU-NVMB-ISC-0181-SWC0002,2405:200:310:a152::,2405:200:101:500:7:2:602:5d03,MU,2405:200:101:500:7:2:601:5d03,2405:200:101:500:7:2:602:5d03
.......

这里有更多例子:https://github.com/yiyedata/simplified-scrapy-demo/tree/master/doc_examples

删除空行。

def delEmptyRows(name, encoding="utf-8"):
lines = utils.getFileLines(name, encoding=encoding)
lines = [line for line in lines if line.strip()!='']
utils.saveFile(name, "".join(lines), encoding=encoding)
# use
delEmptyRows('test.csv')

获取IBridge2RfStatsRow数据的方法。

from simplified_scrapy import SimplifiedDoc, utils, req
xml = req.get('https://pastebin.com/raw/SWhDM1zq')

rows = []
doc = SimplifiedDoc(xml)
lstIBridge2RfStatsRow = doc.selects('IBridge2RfStatsRow').children
# Add header
header = []
for col in lstIBridge2RfStatsRow[0]:
header.append(col.tag)
rows.append(header)

# Generate line
for row in lstIBridge2RfStatsRow:
rows.append([col.text for col in row])

utils.save2csv('IBridge2RfStatsRow.csv',rows)

关于python - 使用 python xml 库将 XML 转换为 csv,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63961184/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com