gpt4 book ai didi

python - 如何使用列表理解从嵌套字典中提取

转载 作者:行者123 更新时间:2023-11-30 23:33:07 28 4
gpt4 key购买 nike

我正在尝试从 XML 中提取一些数据。我正在使用xmltodict将数据加载到字典中,然后使用列表推导式将各个部分拉出到单独的列表中。稍后我将使用 matplotlib 绘制这些图。

XML:

<?xml version="1.0" ?>
<MYDATA>
<SESSION ID="1234">
<INFO>
<BEGIN LOAD="23"/>
</INFO>
<TRANSACTION ID="2103645570">
<ANSWER>Hello</ANSWER>
</TRANSACTION>
<TRANSACTION ID="4315547431">
<ANSWER>This is an answer</ANSWER>
</TRANSACTION>
</SESSION>
<SESSION ID="5678">
<INFO>
<BEGIN LOAD="28"/>
</INFO>
<TRANSACTION ID="4099381642">
<ANSWER>Hello</ANSWER>
</TRANSACTION>
<TRANSACTION ID="1220404184">
<ANSWER>A Different answer</ANSWER>
</TRANSACTION>
<TRANSACTION ID="201506542">
<ANSWER>Yet another one</ANSWER>
</TRANSACTION>
</SESSION>
</MYDATA>

我的代码:

from collections import OrderedDict

# doc contains the xml exactly as loaded by xmltodict
doc = OrderedDict([(u'MYDATA', OrderedDict([(u'SESSION', [OrderedDict([(u'@ID', u'1234'), (u'INFO', OrderedDict([(u'BEGIN', OrderedDict([(u'@LOAD', u'23')]))])), (u'TRANSACTION', [OrderedDict([(u'@ID', u'2103645570'), (u'ANSWER', u'Hello')]), OrderedDict([(u'@ID', u'4315547431'), (u'ANSWER', u'This is an answer')])])]), OrderedDict([(u'@ID', u'5678'), (u'INFO', OrderedDict([(u'BEGIN', OrderedDict([(u'@LOAD', u'28')]))])), (u'TRANSACTION', [OrderedDict([(u'@ID', u'4099381642'), (u'ANSWER', u'Hello')]), OrderedDict([(u'@ID', u'1220404184'), (u'ANSWER', u'A Different answer')]), OrderedDict([(u'@ID', u'201506542'), (u'ANSWER', u'Yet another one')])])])])]))])

sess_ids = [i['@ID'] for i in doc['MYDATA']['SESSION']]
print sess_ids

sess_loads = [i['INFO']['BEGIN']['@LOAD'] for i in doc['MYDATA']['SESSION']]
print sess_loads

trans_ids = [[j['@ID'] for j in i['TRANSACTION']] for i in doc['MYDATA']['SESSION']]
print trans_ids

输出:

sess_ids:    [u'1234', u'5678']
sess_loads: [u'23', u'28']
trans_ids: [[u'2103645570', u'4315547431'], [u'4099381642', u'1220404184', u'201506542']]

您可以看到我能够访问 SESSION 元素中的 ID 属性以及 BEGIN 元素中的 LOAD 属性。

我需要从 TRANSACTION 元素中获取 ID 属性作为单个列表。目前,我正在变量 trans_ids 中获取列表的列表。

如何获得简单的值列表?

我已经尝试过:

[j['@ID'] for j in i['TRANSACTION'] for i in doc['MYDATA']['SESSION']]

但这只是重复第二个 session 两次,给出:

[u'4099381642',
u'4099381642',
u'1220404184',
u'1220404184',
u'201506542',
u'201506542']

最佳答案

你需要查字典吗?这种事情在 XML 中相当简单:

import xml.etree.ElementTree as etree
txml = etree.parse('xml string above')
txml.findall('SESSION/TRANSACTION')
[<Element TRANSACTION at 0x4064f9d8>,
<Element TRANSACTION at 0x4064fa20>,
<Element TRANSACTION at 0x4064f990>,
<Element TRANSACTION at 0x4064fa68>,
<Element TRANSACTION at 0x4064fab0>]
[x.get('ID') for x in txml.findall('SESSION/TRANSACTION')]
['2103645570', '4315547431', '4099381642', '1220404184', '201506542']

至少,对我来说它看起来更紧凑。

关于python - 如何使用列表理解从嵌套字典中提取,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19099344/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com