gpt4 book ai didi

python - 如何在一行数据帧中解压列表内的多个字典对象?

转载 作者:行者123 更新时间:2023-11-30 22:11:04 27 4
gpt4 key购买 nike

我有一个数据框,其中每行和每行的单个列表中包含以下字典,该列表的大小不同,它们的大小不同,如下所示:

ID    unnest_column

1 [{'abc': 11, 'def': 1},{'abc': 15, 'def': 1},
{'abc': 16, 'def': 1},
{'abc': 17, 'def': 1},
{'abc': 18, 'def': 1, 'ghi': 'abc'},
{'abc': 23, 'def': 'xxx', 'def': 1},
{'abc': 23, 'def': 'xxx', 'def': 2},
{'abc': 23, 'def': 'xxx', 'def': 4}]


2 [{'abc': 11, 'def': 1}]

如何解压列表中的字典并创建键值列?

可能有新的 df,不确定它到底是什么样子,只需要将键插入列:

id    abc    def     ghi

1 2 3 abc

最佳答案

IIUC,来自

df = pd.DataFrame()
df['x'] = [[{'QuestionId': 11, 'ResponseId': 1},{'QuestionId': 15, 'ResponseId': 1},
{'QuestionId': 16, 'ResponseId': 1},
{'QuestionId': 17, 'ResponseId': 1},
{'QuestionId': 18, 'ResponseId': 1, 'Value': 'abc'},
{'QuestionId': 23, 'DataLabel': 'xxx', 'ResponseId': 1},
{'QuestionId': 23, 'DataLabel': 'xxx', 'ResponseId': 2},
{'QuestionId': 23, 'DataLabel': 'xxx', 'ResponseId': 4}],
[{'QuestionId': 11, 'ResponseId': 1}]]

您可以sum列表来聚合它们,并使用DataFrame构造函数

new_df = pd.DataFrame(df.x.values.sum())


DataLabel QuestionId ResponseId Value
0 NaN 11 1 NaN
1 NaN 15 1 NaN
2 NaN 16 1 NaN
3 NaN 17 1 NaN
4 NaN 18 1 abc
5 xxx 23 1 NaN
6 xxx 23 2 NaN
7 xxx 23 4 NaN
8 NaN 11 1 NaN

如果您想维护原始索引,可以构建一个 inds 列表并将其作为参数传递给构造函数:

inds = [index for _ in ([i] * len(v) for i,v in df.x.iteritems()) for index in _]
pd.DataFrame(df.x.values.sum(), index=inds)

DataLabel QuestionId ResponseId Value
0 NaN 11 1 NaN
0 NaN 15 1 NaN
0 NaN 16 1 NaN
0 NaN 17 1 NaN
0 NaN 18 1 abc
0 xxx 23 1 NaN
0 xxx 23 2 NaN
0 xxx 23 4 NaN
1 NaN 11 1 NaN

关于python - 如何在一行数据帧中解压列表内的多个字典对象?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51528643/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com