gpt4 book ai didi

python - 将字典转换为数据框python

转载 作者:搜寻专家 更新时间:2023-10-30 22:07:54 25 4
gpt4 key购买 nike

如何在 pandas python 中读取文件作为数据框?

文件包含以下内容

{"headers": {"ai5": "8fa683e59c02c04cb781ac689686db07", "debug": null, "random": null, "sdkv": "7.6"}, "post": {"event": "ggstart", "ts": "1462759195259"}, "params": {}, "bottle": {"timestamp": "2016-05-09 02:00:00.004906", "game_id": "55107008"}}
{"headers": {"ai5": "335644267c1d5f04eaea7bc6f51b1861", "debug": null, "random": null, "sdkv": "7.6"}, "post": {"event": "ggstart", "ts": "1462759189745"}, "params": {}, "bottle": {"timestamp": "2016-05-09 02:00:00.033775", "game_id": "55107008"}}

....下面有很多行

如何将它加载到数据框中,字典键作为标题?

最佳答案

您可以使用 read_json首先使用参数 lines=True:

df = pd.read_json('file.json', lines=True)
print (df)
bottle \
0 {'timestamp': '2016-05-09 02:00:00.004906', 'g...
1 {'timestamp': '2016-05-09 02:00:00.033775', 'g...

headers params \
0 {'ai5': '8fa683e59c02c04cb781ac689686db07', 'r... {}
1 {'ai5': '335644267c1d5f04eaea7bc6f51b1861', 'r... {}

post
0 {'event': 'ggstart', 'ts': '1462759195259'}
1 {'event': 'ggstart', 'ts': '1462759189745'}

然后 concat嵌套的 dictionaries,输出是 MultiIndex in columns:

df = pd.concat([pd.DataFrame(df[x].values.tolist()) for x in df], axis=1, keys=df.columns)
print (df)
bottle headers \
game_id timestamp ai5
0 55107008 2016-05-09 02:00:00.004906 8fa683e59c02c04cb781ac689686db07
1 55107008 2016-05-09 02:00:00.033775 335644267c1d5f04eaea7bc6f51b1861

post
debug random sdkv event ts
0 None None 7.6 ggstart 1462759195259
1 None None 7.6 ggstart 1462759189745

使用 apply(pd.Series) 的较慢解决方案

df = pd.concat([df[x].apply(pd.Series) for x in df], axis=1, keys=df.columns)
print (df)
bottle headers \
game_id timestamp ai5
0 55107008 2016-05-09 02:00:00.004906 8fa683e59c02c04cb781ac689686db07
1 55107008 2016-05-09 02:00:00.033775 335644267c1d5f04eaea7bc6f51b1861

post
debug random sdkv event ts
0 None None 7.6 ggstart 1462759195259
1 None None 7.6 ggstart 1462759189745

要删除 MultiIndex 添加 map:

df = pd.concat([pd.DataFrame(df[x].values.tolist()) for x in df], axis=1, keys=df.columns)
df.columns = df.columns.map('_'.join)
print (df)
bottle_game_id bottle_timestamp \
0 55107008 2016-05-09 02:00:00.004906
1 55107008 2016-05-09 02:00:00.033775

headers_ai5 headers_debug headers_random headers_sdkv \
0 8fa683e59c02c04cb781ac689686db07 None None 7.6
1 335644267c1d5f04eaea7bc6f51b1861 None None 7.6

post_event post_ts
0 ggstart 1462759195259
1 ggstart 1462759189745

关于python - 将字典转换为数据框python,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43368328/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com