gpt4 book ai didi

python - 将多层嵌套 json 展平为 pandas 数据框

转载 作者:行者123 更新时间:2023-12-02 01:26:17 24 4
gpt4 key购买 nike

我正在尝试将此 json 响应压平为 pandas 数据帧以导出到 csv。

看起来像这样:

j = [
{
"id": 401281949,
"teams": [
{
"school": "Louisiana Tech",
"conference": "Conference USA",
"homeAway": "away",
"points": 34,
"stats": [
{"category": "rushingTDs", "stat": "1"},
{"category": "puntReturnYards", "stat": "24"},
{"category": "puntReturnTDs", "stat": "0"},
{"category": "puntReturns", "stat": "3"},
],
}
],
}
]

...统计区域中还有更多项目。如果我运行这个并扁平化到团队级别:

multiple_level_data = pd.json_normalize(j, record_path =['teams'])

我得到:

           school      conference homeAway  points                                              stats
0 Louisiana Tech Conference USA away 34 [{'category': 'rushingTDs', 'stat': '1'}, {'ca...

如何将其展平两次,以便所有统计数据都位于每行各自的列上?

如果我这样做:

multiple_level_data = pd.json_normalize(j, record_path =['teams'])
multiple_level_data = multiple_level_data.explode('stats').reset_index(drop=True)
multiple_level_data=multiple_level_data.join(pd.json_normalize(multiple_level_data.pop('stats')))

我最终得到多行而不是更多列:

enter image description here

最佳答案

你可以尝试:

df = pd.DataFrame(j).explode("teams")
df = pd.concat([df, df.pop("teams").apply(pd.Series)], axis=1)

df["stats"] = df["stats"].apply(lambda x: {d["category"]: d["stat"] for d in x})

df = pd.concat(
[
df,
df.pop("stats").apply(pd.Series),
],
axis=1,
)

print(df)

打印:

          id          school      conference homeAway  points rushingTDs puntReturnYards puntReturnTDs puntReturns
0 401281949 Louisiana Tech Conference USA away 34 1 24 0 3

关于python - 将多层嵌套 json 展平为 pandas 数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74538822/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com