gpt4 book ai didi

python - 在 python 中标准化复杂的 Json

转载 作者:行者123 更新时间:2023-12-01 07:15:08 28 4
gpt4 key购买 nike

我正在尝试使用 pandas 规范 json 函数,但无法展平 json 内的嵌套数组

我已经尝试阅读其中一个示例,但它只使用此代码给了我一条记录

from pandas.io.json import json_normalize
data ={
"id":"001",
"counties" : [ {"name":"y"},{"name":"X"}],
"extendedDescriptionFrench" : "Fromage Brick Petit Gaspesien",
"brand" : "PETIT GASPESIEN",
"brandFrench" : "PETIT GASPESIEN",
"productLife" : "90",
"digitalAssetFoodservice" : [ {
"digitalAssetFormatFoodservice" : "JPG",
"digitalAssetGDTIFoodservice" : "754000000016500000000002167445",
"digitalAssetImageVersionDateTimeFoodservice" : "2016-06-28T20:06:06.000-04:00",
"digitalAssetStateFoodservice" : "P"
}, {
"digitalAssetFormatFoodservice" : "JPG",
"digitalAssetGDTIFoodservice" : "754000000016500000000002167597",
"digitalAssetImageTypeFoodservice" : "M",
"digitalAssetImageVersionDateTimeFoodservice" : "2016-06-28T20:06:06.000-04:00"
}, {
"digitalAssetFormatFoodservice" : "JPG",
"digitalAssetGDTIFoodservice" : "754000000016500000000002167687",
"digitalAssetImageTypeFoodservice" : "C",
"digitalAssetImageVersionDateTimeFoodservice" : "2016-06-28T20:06:06.000-04:00",
"digitalAssetStateFoodservice" : "C"
} ]
}

a=json_normalize(data)
print(a)

是否有办法将“digitalAssetFoodservice”数组展平为列。

这是我得到的输出 Current ouput

如果我有多个嵌套数组字段怎么办

最佳答案

我相信您需要将嵌套数组的键与非嵌套数组的键一起传递。

a=json_normalize(data,'digitalAssetFoodservice',['id','extendedDescriptionFrench','brand','productLife'])

print(a)
print(a.columns)

输出:

  digitalAssetFormatFoodservice     digitalAssetGDTIFoodservice digitalAssetImageVersionDateTimeFoodservice  ...      extendedDescriptionFrench            brand productLife
0 JPG 754000000016500000000002167445 2016-06-28T20:06:06.000-04:00 ... Fromage Brick Petit Gaspesien PETIT GASPESIEN 90
1 JPG 754000000016500000000002167597 2016-06-28T20:06:06.000-04:00 ... Fromage Brick Petit Gaspesien PETIT GASPESIEN 90
2 JPG 754000000016500000000002167687 2016-06-28T20:06:06.000-04:00 ... Fromage Brick Petit Gaspesien PETIT GASPESIEN 90

[3 rows x 9 columns]
Index(['digitalAssetFormatFoodservice', 'digitalAssetGDTIFoodservice',
'digitalAssetImageVersionDateTimeFoodservice',
'digitalAssetStateFoodservice', 'digitalAssetImageTypeFoodservice',
'id', 'extendedDescriptionFrench', 'brand', 'productLife'],
dtype='object')

尝试两种情况时:

这就是我想出的办法,因为我找不到一种方法来在一行中生成预期的输出。我还稍微更改了脚本以使其更容易。 (基于:“问题:您的预期输出是什么?一切都是重复的,一次用于名称:x,一次用于名称:y?OP:是,重复”

from pandas.io.json import json_normalize
data ={
"id":"001",
"counties" : [ {"name":"y"},{"name":"X"}],
"eDF" : "Fromage Brick Petit Gaspesien",
"brand" : "PETIT GASPESIEN",
"brandFrench" : "PETIT GASPESIEN",
"productLife" : "90",
"dAF" : [ {
"dAFF" : "JPG",
"dAGDTIF" : "75401652167445",
"dAIVDTF" : "2016-06-28",
"dASF" : "P"
}, {
"dAFF" : "JPG",
"dAGDTIF" : "75401652167597",
"dAITFa" : "M",
"dAIVDTF" : "2016-06-28"
}, {
"dAFF" : "JPG",
"dAGDTIF" : "7540162167687",
"dAITF" : "C",
"dAIVDTF" : "2016-06-28",
"dASF" : "C"
} ]
}

repetitive = ['id','eDF','brand','brandFrench','productLife']
a=json_normalize(data,'counties',repetitive)
b=json_normalize(data,'dAF',repetitive)
c = a.merge(b,how='inner',left_on=repetitive,right_on=repetitive)
print(a)

输出:

 name   id                            eDF            brand      brandFrench productLife
0 y 001 Fromage Brick Petit Gaspesien PETIT GASPESIEN PETIT GASPESIEN 90
1 X 001 Fromage Brick Petit Gaspesien PETIT GASPESIEN PETIT GASPESIEN 90

现在另一个b:

print(b)

输出:

  dAFF         dAGDTIF     dAIVDTF dASF dAITFa dAITF   id                            eDF            brand      brandFrench productLife
0 JPG 75401652167445 2016-06-28 P NaN NaN 001 Fromage Brick Petit Gaspesien PETIT GASPESIEN PETIT GASPESIEN 90
1 JPG 75401652167597 2016-06-28 NaN M NaN 001 Fromage Brick Petit Gaspesien PETIT GASPESIEN PETIT GASPESIEN 90
2 JPG 7540162167687 2016-06-28 C NaN C 001 Fromage Brick Petit Gaspesien PETIT GASPESIEN PETIT GASPESIEN 90

最后c:

print(c)

输出:

  name   id                            eDF            brand      brandFrench productLife dAFF         dAGDTIF     dAIVDTF dASF dAITFa dAITF
0 y 001 Fromage Brick Petit Gaspesien PETIT GASPESIEN PETIT GASPESIEN 90 JPG 75401652167445 2016-06-28 P NaN NaN
1 y 001 Fromage Brick Petit Gaspesien PETIT GASPESIEN PETIT GASPESIEN 90 JPG 75401652167597 2016-06-28 NaN M NaN
2 y 001 Fromage Brick Petit Gaspesien PETIT GASPESIEN PETIT GASPESIEN 90 JPG 7540162167687 2016-06-28 C NaN C
3 X 001 Fromage Brick Petit Gaspesien PETIT GASPESIEN PETIT GASPESIEN 90 JPG 75401652167445 2016-06-28 P NaN NaN
4 X 001 Fromage Brick Petit Gaspesien PETIT GASPESIEN PETIT GASPESIEN 90 JPG 75401652167597 2016-06-28 NaN M NaN
5 X 001 Fromage Brick Petit Gaspesien PETIT GASPESIEN PETIT GASPESIEN 90 JPG 7540162167687 2016-06-28 C NaN C

关于python - 在 python 中标准化复杂的 Json,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58014217/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com