gpt4 book ai didi

python - 如何用 pandas 数据框中的多个值替换一列值

转载 作者:行者123 更新时间:2023-12-05 04:45:30 25 4
gpt4 key购买 nike

输入数据框

Date            Geo         Shipment
2020-01-01 USA 1000
2020-01-01 BRA 5865
2020-01-01 CHN 4789
2020-01-02 EU1 6541
2020-01-02 EU2 3258
..

dict = {"EU1":["ALA", "BEL", "AND", "AUT"] , "EU2": ["AUT", "BEL", "BGR", "HRV", "CZE"] , "EU3": ["EST", "HRV", "FRA", "DEU"]}

如何替换 Geo 列中的值,以便存在一对多映射和重复 Shipment 值?

输出df

Date            Geo         Shipment
2020-01-01 ALA 1000
2020-01-01 BEL 1000
2020-01-01 AND 1000
2020-01-01 AUT 1000
..
2020-01-01 AUT 5865
2020-01-01 BEL 5865
2020-01-01 HRV 5865
2020-01-01 BGR 5865
2020-01-01 CZE 5865
..
2020-01-01 EST 4789
2020-01-01 HRV 4789
2020-01-01 FRA 4789
2020-01-01 DEU 4789
..

最佳答案

通过构造函数使用DataFrame,然后通过DataFrame.merge 进行外部连接|并按 DataFrame.pop 重新分配列 Geo :

d =  {"EU1":["ALA", "BEL", "AND", "AUT"] , 
"EU2": ["AUT", "BEL", "BGR", "HRV", "CZE"] ,
"EU3": ["EST", "HRV", "FRA", "DEU"]}

df1 = pd.DataFrame(((k, x) for k, v in d.items() for x in v), columns=['Geo','New'])


df = df.merge(df1, on='Geo', how='outer')
df['Geo'] = df.pop('New')
df = df.sort_values('Date', ignore_index=True)

print (df)
Date Geo Shipment
0 2020-01-01 ALA 1000
1 2020-01-01 HRV 4789
2 2020-01-01 EST 4789
3 2020-01-01 CZE 5865
4 2020-01-01 HRV 5865
5 2020-01-01 FRA 4789
6 2020-01-01 BEL 5865
7 2020-01-01 AUT 5865
8 2020-01-01 BGR 5865
9 2020-01-01 AUT 1000
10 2020-01-01 AND 1000
11 2020-01-01 BEL 1000
12 2020-01-01 DEU 4789
13 2020-01-02 AND 6541
14 2020-01-02 BEL 6541
15 2020-01-02 ALA 6541
16 2020-01-02 AUT 3258
17 2020-01-02 BEL 3258
18 2020-01-02 BGR 3258
19 2020-01-02 HRV 3258
20 2020-01-02 CZE 3258
21 2020-01-02 AUT 6541

新数据的解决方案:

d =  {"EU1":["ALA", "BEL", "AND", "AUT"] , 
"EU2": ["AUT", "BEL", "BGR", "HRV", "CZE"] ,
"EU3": ["EST", "HRV", "FRA", "DEU"]}

df1 = pd.DataFrame(((k, x) for k, v in d.items() for x in v), columns=['Geo','New'])

df = df.merge(df1, on='Geo', how='left')
df['Geo'] = df.pop('New').fillna(df['Geo'])
print (df)
Date Geo Shipment
0 2020-01-01 USA 1000
1 2020-01-01 BRA 5865
2 2020-01-01 CHN 4789
3 2020-01-02 ALA 6541
4 2020-01-02 BEL 6541
5 2020-01-02 AND 6541
6 2020-01-02 AUT 6541
7 2020-01-02 AUT 3258
8 2020-01-02 BEL 3258
9 2020-01-02 BGR 3258
10 2020-01-02 HRV 3258
11 2020-01-02 CZE 3258

关于python - 如何用 pandas 数据框中的多个值替换一列值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69114105/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com