gpt4 book ai didi

pandas - 如何在pandas中将一列分成三列

转载 作者:行者123 更新时间:2023-12-01 22:54:15 25 4
gpt4 key购买 nike

我有一个数据框,如下所示

ID  Name     Address
1 Kohli Country: India; State: Delhi; Sector: SE25
2 Sachin Country: India; State: Mumbai; Sector: SE39
3 Ponting Country: Australia; State: Tasmania
4 Ponting State: Tasmania; Sector: SE27

从上面我想准备下面的数据框

ID  Name     Country   State     Sector
1 Kohli India Delhi SE25
2 Sachin India Mumbai SE39
3 Ponting Australia Tasmania None
4 Ponting None Tasmania SE27

我尝试了下面的代码

df[['Country', 'State', 'Sector']] = pd.DataFrame(df['ADDRESS'].str.split(';',2).tolist(),
columns = ['Country', 'State', 'Sector'])

但是从上面看来,我必须通过对列进行切片来清理数据。我想知道有没有比这个更简单的方法。

最佳答案

使用列表理解和字典理解来获取字典列表,并传递给 DataFrame 构造函数:

L = [{k:v for y in x.split('; ')  for k, v in dict([y.split(': ')]).items()} 
for x in df.pop('Address')]

df = df.join(pd.DataFrame(L, index=df.index))
print (df)
ID Name Country State Sector
0 1 Kohli India Delhi SE25
1 2 Sachin India Mumbai SE39
2 3 Ponting Australia Tasmania NaN

或者使用split和reshape stack:

df1 = (df.pop('Address')
.str.split('; ', expand=True)
.stack()
.reset_index(level=1, drop=True)
.str.split(': ', expand=True)
.set_index(0, append=True)[1]
.unstack()
)
print (df1)
0 Country Sector State
0 India SE25 Delhi
1 India SE39 Mumbai
2 Australia NaN Tasmania

df = df.join(df1)
print (df)
ID Name Country Sector State
0 1 Kohli India SE25 Delhi
1 2 Sachin India SE39 Mumbai
2 3 Ponting Australia NaN Tasmania

关于pandas - 如何在pandas中将一列分成三列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57588040/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com