gpt4 book ai didi

python - 在一次更新中在 Pandas 数据框中创建多列

转载 作者:行者123 更新时间:2023-11-28 22:16:11 25 4
gpt4 key购买 nike

我有一个数据框如下:

df = pd.DataFrame({'Group': ['Fruit', 'Vegetable', 'Fruit', 'Vegetable', 'Fruit', 'Vegetable', 'Vegetable'],
'NId': ['Banana', 'Onion', 'Grapes', 'Potato', 'Apple', np.nan, np.nan],
'BName': [np.nan, 'GTwo', np.nan, 'GSix', np.nan, 'GOne', 'GNine'],
'BId': [np.nan, '5252', np.nan, '5678', np.nan, '5125', '5923']})
df['BId'] = df['BId'].astype(str)
df = df[['Group', 'NId', 'BName', 'BId']]

数据框如下:

       Group     NId  BName   BId
0 Fruit Banana NaN nan
1 Vegetable Onion GTwo 5252
2 Fruit Grapes NaN nan
3 Vegetable Potato GSix 5678
4 Fruit Apple NaN nan
5 Vegetable NaN GOne 5125
6 Vegetable NaN GNine 5923

然后我执行以下操作以创建如下编码的新列:

df.loc[df['NId'].notna(), 'Cat'] = df[df['NId'].notna()].apply(lambda x: 'NId', axis=1)
df.loc[df['NId'].isna(), 'Cat'] = df[df['NId'].isna()].apply(lambda x: 'GId', axis=1)

df.loc[df['NId'].notna(), 'Id'] = df[df['NId'].notna()].apply(lambda x: str(x['NId']), axis=1)
df.loc[df['NId'].isna(), 'Id'] = df[df['NId'].isna()].apply(lambda x: x['BName'], axis=1)

df.loc[df['NId'].notna(), 'IdQ'] = df[df['NId'].notna()].apply(lambda x: 'NId:' + str(x['NId']), axis=1)
df.loc[df['NId'].isna(), 'IdQ'] = df[df['NId'].isna()].apply(lambda x: 'BId:' + x['BId'], axis=1)

产生了以下输出数据帧:

       Group     NId  BName   BId  Cat      Id         IdQ
0 Fruit Banana NaN nan NId Banana NId:Banana
1 Vegetable Onion GTwo 5252 NId Onion NId:Onion
2 Fruit Grapes NaN nan NId Grapes NId:Grapes
3 Vegetable Potato GSix 5678 NId Potato NId:Potato
4 Fruit Apple NaN nan NId Apple NId:Apple
5 Vegetable NaN GOne 5125 BId GOne BId:5125
6 Vegetable NaN GNine 5923 BId GNine BId:5923

我想知道是否有一种方法可以组合这些操作,或者是否有更好的方法。基本上我正在做的是 Id 如果不是 NaN 则为 NId,否则为 BName。如果从 NId 更新,则 Cat 为 NId,否则为 BId。 IdQ 列是 'NId' + NId 或 'BId' + BId 的组合,具体取决于上面编码的逻辑。

最佳答案

使用numpy.where :

mask = df['NId'].notna()
df['Cat'] = np.where(mask, 'NId','GId')
df['Id'] = np.where(mask, df['NId'].astype(str), df['BName'])
df['IdQ'] = np.where(mask, 'NId:' + df['NId'].astype(str), 'BId:' + df['BId'])
print (df)
Group NId BName BId Cat Id IdQ
0 Fruit Banana NaN nan NId Banana NId:Banana
1 Vegetable Onion GTwo 5252 NId Onion NId:Onion
2 Fruit Grapes NaN nan NId Grapes NId:Grapes
3 Vegetable Potato GSix 5678 NId Potato NId:Potato
4 Fruit Apple NaN nan NId Apple NId:Apple
5 Vegetable NaN GOne 5125 GId GOne BId:5125
6 Vegetable NaN GNine 5923 GId GNine BId:5923

关于python - 在一次更新中在 Pandas 数据框中创建多列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52361701/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com