gpt4 book ai didi

python - 如何根据带条件的列值在数据框中插入行?

转载 作者:太空宇宙 更新时间:2023-11-03 12:53:52 24 4
gpt4 key购买 nike

我有一个大约 20k 行的 DataFrame,看起来像这样:

import pandas as pd
import numpy as np
df = pd.DataFrame({'Car_ID': ['B332', 'B332', 'B332', 'C315', 'C315', 'C315', 'C315', 'C315', 'F310', 'F310'], \
'Date': ['2018-03-12', '2018-03-14', '2018-03-15', '2018-03-17', '2018-03-13', '2018-03-15', \
'2018-03-18', '2018-03-21', '2018-03-10', '2018-03-13'], \
'Driver': ['Alex', 'Alex', 'Mick', 'Sara', 'Sara', 'Jean', 'Sara', 'Sara', 'Franck','Michel']})
df

Out:
Car_ID Date Driver
0 B332 2018-03-12 Alex
1 B332 2018-03-14 Alex
2 B332 2018-03-15 Mick
3 C315 2018-03-17 Sara
4 C315 2018-03-13 Sara
5 C315 2018-03-15 Jean
6 C315 2018-03-18 Sara
7 C315 2018-03-21 Sara
8 F310 2018-03-10 Franck
9 F310 2018-03-13 Michel

我像这样为数据框中的每个事件创建一个新列:

df["Event"] = np.where(df.Car_ID.str.contains('B', case=True, na=False), 'Rent_Car_B', \
np.where(df.Car_ID.str.contains('C', case=True, na=False), 'Rent_Car_C', \
np.where(df.Car_ID.str.contains('F', case=True, na=False), 'Rent_Car_F', df.Car_ID)))
df

Out:
Car_ID Date Driver Event
0 B332 2018-03-12 Alex Rent_Car_B
1 B332 2018-03-14 Alex Rent_Car_B
2 B332 2018-03-15 Mick Rent_Car_B
3 C315 2018-03-17 Sara Rent_Car_C
4 C315 2018-03-13 Sara Rent_Car_C
5 C315 2018-03-15 Jean Rent_Car_C
6 C315 2018-03-18 Sara Rent_Car_C
7 C315 2018-03-21 Sara Rent_Car_C
8 F310 2018-03-10 Franck Rent_Car_F
9 F310 2018-03-13 Michel Rent_Car_F

对于我的 Event 列,我想为每个驱动程序更改添加新行,如下所示:

Out:
Car_ID Date Driver Event
0 B332 2018-03-12 Alex Rent_Car_B
1 B332 2018-03-14 Alex Rent_Car_B
2 B332 2018-03-15 Mick Rent_Car_B
3 B332 2018-03-15 Alex to Mick
4 C315 2018-03-17 Sara Rent_Car_C
5 C315 2018-03-13 Sara Rent_Car_C
6 C315 2018-03-15 Jean Rent_Car_C
7 C315 2018-03-15 Sara to Jean
8 C315 2018-03-18 Sara Rent_Car_C
9 C315 2018-03-18 Jean to Sara
10 C315 2018-03-21 Sara Rent_Car_C
11 F310 2018-03-10 Franck Rent_Car_F
12 F310 2018-03-13 Michel Rent_Car_F
13 F310 2018-03-13 Franck to Mike

我不确定是否有一些技巧可以实现这项工作。我会很感激你的建议!

最佳答案

使用 shift 方法并首先用它创建一个我们将在之后使用的列:

df['Driver_shift'] = df['Driver'].shift()

选择您实际更换司机的行和与掩码相同的car_ID:

mask = (df['Driver'] != df['Driver_shift'])&(df['Car_ID'] == df['Car_ID'].shift())
df_change = df[mask]

现在,为了后面的拼接和排序,改变索引增加0.5,改变两列的值:

df_change = df_change.set_index(df_change.index+0.5)
df_change.loc[:,'Event'] = df_change['Driver_shift'] + ' to ' + df_change['Driver']
df_change['Driver'] = '' # to replace the value

现在您可以连接、排序、重置索引和删除:

pd.concat([df,df_change]).sort_index().reset_index(drop=True).drop('Driver_shift',1)

你得到:

   Car_ID        Date  Driver             Event
0 B332 2018-03-12 Alex Rent_Car_B
1 B332 2018-03-14 Alex Rent_Car_B
2 B332 2018-03-15 Mick Rent_Car_B
3 B332 2018-03-15 Alex to Mick
4 C315 2018-03-17 Sara Rent_Car_C
5 C315 2018-03-13 Sara Rent_Car_C
6 C315 2018-03-15 Jean Rent_Car_C
7 C315 2018-03-15 Sara to Jean
8 C315 2018-03-18 Sara Rent_Car_C
9 C315 2018-03-18 Jean to Sara
10 C315 2018-03-21 Sara Rent_Car_C
11 F310 2018-03-10 Franck Rent_Car_F
12 F310 2018-03-13 Michel Rent_Car_F
13 F310 2018-03-13 Franck to Michel

编辑:在每个驱动程序和日期之前添加一行

df1 = df.copy()
df1.index = df1.index +0.5
df2 = pd.concat([df.drop('Event',1),df1]).sort_index().reset_index(drop=True)
df2['Event'] = df2['Event'].fillna(df2['Driver'])

结果在df2中

关于python - 如何根据带条件的列值在数据框中插入行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50910334/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com