gpt4 book ai didi

python - 计算 pandas 中另一列上的连续日期列与 groupby 之间的差异?

转载 作者:行者123 更新时间:2023-12-02 19:42:51 24 4
gpt4 key购买 nike

我有一个 pandas 数据框,

data = pd.DataFrame([['Car','2019-01-06T21:44:09Z'],
['Train','2019-01-06T19:44:09Z'],
['Train','2019-01-02T19:44:09Z'],
['Car','2019-01-08T06:44:09Z'],
['Car','2019-01-06T18:44:09Z'],
['Train','2019-01-04T19:44:09Z'],
['Car','2019-01-05T16:34:09Z'],
['Train','2019-01-08T19:44:09Z'],
['Car','2019-01-07T14:44:09Z'],
['Car','2019-01-06T11:44:09Z'],
['Train','2019-01-10T19:44:09Z'],
],
columns=['Type', 'Date'])

按日期排序后,需要找到每种类型的连续日期之间的差异

最终数据如下

data = pd.DataFrame([['Car','2019-01-06T21:44:09Z',1],
['Train','2019-01-06T19:44:09Z',4],
['Train','2019-01-02T19:44:09Z',0],
['Car','2019-01-08T06:44:09Z',3],
['Car','2019-01-06T18:44:09Z',1],
['Train','2019-01-04T19:44:09Z',2],
['Car','2019-01-05T16:34:09Z',0],
['Train','2019-01-08T19:44:09Z',6],
['Car','2019-01-07T14:44:09Z',2],
['Car','2019-01-06T11:44:09Z',1],
['Train','2019-01-10T19:44:09Z',8],
],
columns=['Type', 'Date','diff'])

此处,Type Car min(Date) 为 2019-01-05T16:34:09Z,因此差异从 0 开始,然后第二个日期为 2019-01-06T18:44:09Z 和 2019-01-06T11:44 :09Z,所以 diff 是 1 天(这里不确定是否可以包括时间)等等..对于 Type Train min(Date) 是 2019-01-02T19:44:09Z,所以 diff 是 0 然后 2019-01-04T19:44:09Z 所以 2 天 diff

我尝试了 groupby,但不确定如何包括日期排序

data['diff'] = data.groupby('Type')['Date'].diff() / np.timedelta64(1, 'D')

最佳答案

pandas.DataFrame.groupbydt.date一起使用:

df['diff'] = df.groupby('Type')['Date'].apply(lambda x: x.dt.date - x.min().date())

输出:

     Type                      Date   diff
0 Car 2019-01-06 21:44:09+00:00 1 days
1 Train 2019-01-06 19:44:09+00:00 4 days
2 Train 2019-01-02 19:44:09+00:00 0 days
3 Car 2019-01-08 06:44:09+00:00 3 days
4 Car 2019-01-06 18:44:09+00:00 1 days
5 Train 2019-01-04 19:44:09+00:00 2 days
6 Car 2019-01-05 16:34:09+00:00 0 days
7 Train 2019-01-08 19:44:09+00:00 6 days
8 Car 2019-01-07 14:44:09+00:00 2 days
9 Car 2019-01-06 11:44:09+00:00 1 days
10 Train 2019-01-10 19:44:09+00:00 8 days

如果您希望它们为int,请添加dt.days:

df['diff'] = df.groupby('Type')['Date'].apply(lambda x: x.dt.date - x.min().date()).dt.days

输出:

     Type                      Date  diff
0 Car 2019-01-06 21:44:09+00:00 1
1 Train 2019-01-06 19:44:09+00:00 4
2 Train 2019-01-02 19:44:09+00:00 0
3 Car 2019-01-08 06:44:09+00:00 3
4 Car 2019-01-06 18:44:09+00:00 1
5 Train 2019-01-04 19:44:09+00:00 2
6 Car 2019-01-05 16:34:09+00:00 0
7 Train 2019-01-08 19:44:09+00:00 6
8 Car 2019-01-07 14:44:09+00:00 2
9 Car 2019-01-06 11:44:09+00:00 1
10 Train 2019-01-10 19:44:09+00:00 8

关于python - 计算 pandas 中另一列上的连续日期列与 groupby 之间的差异?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59834527/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com