gpt4 book ai didi

python - 找到每个 pandas 组最接近的时间值

转载 作者:行者123 更新时间:2023-12-01 02:11:08 25 4
gpt4 key购买 nike

import pandas as pd
df = pd.DataFrame({'date': ['2014-06-22 17:46:00', '2014-06-24 16:52:00', '2014-06-25 20:02:00', '2014-06-25 17:55:00', '2014-07-02 11:36:00', '2014-07-06 12:40:00', '2014-07-05 12:46:00', '2014-07-27 15:12:00'],
'type': ['A', 'A', 'A', 'B', 'B', 'C', 'C', 'C']})

>>> df
date type
0 2014-06-22 17:46:00 A
1 2014-06-24 16:52:00 A
2 2014-06-25 20:02:00 A
3 2014-06-25 17:55:00 B
4 2014-07-02 11:36:00 B
5 2014-07-06 12:40:00 C
6 2014-07-05 12:46:00 C
7 2014-07-27 15:12:00 C

如何获取最接近时间(例如 17:00)(不考虑日期)的每个组元素的索引?期望的结果是:

>>> df.groupby('type').date. ???
type
A 1
B 3
C 7
Name: date, dtype: int64

此外,如果我想找到最接近但早于给定时间的时间怎么办?再次到 17:00 时,需要返回:

>>> df.groupby('type').date. ???
type
A 1
B 4
C 7
Name: date, dtype: int64

最佳答案

获取默认日期,添加时间s并获取与时间t的差值:

首先通过 DataFrameGroupBy.idxmin 获取每组绝对值的最小索引,对于第二个解决方案,通过将正值替换为 DataFrameGroupBy.idxmax 的 NaN 来获取每组的最大负值和 mask :

df = pd.DataFrame({'date': ['2014-06-22 17:46:00', '2014-06-22 16:52:00', 
'2014-06-25 20:02:00', '2014-06-25 17:55:00',
'2014-07-02 11:36:00', '2014-07-06 12:40:00',
'2014-07-05 12:46:00', '2014-07-27 15:12:00'],
'type': ['A', 'A', 'A', 'B', 'B', 'C', 'C', 'C']})
<小时/>
#convert column to datetimes
df['date'] = pd.to_datetime(df.date)

t = '17:00:00'
a = pd.to_datetime(df['date'].dt.strftime('%H:%M:%S')) - pd.to_datetime(t)
print (a)
0 00:46:00
1 -1 days +23:52:00
2 03:02:00
3 00:55:00
4 -1 days +18:36:00
5 -1 days +19:40:00
6 -1 days +19:46:00
7 -1 days +22:12:00
Name: date, dtype: timedelta64[ns]


b = a.abs().groupby(df['type']).idxmin()
print (b)
type
A 1
B 3
C 7
Name: date, dtype: int64

c = a.mask(a > pd.Timedelta(0)).groupby(df['type']).idxmax()
print (c)
type
A 1
B 4
C 7
Name: date, dtype: int64

详细信息:

df1 = pd.concat([df, a, a.abs(), a.mask(a >  pd.Timedelta(0))], axis=1)
df1.columns = ['date','type','diff','absolute diff','max negative']
print (df1)
date type diff absolute diff max negative
0 2014-06-22 17:46:00 A 00:46:00 00:46:00 NaT
1 2014-06-22 16:52:00 A -1 days +23:52:00 00:08:00 -1 days +23:52:00
2 2014-06-25 20:02:00 A 03:02:00 03:02:00 NaT
3 2014-06-25 17:55:00 B 00:55:00 00:55:00 NaT
4 2014-07-02 11:36:00 B -1 days +18:36:00 05:24:00 -1 days +18:36:00
5 2014-07-06 12:40:00 C -1 days +19:40:00 04:20:00 -1 days +19:40:00
6 2014-07-05 12:46:00 C -1 days +19:46:00 04:14:00 -1 days +19:46:00
7 2014-07-27 15:12:00 C -1 days +22:12:00 01:48:00 -1 days +22:12:00

关于python - 找到每个 pandas 组最接近的时间值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48692745/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com