python - 查找每日最大值及其时间戳 (yyyy :mm:dd hh:mm:ss) in Python Pandas-6ren

python - 查找每日最大值及其时间戳 (yyyy :mm:dd hh:mm:ss) in Python Pandas

转载作者：太空宇宙更新时间：2023-11-04 00:11:23

24

4

实际上，我有一个 150 MB 的每日分钟测量数据，持续了两年。我在这里给出了示例数据。我想创建一个新的数据框，其中包含每天的最大值及其时间戳。我的示例数据是:

    DateTime            Power
01-Aug-16 10:43:00.000  229.9607961
01-Aug-16 10:43:23.000  230.9030781
01-Aug-16 10:44:00.000  231.716212
01-Aug-16 10:45:00.000  232.4485882
01-Aug-16 10:46:00.000  233.2739154
02-Aug-16 09:42:00.000  229.6851724
02-Aug-16 09:43:00.000  230.9163998
02-Aug-16 09:43:06.000  230.9883337
02-Aug-16 09:44:00.000  231.2569098
02-Aug-16 09:49:00.000  229.5774805
02-Aug-16 09:50:00.000  229.8758693
02-Aug-16 09:51:00.000  229.9825204
03-Aug-16 10:09:00.000  231.3605982
03-Aug-16 10:10:00.000  231.6827163
03-Aug-16 10:11:00.000  231.1580262
03-Aug-16 10:12:00.000  230.4054286
03-Aug-16 10:13:00.000  229.6507959
03-Aug-16 10:13:02.000  229.6268353
03-Aug-16 10:14:00.000  230.4584964
03-Aug-16 10:15:00.000  230.9004206
03-Aug-16 10:16:00.000  231.189036

我现在的代码是:

max_per_day = df.groupby(pd.Grouper(key='time',freq='D')).max()
print(max_per_day)

我现在的输出是:

    time                  
2016-08-01  237.243835
2016-08-02  239.658539
2016-08-03  237.424683
2016-08-04  236.790695
2016-08-05  240.163910

目前它输出 yyyy:mm:dd 和值。但我什至想要 hh:mm(或 hh:mm:ss)对每个最大值。我尝试了以下代码:

max_pmpp_day = df.loc[df.groupby(pd.Grouper(freq='D')).idxmax().iloc[:,0]]

输出是:

 TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index'

我试过@jezrael 的回答

df['DateTime'] = pd.to_datetime(df['time'])
s = df.groupby(pd.Grouper(key='DateTime', freq='D'))['Pmpp'].transform('max')
df = df[df['Pmpp'] == s]    
print(df)

输出为

                     time        Pmpp            DateTime
34    2016-08-01 11:11:00  237.243835 2016-08-01 11:11:00
434   2016-08-02 13:30:02  239.658539 2016-08-02 13:30:02
648   2016-08-03 12:39:00  237.424683 2016-08-03 12:39:00

最佳答案

您可以使用 GroupBy.transform或 Resampler.transform用于在新的 Series 中返回 max 值并与原始列进行比较:

df['DateTime'] = pd.to_datetime(df['DateTime'])
s = df.groupby(pd.Grouper(key='DateTime', freq='D'))['Power'].transform('max')
#alternative
#s = df.resample('D', on='DateTime')['Power'].transform('max')
df = df[df['Power'] == s]
print (df)
              DateTime       Power
4  2016-08-01 10:46:00  233.273915
8  2016-08-02 09:44:00  231.256910
13 2016-08-03 10:10:00  231.682716

或者创建DatetimeIndex 并在groupby 之后添加列以检查idxmax:

df['DateTime'] = pd.to_datetime(df['DateTime'])
df = df.set_index('DateTime')
df = df.loc[df.groupby(pd.Grouper(freq='D'))['Power'].idxmax()]
print (df)
                          Power
DateTime                       
2016-08-01 10:46:00  233.273915
2016-08-02 09:44:00  231.256910
2016-08-03 10:10:00  231.682716

@Jon Clements 的解决方案，谢谢:

df = (df.sort_values('Power')
        .groupby(df.DateTime.dt.to_period('D'))
        .last()
        .reset_index(drop=True))

关于python - 查找每日最大值及其时间戳 (yyyy :mm:dd hh:mm:ss) in Python Pandas，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52405615/

24

4

0

文章推荐： angularjs - 可编程聊天的 Twilio 推送通知失败

文章推荐： css - 在下拉列表中悬停子 ul 时，如何保持父 ul 的悬停？

文章推荐： node.js - 如何使用 ECMAScrip6 导入带参数的函数？

文章推荐： css - float 内容下的粘性页脚

正则表达式在存在多个时提取第一个 date_time 戳
给定一个带有多个 date_time 戳的字符串，我想提取第一个戳及其前面的文本候选字符串可以有一个或多个时间戳后续的 date_time 戳记将被 sep="-" 隔开后续date_time
android - 照片上的文字(日期)戳
是否可以合并从相机拍摄的文本和照片？我想在照片上标记日期和时间，但我在 Google 上找不到任何内容。最佳答案使用下面的代码来实现你所需要的。 Bitmap src = Bitm
facebook - 有没有办法通过 Graph API 戳？
有没有办法通过 Graph API 戳另一个用户？基于this post ，并使用 Graph Explorer ，我发布到“/USERID/pokes”，我已经授予它(Graph API 应用程序和
html - Firefox float 元素需要 DOM 戳
我有两个向左浮动的元素。一个是 body 的第一个 child ，另一个是容器的第一个 child ，容器是 body 的第二个 child 。 ...

首页

博学

6Ren·AI

商城

python - 查找每日最大值及其时间戳 (yyyy :mm:dd hh:mm:ss) in Python Pandas