gpt4 book ai didi

python - 通过滚动对象将多个滚动函数应用于 Pandas 组的多列?

转载 作者:行者123 更新时间:2023-11-28 21:40:37 27 4
gpt4 key购买 nike

我想做以下事情:

  1. 分组数据框

  2. 为每个组生成时间窗口(给定时间单位)

  3. 在生成的结构中,获取每一列并应用多个滚动汇总统计函数,以便结果具有每个组/时间窗口组合的汇总统计信息。

这是一个示例数据集:

gps_time,name,val_x,val_y
2017-07-04 11:20:23.423,bob,0.963,0.201
2017-07-04 11:20:24.492,bob,0.964,0.203
2017-07-04 11:20:24.499,bob,0.962,0.210
2017-07-04 11:20:25.627,sarah,0.893,0.010
2017-07-04 11:20:28.627,sarah,0.894,0.012
2017-07-04 11:20:29.613,sarah,0.895,0.014
2017-07-04 11:20:29.630,larry,-0.423,0.231
2017-07-04 11:20:30.423,larry,-0.431,0.22
2017-07-04 11:20:30.428,larry,-0.432,0.222

以及上述数据所需的输出,按名称分组,窗口为 1 秒:

name,gps_time,val_x_mean,val_x_med,val_y_mean,val_y_med
bob,2017-07-04 11:20:23.423,0.963,0.963,0.201,0.201
bob,2017-07-04 11:20:24.492,0.963,0.963,0.2065,0.2065
sarah,2017-07-04 11:20:25.627,0.893,0.89,0.010,0.010
sarah,2017-07-04 11:20:28.627,0.8945,0.8945,0.013,0.013
larry,2017-07-04 11:20:30.423,-0.4287,-0.431,0.336,0.222

我试过使用列表理解来生成一堆数据框,但这个过程真的很慢,我必须为每一列调用它。

最佳答案

让我们将 groupbypd.Grouper 一起使用:

df_out = df.groupby([pd.Grouper(freq='S', key='gps_time'),'name']).agg(['mean','median'])
df_out.columns = df_out.columns.map('_'.join)
df_out.reset_index()

输出:

             gps_time   name  val_x_mean  val_x_median  val_y_mean  \
0 2017-07-04 11:20:23 bob 0.9630 0.9630 0.2010
1 2017-07-04 11:20:24 bob 0.9630 0.9630 0.2065
2 2017-07-04 11:20:25 sarah 0.8930 0.8930 0.0100
3 2017-07-04 11:20:28 sarah 0.8940 0.8940 0.0120
4 2017-07-04 11:20:29 larry -0.4230 -0.4230 0.2310
5 2017-07-04 11:20:29 sarah 0.8950 0.8950 0.0140
6 2017-07-04 11:20:30 larry -0.4315 -0.4315 0.2210

val_y_median
0 0.2010
1 0.2065
2 0.0100
3 0.0120
4 0.2310
5 0.0140
6 0.2210

关于python - 通过滚动对象将多个滚动函数应用于 Pandas 组的多列?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45445314/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com