gpt4 book ai didi

python - Pandas 将数据重新采样到秒,每约 10 秒进行分组

转载 作者:行者123 更新时间:2023-12-01 07:15:54 24 4
gpt4 key购买 nike

假设我有以下数据框:

>>> df
a
2019-04-05 00:00:00 2.0
2019-04-05 00:00:01 1.0
2019-04-05 00:00:02 NaN
2019-04-05 00:00:03 NaN
2019-04-05 00:00:04 NaN
2019-04-05 00:00:05 NaN
2019-04-05 00:00:06 NaN
2019-04-05 00:00:07 NaN
2019-04-05 00:00:08 3.0
2019-04-05 00:00:09 4.0
2019-04-05 00:00:10 NaN
2019-04-05 00:00:11 NaN
2019-04-05 00:00:12 NaN
2019-04-05 00:00:13 NaN
2019-04-05 00:00:14 NaN
2019-04-05 00:00:15 NaN
2019-04-05 00:00:16 NaN
2019-04-05 00:00:17 NaN
2019-04-05 00:00:18 NaN
2019-04-05 00:00:19 NaN
2019-04-05 00:00:20 4.0
2019-04-05 00:00:21 5.0
2019-04-05 00:00:22 NaN
2019-04-05 00:00:23 NaN
2019-04-05 00:00:24 NaN
2019-04-05 00:00:25 NaN
2019-04-05 00:00:26 6.0
2019-04-05 00:00:27 NaN
2019-04-05 00:00:28 4.0
2019-04-05 00:00:29 NaN
2019-04-05 00:00:30 NaN
2019-04-05 00:00:31 NaN

我希望每 7 秒有 1 个值(假设有一个值,否则只是一个 NaN),因此数据框如下所示:

>>> df
a
2019-04-05 00:00:00 2.0
2019-04-05 00:00:01 NaN
2019-04-05 00:00:02 NaN
2019-04-05 00:00:03 NaN
2019-04-05 00:00:04 NaN
2019-04-05 00:00:05 NaN
2019-04-05 00:00:06 NaN
2019-04-05 00:00:07 NaN
2019-04-05 00:00:08 3.0
2019-04-05 00:00:09 NaN
2019-04-05 00:00:10 NaN
2019-04-05 00:00:11 NaN
2019-04-05 00:00:12 NaN
2019-04-05 00:00:13 NaN
2019-04-05 00:00:14 NaN
2019-04-05 00:00:15 NaN
2019-04-05 00:00:16 NaN
2019-04-05 00:00:17 NaN
2019-04-05 00:00:18 NaN
2019-04-05 00:00:19 NaN
2019-04-05 00:00:20 4.0
2019-04-05 00:00:21 NaN
2019-04-05 00:00:22 NaN
2019-04-05 00:00:23 NaN
2019-04-05 00:00:24 NaN
2019-04-05 00:00:25 NaN
2019-04-05 00:00:26 NaN
2019-04-05 00:00:27 NaN
2019-04-05 00:00:28 4.0
2019-04-05 00:00:29 NaN
2019-04-05 00:00:30 NaN
2019-04-05 00:00:31 NaN

7 秒点是任意的,实际上我大约每分钟都会取值。这是我到目前为止所尝试过的:

df = df.resample('7s').first()

但这会产生以下数据帧:

                       a
2019-04-05 00:00:00 2.0
2019-04-05 00:00:07 3.0
2019-04-05 00:00:14 4.0
2019-04-05 00:00:21 5.0
2019-04-05 00:00:28 4.0

注意:我并不为这些点之间缺少 NaN 所困扰,因为它们是隐含的。我只是对时间不满意,因为它每 7 秒强制一个值,而我只想不允许值彼此相差在 7 秒内,而不需要每 7 秒一个值。

为清楚起见,伊迪丝:

我不想要的数据帧:

                       a
2019-04-05 00:00:00 2.0
2019-04-05 00:00:07 3.0
2019-04-05 00:00:14 4.0
2019-04-05 00:00:21 5.0
2019-04-05 00:00:28 4.0

我想要的数据框:

>>> df
a
2019-04-05 00:00:00 2.0
2019-04-05 00:00:01 NaN
2019-04-05 00:00:02 NaN
2019-04-05 00:00:03 NaN
2019-04-05 00:00:04 NaN
2019-04-05 00:00:05 NaN
2019-04-05 00:00:06 NaN
2019-04-05 00:00:07 NaN
2019-04-05 00:00:08 3.0
2019-04-05 00:00:09 NaN
2019-04-05 00:00:10 NaN
2019-04-05 00:00:11 NaN
2019-04-05 00:00:12 NaN
2019-04-05 00:00:13 NaN
2019-04-05 00:00:14 NaN
2019-04-05 00:00:15 NaN
2019-04-05 00:00:16 NaN
2019-04-05 00:00:17 NaN
2019-04-05 00:00:18 NaN
2019-04-05 00:00:19 NaN
2019-04-05 00:00:20 4.0
2019-04-05 00:00:21 NaN
2019-04-05 00:00:22 NaN
2019-04-05 00:00:23 NaN
2019-04-05 00:00:24 NaN
2019-04-05 00:00:25 NaN
2019-04-05 00:00:26 NaN
2019-04-05 00:00:27 NaN
2019-04-05 00:00:28 4.0
2019-04-05 00:00:29 NaN
2019-04-05 00:00:30 NaN
2019-04-05 00:00:31 NaN

或者:

>>> df
a
2019-04-05 00:00:00 2.0
2019-04-05 00:00:08 3.0
2019-04-05 00:00:20 4.0
2019-04-05 00:00:28 4.0

最佳答案

您可以对数据帧进行上采样,您已经非常接近了;

df = df.resample('7s').first()
df = df.resample(rule='1s')

这将为新插入的行在添加的秒数上创建一个包含 NaN 的数据框。

关于python - Pandas 将数据重新采样到秒,每约 10 秒进行分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57950074/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com