gpt4 book ai didi

python - 通过插值对 Pandas 时间序列进行上采样

转载 作者:行者123 更新时间:2023-11-28 22:23:57 25 4
gpt4 key购买 nike

我有一个

import pandas as pd
index = pd.date_range('1/1/2000', periods=9, freq='0.9S')
series = pd.Series(range(9), index=index)

>>> series
2000-01-01 00:00:00.000 0
2000-01-01 00:00:00.900 1
2000-01-01 00:00:01.800 2
2000-01-01 00:00:02.700 3
2000-01-01 00:00:03.600 4
2000-01-01 00:00:04.500 5
2000-01-01 00:00:05.400 6
2000-01-01 00:00:06.300 7
2000-01-01 00:00:07.200 8
Freq: 900L, dtype: int64

现在我明白了

>>> series.resample(rule='0.5S').head(100)
2000-01-01 00:00:00.000 0.0
2000-01-01 00:00:00.500 1.0
2000-01-01 00:00:01.000 NaN
2000-01-01 00:00:01.500 2.0
2000-01-01 00:00:02.000 NaN
2000-01-01 00:00:02.500 3.0
2000-01-01 00:00:03.000 NaN
2000-01-01 00:00:03.500 4.0
2000-01-01 00:00:04.000 NaN
2000-01-01 00:00:04.500 5.0
2000-01-01 00:00:05.000 6.0
2000-01-01 00:00:05.500 NaN
2000-01-01 00:00:06.000 7.0
2000-01-01 00:00:06.500 NaN
2000-01-01 00:00:07.000 8.0
Freq: 500L, dtype: float64

正如我所料,但我得到了

>>> series.resample(rule='0.5S').interpolate(method='linear')
2000-01-01 00:00:00.000 0.000000
2000-01-01 00:00:00.500 0.555556
2000-01-01 00:00:01.000 1.111111
2000-01-01 00:00:01.500 1.666667
2000-01-01 00:00:02.000 2.222222
2000-01-01 00:00:02.500 2.777778
2000-01-01 00:00:03.000 3.333333
2000-01-01 00:00:03.500 3.888889
2000-01-01 00:00:04.000 4.444444
2000-01-01 00:00:04.500 5.000000
2000-01-01 00:00:05.000 5.000000
2000-01-01 00:00:05.500 5.000000
2000-01-01 00:00:06.000 5.000000
2000-01-01 00:00:06.500 5.000000
2000-01-01 00:00:07.000 5.000000
Freq: 500L, dtype: float64

我预计最后一个值仍然是 8.0,对于 6.5 秒的时间戳仍然是 7.0。这是怎么回事?

最佳答案

一种至少部分正确的方法(对于真实数据,结果不是很好,我用 scipy's interp1d 取得了更好的成功)是在方法之间使用 mean() :

>>> series.resample(rule='0.5S').mean().interpolate(method='linear')
2000-01-01 00:00:00.000 0.0
2000-01-01 00:00:00.500 1.0
2000-01-01 00:00:01.000 1.5
2000-01-01 00:00:01.500 2.0
2000-01-01 00:00:02.000 2.5
2000-01-01 00:00:02.500 3.0
2000-01-01 00:00:03.000 3.5
2000-01-01 00:00:03.500 4.0
2000-01-01 00:00:04.000 4.5
2000-01-01 00:00:04.500 5.0
2000-01-01 00:00:05.000 6.0
2000-01-01 00:00:05.500 6.5
2000-01-01 00:00:06.000 7.0
2000-01-01 00:00:06.500 7.5
2000-01-01 00:00:07.000 8.0
Freq: 500L, dtype: float64

关于python - 通过插值对 Pandas 时间序列进行上采样,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46728152/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com