gpt4 book ai didi

python - 使用线性插值填充时间戳 NaT

转载 作者:行者123 更新时间:2023-12-02 07:13:35 25 4
gpt4 key购买 nike

我有一个像这样的 DataFrame df:

                                  t        pos
frame
0 2015-11-21 14:46:32.843517000 0.000000
1 NaT 0.000000
2 NaT 0.000000
3 NaT 0.000000
4 NaT 0.000000
5 NaT 0.000000
6 NaT 0.000000
7 NaT 0.000000
8 NaT 0.000000
9 NaT 0.000000
10 NaT 0.000000
11 NaT 0.000000
12 NaT 0.000000
13 NaT 0.000000
14 NaT 0.000000
15 NaT 0.000000
16 NaT 0.000000
17 NaT 0.000000
18 NaT 0.000000
19 NaT 0.000000
... ... ...
304 2015-11-21 14:46:54.255383750 12.951807
305 2015-11-21 14:46:54.312271250 5.421687
306 2015-11-21 14:46:54.343288000 3.614458
307 2015-11-21 14:46:54.445307000 1.204819
308 2015-11-21 14:46:54.477091000 0.000000
309 NaT 0.000000
310 NaT 0.000000
311 NaT 0.000000
312 NaT 0.000000
313 NaT 0.000000
314 2015-11-21 14:46:54.927361000 1.204819
315 2015-11-21 14:46:55.003917250 4.819277
316 2015-11-21 14:46:55.058081500 12.048193
317 2015-11-21 14:46:55.112070500 24.698795
318 2015-11-21 14:46:55.167366000 34.538153
319 2015-11-21 14:46:55.252116750 29.718876
320 2015-11-21 14:46:55.325177750 16.064257
321 2015-11-21 14:46:55.396772000 6.927711
322 2015-11-21 14:46:55.448250000 3.614458
323 2015-11-21 14:46:55.559872500 0.602410

我想用 pandas.tslib.Timestamp 填充 NaT

我找到了http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.DataFrame.fillna.html

但我找不到为此的方法

但可能有一个解决方法。

最佳答案

您对 interpolate 方法当前不适用于 Timestamp 的看法是正确的。一种解决方案是将其转换为 float ,对其进行插值并将其转换回 Timestamp:

In [63]:

print df
pos t
0 0 2015-11-21 14:46:54.445307000
1 1 2015-11-21 14:46:54.477091000
2 2 NaT
3 3 NaT
4 4 NaT
5 5 NaT
6 6 2015-11-21 14:46:54.927361000
7 7 2015-11-21 14:46:55.003917250
In [64]:

pd.to_datetime(pd.to_numeric(df.t).interpolate())
Out[64]:
0 2015-11-21 14:46:54.445306880
1 2015-11-21 14:46:54.477091072
2 2015-11-21 14:46:54.567144960
3 2015-11-21 14:46:54.657199104
4 2015-11-21 14:46:54.747252992
5 2015-11-21 14:46:54.837307136
6 2015-11-21 14:46:54.927361024
7 2015-11-21 14:46:55.003917312
Name: t, dtype: datetime64[ns]
In [65]:

print df
df.ix[df.t.isnull(), 't'] = pd.to_datetime(pd.to_numeric(df.t).interpolate())[df.t.isnull()]
print df
pos t
0 0 2015-11-21 14:46:54.445307000
1 1 2015-11-21 14:46:54.477091000
2 2 2015-11-21 14:46:54.567144960
3 3 2015-11-21 14:46:54.657199104
4 4 2015-11-21 14:46:54.747252992
5 5 2015-11-21 14:46:54.837307136
6 6 2015-11-21 14:46:54.927361000
7 7 2015-11-21 14:46:55.003917250

但是,请注意,由于精度丢失(我猜这可能是原因),数字有点偏差(正负约 1e-6 秒)。仅用插值填充 nan 并让非 nan 保持原样可能是明智的做法。

关于python - 使用线性插值填充时间戳 NaT,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33921795/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com