gpt4 book ai didi

python - Pandas 检查时间序列的连续性

转载 作者:太空宇宙 更新时间:2023-11-04 08:29:28 29 4
gpt4 key购买 nike

我有一个带有每月索引的 DataFrame。我想检查时间索引是否在每月频率上连续,如果可能的话,检查它变得不连续的地方,例如在其索引中相邻的两个月之间有某些“间隔月”。

例子:如下时间序列数据

1964-07-31    100.00
1964-08-31 98.81
1964-09-30 101.21
1964-11-30 101.42
1964-12-31 101.45
1965-03-31 91.49
1965-04-30 90.33
1965-05-31 85.23
1965-06-30 86.10
1965-08-31 84.26

错过 1964/10、1965/[1,2,7]。

最佳答案

使用asfreq按月添加缺失的日期时间,将其过滤到新的 Series 并在必要时按年分组并创建月份列表:

s = s.asfreq('m')
s1 = pd.Series(s[s.isnull()].index)
print (s1)
0 1964-10-31
1 1965-01-31
2 1965-02-28
3 1965-07-31
Name: 0, dtype: datetime64[ns]

out = s1.dt.month.groupby(s1.dt.year).apply(list)
print (out)
0
1964 [10]
1965 [1, 2, 7]
Name: 0, dtype: object

设置:

s = pd.Series({pd.Timestamp('1964-07-31 00:00:00'): 100.0, 
pd.Timestamp('1964-08-31 00:00:00'): 98.81,
pd.Timestamp('1964-09-30 00:00:00'): 101.21,
pd.Timestamp('1964-11-30 00:00:00'): 101.42,
pd.Timestamp('1964-12-31 00:00:00'): 101.45,
pd.Timestamp('1965-03-31 00:00:00'): 91.49,
pd.Timestamp('1965-04-30 00:00:00'): 90.33,
pd.Timestamp('1965-05-31 00:00:00'): 85.23,
pd.Timestamp('1965-06-30 00:00:00'): 86.1,
pd.Timestamp('1965-08-31 00:00:00'): 84.26})

print (s)
1964-07-31 100.00
1964-08-31 98.81
1964-09-30 101.21
1964-11-30 101.42
1964-12-31 101.45
1965-03-31 91.49
1965-04-30 90.33
1965-05-31 85.23
1965-06-30 86.10
1965-08-31 84.26
dtype: float64

编辑:

如果日期时间不总是月份的最后一天:

s = pd.Series({pd.Timestamp('1964-07-31 00:00:00'): 100.0, 
pd.Timestamp('1964-08-31 00:00:00'): 98.81,
pd.Timestamp('1964-09-01 00:00:00'): 101.21,
pd.Timestamp('1964-11-02 00:00:00'): 101.42,
pd.Timestamp('1964-12-05 00:00:00'): 101.45,
pd.Timestamp('1965-03-31 00:00:00'): 91.49,
pd.Timestamp('1965-04-30 00:00:00'): 90.33,
pd.Timestamp('1965-05-31 00:00:00'): 85.23,
pd.Timestamp('1965-06-30 00:00:00'): 86.1,
pd.Timestamp('1965-08-31 00:00:00'): 84.26})
print (s)
1964-07-31 100.00
1964-08-31 98.81
1964-09-01 101.21
1964-11-02 101.42
1964-12-05 101.45
1965-03-31 91.49
1965-04-30 90.33
1965-05-31 85.23
1965-06-30 86.10
1965-08-31 84.26
dtype: float64

#convert all months to first day
s.index = s.index.to_period('m').to_timestamp()
#MS is start month frequency
s = s.asfreq('MS')
s1 = pd.Series(s[s.isnull()].index)
print (s1)
0 1964-10-01
1 1965-01-01
2 1965-02-01
3 1965-07-01
dtype: datetime64[ns]

关于python - Pandas 检查时间序列的连续性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54039062/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com