gpt4 book ai didi

Python Pandas Timeseries 如何找到值高于特定值的最大序列

转载 作者:行者123 更新时间:2023-12-01 05:09:40 27 4
gpt4 key购买 nike

如何找到时间序列中最大的序列。例如,我有一个像这样的 DataFrame:

index      Value 
1-1-2012 10
1-2-2012 14
1-3-2012 15
1-4-2012 8
1-5-2012 7
1-6-2012 16
1-7-2012 17
1-8-2012 18

现在我想获得最长的序列:这里是从 1-6-20121-8-2012 的序列,有 3 个条目。

谢谢安雅

最佳答案

这有点笨拙,但可以完成工作。由于您没有指定标题中提到的“具体值”,所以我选择12。

import pandas as pd

time_indecies = pd.date_range(start='2012-01-01', end='2012-08-01', freq='MS')
data = [10, 14, 15, 8, 7, 16, 17, 18]
df = pd.DataFrame({'vals': data, 't_indices': time_indecies })

threshold = 12
df['tag'] = df.vals > threshold

# make another DF to hold info about each region
regs_above_thresh = pd.DataFrame()

# first row of consecutive region is a True preceded by a False in tags
regs_above_thresh['start_idx'] = \
df.index[df['tag'] & ~ df['tag'].shift(1).fillna(False)]

# last row of consecutive region is a False preceded by a True
regs_above_thresh['end_idx'] = \
df.index[df['tag'] & ~ df['tag'].shift(-1).fillna(False)]

# how long is each region
regs_above_thresh['spans'] = \
[(spam[0] - spam[1] + 1) for spam in \
zip(regs_above_thresh['end_idx'], regs_above_thresh['start_idx'])]

# index of the region with the longest span
max_idx = regs_above_thresh['spans'].argmax()

# we can get the start and end points of longest region from the original dataframe
df.ix[regs_above_thresh.ix[max_idx][['start_idx', 'end_idx']].values]

连续区域智能来自 behzad.nouri 的 solution here .

关于Python Pandas Timeseries 如何找到值高于特定值的最大序列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24432605/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com