gpt4 book ai didi

python - 如果 DST 更改的数据帧频率低于 1 小时,则会出现 pytz 错误 [多索引 pandas]

转载 作者:太空宇宙 更新时间:2023-11-03 18:07:08 25 4
gpt4 key购买 nike

我在更改频率低于 1 小时的数据帧的时区时遇到问题。就我而言,我从 CSV 源获取每刻钟一次的数据帧,并且必须删除 3 月份的 DST 时间并添加 10 月份的 DST 时间。如果频率是每小时,下面的函数可以很好地工作,但是如果频率低于以下的频率,下面的函数就不起作用。

有人能解决这个问题吗?

import pandas as pd
import numpy as np
from pytz import timezone

def DST_Paris(NH, NH_str):
## Suppose that I do not create the dataframe here but I import one from a CSV file
df = pd.DataFrame(np.random.randn(NH * 365), index = pd.date_range(start="01/01/2014", freq=NH_str, periods=NH * 365))

## I need to delete the hour in March and duplicate the hour in October
## If freq is inf at 1 hour, I need to duplicate all the data inside the considerated hour

tz = timezone('Europe/Paris')
change_date = tz._utc_transition_times
GMT1toGMT2_dates = [datei.date() for datei in list(change_date) if datei.month == 3]
GMT2toGMT1_dates = [datei.date() for datei in list(change_date) if datei.month == 10]
ind_March = np.logical_and(np.in1d(df.index.date, GMT1toGMT2_dates),(df.index.hour == 2))
ind_October = np.logical_and(np.in1d(df.index.date, GMT2toGMT1_dates),(df.index.hour == 2))
df['ind_March'] = (1-ind_March)
df['ind_October'] = ind_October * 1
df = df[df.ind_March == 1]
df = df.append(df[df.ind_October == 1])
del df['ind_March']
del df['ind_October']
df = df.sort()

## Error if granularity below of 1 hours
df = df.tz_localize('Europe/Paris', ambiguous = 'infer')
return df

try:
DST_Paris(24, "1h")
print "dataframe freq = 1h ==> no pb"
except:
print "dataframe freq = 1h ==> error"

try:
DST_Paris(96, "15min")
print "dataframe freq = 15min ==> no pb"
except:
print "dataframe freq = 15min ==> error"

输出是:

dataframe freq = 1h    ==> no pb
dataframe freq = 15min ==> error

最佳答案

解决方法是使用

is_dst = False  # or True
df = df.tz_localize('Europe/Paris', ambiguous=[is_dst]*len(df))

明确指定是否应将不明确的本地时间解释为夏令时区。

<小时/>

顺便说一下,

df['ind_March'] = (1-ind_March)
df['ind_October'] = ind_October * 1
df = df[df.ind_March == 1]
df = df.append(df[df.ind_October == 1])
del df['ind_March']
del df['ind_October']
df = df.sort()

可以简化为

df = df.loc[(~ind_March) & (ind_October)] 
df = df.sort()

关于python - 如果 DST 更改的数据帧频率低于 1 小时,则会出现 pytz 错误 [多索引 pandas],我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26671112/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com