gpt4 book ai didi

Python 重新采样 - Pad 未填充 NAN

转载 作者:太空宇宙 更新时间:2023-11-03 21:41:04 25 4
gpt4 key购买 nike

在使用 resample 的 pad() 函数 对时间序列进行上采样后,我尝试填充 NaN

我使用 resample('1min').asfreq 将每小时数据上采样为分钟间隔数据,然后使用 resample.('1min').pad() code> 它不会像 Pandas.Dataframe.resample tutorial 中那样用先前的值填充 NaN 值。 .

运行以创建带有日期时间索引的数据框

url = "https://www.ndbc.noaa.gov/view_text_file.php?filename=42887h2016.txt.gz&dir=data/historical/stdmet/"
data_csv = urlopen(url)
df = pd.read_csv(data_csv, delim_whitespace=True, index_col=0, parse_dates=True)
df.drop(['WDIR', 'WSPD', 'GST', 'WVHT', 'DPD', 'APD', 'MWD', 'PRES', 'VIS', 'TIDE', 'VIS', 'ATMP', 'WTMP'],
axis = 1, inplace = True)

#Data Preparation
df.reset_index(level=0, inplace=True)
df = df.iloc[1:]
df = df.rename(columns={'#YY': 'YY'})

#Create datetime variable
df['Date'] = df[df.columns[0:3]].apply(lambda x: '/'.join(x.dropna().astype(int).astype(str)),axis=1)
df['Time'] = df[df.columns[3:5]].apply(lambda x: ':'.join(x.dropna().astype(int).astype(str)),axis=1)
df['Date.Time'] = df['Date'] + ':' + df['Time']
df['Date'] = pd.to_datetime(df['Date'], format = '%Y/%m/%d')
df['Date.Time'] = pd.to_datetime(df['Date.Time'], format='%Y/%m/%d:%H:%M', utc=True)

#Remaining data prep for the dataframe and create index w/ time date
df = df.convert_objects(convert_numeric=True)
df = df[(df['MM'] == 2.0) | (df['MM'] == 3.0)]
df = df.replace(999, np.nan)
df = df.set_index('Date.Time')
df.drop(['hh', 'mm', 'Time', 'Date'], axis = 1, inplace = True)

结果就是我们想要的数据框:

                             YY  MM  DD  DEWP
Date.Time
2016-12-01 00:00:00+00:00 2016 12 1 11.3
2016-12-01 01:00:00+00:00 2016 12 1 9.0
2016-12-01 02:00:00+00:00 2016 12 1 11.0
2016-12-01 03:00:00+00:00 2016 12 1 10.8
2016-12-01 04:00:00+00:00 2016 12 1 6.5

现在从一小时重新采样最多 1 分钟

df = df.resample('1min').asfreq()
df.head()

结果:

                               YY    MM   DD  DEWP
Date.Time
2016-12-01 00:00:00+00:00 2016.0 12.0 1.0 11.3
2016-12-01 00:01:00+00:00 NaN NaN NaN NaN
2016-12-01 00:02:00+00:00 NaN NaN NaN NaN
2016-12-01 00:03:00+00:00 NaN NaN NaN NaN
2016-12-01 00:04:00+00:00 NaN NaN NaN NaN

使用 Pad 命令填充 NaN 值

df = df.resample('1min').pad()
df.head()

结果:

                               YY    MM   DD  DEWP
Date.Time
2016-12-01 00:00:00+00:00 2016.0 12.0 1.0 11.3
2016-12-01 00:01:00+00:00 NaN NaN NaN NaN
2016-12-01 00:02:00+00:00 NaN NaN NaN NaN
2016-12-01 00:03:00+00:00 NaN NaN NaN NaN
2016-12-01 00:04:00+00:00 NaN NaN NaN NaN

变量 DEWP 应该看起来像这样

                               YY    MM   DD  DEWP
Date.Time
2016-12-01 00:00:00+00:00 2016.0 12.0 1.0 11.3
2016-12-01 00:01:00+00:00 2016.0 12.0 1.0 11.3
2016-12-01 00:02:00+00:00 2016.0 12.0 1.0 11.3
2016-12-01 00:03:00+00:00 2016.0 12.0 1.0 11.3
2016-12-01 00:04:00+00:00 2016.0 12.0 1.0 11.3

如有任何帮助,我们将不胜感激!

最佳答案

函数df.resample('1min').fillna("pad")有效。文档可以找到 here .

关于Python 重新采样 - Pad 未填充 NAN,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52862750/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com