python - 使用 xarray 在 netCDF 文件上使用 Prophet

转载作者：太空宇宙更新时间：2023-11-03 20:45:36

我有一个“netCDF”文件，我已使用 xarray 读取该文件，我想用它来生成文件中每个像素的预测。

import xarray as xr
from fbprophet import Prophet
import time    

with xr.open_dataset('avi.nc', 
                     chunks={'y': 2, 'x':2}) as avi:
    print(avi)

<xarray.Dataset>
Dimensions:  (ds: 104, lat: 213, lon: 177)
Coordinates:
  * lat      (lat) float64 -2.711e+06 -2.711e+06 -2.711e+06 -2.711e+06 ...
  * lon      (lon) float64 1.923e+06 1.924e+06 1.924e+06 1.924e+06 1.924e+06 ...
  * ds       (ds) object '1999-07-16T23:46:04.500000000' ...
Data variables:
    y        (ds, lat, lon) float64 dask.array<shape=(104, 213, 177),
        chunksize=(104, 2, 2)>

我为每个像素创建模型的方式是:* 循环遍历数组中的每个像素(for i in range(dataset.sizes['lat']):)，* 创建模型 (m1),* 将模型输出发送到 pandas DataFrame (output)

我尝试过对 netCDF 文件进行“分块”，但我发现效率没有任何差异。下面是我目前使用的代码。

columns = ('Year','lat', 'lon')
dates = list(range(1996, 1999))
output = pd.DataFrame(columns=columns)
forecast2 = pd.DataFrame()

def GAM2 (dataset):
    for i in range(dataset.sizes['lat']): 
        for k in range(dataset.sizes['lon']):
            count +=1
            df1 = dataset.y.isel(lat=slice(px_lat, (px_lat+1)), lon=slice(px_lon, (px_lon+1))).to_dataframe()

            df1['ds'] = pd.to_datetime(df1.index.get_level_values(0), dayfirst=True)
            df1['doy'] = df1['ds'].dt.dayofyear

            m1 = Prophet(weekly_seasonality=False).fit(df1)  
            future1 = m1.make_future_dataframe()  
            output _data = {
                    'Year': year,
                    'lat': dataset.lat[px_lat].values,
                    'lon': dataset.lon[px_lon].values}

            output = output .append(output , ignore_index=True)
            if px_lon < (dataset.sizes['lon'] - 1):
                px_lon += 1
            else:
                px_lon = 0            
        if px_lat < dataset.sizes['lat']:
            px_lat += 1
        else:
            px_lat = 0

    return output

问题:

我手动循环遍历数组(即 for i in range(dataset.sizes['lat']): ...。
输出当前将发送至 pandas 数据帧，我需要将其发送至具有相同坐标(lat、lon)的 DataArray ) 作为数据集进行进一步分析和可视化。

问题:

dataset.apply() 可以使用这些类型的函数吗？例如:

def GAM2 (dataset, index_name, site_name):
            m1 = Prophet(weekly_seasonality=False).fit(df1)  
            future1 = m1.make_future_dataframe()  
            output _data = {
                    'Year': year,
                    'lat': dataset.lat[px_lat].values,
                    'lon': dataset.lon[px_lon].values}
    return output 

ds.apply(GAM2)

我可以将输出作为变量直接存储到 DataArray 中吗？或者我是否必须继续使用 pandas DatraFrame 然后尝试将其转换为 DataArray？

最佳答案

我相信我已经找到了您正在寻找的答案。

可以使用允许并行计算的 xarray 的向量化 u_function，而不是对 xarray DataArray 的每个坐标点执行双循环。

如果将 FProphet 应用到 u_function 中，则可以生成特定于每个坐标点的预测输出。

这是一个可重现的示例:

import pandas as pd
pd.set_option('display.width', 50000)
pd.set_option('display.max_rows', 50000)
pd.set_option('display.max_columns', 5000)


import numpy as np
import xarray as xr

from dask.diagnostics import ProgressBar
from fbprophet import Prophet

# https://stackoverflow.com/questions/56626011/using-prophet-on-netcdf-file-using-xarray

 #https://gist.github.com/scottyhq/8daa7290298c9edf2ef1eb05dc3b6c60
ds = xr.tutorial.open_dataset('rasm').load()

def parse_datetime(time):
    return pd.to_datetime([str(x) for x in time])

ds.coords['time'] = parse_datetime(ds.coords['time'].values)


ds = ds.isel({'x':slice(175,180), 'y':slice(160,170)})
ds.isel({'time':0}).Tair.plot()

ds = ds.chunk({'x':40, 'y':40})

def fillna_in_array(x):
    y = np.where(np.abs(x)==np.inf, 0, x)  

    y = np.where(np.isnan(y), 0, y)

    if np.all(y) == 0:

        y = np.arange(len(y))
    return y



def xarray_Prophet(y, time, periods=30, freq='D'):
    '''
    This is a vectorized u_function of the Prophet prediction module.

    It returns an array of values containing original and predicted values
    according to the provided temporal sequence.

    Parameters:

        y (array): an array containing the y past values that will be 
                   used for the prediction.

        time (array): an array containing the time intervals of each respective 
                      entrance of the sampled y

        periods (positive int): the number of times it will be used for prediction

        freq (str): the frequency that will be used in the prediction:
            (i.e.: 'D', 'M', 'Y', 'm', 'H'...)

    Returns:

        array of predicted values of y (yhat)

    '''


    # Here, we ensure that all data is filled. Since Xarray has some Issues with
    # sparse matrices, It is a good solution for all NaN, inf, or 0 values for 
    # sampled y data

    with ProgressBar():
        y = fillna_in_array(y)

        # here the processing really starts:

        forecast = pd.DataFrame()

        forecast['ds'] = pd.to_datetime(time)
        forecast['y'] = y


        m1 = Prophet(weekly_seasonality=True, 
                     daily_seasonality=False).fit(forecast)  

        forecast = m1.make_future_dataframe(periods=periods, freq=freq)

        # In here, the u_function should return a simple 1-D array, 
        # or a pandas  series.
        # Therefore, we select the attribute 'yat' from the 
        # FProphet prediction dataframe to return solely a 1D data.

    return m1.predict(forecast)['yhat']

def predict_y(ds, 
              dim=['time'], 
              dask='allowed', 
              new_dim_name=['predicted'], 
              periods=30, freq='D'):

    '''
    Function Description:

        This function is a vectorized parallelized wrapper of 
        the "xarray_Prophet".

        It returns a new Xarray object (dataarray or Dataset) with the new 
        dimension attached.

    Parameters:
        ds (xarray - DataSet/DataArray)

        dim (list of strings): a list of the dimension that will be used in the 
        reduction (temporal prediction)

        dask (str):  allowed 

        new_dim_name (list of strings): it contains the name that will be used
                                        in the reduction operation.

        periods (positive int): the number of steps to be predicted based
                                      on the parameter "freq".


        freq (str): the frequency that will be used in the prediction:
            (i.e.: 'D', 'M', 'Y', 'm', 'H'...)                                      



    Returns:

        Xarray object (Dataset or DataArray): the type is solely dependent on 
                                              the ds object's type.

    '''
    with ProgressBar():
        ds = ds.sortby('time', False)

        time = np.unique(ds['time'].values)

        kwargs = {'time':time,
                  'periods': periods,
                  'freq':freq}


        filtered = xr.apply_ufunc(xarray_Prophet,
                                      ds,
                                      dask=dask,
                                      vectorize=True,
                                      input_core_dims=[dim],
                                      #exclude_dims = dim, # This must not be setted.
                                      output_core_dims=[new_dim_name],
                                      kwargs=kwargs,
                                      output_dtypes=[float],
                                      join='outer',
                                      dataset_fill_value=np.nan,
                                      ).compute()

    return filtered



da_binned = predict_y( ds = ds['Tair'], 
                       dim = ['time'], 
                       dask='allowed',
                       new_dim_name=['predicted'],
                       periods=30).rename({'predicted':'time'})



print(da_binned)

关于python - 使用 xarray 在 netCDF 文件上使用 Prophet，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56626011/

文章推荐： css - 在 div 中鼠标悬停时使图像淡出

文章推荐： python - 如何使用selenium获取 'href'链接？

文章推荐： c# - 在验证器上设置 ControlToValidate

文章推荐： css - SASS 的百分比 - 一个像素问题？

facebook-prophet - Prophet/Python 中的库未加载 : @rpath/libtbb. dylib
我在 Mac X1 上，蒙特雷。我已经安装了 prophet，但在尝试拟合模型时遇到了这个问题。 RuntimeError: Error during optimization: console l
r - dyplot.prophet 绘图函数中的标题和轴图例
我刚开始在 R 中使用 Facebook 的 Prophet 库，我有一个关于 dyplot.prophet 绘图函数的问题。我使用这段代码创建了一个图: dyplot.prophet(m,forec
python - 优化用于异常检测的 Prophet 区间宽度参数
我正在使用 Facebook prophet 执行异常检测任务。先知的一般超参数的优化将使我们做出更好的预测 (yhat)，但先知中的异常是根据值 (Y) 是否位于区间宽度之外来决定/捕获的。问题
python - 如何使 Prophet 的输出静音？
我正在使用 Prophet(Facebook 的时间序列库)，它会产生大量输出。是这样的: Prophet output 我已经沉默了一些像这样的输出: @contextmanager def sup
python - Facebook Prophet 饱和失败
我使用 Prophet 来预测销售额并使用饱和度参数来避免负值: df_prof = df_prof.sample(1000)df_prof['cap'] = 6000df_prof['地板'] =
R Prophet add_regressor 给出奇怪的结果
我正在尝试(第一次)使用 add_regressor 函数向 prophet 添加一个外部变量，但我得到的结果看起来很奇怪。我使用的数据集可在 kaggle(众所周知的洗发水销售)上免费获得 here
python - FB Prophet 按分钟预测
我正在尝试使用 Prophet 进行分钟预测。但我得到了奇怪的输出。任何有关如何在 Prophet 中按分钟进行预测的建议或改进我的代码的建议将不胜感激! 数据采用 FB Prophet 格式。 df
python - 如何从 Prophet 中提取季节性趋势
我一直在使用 Facebook 的 Prophet，到目前为止它已经产生了一些很好的结果。查看文档和谷歌搜索后，似乎没有一种自动的方法可以从模型中提取季节性趋势作为数据框或字典，例如: weekly
python - 使用 Facebook Prophet 同时预测多个变量
我是 Python 和 Facebook Prophet 的新手，所以这可能很简单，但我无法在网上找到答案。我有一个 7 列的 csv 文件。一列包含具有每日增量的日期戳 ('ds') 列，其他 6
python - 使用 Facebook Prophet 在具有多个时间序列的数据框中进行时间序列预测
我有以下数据框: fid via 2015-01-18 id_22207 0.275056 2015-01-30 id_22207 0.306961
python - 是否可以使用 FB Prophet 进行多元多步预测？
我正在研究一个多变量(100+ 个变量)多步骤(t1 到 t30)预测问题，其中时间序列频率为每 1 分钟一次。该问题需要预测 100 多个变量之一作为目标。我很想知道是否可以使用 FB Prophe
sas - 将 Prophet Projection 文件导入 SAS
我们的客户向我们提供了一个 Prophet“.projection”文件，它看起来是一个二进制文件(在记事本中打开时有很多特殊字符 - ?Š…kÿd?Š…kÿd? ). 我的问题是 - 如何将此文件导
r - 使用 R 对多个项目进行 Prophet 预测
我对在 R 中使用 Prophet 进行时间序列预测非常陌生。我能够使用 Prophet 预测单个产品的值。如果我可以使用 Prophet 循环为多个产品生成预测，有什么办法吗？下面的代码对于单个产品
python - Python 中的 Prophet/fbprophet 包
谁能解释一下如何在 Python3 上安装 Prophet？我尝试了 pip install fbprophet 但没有成功。尝试在导入 pandas 和 sklearn 后在笔记本中执行此操作并
python - 在 python 中使用 Prophet 预测每个类别的值
我对用 Python 和 Prophet 做时间序列还很陌生。我有一个包含变量商品代码、日期和销售数量的数据集。我正在尝试使用 python 中的 Prophet 预测每个月每篇文章的销量。我尝试使
plot - 如何从 Prophet.plot_components() 中提取 xy 数据
我有关于单位时间疾病病例数的数据，因此我可以预测疾病的爆发。我使用 Facebook Prophet 的 plot_components 函数从数据中提取基本趋势，Prophet 允许人们很容易地做到
python - 如何使用 Facebook Prophet 选择初始、期间、范围和截止值？
我的数据集中有大约 23300 个每小时的数据点，我尝试使用 Facebook Prophet 进行预测。要微调超参数，可以使用交叉验证: from fbprophet.diagnostics imp
python - 使用 Prophet 时，“StanModel”对象没有属性 'fit_class'
我正在尝试使用先知。我已经安装了所有必需的软件包: pip install pandas numpy jupyterlab seaborn conda install pywin32 conda i
python - Prophet fit 导致 Jupyter Notebook 崩溃
我正在 Jupyter Notebook 中运行这段代码。一切都很顺利，但最后一行总是使内核崩溃(“内核似乎已经死亡。它将自动重新启动。”)。我在 Colab 中运行了它，效果很好。有什么想法吗？ i
python - Python 中的 prophet 包和 fbprophet 有什么区别？
我用谷歌搜索了如何安装 fbprophet 包，但最重要的结果是如何安装 prophet。这两个包有什么区别？它们是一样的吗？最佳答案它是由相同的开发人员开发的。似乎只是改了个名字。 Prophe

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 使用 xarray 在 netCDF 文件上使用 Prophet

问题:

问题: