r - 在 R 中使用 Fable 进行时间序列预测；确定混合模型的最佳模型组合-6ren

r - 在 R 中使用 Fable 进行时间序列预测；确定混合模型的最佳模型组合

转载作者：行者123 更新时间：2023-12-05 03:34:19

我正在使用 fable 和 fabletools 包进行一些时间序列预测分析，我有兴趣比较单个模型和混合模型(包括我正在使用的各个模型)。

这是一些带有模拟数据框的示例代码:-

library(fable)
library(fabletools)
library(distributional)
library(tidyverse)
library(imputeTS)

#creating mock dataframe
set.seed(1)  

Date<-seq(as.Date("2018-01-01"), as.Date("2021-03-19"), by = "1 day")

Count<-rnorm(length(Date),mean = 2086, sd= 728)

Count<-round(Count)

df<-data.frame(Date,Count)

df

#===================redoing with new model================

df$Count<-abs(df$Count)#in case there is any negative values, force them to be absolute

count_data<-as_tsibble(df)

count_data<-imputeTS::na.mean(count_data)

testfrac<-count_data%>%arrange(Date)%>%sample_frac(0.8)
lastdate<-last(testfrac$Date)

#train data
train <- count_data %>%
  #sample_frac(0.8)
  filter(Date<=as.Date(lastdate))
set.seed(1)
fit <- train %>%
  model(
    ets = ETS(Count),
    arima = ARIMA(Count),
    snaive = SNAIVE(Count),
    croston= CROSTON(Count),
    ave=MEAN(Count),
    naive=NAIVE(Count),
    neural=NNETAR(Count),
    lm=TSLM(Count ~ trend()+season())
  ) %>%
  mutate(mixed = (ets + arima + snaive + croston + ave + naive + neural + lm) /8)# creates a combined model using the averages of all individual models 


fc <- fit %>% forecast(h = 7)

accuracy(fc,count_data)

fc_accuracy <- accuracy(fc, count_data,
                        measures = list(
                          point_accuracy_measures,
                          interval_accuracy_measures,
                          distribution_accuracy_measures
                        )
)

fc_accuracy

# A tibble: 9 x 13
#  .model  .type     ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1 winkler percentile  CRPS
#  <chr>   <chr>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>   <dbl>      <dbl> <dbl>
#1 arima   Test  -191.   983.  744. -38.1  51.8 0.939 0.967 -0.308   5769.       567.  561.
#2 ave     Test  -191.   983.  744. -38.1  51.8 0.939 0.967 -0.308   5765.       566.  561.
#3 croston Test  -191.   983.  745. -38.2  51.9 0.940 0.968 -0.308  29788.       745.  745.
#4 ets     Test  -189.   983.  743. -38.0  51.7 0.938 0.967 -0.308   5759.       566.  560.
#5 lm      Test  -154.  1017.  742. -36.5  51.1 0.937 1.00  -0.307   6417.       583.  577.
#6 mixed   Test  -173.   997.  747. -36.8  51.1 0.944 0.981 -0.328  29897.       747.  747.
#7 naive   Test    99.9  970.  612. -19.0  38.7 0.772 0.954 -0.308   7856.       692.  685.
#8 neural  Test  -322.  1139.  934. -49.6  66.3 1.18  1.12  -0.404  26361.       852.  848.
#9 snaive  Test  -244   1192.  896. -37.1  55.5 1.13  1.17  -0.244   4663.       690.  683.

我演示了如何创建混合模型。但是，可能会有一些单独的模型在添加到混合模型时会妨碍混合模型的性能；换句话说，如果混合模型不包括以有害方式扭曲准确性的单个模型，则混合模型可能会得到改进。

期望的结果

我想要实现的是能够测试单个模型的所有可能组合，并返回在其中一项准确度指标(例如平均绝对误差 (MAE))上具有最佳性能的混合模型。但我不确定如何以自动化方式执行此操作，因为有许多潜在的组合。

有人可以建议或分享一些关于我如何做到这一点的代码吗？

最佳答案

需要考虑的几件事:

虽然快速评估许多组合模型的性能绝对是可取的，但这很不切实际。最好的选择是单独评估您的模型，然后使用例如创建更简单的组合2 或 3 个最好的
例如，考虑一下您实际上可以有加权组合 - 例如0.75 * ets + 0.25 * arima。现在的可能性实际上是无穷无尽的，因此您开始看到蛮力方法的局限性(注意，我认为 fable 实际上还不支持这些组合)。

也就是说，这里有一种方法可以用来生成所有可能的组合。请注意，这可能需要很长时间才能运行 - 但应该可以满足您的需求。

# Get a table of models to get combinations from
fit <- train %>%
  model(
    ets = ETS(Count),
    arima = ARIMA(Count),
    snaive = SNAIVE(Count),
    croston= CROSTON(Count),
    ave=MEAN(Count),
    naive=NAIVE(Count),
    neural=NNETAR(Count),
    lm=TSLM(Count ~ trend()+season())
  )

# Start with a vector containing all the models we want to combine
models <- c("ets", "arima", "snaive", "croston", "ave", "naive", "neural", "lm")

# Generate a table of combinations - if a value is 1, that indicates that
# the model should be included in the combinations
combinations <- models %>% 
  purrr::set_names() %>% 
  map(~0:1) %>% 
  tidyr::crossing(!!!.)

combinations
#> # A tibble: 256 x 8
#>      ets arima snaive croston   ave naive neural    lm
#>    <int> <int>  <int>   <int> <int> <int>  <int> <int>
#>  1     0     0      0       0     0     0      0     0
#>  2     0     0      0       0     0     0      0     1
#>  3     0     0      0       0     0     0      1     0
#>  4     0     0      0       0     0     0      1     1
#>  5     0     0      0       0     0     1      0     0
#>  6     0     0      0       0     0     1      0     1
#>  7     0     0      0       0     0     1      1     0
#>  8     0     0      0       0     0     1      1     1
#>  9     0     0      0       0     1     0      0     0
#> 10     0     0      0       0     1     0      0     1
#> # ... with 246 more rows

# This just filters for combinations with at least 2 models
relevant_combinations <- combinations %>% 
  filter(rowSums(across()) > 1)

# We can use this table to generate the code we would put in a call to `mutate()`
# to generate the combination. {fable} does something funny with code
# evaluation here, meaning that more elegant approaches are more trouble 
# than they're worth
specs <- relevant_combinations %>% 
  mutate(id = row_number()) %>% 
  pivot_longer(-id, names_to = "model", values_to = "flag_present") %>% 
  filter(flag_present == 1) %>% 
  group_by(id) %>% 
  summarise(
    desc = glue::glue_collapse(model, "_"),
    model = glue::glue(
      "({model_sums}) / {n_models}",
      model_sums = glue::glue_collapse(model, " + "),
      n_models = n()
    )
  ) %>% 
  select(-id) %>% 
  pivot_wider(names_from = desc, values_from = model)

# This is what the `specs` table looks like:
specs
#> # A tibble: 1 x 247
#>   neural_lm         naive_lm  naive_neural  naive_neural_lm   ave_lm  ave_neural
#>   <glue>            <glue>    <glue>        <glue>            <glue>  <glue>    
#> 1 (neural + lm) / 2 (naive +~ (naive + neu~ (naive + neural ~ (ave +~ (ave + ne~
#> # ... with 241 more variables: ave_neural_lm <glue>, ave_naive <glue>,
#> #   ave_naive_lm <glue>, ave_naive_neural <glue>, ave_naive_neural_lm <glue>,
#> #   croston_lm <glue>, croston_neural <glue>, croston_neural_lm <glue>,
#> #   croston_naive <glue>, croston_naive_lm <glue>, croston_naive_neural <glue>,
#> #   croston_naive_neural_lm <glue>, croston_ave <glue>, croston_ave_lm <glue>,
#> #   croston_ave_neural <glue>, croston_ave_neural_lm <glue>,
#> #   croston_ave_naive <glue>, croston_ave_naive_lm <glue>, ...

# We can combine our two tables and evaluate the generated code to produce 
# combination models as follows:
combinations <- fit %>% 
  bind_cols(rename_with(specs, ~paste0("spec_", .))) %>% 
  mutate(across(starts_with("spec"), ~eval(parse(text = .))))

# Compute the accuracy for 2 random combinations to demonstrate:
combinations %>% 
  select(sample(seq_len(ncol(.)), 2)) %>% 
  forecast(h = 7) %>% 
  accuracy(count_data, measures = list(
    point_accuracy_measures,
    interval_accuracy_measures,
    distribution_accuracy_measures
  ))
#> # A tibble: 2 x 13
#>   .model          .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1 winkler
#>   <chr>           <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>   <dbl>
#> 1 spec_ets_arima~ Test  -209. 1014.  771. -40.1  54.0 0.973 0.998 -0.327  30825.
#> 2 spec_ets_snaiv~ Test  -145.  983.  726. -34.5  48.9 0.917 0.967 -0.316  29052.
#> # ... with 2 more variables: percentile <dbl>, CRPS <dbl>

关于r - 在 R 中使用 Fable 进行时间序列预测；确定混合模型的最佳模型组合，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/70183054/

文章推荐： git apply --reject 与 git apply --3way

文章推荐： html - 为什么子div偏移了？

文章推荐： c++ - C++17 在 MOVE 省略方面发生了什么变化

Vim - 如何使用 smartcase 进行/搜索，而使用 noic 进行 * 搜索？
我喜欢 smartcase，也喜欢 * 和 # 搜索命令。但我更希望 * 和 # 搜索命令区分大小写，而/和 ?搜索命令遵循 smartcase 启发式。是否有隐藏在某个地方我还没有找到的设置？我宁
通过 SSH 进行 SSH(或如何通过 SSH 进行 'proxify' SSH)
关闭。这个问题是off-topic .它目前不接受答案。想改进这个问题？ Update the question所以它是on-topic对于堆栈溢出。 10年前关闭。 Improve this qu
java - 使用一个 LDAP 进行 AD 身份验证失败，但使用另一 LDAP 进行 AD 身份验证通过
从以下网站，我找到了执行java AD身份验证的代码。 http://java2db.com/jndi-ldap-programming/solution-to-sslhandshakeexcepti
r - 在哪些情况下，人们更喜欢通过 reshape 进行 melt 而不是通过 plyr 进行 ddply？我正在努力学习它们，它们看起来很相似
似乎 melt 会使用 id 列和堆叠的测量变量 reshape 您的数据框，然后通过转换让您执行聚合。 ddply，从 plyr 包看起来非常相似..你给它一个数据框，几个用于分组的列变量和一个聚合
caching - 为什么 Facebook 在 memcached 中使用 TCP 进行 SET 和 UDP 进行 GET
我的问题是关于 memcached。 Facebook 使用 memcached 作为其结构化数据的缓存，以减少用户的延迟。他们在 Linux 上使用 UDP 优化了 memcached 的性能。 h
java - 在 Camel route ，使用 exec 组件使用 grep 进行 curl ，但使用 ${HOSTNAME} 进行 grep 无法正常工作
在 Camel route ，我正在使用 exec 组件通过 grep 进行 curl ，但使用 ${HOSTNAME} 的 grep 无法正常工作，下面是我的 Camel 路线。请在这方面寻求帮助。
django - 进行 "not in"查询
我正在尝试执行相当复杂的查询，在其中我可以排除与特定条件集匹配的项目。这是一个 super 简化的模型来解释我的困境: class Thing(models.Model) user = mod
django - 进行 "not in"查询
我正在尝试执行相当复杂的查询，我可以在其中排除符合特定条件集的项目。这里有一个 super 简化的模型来解释我的困境: class Thing(models.Model) user = mod
angular - 进行 Angular 内容投影的现代方法是什么？
我发现了很多嵌入/内容项目的旧方法，并且我遵循了在这里找到的最新方法(我假设):https://blog.angular-university.io/angular-ng-content/ 我正在尝试
使用 NextJS 进行 Fastify
我正在寻找如何使用 fastify-nextjs 启动 fastify-cli 的建议我曾尝试将代码简单地添加到建议的位置，但它不起作用。 'use strict' const path = req
javascript - 进行 gatsby 构建时未定义窗口
我正在尝试将振幅 js 与 React 和 Gatsby 集成。做 gatsby developer 时一切看起来都不错，因为它发生在浏览器中，但是当我尝试 gatsby build 时，我收到以下错
java - 进行 Null 检查的频率和位置
我试图避免过度执行空值检查，但同时我想在需要使代码健壮的时候进行空值检查。但有时我觉得它开始变得如此防御，因为我没有实现 API。然后我避免了一些空检查，但是当我开始单元测试时，它开始总是等待运行时异
使用 NOT 进行 Kibana 搜索
尝试进行包含一些 NOT 的 Kibana 搜索，但获得包含 NOT 的结果，因此猜测我的语法不正确: "chocolate" AND "milk" AND NOT "cow" AND NOT "tr
iphone - 进行 Facebook 集成时出错
我正在使用开源代码共享包在 iOS 中进行 facebook 集成，但收到错误“FT_Load_Glyph failed: glyph 65535: error 6”。我在另一台 mac 机器上尝试了
r - 进行 Tobit 回归时的奇异性错误
我正在尝试估计一个标准的 tobit 模型，该模型被审查为零。变量是因变量 : 幸福自变量 : 城市(芝加哥，纽约)，性别(男，女)，就业(0=失业，1=就业)，工作类型(失业，蓝色，白色
从多个文件夹中对多个 jar 进行 gradle
我有一个像这样的项目布局样本/ 一种/ 源/ 主要的/ java / java 资源/ .jpg 乙/ 源/ 主要的/ java / B.java 资源/ B.jpg 构建.gradle 设置.gr
javascript - 进行 fetch 调用时出错
如何循环遍历数组中的多个属性以及如何使用map函数将数组中的多个属性显示到网页 import React, { Component } from 'react'; import './App.css'
javascript - 进行 AJAX 调用时加载程序不显示
我有一个 JavaScript 函数，它进行 AJAX 调用以返回一些数据，该调用是在选择列表更改事件上触发的。我尝试了多种方法来在等待时显示加载程序，因为它当前暂停了选择列表，从客户的 Angul
java - 进行 null 检查的更短方法
可能以前问过，但找不到。我正在用以下形式写很多语句: if (bar.getFoo() != null) { this.foo = bar.getFoo(); } 我想到了三元运算符，但我认
javascript - 进行 JavaScript 验证后短信消失
我有一个表单，在将其发送到 PHP 之前我正在执行一些验证 JavaScript，验证后的 JavaScript 函数会发布用户在中输入的文本。页面底部的标签；然而，此消息显示短暂，然后消失...

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

r - 在 R 中使用 Fable 进行时间序列预测；确定混合模型的最佳模型组合