gpt4 book ai didi

r - 对多个 AutoTuning 实例进行基准测试

转载 作者:行者123 更新时间:2023-12-04 07:54:25 25 4
gpt4 key购买 nike

我一直在尝试使用 mlr3 对 xgboost 进行一些超参数调整。我想比较三种不同的模型:

  • xgboost 只调整了 alpha 超参数
  • xgboost 调整了 alpha 和 lambda 超参数
  • xgboost 调整了 alpha、lambda 和 maxdepth 超参数。

  • 阅读 mlr3 书后,我认为使用 AutoTuner 进行嵌套重采样和基准测试将是执行此操作的最佳方法。这是我尝试过的:
    task_mpcr <- TaskRegr$new(id = "mpcr", backend = data.numeric, target = "n_reads")

    measure <- msr("poisson_loss")

    xgb_learn <- lrn("regr.xgboost")

    set.seed(103)
    fivefold.cv = rsmp("cv", folds = 5)

    param.list <- list( alpha = p_dbl(lower = 0.001, upper = 100, logscale = TRUE),
    lambda = p_dbl(lower = 0.001, upper = 100, logscale = TRUE),
    max_depth = p_int(lower = 2, upper = 10)
    )


    model.list <- list()
    for(model.i in 1:length(param.list)){

    param.list.subset <- param.list[1:model.i]
    search_space <- do.call(ps, param.list.subset)

    model.list[[model.i]] <- AutoTuner$new(
    learner = xgb_learn,
    resampling = fivefold.cv,
    measure = measure,
    search_space = search_space,
    terminator = trm("none"),
    tuner = tnr("grid_search", resolution = 10),
    store_tuning_instance = TRUE
    )
    }
    grid <- benchmark_grid(
    task = task_mpcr,
    learner = model.list,
    resampling = rsmp("cv", folds =3)
    )

    bmr <- benchmark(grid, store_models = TRUE)
    请注意,我添加了泊松损失作为我正在使用的计数数据的度量。
    出于某种原因,在运行基准函数后,我所有模型的泊松损失几乎每折都相同,这让我认为没有进行任何调整。
    我也找不到一种方法来访问用于获得每次训练/测试迭代的最低损失的超参数。
    我是否完全滥用了基准功能?
    此外,这是我关于 SO 的第一个问题,因此任何格式建议将不胜感激!

    最佳答案

    要查看调优是否有效果,您只需将未调优的学习器添加到基准测试中即可。否则,结论可能是调整 alpha 对您的示例来说就足够了。
    我修改了代码,使其与示例任务一起运行。

    library(mlr3verse)

    task <- tsk("mtcars")

    measure <- msr("regr.rmse")

    xgb_learn <- lrn("regr.xgboost")

    param.list <- list(
    alpha = p_dbl(lower = 0.001, upper = 100, logscale = TRUE),
    lambda = p_dbl(lower = 0.001, upper = 100, logscale = TRUE)
    )

    model.list <- list()
    for(model.i in 1:length(param.list)){

    param.list.subset <- param.list[1:model.i]
    search_space <- do.call(ps, param.list.subset)

    at <- AutoTuner$new(
    learner = xgb_learn,
    resampling = rsmp("cv", folds = 5),
    measure = measure,
    search_space = search_space,
    terminator = trm("none"),
    tuner = tnr("grid_search", resolution = 5),
    store_tuning_instance = TRUE
    )
    at$id = paste0(at$id, model.i)

    model.list[[model.i]] <- at
    }

    model.list <- c(model.list, list(xgb_learn)) # add baseline learner

    grid <- benchmark_grid(
    task = task,
    learner = model.list,
    resampling = rsmp("cv", folds =3)
    )

    bmr <- benchmark(grid, store_models = TRUE)

    autoplot(bmr)

    bmr_data = bmr$data$as_data_table() # convert benchmark result to a handy data.table
    bmr_data$learner[[1]]$learner$param_set$values # the final learner used by AutoTune is nested in $learner

    # best found value during grid search
    bmr_data$learner[[1]]$archive$best()

    # transformed value (the one that is used for the learner)
    bmr_data$learner[[1]]$archive$best()$x_domain
    在最后几行中,您将看到如何访问基准测试的各个运行。在我的示例中,我们有 9 次运行,结果为 3 个学习者和 3 个外部重采样折叠。

    关于r - 对多个 AutoTuning 实例进行基准测试,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66774423/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com