gpt4 book ai didi

r - mlr3 中的嵌套分支(和依赖项)

转载 作者:行者123 更新时间:2023-12-03 23:01:52 25 4
gpt4 key购买 nike

我正在尝试使用 information_gain 和 mrmr 特征过滤,但也尝试使用 information_gain 和 mrmr 特征过滤的组合(两者的结合)。我试过在下面创建一个reprex。

library("mlr3verse")
task <- tsk('sonar')


filters = list("nop" = po("nop"),
"information_gain" = po("filter", flt("information_gain")),
"mrmr" = po("filter", flt("mrmr")),
"ig_mrmr" = po("branch", c("ig2", "mrmr2"), id = "ig_mrmr") %>>%
gunion(list("ig2" = po("filter", flt("information_gain")),
"mrmr2" = po("filter", flt("mrmr")))) %>>%
po("featureunion", id = "union_igmrmr"))

pipe =
po("branch", names(filters), id = "branch1") %>>%
gunion(unname(filters)) %>>%
po("unbranch", names(filters), id = "unbranch1") %>>%
po(lrn('classif.rpart'))

pipe$plot()
pipe plot
到目前为止看起来不错,在这里您可以看到我正在尝试结合 ig 和 mrmr 选定的功能。
接下来我设置参数,我认为是正确的:
ps <- ParamSet$new(list(
ParamDbl$new("classif.rpart.cp", lower = 0, upper = 0.05),
ParamInt$new("information_gain.filter.nfeat", lower = 20L, upper = 60L),
ParamFct$new("information_gain.type", levels = c("infogain", "symuncert")),
ParamInt$new("ig2.filter.nfeat", lower = 20L, upper = 60L),
ParamFct$new("ig2.type", levels = c("infogain", "symuncert")),
ParamInt$new("mrmr.filter.nfeat", lower = 20L, upper = 60L),
ParamInt$new("mrmr2.filter.nfeat", lower = 20L, upper = 60L),
ParamFct$new("branch1.selection", levels = names(filters)),
ParamFct$new("ig_mrmr.selection", levels = c("ig2", "mrmr2"))
))
依赖是我挣扎的地方。我可以在外部分支或内部分支上设置“嵌套”参数,但我不确定如何在两者上触发它们。在下面的示例中,它们设置在外部分支上。
ps$add_dep("information_gain.filter.nfeat", "branch1.selection", CondEqual$new("information_gain"))
ps$add_dep("information_gain.type", "branch1.selection", CondEqual$new("information_gain"))
ps$add_dep("mrmr.filter.nfeat", "branch1.selection", CondEqual$new("mrmr"))
ps$add_dep("ig2.filter.nfeat", "branch1.selection", CondEqual$new("ig_mrmr"))
ps$add_dep("ig2.type", "branch1.selection", CondEqual$new("ig_mrmr"))
ps$add_dep("mrmr2.filter.nfeat", "branch1.selection", CondEqual$new("ig_mrmr"))

ps

glrn <- GraphLearner$new(pipe)

glrn$predict_type <- "prob"

cv5 <- rsmp("cv", folds = 5)

task$col_roles$stratum <- task$target_names

instance <- TuningInstanceSingleCrit$new(
task = task,
learner = glrn,
resampling = cv5,
measure = msr("classif.auc"),
search_space = ps,
terminator = trm("evals", n_evals = 5)
)

tuner <- tnr("random_search")
tuner$optimize(instance)
请注意,在尝试优化调谐器之前,我不会遇到错误。
错误信息:
Error in self$assert(xs) : 
Assertion on 'xs' failed: Parameter 'ig2.filter.nfeat' not available. Did you mean 'branch1.selection' / 'information_gain.filter.nfeat' / 'information_gain.filter.frac'?.

最佳答案

从您的描述来看,您似乎不打算为 c("ig2", "mrmr2") 使用分支。 :

po("branch", c("ig2", "mrmr2"), id = "ig_mrmr") %>>%
gunion(list("ig2" = po("filter", flt("information_gain")),
"mrmr2" = po("filter", flt("mrmr")))) %>>%
po("featureunion", id = "union_igmrmr")
因为您打算将这两者的输出结合起来。换句话说,您希望它们都应用在同一个重采样实例中。
library("mlr3verse")
task <- tsk('sonar')
filters = list("nop" = po("nop"),
"information_gain" = po("filter", flt("information_gain")),
"mrmr" = po("filter", flt("mrmr")),
"ig_mrmr" = po("copy", 2) %>>%
gunion(list("ig2" = po("filter", flt("information_gain")),
"mrmr2" = po("filter", flt("mrmr")))) %>>%
po("featureunion", id = "union_igmrmr"))

pipe = po("branch", names(filters), id = "branch1") %>>%
gunion(unname(filters)) %>>%
po("unbranch", names(filters), id = "unbranch1") %>>%
po(lrn('classif.rpart'))

pipe$plot()
enter image description here
查看可以调整的参数的最简单方法是:
pipe$param_set
从这里你会看到你指定的一些参数没有全名。例如:
15:   ig2.information_gain.filter.nfeat ParamInt     0   Inf                                   <NoDefault[3]>      
16: ig2.information_gain.filter.frac ParamDbl 0 1 <NoDefault[3]>
17: ig2.information_gain.filter.cutoff ParamDbl -Inf Inf <NoDefault[3]>
18: ig2.information_gain.type ParamFct NA NA infogain,gainratio,symuncert infogain
19: ig2.information_gain.equal ParamLgl NA NA TRUE,FALSE FALSE
20: ig2.information_gain.discIntegers ParamLgl NA NA TRUE,FALSE TRUE
21: ig2.information_gain.threads ParamInt 0 Inf 1
22: ig2.information_gain.affect_columns ParamUty NA NA <Selector[1]>
23: mrmr2.mrmr.filter.nfeat ParamInt 0 Inf <NoDefault[3]>
24: mrmr2.mrmr.filter.frac ParamDbl 0 1 <NoDefault[3]>
25: mrmr2.mrmr.filter.cutoff ParamDbl -Inf Inf <NoDefault[3]>
26: mrmr2.mrmr.threads ParamInt 0 Inf 0
27: mrmr2.mrmr.affect_columns ParamUty NA NA <Selector[1]>
让我们为参数指定正确的名称:
ps = ParamSet$new(list(
ParamDbl$new("classif.rpart.cp", lower = 0, upper = 0.05),
ParamInt$new("information_gain.filter.nfeat", lower = 20L, upper = 60L),
ParamFct$new("information_gain.type", levels = c("infogain", "symuncert")),
ParamInt$new("ig2.information_gain.filter.nfeat", lower = 20L, upper = 60L),
ParamFct$new("ig2.information_gain.type", levels = c("infogain", "symuncert")),
ParamInt$new("mrmr.filter.nfeat", lower = 20L, upper = 60L),
ParamInt$new("mrmr2.mrmr.filter.nfeat", lower = 20L, upper = 60L),
ParamFct$new("branch1.selection", levels = names(filters))
))

ps$add_dep("information_gain.filter.nfeat", "branch1.selection", CondEqual$new("information_gain"))
ps$add_dep("information_gain.type", "branch1.selection", CondEqual$new("information_gain"))
ps$add_dep("mrmr.filter.nfeat", "branch1.selection", CondEqual$new("mrmr"))
ps$add_dep("ig2.information_gain.filter.nfeat", "branch1.selection", CondEqual$new("ig_mrmr"))
ps$add_dep("ig2.information_gain.type", "branch1.selection", CondEqual$new("ig_mrmr"))
ps$add_dep("mrmr2.mrmr.filter.nfeat", "branch1.selection", CondEqual$new("ig_mrmr"))
现在一切运行都没有问题:
glrn <- GraphLearner$new(pipe) 

glrn$predict_type <- "prob"

cv5 <- rsmp("cv", folds = 5)

task$col_roles$stratum <- task$target_names

instance <- TuningInstanceSingleCrit$new(
task = task,
learner = glrn,
resampling = cv5,
measure = msr("classif.auc"),
search_space = ps,
terminator = trm("evals", n_evals = 5)
)

tuner <- tnr("random_search")
tuner$optimize(instance)

instance$result
classif.rpart.cp information_gain.filter.nfeat information_gain.type ig2.information_gain.filter.nfeat ig2.information_gain.type mrmr.filter.nfeat mrmr2.mrmr.filter.nfeat branch1.selection
1: 0.01956043 NA <NA> 44 symuncert NA 34 ig_mrmr
learner_param_vals x_domain classif.auc
1: <list[6]> <list[5]> 0.7187196
这个画廊帖子将很有用:
https://mlr3gallery.mlr-org.com/posts/2020-04-23-pipelines-selectors-branches/
和其他人一样
https://mlr3gallery.mlr-org.com/
如果您觉得 mlr3 的某些方面无法理解并且您找不到合适的画廊帖子/书籍示例,您应该请求它。
图书链接: https://mlr3book.mlr-org.com/

关于r - mlr3 中的嵌套分支(和依赖项),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65259899/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com