gpt4 book ai didi

r - 带有 ROC 的插入符号 rfe + sum 中的特征选择

转载 作者:行者123 更新时间:2023-12-04 10:32:56 26 4
gpt4 key购买 nike

我一直在尝试使用 caret 包应用递归特征选择。我需要的是 ref 使用 AUC 作为性能度量。谷歌搜索一个月后,我无法使该过程正常工作。这是我使用的代码:

library(caret)
library(doMC)
registerDoMC(cores = 4)

data(mdrr)

subsets <- c(1:10)

ctrl <- rfeControl(functions=caretFuncs,
method = "cv",
repeats =5, number = 10,
returnResamp="final", verbose = TRUE)

trainctrl <- trainControl(classProbs= TRUE)

caretFuncs$summary <- twoClassSummary

set.seed(326)

rf.profileROC.Radial <- rfe(mdrrDescr, mdrrClass, sizes=subsets,
rfeControl=ctrl,
method="svmRadial",
metric="ROC",
trControl=trainctrl)

执行此脚本时,我得到以下结果:
Recursive feature selection

Outer resampling method: Cross-Validation (10 fold)

Resampling performance over subset size:

Variables Accuracy Kappa AccuracySD KappaSD Selected
1 0.7501 0.4796 0.04324 0.09491
2 0.7671 0.5168 0.05274 0.11037
3 0.7671 0.5167 0.04294 0.09043
4 0.7728 0.5289 0.04439 0.09290
5 0.8012 0.5856 0.04144 0.08798
6 0.8049 0.5926 0.02871 0.06133
7 0.8049 0.5925 0.03458 0.07450
8 0.8124 0.6090 0.03444 0.07361
9 0.8181 0.6204 0.03135 0.06758 *
10 0.8069 0.5971 0.04234 0.09166
342 0.8106 0.6042 0.04701 0.10326

The top 5 variables (out of 9):
nC, X3v, Sp, X2v, X1v

该过程始终使用准确度作为性能指标。出现的另一个问题是,当我尝试从使用以下方法获得的模型中进行预测时:
predictions <- predict(rf.profileROC.Radial$fit,mdrrDescr)

我收到以下消息
In predictionFunction(method, modelFit, tempX, custom = models[[i]]$control$custom$prediction) :
kernlab class prediction calculations failed; returning NAs

结果是不可能从模型中得到一些预测。

以下是通过 sessionInfo()获得的信息
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8
[4] LC_COLLATE=es_ES.UTF-8 LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8
[7] LC_PAPER=es_ES.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] grid parallel splines stats graphics grDevices utils datasets methods base

other attached packages:
[1] e1071_1.6-2 class_7.3-9 pROC_1.6.0.1 doMC_1.3.2 iterators_1.0.6 foreach_1.4.1
[7] caret_6.0-21 ggplot2_0.9.3.1 lattice_0.20-24 kernlab_0.9-19

loaded via a namespace (and not attached):
[1] car_2.0-19 codetools_0.2-8 colorspace_1.2-4 compiler_3.0.2 dichromat_2.0-0
[6] digest_0.6.4 gtable_0.1.2 labeling_0.2 MASS_7.3-29 munsell_0.4.2
[11] nnet_7.3-7 plyr_1.8 proto_0.3-10 RColorBrewer_1.0-5 Rcpp_0.10.6
[16] reshape2_1.2.2 scales_0.2.3 stringr_0.6.2 tools_3.0.2

最佳答案

一个问题是一个小错误( 'trControl=' 而不是 'trainControl=' )。还有,你改caretFuncs将其附加到 rfe 后的控制功能。最后,您需要告诉 trainControl计算 ROC 曲线。

此代码有效:

 caretFuncs$summary <- twoClassSummary

ctrl <- rfeControl(functions=caretFuncs,
method = "cv",
repeats =5, number = 10,
returnResamp="final", verbose = TRUE)

trainctrl <- trainControl(classProbs= TRUE,
summaryFunction = twoClassSummary)
rf.profileROC.Radial <- rfe(mdrrDescr, mdrrClass,
sizes=subsets,
rfeControl=ctrl,
method="svmRadial",
## I also added this line to
## avoid a warning:
metric = "ROC",
trControl = trainctrl)


> rf.profileROC.Radial

Recursive feature selection

Outer resampling method: Cross-Validated (10 fold)

Resampling performance over subset size:

Variables ROC Sens Spec ROCSD SensSD SpecSD Selected
1 0.7805 0.8356 0.6304 0.08139 0.10347 0.10093
2 0.8340 0.8491 0.6609 0.06955 0.10564 0.09787
3 0.8412 0.8491 0.6565 0.07222 0.10564 0.09039
4 0.8465 0.8491 0.6609 0.06581 0.09584 0.10207
5 0.8502 0.8624 0.6652 0.05844 0.08536 0.09404
6 0.8684 0.8923 0.7043 0.06222 0.06893 0.09999
7 0.8642 0.8691 0.6913 0.05655 0.10837 0.06626
8 0.8697 0.8823 0.7043 0.05411 0.08276 0.07333
9 0.8792 0.8753 0.7348 0.05414 0.08933 0.07232 *
10 0.8622 0.8826 0.6696 0.07457 0.08810 0.16550
342 0.8650 0.8926 0.6870 0.07392 0.08140 0.17367

The top 5 variables (out of 9):
nC, X3v, Sp, X2v, X1v

对于预测问题,您应该使用 rf.profileROC.Radial而不是 fit成分:
 > predict(rf.profileROC.Radial, head(mdrrDescr))
pred Active Inactive
1 Inactive 0.4392768 0.5607232
2 Active 0.6553482 0.3446518
3 Active 0.6387261 0.3612739
4 Inactive 0.3060582 0.6939418
5 Active 0.6661557 0.3338443
6 Active 0.7513180 0.2486820

最大限度

关于r - 带有 ROC 的插入符号 rfe + sum 中的特征选择,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21088825/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com